Recently I was working with a customer piloting Office 365 where we had to move from on-premise Exchange 2007 server to Office 365. In doing so we installed an Exchange 2013 server to be their Hybrid server during the pilot and possibly for the entire move to the cloud. After all the relevant infrastructure is in place (ADFS, ADFS Proxy (WAP), DirSync and Exchange 2013 Hybrid) we begin to test basic functionality. This includes ADFS login and DirSync sync to the cloud. Next we test mailbox moves to the cloud. Our first 3 test/IT mailboxes moved without major issues. The next test is mail flow between the cloud and on premise:
- Email from Exchange 2007 to Office 365.
- Email from Office 365 to Exchange 2007.
- Email from Office 365 to external recipient.
- Email from external recipient to Office 365.
We noticed that items 1 and 2 were not working as expected. Items were queuing up in transit to their destination. We began troubleshooting the flow of SMTP traffic. First we verified that the connectors were in Office 365 and on the hybrid Exchange 2013 server. These appeared to be configured correctly. For emails to the cloud we changed the Send connector Protocol Logging to Verbose from None. For more clues we also checked Message Tracking Logs and the Application Log for Exchange. The errors seemed to indicate a communication error for outbound emails from Exchange 2013 Hybrid server to Office 365. We saw this error in the logs:
First thing that came to mind was is what rules were configured on the firewall for SMTP traffic. The client confirmed that the firewall was only allowing the older Exchange servers to send email to the Internet, no other IPs could send email. We then added a rule for the Hybrid server to be able to send email outside the company. Once the rule was added, we had good, predictable mail flow to any mailboxes on Office 365..
One problem solved.
On to inbound traffic.
Messages from Office 365 were queued up on the Exchange 2013 hybrid server. The queue that held the messages did not have any errors pertaining to Exchange Server authentication, which seems to be the most common type of issue. for other migrations involving Exchange 2003/2007, I’ve adjusted tarpit settings as well as MaxAcknowledgementDelay settings. In this case, these settings would not work.
In the Queue Viewer for Exchange 2013 the error message was a 4.2.1 – socket error. We turned up the protocol logs on both sides of the connection. We then reviewed the Message Tracking logs. On the Exchange 2007 server we could see the server connect, advertise its services and exchange certificates… then the conversation would just stop. No timeout, no errors, no further communication. Strange. We checked the connection settings and the receive connection was set correctly. I increased the timeout to make sure this was not the issue and no emails passed, errors were the same. We created an connector just for the two Exchange servers to talk on and received the exact same error.
After a bit of research I ran upon this KB Article. We used Method 3 to resolve the issue. Once the fix was put into place we restarted the Transport service and force the queue to retry. Email began to flow immediately.
Here are the tools that should be used for troubleshooting (some are not mentioned above):
- Protocol logging to verbose on both sides of the issue.
- Review Message Tracking Logs (PowerShell) – in Office 365 and On Prem
- Review Message Tracking in the EAC – Office 365 and On Prem
- Review Protocol logs where you can.
- Verify no firewall rules are involved.
- Verify your Anti-Virus Settings – Port 25 not blocked