OWA /IIS Proxy Issue

This weekend Zach, a coworker of mine, came over to my house and we performed a CCR to SCR passive cluster node failover test for a customer. We went from a CCR cluster in a NY database center to an SCR server in a NJ datacenter that was configured as passive mailbox server node. After we failed over to the NJ server we then failed back to the NY servers. Follow this link for more on CCR to SCR failover. We started at 0700 and ended at about 2200 that night. The failover / failback went as planned only a very few issues, such as:

  • During fail back when we ran Setup.com /ClearLocalCMS on the CCR cluster node we started running it on the wrong server. It must be run on the cluster node that used to the Active Cluster node. Next failover we do we’ll make better note of which is which
  • The backup CAS server was not set to redirect the root OWA folder to /OWA. This caused a bit of confusion for testers. This is something that we noted as “verify that the backup and primary site CAS servers have the same settings for web services”

Where we ran into our time consuming issues before we started the failover. We were tesitng the DR site to make sure that it all worked and we ran into a problem that ended up dealing with OWA / IIS Proxy and the customer’s installation of RSA authentication for OWA. To read about that problem, what we did to fix it, and what the root casue is clink the link more link before. For the none techie readers just wait for the next pictures that I post!

The Problem:

When the customer goes to https://contoso.com/owa in the production site they expect to see an RSA login page.
The Customer had set this up in the failover site .They had issues with the RSA so they uninstalled RSA from the CAS server
When the customer went to https://contoso.com/owa in the DR site durring testing prior to the failover they were presented with a standard OWA forms based authentication page. They logged in, and were then presented with an RSA web page that you could not login to – this was the problem we had to fix

Trouble Shooting Steps:

Our first thought was that RSA was not installed correctly on the DR CAS server so we removed it – did not fix the issue
Next we figured that we missed something in the IIS metabase so we removed Exchange, IIS, and the IIS metabase. Then we reinstalled IIS and Exchange– did not fix the issue
We then built a new server from scratch, installed IIS and Exchange on it – did not fix the issue
While we built the new server we called RSA and had them help us remove everything from IIS and the Windows registry – did not fix the issue
Finally we installed the Mailbox Server role on the CAS server in the DR site, created mailbox local and tested with the local mailbox – the RSA page was not present problem fixed

Solution:

In the end the solution was ridiculously simple. Everything was working by design when we started. We did not have to fix anything to perform our test. Going forward the customer needs to fix and verify the RSA implementation on all CAS servers in all sites.

Root Cause:

When you connect to a CAS server the CAS sever looks up where you mailbox located and proxies your web request to a CAS server that’s in the same site as your Mailbox server. This is the key and why our last troubleshooting step worked. When we created a mailbox on the CAS sever in the DR site the CAS server talked only to itself and everything worked. When we tried to connect to a mailbox in the production site our CAS sever connected to the CAS sever in the production site that returned the RSA request to our CAS server as its design to do. The RSA request referenced local files that were not installed on the DR CAS server, so it could not complete the RSA authentication

What I learned from this:

IIS issues with Exchange 2007 and beyond might not be caused by the server that you think that you are connecting too. Because of IIS proxy they might be caused by a different CAS server that’s in the same site as your Mailbox server.

Related Posts with Thumbnails

About Kevinm