I think that one of the coolest features of Exchange 2010 is the seamless free/busy and calendar federation between organizations. In order to get federation provisioned there are a number of steps you need to take which you can find detailed on Technet.
The first step of this setup involves creating a Federation Trust to the Microsoft Federation Gateway (MFG), but in order to create this trust you need to use a public certificate issued by one of the following Certificate Authorities (the haphazard thumbprint formatting is Technet’s, not mine):
CA certificate friendly name
Thumbprint
Comodo
NA
Digicert Global Root CA
083B:E056:9042:46B1:A175:6AC9:5991:C74A
Digicert High Assurance EV Root CA
91 8d a5 e4 99 c1 5f 7c 62 75 b1 24 fe de 53 35 7c 34 bd 36
I recently was involved an Exchange deployment that involved purchasing a SAN certificate from Comodo. One of the certificate authorities Comodo uses to issue SAN certs is the USERTrust Legacy Secure Server CA, which has its own certificate issued by the Entrust.net Secure Server Certification Authority. Bottom line is the certificate you get verifies up to the Entrust certificate you can see below which the Federation Gateway supports.
After trying to create the Federation Trust we were seeing the following error:
An error occurred while attempting to provision Exchange to the Partner STS. Detailed information “An error occurred accessing Windows Live. Detailed information “The request failed with HTTP status 403: Forbidden.”.”
Basically this is the MFG’s way of saying “I don’t trust this certificate.” It turns out the MFG is geared to only accept certificates issued directly from one of the certificate authorities listed above which is not something I saw in the documentation. So if the Entrust Secure Server Certification Authority had issued our webmail certificate it would have been accepted. But like in our case, if your certificate is issued from a 3rd party intermediate certificate authority it won’t be accepted even if it technically verifies up to a support rooted authority.
The good news is a call to PSS resulted in Microsoft making a change on the MFG to accept certificates issued by this particular intermediate CA going forward for everyone. So if ran into this error previously you should be able to try again with the same certificate and see the trust succeed. As of this writing I’ve requested them to also add support for the AAA Certificate Services intermediate CA Comodo also issues certificates from.
One of the deployments I’ve been working on recently involved using F5 BigIP hardware load balancers to do SSL offloading for a two-node Exchange 2010 design. To give some background here usually you would just pass through port 443 (I’m skipping over the RPC Client Access piece since it’s not relevant here) from your load balancer straight to the Exchange servers, letting the servers handle the SSL encryption like in this diagram:
The benefit of that approach is it’s simple and a very common deployment method. On the flip side, you can benefit from offloading SSL encryption to the BigIPs and gain some more advanced forms of load balancing. In this case the improved load balancing was the goal along with some internal policies forcing this approach. What happens with SSL offloading is the HTTPS traffic ends at the BigIPs which turn around and pass port 80 clear-text traffic back to the Exchange servers so they have a bit less CPU work to do. That strategy looks more like this:
The problem with this configuration is Exchange is really designed to operate with SSL in mind and you have to go out of your way to allow it to operate in clear-text. What you’ll need to configure on each CAS server is:
The issue I ran into is after following all of these steps Autodiscover was still not functional through the load balancing. I could enter https://<CAS Array FQDN>/Autodiscover/Autodiscover.xml into a browser and reach the XML file with no problem, but running the Autodiscover test within Outlook would return a 404 error. Every other service was working just fine:
This threw me for awhile and after a bit of searching I ran across KB 980048 where it’s noted that Autodiscover cannot be used on port 80 with an HTTP POST request, which is what Outlook uses. My attempts at accessing the XML directly succeeded because I was only trying to download the file. Supposedly this is going to be fixed in Service Pack 1.
While the KB provides no immediate solution what I found that works is to use the same methodology Technet recommends for the Exchange Web Services web.config file. Go into your /Autodiscover folder and edit the web.config to replace all instances of httpsTransport with httpTransport (a simple search and replace should work). Be sure to save a copy before you make modifications, restart your server after making the change and you should be able to offload SSL for Autodiscover successfully. Since as far as I know this is undocumented today you can try this at your own risk, but it appears to be working.
Now would normally be the time where everyone is running around like their head has been cut off because your Front-End server is totally hosed, but because you followed the backup procedures in Part 1 (you did run the backup, right?) restoring service to your OCS server is fairly simple.
Restore the Database
Open up the DPM console.
Click the Recovery tab at the top.
We need to restore the SQL database and files separately, but let’s start with the database. Expand the tree to <Forest Name>\<OCS Server>\All Protected SQL Instances\<OCS Server>\RTC\rtc
Highlight a suitable recovery date in the calendar and select the RTC database below.
Right-click and select Recover…
Press Next.
We’ve successfully screwed up the server to where we might as well recover to the original SQL server. Select that option and press Next.
Select Leave database operational and press Next.
No options needed. Just press Next.
Yup, those are the files we need. Press Recover.
Press Close while the recovery operation occurs.
If you click the Monitoring tab you can view the jobs in process.
Restore the Files
Now we need to restore files separately. Expand the tree to <Domain Name>\<OCS Server>\All Protected Protected Volumes\<OCS Installation Volume>
Highlight a suitable recovery date in the calendar and select the Program Files folder below.
Right-click Program Files and select Recover…
Press Next.
Select Recover to the original location and press Next.
Select to Overwrite the existing versions (if any), and then select to Apply the security settings of the recovery point version. Press Next.
Now press Recover.
Press Close while the recovery operation occurs.
If you click the Monitoring tab you can view the jobs in process.
Fix SQL Database Chaining
One thing DPM won’t restore is an option within SQL. If you miss this step your Front-End services will fail to start.
If you check your OCS Front-End you’ll find all the files you deleted previously have now returned. You could probably get away with restarting services as this point, but since the machine was completed hosed I’m just going to restart the server and cross my fingers.
Check Functionality
After the restart all of my OCS services started successfully and my errors have gone away. You can see now my Communicator list still has my contacts and access levels defined. Likewise, Device Updates and client auto updates should function normally now.
Now that we’ve verified the DPM backups are running successfully on a regular schedule we can get to really destroying the environment. First up: the RTC database. So shut down your OCS Front-End and SQL services. Then go and delete the RTC.mdf and RTC.ldf files. I know that doesn’t sound like a good idea, but really, delete them.
Open Explorer, jump in to the following volumes and delete the content there:
<OCS Installation Volume>\<OCS Installation Folder>\Application Host\Application Data
Now go and start your SQL services and try starting the OCS services up again. You’ll find a few errors and warnings in your OCS application log because it can’t read the RTC database. Communicator and Live Meeting clients won’t be able to connect to the server as this point either. Oops!
Congratulations, you’ve successfully messed up your Front-End server to the point where it is non-functional. The device update files have been lost, the MOC Auto-Update files have been lost and all your meeting content is gone. In the next section I’ll demonstrate how to get the server back to an operational state with DPM.
The goal of this series is to demonstrate how to recover your OCS Front-End’s RTC database in the event of a disaster where your database or disk hosting the RTC database has become corrupted. Or maybe you’ve recovered a server’s installation and configuration, but now need to recover the user information. I’m going to do this in 3 different parts: backing up, destroying, recovering. To get started we need to have a semi-realistic OCS environment running so in this example I have a Standard Edition Front-End running where I’ve done the following:
Added a few users to my contact list and changed the access levels around.
Uploaded and approved the latest UCUpdates.cab package for phone devices.
Added a MOC hotfix to the auto update feature.
Created a couple of conferences with content in Live Meeting.
These items may seem a little random, but they’ve been done to illustrate what’s restorable from the RTC database and the file shares on a Front-End. I also have another machine called OR1DEVDPM01 running the beta of DPM 2010, which is what we’ll be using for the backup and restore.
Now that we have a machine running we first need to get this thing backed up before we trash it. You’ll want to create an exception on the OCS machine for the firewall to allow any traffic from the DPM machine. This will allow installation of the DPM agent, and allow backups and restores to occur.
Install the DPM Agent
Open the DPM console.
Click the Management tab at the top.
Click the Agents tab below the main navigation line.
Click Install in the action pane.
Select Install agents and press Next.
Select the server you’re pushing the agent to (OR1DEVOCS01) and press the Add button. Then press Next.
Enter the credentials of an account with administrative rights on the server and press Next.
Since this is a lab and I’m using Server 2008 I’m not too concerned the server restarting, but in production I’d advise opting for the manual restart.
Press Install and then you can click close while the agent deploys.
After a minute or two the agent status should change to OK. Now we can start backing up the server.
Add the OCS Protection Group
In the DPM console again click on the Protection tab.
Click Create protection group in the action pane.
Select Servers and press Next.
Expand the OCS Front-End, OR1DEVOCS01 and you’ll see a few different nodes such as shares, SQL, volumes and system state. The OCS Backup and Restore guide provides some guidance on what actually needs to be backed up from the server. These options pertain to a Standard Edition Front-End so be sure to check the document for any other role. Here are the items we need to select:
All SQL Servers\<\RTC\rtc<>
All volumes\<OCS Installation Volume>\<OCS Installation Folder>\Application Host\Application Data
All volumes\<OCS Installation Volume>\<OCS Installation Folder>\Web Components\AutoUpdate
All volumes\<OCS Installation Volume>\<OCS Installation Folder>\Web Components\Data MCU Web\Web
All volumes\<OCS Installation Volume>\<OCS Installation Folder>\Web Components\Data MCU Web\Non-Web
All volumes\<OCS Installation Volume>\<OCS Installation Folder>\Web Components\DeviceUpdateFiles
You can press OK and ignore the warning that pops up about adding the system state backup. Press Next to continue after selecting all of the above options.
Name the Protection Group something descriptive. I’m going out on a limb here, but I used OCS Front-Ends as the name. I don’t have any tape libraries hooked up, so I’ll just be opting for short-term protection to disk. Press Next.
I imagine you’ll generally want more than 5 days of backups, but this works for the purpose here. 15 minute synchronizations are OK, but keep in mind OCS uses a simple recovery model in SQL meaning you take full backups and you restore full backups. None of this full, plus incremental and rolling logs forward fun. Just flat out restore of the entire DB and logs at once. The problem here is a simple database recovery model cannot leverage the synchronization feature of DPM like incremental backups can, so we’re limited to being able to restore only from a full backup, or an “Application recovery point” in DPM terms. You’ll see the default is to back up every day at 8 PM which may or may not be acceptable for you.
If you press the Modify button you can add in additional time slots to run an express full backup. Unfortunately (hoping this is a beta bug), you can’t select all the time slots and press Add. So just press Add quite a few times until each timeslot is added and you’ll have a recovery point every 30 minutes for your database. The trade off to running with this kind of frequency is the disk space used. Pick a schedule that’s appropriate for your deployment. Press OK to accept the schedule and then press Next to save the short-term goals.
On the next page you’ll see the disk allocation. Press Next to continue.
Select when to create the replica of your data (now) and press Next.
Choose when to run consistency checks and press Next.
On the last page you can review your selections and then press Create Group. The initial replica jobs will be created and then you can press Close.
If you click the Monitoring tab you can view the jobs in process.
At this point we should have the backups running from the Front-End server. The next part of this will be destroying the data and blowing up the server. After that I’ll show how to recover everything we destroyed.
This post was originally written based on the Exchange 2010 beta bits before the Technet documentation was updated to reflect the actual required permissions for a DAG’s FSW. Consequently, it had a major error. You’ll want to visit Devin’s page for a full explanation and the correct way to set up the DAG. Always helps when you can read the documentation, right?
This morning I set out to install Exchange 2010 on Server 2008 R2 and I was amazed I actually had this up and running within 20 minutes of booting my guest virtual machine. I have not looked in to many of the technical advantages of R2 over R1 for Exchange yet, but I can say that the installation requires a lot fewer prerequisite installs than on Server 2008 R1. Here’s a quick guide to getting up and running on R2 with all the server roles installed.
Install a Server 2008 R2 RTM server. I’d recommend using Enterprise Edition so you can add a 2nd Exchange server later and test out the DAGs. I had a Sysprepped image I was able to boot up and join to the domain very quickly.
Copy the Exchange2010-RC1-x64.exe file to your server and run it. Choose a location to extract the files to.
Open a command prompt with administrative privileges and navigate to the folder where you extracted the Exchange files.
Issue the command: servermanagercmd.exe -ip scripts\exchange-all.xml
Ignore the warning about servermanagercmd being deprecated and restart the server when the installation completes.
Open the Services MMC.
Change the Net.Tcp Port Sharing Service startup type to Automatic. The prerequisite check for the CAS role requires this to be set.
Open a command prompt with administrative privileges and navigate to the folder where you extracted the Exchange files.
Issue the command: setup
Click Choose Exchange language option and then click Install only languages from the DVD.
Click Install Microsoft Exchange.
Click Next.
Accept the license terms and click Next.
Select Yes to enable error reporting and press Next.
Select Custom Exchange Server Installation and press Next.
Select the Mailbox Role, Client Access Role, Hub Transport Role, Unified Messaging Role and Management Tools. Press Next.
Name the Exchange organization and press Next.
Select No for Outlook 2003 clients or Entourage (pre-Web Services edition) and press Next.
Check the box Client Access server role will be Internet-facing, enter your public URL (mail.domain.com) and press Next.
Select the option to join the CEIP and press Next.
After the prerequisite check completes click Install.
You can see my installer completed in about 12 minutes, which is pretty damn cool. This was a VM with 3 GB of RAM with its VHD on a RAID 10 set. Imagine if this was a production machine with a real amount of RAM.