Lync Meeting Content and Attachments That Won't Download

My most recent adventure involved a scenario where files uploaded or attached to a Lync meeting couldn't be saved by many of the meeting participants. You could press the Save As or Open buttons, but the progress indicator would just sit at 0% and never move. Some users in each meeting could actually download the content, but the behavior did not follow specific users or meetings. Permissions in the meeting were set to allow anyone to download, so it didn't appear to be an issue specific to the meeting settings. The Lync QoE report submitted by each client gave me a rather unhelpful message:

A resource was unable to be downloaded via HTTP

Time to do some tracing. Firing up Fiddler allowed me to see the client make a download attempt to the external web services site, but I was getting a 404 Not Found response from the server. Sure enough, I was able to find the same hit from my client in the IIS logs on the Front End server.

GET /DataCollabWeb/Fd/05f08f60-e18a-5f5c-b73d-2331aa486896/{ED48B5A7-1CD5-4D5F-9D04-6EC08A6F8FEF}/S4S3N9QD/bc2f8fb2-0948-f75b-9a96-fddf0018beef.bin - 4443 - 192.168.200.224 Mozilla/4.0+(compatible+;+MSIE+9.10.9200.16688+;+Microsoft+Windows+Professional++;+Placeware+RPC+1.0) - 404 0 2 22

Odd. IIS was unable to locate the file the Lync client was asking for. IIS serves up these files from the Lync back-end file share, which pointed me in the direction of the DFS share. This was configured to replicate between two file servers, but nothing looked obviously wrong after opening the share folder.

After some more thought I opened the individual shares on each server. Bingo. They each had different, unique content, only one of which matched what I saw in the actual shared Lync namespace! The Front Ends connections were being distributed across the different DFS members and the content was not being replicated on the back-end, which created scenarios where a file would appear to be missing.

Digging in to the DFS logs allowed me to see the servers had stopped replicating almost 90 days earlier. DFS in 2008 R2 and later uses a feature that prevents replication from restarting if an automatic recovery occurred after a network or power outage. So a brief issue a few months back actually caused DFS to stop replication, and it never resolved itself.

You can switch the automatic recovery back to pre-Server 2008 R2 behavior with this command (from an elevated command prompt) on each DFS node:

wmic /namespace:\\root\microsoftdfs path dfsrmachineconfig set StopReplicationOnAutoRecovery=FALSE

Even still, that change does not magically fix DFS. If a node has been unable to replicate for more than 60 days DFS will also prevent the replication from occurring due to a MaxOfflineTimeInDays parameter. You can view that default value with this command:

wmic.exe /namespace:\\root\microsoftdfs path DfsrMachineConfig get MaxOfflineTimeInDays

And if you need to adjust that value you can use this command on each node:

wmic.exe /namespace:\\root\microsoftdfs path DfsrMachineConfig set MaxOfflineTimeInDays=120

This actually still did not resolve my problem. In the interest of time I was able to use the following steps to get replication flowing again:

  • Disable the secondary node in the Replication Group through the DFS Management Console.
  • On the secondary node, remove the existing file share to close all connections to it.
  • Copy all of the content from the secondary node to a temporary location.
  • Remove all content from the (now unshared) folder on the secondary node.
  • Share the empty folder on the secondary node.
  • Re-enable the secondary node in the DFS Management Console.
  • Copy the temporary saved data to the primary node's file share, skipping any duplicate files.

Issue the following command on both nodes to poll Active Directory for an update:

dfsrdiag.exe PollAD

You should see Event Log entries logged that indicate replication is starting and that an initial sync has completed. After validating replication was working I was able to successfully download newly uploaded meeting content from any connected client.

Lync 2013 and the RTCXDS 16 GB Transaction Log Limit

Be aware that when you deploy Lync 2013 there is a maximum size limit that is configured for the transaction log of the RTCXDS database. The other databases created by Install-CsDatabase all have a maximum size set to about 2 TB (so, essentially unlimited), but the RTCXDS log file is strangely capped at 16 GB.

This isn't normally a problem, but if you use SQL Mirroring and/or the Lync Backup Service, which will both leverage this file more, you may start experiencing problems when this limit is reached. KB 2756725 is posted which indicates you may be unable to start the Front End services once this occurs:

Or you may find users are just unable to connect to Lync, and these errors are logged in your Front Ends.

The RtcDb Sync Agent has encountered an Exception: [System.Data.SqlClient.SqlException (0x80131904): The transaction log for database 'rtcxds' is full due to 'LOG_BACKUP'.

If you open the SQL Management Studio you can check the free space available in each log file with the following query:

DBCC SQLPERF (LOGSPACE)

You'll see this information for each database. In this case you can see the RTCXDS database has used 96.95% of the log space available.

full

You can flush this free space out by running a SQL backup job of the transaction logs. I hope it's obvious, but if you've hit the 16 GB limit you're going to create a 16 GB backup file, so make sure you have at least that much free space on your backup target. This is not your usual database-level backup; you'll need to make sure you run a backup of the transaction logs. The specific steps to run this job are outlined here

After the job completes run the query again and notice the change. RTCXDS now has only 0.14% log space used.

empty

So, the question is how to avoid this problem before you get burned. I see two options:

Back Up the Transaction Log File Regularly

Schedule regular backups of the RTCXDS transaction log file. This will flush the file and keep the Log Space Used % at a low value. It's fairly common for Lync deployments to use various scripts for dumping all the backup data to flat files instead of involving SQL admins and backup software, so this may not be an appealing option.

Increase the Transaction Log Maximum Size

Increase the maximum size of the RTCXDS transaction log to match the other databases at 2 TB. You'll most likely run out of disk space on the server before you ever hit this number, and you can catch this issue from cropping up with a basic disk free space alert.

You can follow these steps to adjust the log file maximum size:

  • Connect to the Database Engine of the SQL instance.
  • Expand Databases.
  • Right-click on RTCXDS and choose Properties.
  • Under Select a page, select Files.
  • Select the rtcxds_log file, then move the scroll bar to the right.
  • In the Autogrowth/Maxsize column click the button with the three dots "…"
  • Under the Enable Autogrowth menu, change the Maximum File Size to Unlimited.
  • Click OK twice to commit changes.

Exchange 2013 Schema Prep Objects to Object References without an Object

When running the Exchange Server 2013 Schema Prep step you may find the Exchange pre-requisites test fails, and an incredibly helpful error message is provided for your reading enjoyment:

The On-Premises test failed with the message: Object reference not set to an instance of an object.

For more information, visit: http://technet.microsoft.com/library(EXCHG.150)/ms.exch.setupreadiness.
DidOnPremisesSettingCreatedAnException.aspx

Well, actually, that was rather unhelpful. And the handy URL provided basically just says the content hasn't been added yet.

Digging in to the ExchangeSetup.log files found in C:\ExchangeSetupLogs is the only way to diagnose what's really going on. This file will list all of the pre-requisite checks the installer is running through, and if you search for the phrase 'HasException:True' you can skip to anything flagged as an issue. This would theoretically identify the exact failure and let you be on your way, right? Nope.

In my case, I ran across this exception:

Evaluated [Setting:IsHybridObjectFoundOnPremises] [HasException:True] [Value:
Microsoft.Exchange.Management.Deployment.HybridConfigurationDetection.HybridConfigurationDetectionException: The On-Premises test failed with the message: Object reference not set to an instance of an object.. ---> System.NullReferenceException: Object reference not set to an instance of an object.
   at Microsoft.Exchange.Management.Deployment.HybridConfigurationDetection.HybridConfigurationDetection.TestOnPremisesOrgRelationshipDomainsCrossWithAcceptedDomain(IOnPremisesHybridDetectionCmdlets onPremCmdlets)
   at Microsoft.Exchange.Management.Deployment.HybridConfigurationDetection.HybridConfigurationDetection.RunOnPremisesHybridTest()
   --- End of inner exception stack trace ---

Ok, so we're getting a little bit closer. You still need to do some Exchange team to real-world language translation, but the gist seems to be that the installer has an issue with the Organizational Relationships and/or Accepted Domains.

Nothing seemed obviously wrong with either of these at first glance, but after combing through each listed object in both sections about ten times I discovered a (non-functional) organizational relationship that had been defined without an Application URI value. Seems like someone had staged this object, but never completed configuration. Deleting the problematic organizational relationship allowed the installer to complete the schema prep step without any further complaints.

It sure would be nice if the Exchange installer was a bit more detailed about exactly which object it was checking or had a problem with, but who doesn't love a good Easter egg hunt from time to time?