The C Stands for Compact

I believe in my last post about damaged Communicator address book files I pointed out that it was a good idea to keep your OCS clients and servers on the same hotfix levels. I would still argue that’s a good thing to do in general, but in my case this wasn’t the actual resolution. While it worked for awhile after a few weeks the damaged address book files error popped up again:

Communicator cannot synchronize with the corporate address book because the corporate address book file appears to be damaged. Contact your system administrator with this information.

All the MOC clients and OCS servers were at the same revision level this time so that wasn’t the problem. Deleting the GalContacts.db and forcing a full download would succeed and the client goes along perfectly happy. Interestingly enough, deleting the GalContacts.db.idx file on a problematic machine would allow the delta file to download successfully so it appears the issue may be with the index file. Anti-virus logs also showed they weren’t trying to clean or repair the file in any way.

I couldn’t find any errors server-side and everything seemed to be functioning properly so I looked at the IIS logs on the Front-End again. Low and behold – the log was huge compared to previous days – about 10x as big. It was filled with many, many requests for downloads of files in the C-xxxx-xxxx.lsabs format which threw me off because the ABS documentation points out that F-xxxx files are fulls, and D-xxxx-xxxx files are delta changes, but has zero mention of the C-xxxx-xxxx files. These IIS requests were also successful downloads, not failures so I wouldn’t have expected clients to have an error, but every user was also repeatedly downloading the same sets of files and then trying to download previous C-xxxx-xxxx files as well.

I took one of the matching C-xxxx-xxxx and D-xxxx-xxxx files (they’ll have the same hex names) and dumped both to text files using abserver.exe –dumpfile to try and compare. Viewing them side-by-side they seemed to have the same content, but in a slightly different order. So it appears they were both delta files, but the C file was about 50% of the D’s size. Odd, but I still had no clue when they would be used over a D because there was zero documentation about the change.

Thanks to a few kind folks on Twitter (@aumblumberg and @MacRS4 ) who went down this same road with Microsoft via a support case previously, I found out the new C files stand for “Compact” and the change was implemented in the July server-side hotfixes. These are also delta files, but compressed in a way supposedly to make them more efficient. In our case (and theirs), it broke the address book downloads completely.

Fortunately, there is a registry key available to prevent clients from trying to use these compact files. This key only applies if you’re using the October Communicator client-side hotfix:

HKLM\Software\Policies\Microsoft\Communicator
Key name: GalUseCompactDeltaFile
Type: DWORD

Possible Values:

  • 0: Do not use compact delta file
  • 1: Use compact delta file (default)
  • 2: Use compact delta file, but do not issue an LDAP query to retrieve the “Title” and “Office” attribute values from Active Directory

You can read more about this registry setting from KB 976985 even though the actual KB is aimed at a different issue with LDAP queries and account lockouts.

I’ll find out today whether this actually fixes the downloads without having to clear out the GalContacts.db file on each client.

It looks like these constant address book changes like this and adding the 0-60 minute download delay are aimed at the larger organizations with a significant address book size, but. I almost feel like these updates are on par with the Resource Kit book providing examples for companies with 100,000+ users in various locations. Great info, but what about the real world? Not everyone using OCS is that big and it would be swell to have guidance around small and medium-sized deployments instead of trying to take those numbers and make them fit. I'd be happy to just let the ABS download the way it used to and leave it alone.

The most frustrating part here has been that this service has been something that traditionally just worked without intervention and instead I’ve been spending hours and hours troubleshooting to figure out what happened because there was nothing mentioned about the change in behavior server or client-side. Maybe there should be some threshold for these ABS disasters optimization changes where they only occur if the full address books are over some value like 10, 20, 50 or 100 MB? Until that happens I'll be disabling the compact delta files at future OCS deployments to make sure we avoid this problem.

Communicator and Damaged Address Book Files

The past few days I spent wrestling an address book server issue in OCS and I wanted to share the solution. The quick version: make sure you have your server side and client side hotfix revisions match up.

If you want the whole story…the specific details of this case involved a Front-End server which had the OCS 2007 R2 April hotfixes, but the MOC clients had the July hotfixes applied. The issue first manifested itself with clients reporting ABS damaged:

Communicator cannot synchronize with the corporate address book because the corporate address book file appears to be damaged. Contact your system administrator with this information.

We resolved this by deleting the entire contents of the address book file share and forcing a resync of the address book. We also deleted GalContacts.db from a few user workstations, but later found the client error actually disappeared on its own without removing the file.

Things hummed along nicely for a week or so until a large amount of users (600+) were enabled for OCS one Friday evening. The following Monday previously enabled pilot users were reporting they still didn’t have a SIP URI in their address books for the mass-enabled users. The GalContacts.db file was also still showing a timestamp from the day the mass change occurred, indicating they had not downloaded an update yet.

We took a peek at the Front-End logs and it appeared to be generating address book files correctly. The odd thing was in looking at the IIS logs we actually saw quite a few 404 errors of MOC clients trying to request delta files that did not exist. Other users showed successful downloads of the latest delta files which should have included the changes, but they weren’t being applied to their local GalContacts.db for some reason. I also saw those same clients registering a success end up using the fallback logic and downloading older address book files even though they had the newer versions. Very, very strange. Any client we deleted GalContacts.db on would pull down the latest full address book with no issues. The clients looking for deltas that didn't exist we probably caused by deleting the address book files previously.

Side tip when looking at the IIS logs: Full files start with F-xxxx and delta files follow a D-xxxx-xxxx naming convention. Also, .lsabs files are used by MOC while .dabs files are used by Tanjay devices.

At that point I noticed the mismatch in server (April hotfixes) vs. client (July hotfixes) versions and suggested we get the latest fixes installed on all sides. While that suggestion made its way through change control procedures we opened a PSS case with Microsoft to hit the problem from another angle. The engineer we spoke with immediately blurted out that we needed to match the hotfix versions as soon as we described the behavior. It sounded to me like this was one he had heard before or was familiar with so while we didn’t have a second approach this definitely helped accelerate the change control ticket to an authorized state. After we fully patched the Front-End using the new ServerUpdateInstaller (a lifesaver), applied the back-end database hotfix, and installed the client October hotfix the address book went back to functioning properly. There were a couple of users that needed to delete the GalContacts.db before everything went back to normal, but most of them picked it up without intervention.

As for root cause, the KB 972403 article actually does reference applying both the MOC and server fix together, but the July server hotfix document doesn’t describe this behavior or even mention it. Personally, I think the underlying issue was having the 3.5.6907.37 hotfix on clients while the abserver.exe file was still at 3.5.6907.0. In any case, I learned a lot more about the ABS than I ever cared to, but it was great information that will surely help in the future.