Lync Music on Hold with Media Gateways

With the CU7 release Microsoft has added Music on Hold abilities to the Lync Phone Edition clients. Many organizations are starting to turn this feature on now, but it's possible that external PSTN callers still might not not hear any music on hold being played. If you're seeing (or not hearing) this behavior, there may be a gateway setting that's not passing the music on hold to PSTN callers.

On an AudioCodes MSBG 1000 this setting can be found under VOIP\GW and IP to IP\DTMF and Supplementary Services\Supplementary Services. Change the Hold Format parameter to "Send Only" if it is currently set to 0.0.0.0. This parameter controls if the gateway should expect a SDP with fields set as a=sendonly and c= containing the client's IP address, or if it should expect a=inactive and c=0.0.0.0. Setting this value to Send Only allows the Lync PCs and phones to successfully pass music on hold to a PSTN caller.

VMM 2012 SP1 Service Stops Every Night

If you have the System Center Configuration Manager 2012 RTM agent installed on a Windows Server 2012 server also running the System Center Virtual Machine Manager 2012 SP1 agent, you might find the VMM service unexpectedly stops each night. There is nothing obvious in the event logs about why the service stops, but each night the hosts will move to "Not responding" in VMM because the service is not running.

This is caused by an inventory process in SCCM that causes the VMM agent to stop. You can uninstall the SCCM agent using this command:

C:\Windows\ccmsetup\ccmsetup.exe /uninstall

One-Way Audio with AudioCodes and Encrypted Lync Media

This is a quick one: If you're using media encryption with the latest MSBG 1000 firmware posted to the AudioCodes website, you might find yourself experiencing issues with one-way audio. Specifically, Lync users are periodically unable to hear PSTN callers. The issue is sporadic, and may appear or disappear as the call continues, although a hold/resume tends to fix it temporarily.

After doing some packet captures you will see a bi-directional RTP stream between the gateway and Mediation server, but the Mediation server does not actually pass any RTP packets to the Lync endpoint side because it is unable to decrypt the media.

This is a known issue in the 6.40A.037.009 firmware and can be resolved by upgrading to version 6.40A.059. You'll have to call in to support to retrieve the newer firmware since the one with the issue is the latest posted to the AudioCodes website (and unfortunately very dated, seeing as it was released in May of 2012.) Happy upgrading.

Lync 2013 MCU and Old Conference Directories

After installing a Lync Server 2013 pool and moving users over, you might find they are unable to create new conferences if they are enabled for PSTN dial-in conferencing. When running a SIPStack trace you’ll find the following error during creation of the conference:

Start-Line: SIP/2.0 500 failedLookupForConferenceDirectoryOwner
ms-diagnostics: 3193;reason="Could not find a front end server that owns the given pstn meetind id."

Which might lead you to notice that Front-End is logging error Event 51048, LS MCU Factory:

McuFactory could not find the pool associated with one of the conference directories.
Failed to read pool FQDN associated with conference directory 20.

Cause: The pool associated with the conference directory does not exist anymore.
Resolution: Conference directories without a valid pool associated can be deleted using management tools.

The issue seems to be that if orphan conference directories exist, they may prevent new conferences for 2013 users. You can check for orphan directories via:

Get-CsConferenceDirectory | FT Identity,ServiceID

Review the output and check to see if there is a ServiceID matching a pool which no longer exists. BackCompatSite entries refer to OCS 2007 R2 servers added during a merge operation.

If a Service ID exists for a pool that has been deleted the conference directory cannot be moved at this point, but you can use the following command to forcefully remove it:

Remove-CsConferenceDirectory <Identity Number> –Force

Take care to make sure you’re deleting the correct identity. Removing an active conference directory will force users to reschedule meetings.

ADFS & RelayState

Active Directory Federation Services (ADFS) has been around for some time now, and many organizations use it to provide single sign-on capabilities to Office 365 without giving it a second glance, but ADFS is really a generic identity provider that can work with other Security Assertion Markup Language (SAML) 2.0 endpoints. The incredibly over-simplified gist of SAML is that some identity provider (ADFS + Active Directory) authenticates a user, hands them a token, and the user takes that token to log in to other web applications such as Office 365, Salesforce, Workday, Jobvite, or any SAML 2.0 compliant service. The benefit here is organizations have a single point of authentication in ADFS, but can use those credentials to access multiple services that they don’t necessarily host or deploy.

Often times these services will email a user with a hyperlink embedded in the message that links to a specific page or “deep link” on the service. If a user clicks on that link, it should take them directly to the referenced opportunity or page. The issue with the initial release of ADFS was that this feature did not work. The users could click one of these links and get logged in to the service, but they would always end up on the home or main page – not the link they clicked on. The reason for this is that ADFS did not support the RelayState parameter, which actually contains that end state or desired URL after login occurs. Update Rollup 2 for ADFS adds this functionality and can be downloaded from here: http://support.microsoft.com/kb/2681584 .

However, this does not mean that RelayState will begin working automagically. There are some requirements about how the URL is formatted in order for ADFS to properly consume the RelayState and send the user to the link after authentication. Those guidelines are laid out here, and includes the fact that part of it must already be URL encoded: http://technet.microsoft.com/en-us/library/jj127245(v=ws.10).aspx

So the options are to have the service provider deliver deep links in the format ADFS requires (unlikely to happen), or modify the links to the correct format. Fortunately, we can use the IIS URL Rewrite Module to match and manipulate patterns based on regular expressions (your Lync voice administrators should be very familiar with this concept.) Install the URL Rewrite module on your ADFS servers from here: http://www.iis.net/downloads/microsoft/url-rewrite

Let’s work through an example that uses the Jobvite service. I apologize for the color barf throughout the rest of the post, but I hope it highlights how everything lines up. A deep link that a user receives in an email from Jobvite resembles the following format:

https://www.jobvite.com/m/default.aspx?3EGNofwx

When a user clicks that link, Jobvite knows the tenant is enabled for SSO, and provides a redirect URL to the browser where the user should authenticate. It’s the equivalent of saying “I don’t authenticate you, but this URL (ADFS) can. Come back when you have a token that says you’ve authenticated.” In Jobvite’s case, that URL includes an identifier for ADFS to match a relying party trust (found in the loginToRp parameter), and two RelayState parameters – RelayState and Target. The values in RelayState and Target are identical, so I can only assume Jobvite is hoping the SSO provider will consume one of these. Unfortunately, ADFS doesn’t like either! This is the URL a user is redirected to by Jobvite:

https://sso.confusedamused.com/adfs/ls/idpinitiatedsignon.aspx?loginToRp=http://jobvite.com/saml&RelayState=/m/default.aspx?3EGNofwx&target=/m/default.aspx?3EGNofwx

Now the user’s browser flips to the ADFS link and logs in. Afterwards, the user is dumped back in to the main Jobvite page because the RelayState parameter didn’t match ADFS’s expected format. There are no errors, and a user is now logged in to Jobvite, but they didn’t end up on the URL they clicked in the email. It’s more of an annoyance than a major issue, but it can be fixed. The format ADFS wants to see looks like this and the part in italics must be URL encoded.

https://sso.confusedamused.com/adfs/ls/idpinitiatedsignon.aspx?RelayState=RPID=[Relying Party ID]&RelayState=[Relay State]

Covering how regular expressions or the rewrite module works is outside the scope of this article, but to be quick:

1. We match a specific URL and pattern received by the IIS server.

2. We then “rewrite” or modify that URL before it is actually processed and delivered to the ADFS application.

Start by opening the URL Rewrite module and click “Add Rule(s).” Select “Blank rule” and click OK.

First we need to set a match URL. The URL is in this window is really just the FQDN part of the URL, so in this example it would be sso.confusedamused.com. I’ve simply entered .* as the match URL pattern to allow any FQDN that points to this server to be accepted. I could use DNS to reach this server via sso.confusedamused.com, adfs.confusedamused.com, yourmomgoestocollege.confusedamused.com, or anything else and it would still match.

image

Next, we need a pattern to match. Expand the Conditions section and press “Add". The condition input we’ll check is the {REQUEST_URI}, which refers to everything after the FQDN we just checked. The pattern we’ll use looks like this.

^(.*)/adfs/ls/idpinitiatedsignon.aspx\?loginToRp=(.*)&RelayState=(.*)&(.*)$

image

In English, the pattern means “Must start with”, “any string” (that’s the .* part), followed by the exact text “/adfs/ls/idpinitiatedsignon.aspx\?” (we have to escape the “?” with a “\”), followed by loginToRp=”, capture the next “any string” (because this one is in parentheses it means we’re capturing this data to re-use in the rewrite later), followed by the exact string “&RelayState=”, capture the next string, followed by the exact string “&”, capture the next string, and finally, “must end with”. Technically, we don’t need to capture that last string because it just contains the target=/m/default.aspx?3EGNofwx part, which we already know is identical to the RelayState parameter passed right before it. But you get the idea.

For reference, here is the URL again that matches the above pattern. The colors should line up.

https://sso.confusedamused.com/adfs/ls/idpinitiatedsignon.aspx?loginToRp=http://jobvite.com/saml&RelayState=/m/default.aspx?3EGNofwx&target=/m/default.aspx?3EGNofwx

In the Action section the type we’ll select is rewrite. Remember how we “captured” some parts of text in the pattern previously? Those line up to variables that we can now reuse and reference in the rewrite pattern in the format {C:[Variable Number]}. So we have {C:1} that matches the first string we captured, and {C:2} matches the second one, etc. {C:1} actually refers to the very beginning of our pattern (the first (.*)), so the text we’re interested in starts at {C:2} which is the Relying Party ID, and includes {C:3}, our RelayState link. 

Just to refresh, the end-state we need to get to is this, and we have to URL encode everything after the first RelayState=:

https://sso.confusedamused.com/adfs/ls/idpinitiatedsignon.aspx?RelayState=RPID=[Relying Party ID]&RelayState=[Relay State]  

The URL Rewrite module actually includes a function to URL Encode a section of text so the rewrite to use will look like this:

adfs/ls/idpinitiatedsignon.aspx?RelayState={UrlEncode:RPID={C:2}&RelayState={C:3}}

image 

After that, simply select “Stop processing of subsequent rules” and uncheck the box “Append query string.” The end result is that the original link returned by Jobvite that ADFS doesn’t recognize…

https://sso.confusedamused.com/adfs/ls/idpinitiatedsignon.aspx?loginToRp=http://jobvite.com/saml&RelayState=/m/default.aspx?3EGNofwx&target=/m/default.aspx?3EGNofwx 

… will be transformed into this URL before it hits ADFS. The user experience is now that a deep link works as expected. A user that is not logged in to Jobvite can click a URL from an email, be seamlessly authenticated via ADFS, and end up on the URL they clicked on.

https://sso.confusedamused.com/adfs/ls/idpinitiatedsignon.aspx?RelayState=RPID%3Dhttp%3A%2F%2Fjobvite.com%2Fsaml%26RelayState%3D%2Fm%2Fdefault.aspx%3F3EGNofwx

To recap – the process goes:

1. User clicks link in email: https://www.jobvite.com/m/default.aspx?3EGNofwx

2. Jobvite redirects user to this link: https://sso.confusedamused.com/adfs/ls/idpinitiatedsignon.aspx?loginToRp=http://jobvite.com/saml&RelayState=/m/default.aspx?3EGNofwx&target=/m/default.aspx?3EGNofwx

3. URL Rewrite module matches the redirect link pattern and modifies the link to the following before delivering to ADFS: https://sso.confusedamused.com/adfs/ls/idpinitiatedsignon.aspx?RelayState=RPID%3Dhttp%3A%2F%2Fjobvite.com%2Fsaml%26RelayState%3D%2Fm%2Fdefault.aspx%3F3EGNofwx

You can create as many rewrite rules as necessary in case different service providers each send unique formats with the RelayState parameter.

Lync VMs and Virtual Audio Cable

The need to have dedicated PCs each with an audio device for testing Lync audio can sometimes be a challenge, especially in lab environments where all of the clients are virtualized. If you use the console or remote desktop on a virtual machine running Lync you’ll see the warning that no audio device is connected:

image

And if you check your Audio Device settings, Lync will tell you that no device has been found. This prevents you from testing any audio functionality, which can be handy in a lab to validate Edge server or gateway integration scenarios.

image

A solution here is to leverage a tool called Virtual Audio Cable, which offers a free trial available here: http://www.ntonyx.com/vac_demo.htm

To get started, unzip the package and run the installation on a VM. After the software is installed you can create the virtual audio devices. Navigate to the Virtual Audio Cable folder within the Start Menu. Right-click on Control Panel and select “Run As Administrator” to open the configuration utility.

Enter 2 for cables and press Set. Then exit the application.

image

Now open Lync again and check the Audio Device settings. You can select Virtual Audio Cable Line 1 for the speaker, and Line 2 for the microphone.

image

At this point you can now make and receive phone calls within the VM (although you can’t hear anything!)

image

I should note that this has to be done from the console session of a Hyper-V or VMware VM. Connections via Remote Desktop will not work the same way. And to be perfectly clear, this has absolutely nothing to do with the VDI functionality in Lync 2010 or 2013. It is purely for lab and testing purposes.

Site Refresh

It makes me sad to realize this blog redesign started over a year ago, and it's still ridiculously unfinished, but I think it's in a usable state. My original intent was to finish every possible corner, and clean up my CSS and HTML code before launching the page, but I've long given up on that idea and realized it's probably a hopeless dream. Since the day I started this redesign I've gotten married and had a daughter, so I think the days of having extra free time to work on the website are long gone. I'm not sure any component of the page is in it's final state today, but I hope to keep fine-tuning and editing items incrementally (hopefully without breaking anything!)

So, enjoy.

Drop me a line on Twitter if you notice any problems and I'll try to clean them up. And yes, I probably make about 50 cents if you buy Lync Server 2010 Unleashed through the link on the pages. But you can feel good about funding my daughter's college education by buying it from Amazon.

Cisco UCS M81KR NICs on Server 2012

We use Cisco UCS internally at my company because of its flexibility and sheer awesomeness. On top of the ridiculously cool NIC failover abilities, it lets us do crazy stuff like move entire Hyper-V servers between different physical blades in a matter of minutes. I noticed today that after installing a new Server 2012 node from iSCSI boot only a single NIC was present on the server. I had added 3 other vNICs via UCS manager, but Windows had only activated the one iSCSI used for the boot process. I checked Device Manager and found all of the Cisco VIC Ethernet Interfaces appeared as if they were missing a driver. When I tried to update the driver Windows reported it was already up to date.

In order to fix this just open up the properties of the VIC in Device Manager, navigate to the Driver tab, and select Uninstall. Repeat for each VIC showing the yellow warning sign. At the end, right-click the Network adapters and choose Scan for hardware changes. The NICs should come right up with a functional driver.

Fixing VeriSign Certificates on Windows Servers

One item I’ve seen repeatedly cause issues in new Exchange or Lync environments centers around certificates from public providers such as VeriSign, Digicert, or Entrust. These providers generally use multiple tiers of certificates, so when you purchase a certificate it is generally issued by a subordinate, or issuing certificate authority instead of the root certificate authority. The way that SSL certificate chains work require an end client to only need to trust the top most, or root certificate in the chain, in order to accept the server certificate as valid. But in order to properly present the full SSL chain to a client a server must first have the correct trusted root and intermediate certificate authorities loaded. So the bottom line here is that if you haven’t loaded the full certificate chain on the server then you may see clients have trouble connecting.

This becomes especially problematic in the case of VeriSign’s latest chain. If you are using a modern Windows client such as Windows 7 or 2008 R2 you’ll see the VeriSign Class 3 Public Primary Certification Authority – G5 certificate which expires in 2036 with thumbprint ‎4e b6 d5 78 49 9b 1c cf 5f 58 1e ad 56 be 3d 9b 67 44 a5 e5 installed in the Trusted Root Certification Authorities by default. There is some extra confusion generated because there is also a VeriSign Class 3 Public Primary Certification Authority – G5 certificate which expires in 2021 with thumbprint ‎32 f3 08 82 62 2b 87 cf 88 56 c6 3d b8 73 df 08 53 b4 dd 27 installed in the Intermediate Certification Authorities by default. The names of these certificates are identical, but they are clearly different certificates expiring on different dates.

What you’ll find after purchasing a VeriSign certificate is that the CA which actually issues your server certificate, VeriSign Class 3 Secure Server CA – G3, is cross-signed by both of the G5 certificates. This means that there are now 2 different certificate chains you could present to clients, but what is actually presented depends on how you configure the server. The two chain options you can present are displayed below, and while one is a bit longer, both paths are valid.

image

So if a client trusts either of the G5 certificates as a trusted root, it will trust any certificate issued by a subordinate CA such as the G3. What ends up happening is that the certificate chain will look correct when a Windows 7 or 2008 R2 server connects to it, because those operating systems already have the 2036 G5 CA as a trusted root. You’ll see only 3 tier chain presented, and the connection will work just fine.

image

There’s nothing actually wrong with this if all you have are newer clients. In fact, that’s one advantage of cross-signing – that a client can leverage the shortest possible certificate chain. But any kind of downlevel client, such as Lync Phone Edition, does not trust that newer G5 CA by default. This means that when those devices try to connect to the site they are presented with the 2036 G5 certificate as the top-level root CA, and since they do not trust that root they will drop the connection. In order to support the lowest common denominator of devices the chain should actually contain 4 tiers, like in the following screenshot. Older devices typically have the VeriSign Class 3 Public Primary CA already installed as a trusted root, so you may get better compatibility this way.

image

The screenshots have been from the same certificate, but the difference is how the chain is presented. In order for a server to present the full chain you must log on to each server hosting the certificate and open the certificates MMC for the local computer. Locate the VeriSign Class 3 Public Primary Certification Authority – G5 certificate in the Trusted Root Certification Authority node, right-click, and open the Properties. Select Disable all purposes for this certificate and press OK to save your changes.

image

By disabling the incorrect trusted root certificate the server will now be presenting the full chain. The big ‘gotcha’ here is that you can’t easily test this. If you browse to the site from a Windows 7 client and open the Certification Path tab for the certificate it’s still going to look the same as before. The reason for this is that Windows 7 also has the VeriSign Class 3 Public Primary Certification Authority – G5 certificate in the Trusted Root Certification Authorities machine node by default. And because Windows 7 trusts that as a root CA, it will trust any certificate below that point. Certificate testing tools you find on the Internet also aren’t going to be much help here because they also already trust the 2036 G5 certificate. The only way you can verify the full chain is to delete or disable that cert from the client you’re testing on. And no, this is not something you should ever attempt on multiple machines – I’m suggesting this only for testing purposes. If you’re using any kind of SSL decryption at a load balancer to insert cookies for persistence you’ll want to make sure the load balancer admin has loaded the full chain as well.

So now you’ve fixed the chain completely, and after the next Windows Update cycle you’ll probably find the G5 certificate enabled again on the server. The root certificate updates for Windows will actually re-enable this certificate for you (how kind of them!), and result in a broken chain for older clients again. In order to prevent this from occurring you can disable automatic root certificate updates from installing via Windows Update. This can be controlled through a Group Policy setting displayed here:

image

Publishing Exchange Web Services remotely only for Lync

A recent Lync project for me involved a scenario where the customer who was not publishing any aspect of Exchange remotely. They were using Good for all email to mobile devices, and had elected not to publish Outlook Anywhere or Outlook Web app remotely. This presented some challenges in terms of features available to remote Lync clients since a lot of functionality comes from Exchange Web Services (EWS). If EWS is not available a Windows Lync client can fall back to pull some data from Outlook via MAPI, but that does nothing to help a Mac Lync user, Lync mobile devices for iOS, or any Lync Phone Edition clients. All of those require EWS to be published remotely or there will be some feature loss and very visible errors within the user interface.

To work around this issue we started with a goal to publish only Exchange Web Services to support Lync clients, and still prevent all other types of clients from connecting to Exchange remotely. Publishing only EWS is a straightforward process and can be done through the TMG publishing rules. After using the Outlook Anywhere publishing rule template in TMG simply edit the paths tab and remove everything except /EWS/* and /Autodiscover/*:

image

This will effectively allow remote users to connect and do autodiscover queries. Remote Outlook clients will perform a successful autodiscover lookup, but see Outlook Anywhere is disabled (assuming you’ve disabled it on the CAS!) and not attempt a connection. You’ll also want to make sure the ExternalURL property of your EWS virtual directories are populated with the correct FQDN and that public names for that URL and autodiscover are allowed in TMG:

image

This would work great if all you had in the world were PCs, but the challenge here is the Mac Outlook and older Entourage 2008 Web Services clients are entirely based on EWS and by configuring the previous tasks in TMG we were now allowing those remote clients to connect. We needed some way to block those clients so we settled on filtering based on the User Agent string, which is a unique string within the HTTP header identifying the type of client making a request. We started with working out which user agents we actually needed to allow and came up with the following list after running some logs and tracking the user agent headers in sample client connections:

  • Lync iOS (iPad and iPhone): Microsoft Lync iPhone
  • Lync Phone Edition: OCPhone
  • Lync PC: OC
  • Lync for Mac: MC

TMG actually has a User Agent filtering tool built in, but it was very unfortunately written backwards for what we needed. You can see here that TMG is expecting you to explicitly block specific user agents that you don't want to allow:

image

That seems like a good idea if you know all the strings for every possible EWS client out there, but what happens when a new client comes out that’s not specified here? It would be allowed to connect and bypass the restrictions we were trying to impose. In this scenario a load balancer such as an F5 BigIP or A10 Networks AX device can be incredibly handy because of their iRules and aFlex engine. In our case we were able to use the aFlex rules to block requests from everything except the clients we wanted to explicitly allow. The actual code is here:

when HTTP_REQUEST {

#log local0. "[HTTP::header "User-Agent"] INBOUND"

if { ([HTTP::header "User-Agent"] matches_regex "Lync.*") or ([HTTP::header "User-Agent"] matches_regex "Microsoft\+Lync\+iPhone\/.*") or ([HTTP::header "User-Agent"] matches_regex "MC\/.*") or ([HTTP::header "User-Agent"] matches_regex "OC\/.*") or ([HTTP::header "User-Agent"] matches_regex "OCPhone.*") } {

pool ews

} else {

#log local0. "[HTTP::header "User-Agent"] REJECT"

reject

}

}

This method is not foolproof since a user agent can be spoofed fairly easily, but it met the needs of this particular project. You should also keep in mind that if Outlook Web App and the Exchange Control Panel are not published remotely then users will be unable to edit their voicemail options or manage personal call answering rules from the Internet. In my case requiring users to connect their VPN to use those features was an acceptable solution.

Some notes on Lync and Exchange UM QoS

If you haven’t found it yet, the Enabling Quality of Service documentation on TechNet is a fantastic resource to get started on configuring QoS marking for Lync servers and clients. So when planning on enabling QoS in your environment you should start there, and I’d also recommend following Elan Shudnow’s posts for step-by-step screenshots of how to configure these policies on Lync servers. What I’d like to cover here is one scenario that I don’t see documented at this point – Exchange UM and Lync Edge QoS. When a remote user calls in to UM Subscriber Access or an Auto-Attendant via Lync the audio stream will not flow through the Front-End servers. Instead, it will be User <> Edge <> UM.  So if your QoS policies on the Edge don’t take UM into account you won’t have audio traffic on the Edge > UM leg of the call being tagged with a DSCP value.

To get started you can reference the Configure Quality of Service for Unified Messaging documentation. If you’ve only ever used policy-based QoS settings like Lync Server 2010 leverages then you may find the UM setup a little confusing. The key to getting UM to start marking packets is to enable the QoS feature via registry key. On each UM server you’ll want to create a new DWORD Called QoSEnabled inside HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\RTC\Transport and set the value to 1 (don’t worry if some of those sub-keys don’t exist yet – it’s safe to create them.) You can ignore the confusing TechNet note that says you should restart your Lync or OCS servers after this change. The registry key and restart applies to the Exchange UM server you just configured this registry key on – not your Lync servers.

After restarting the UM services you’ll find it will mark all outbound audio packets as SERVICETYPE_GUARANTEED. Windows defaults to applying a DSCP value of 40 for this type of traffic, but you may need to modify this to be something more standard in the networking (Cisco) world where audio is typically marked with DSCP 46. In order to do this you can either apply a Group Policy to the machines or edit the local Group Policy settings on each UM server. You can adjust this value within the Computer Configuration\Administrative Templates\Network\QoS Packet Scheduler\DSCP Value of Conforming Packets section of Group Policy.

SNAGHTML342349

Edit the Guaranteed Service Type value to match the DSCP value your network devices are expecting for audio:

SNAGHTML3472cf

At this point UM tagging of audio packets should be functional and you can (and should) verify this with a Wireshark or Netmon capture. What I’ve not seen called out is the fact that UM is just another client in the world of Lync with Edge servers and that it will be passing audio traffic through the Edge servers for remote users. UM will not respect the audio ports you limit Lync clients to, and it does not use the same range as Lync servers for audio. UM’s default port range is actually quite large since it uses UDP 1024-65535. If you’re tagging traffic from your Edge servers to Lync servers already you can simply re-use the same ports by configuring them in the msexchangeum.config file found within C:\Program Files\Microsoft\Exchange\v14\bin on each UM server.

If you’d prefer to not adjust the default port range you’ll want to be sure the UM servers are accounted for on each of your Lync Edge servers as a separate target in your QoS policy. In this example I’ve set up a separate policy towards each UM server and specified the dynamic range UM will be using as the destination port. This ensures any traffic leaving the internal-facing Edge NIC and heading towards Exchange UM will be marked with DSCP 46.

image

I also want to reiterate one point that Elan calls out since it’s not documented properly at this point – the TechNet docs suggest targeting the MediaRelaySvc.exe application in the QoS policy on the Edge servers. What you’ll find is that if you do specify an executable the packets leaving the internal-facing Edge interface will not be tagged at all. Your rule probably looks perfect and you can restart the server as many times as you’d like, but if you specify the executable you will find all packets leaving the server as DSCP 0. The workaround here is to either not specify the executable at all, or if you want to be more specific you can make sure the source IP in your QoS policy is the internal-facing NIC like I’ve done in the screenshot above.

Jabra 9450 and 9460 Headsets for Lync

This past week I had the chance to try out two different wireless headset options for Lync that are available from Jabra. I wouldn't have guessed it based on the model numbers, but the 9460 is the older headset of the two. It has a touchscreen control and mute button which l thought was a huge bonus.

9460 headset and base station:

photo 1

The charging stand for the headset has a docking station that swivels, but I found the angle options you can put the headset at weren't very thought out. It's impossible to turn the headset to 90 degrees so the boom mic ends up behind the touchscreen in order to save some space. Oddly enough, they do let you turn the stand completely the opposite way in case you wanted to make the dock and headset take us as much room as humanly possible.

After using it for a bit I have to say I was pretty disappointed with the quality of the screen. It sure looks nice, but the touchscreen itself was not very responsive. I found I usually had to hit the button a few times before it would register. There also seemed to be a bit of a delay between the time I would press the button and when it would be marked as a mute in Lync.

9460 touch screen:

photo 2

I was a bit wary of that touchscreen up front so I had also ordered the newer, 9450 model as a backup option. The base station and headset itself are identical to the 9460 and it has the same swivel restrictions. It doesn't look nearly as slick, but it does have physical buttons for mute and the ability to switch between a desk phone and softphone. I've found these physical buttons to be much more responsive and was much happier with this model.

9450 headset and base station:

photo 3

9450 physical buttons:

photo 4

The headsets each have a charging connector on one of the ears which docks on to the charging station. I found it a little awkward to place the first few times, but it got easier the more I used it. It would be swell if you could point the boom mic up or some other direction than the default because I kept knocking it into my desk when trying to get it charged.

Headset and charging station:

photo 5

I should also say that both headsets were extremely comfortable to wear and the sound quality was fantastic. If you're looking for a new Lync headset I'd probably suggest the 9450 at this point. The models are roughly the same price, but the base station on the 9450 seems like much higher quality and it is a more recent piece of hardware, for whatever that's worth.

In the end I found the sweet spot for me was resting the free end of the headset on top of my Polycom CX600:

photo

Office 365 Migration with Cisco IronPort

I ran across an interesting issue recently where a client could not get Autodiscover to work properly during their “rich coexistence” period with an on-prem Exchange 2010 during their migration to Office 365. Autodiscover for an on-prem user would work fine, but as soon as the user had their mailbox moved to Office 365 the Autodiscover process wouldn’t work. The DNS records looked fine and when looking at the log we saw the client would connect to the internal SCP, get a redirect to Office 365 for the correct SMTP address, and then fail. We couldn’t set up a brand new profile for the user internally, but we noticed it would work perfectly ok from an Internet client. Must be something internal at that point, right?

After some more testing we learned a Cisco IronPort was being used for outbound web proxy filtering. As soon as we added an exception for the test machine's IP address we found Autodiscover worked just fine for a cloud user. In the end we added an exception for the FQDNs .outlook.com and .online.lync.com. Secure web filtering keeping users safe and admins frustrated. Happy migrating.

DHCP Snooping and Lync Options

A few weeks ago I was trying to track down a DHCP issue for some Lync Phone Edition clients. Tethered phones could sign in with no issues, but it didn't look like PIN authentication was working properly. We went through and validated all the DHCP options were present on the server, even removed and added them back, but would ultimately end up with "Certificate web service cannot be found" displayed on the phones. In order to isolate the problem a bit I started using the DHCPUtil.exe tool from a workstation (don't forget it must be x64 to use this) so I could simulate what the phones were requesting. You can copy the file from a Lync server and simply run DHCPUtil.exe -EmulateClient to simulate the options request process. While doing this I ran simultaneous packet captures on the workstation and DHCP server to see where the disconnect was.

Here you can see the client request:

The DHCP server sees the request come in with the vendor identifier:

The DHCP server responds to the client with Option 120 and 43:

But the client never sees the response:

After grabbing those traces it was pretty obvious something in the network path was preventing those responses from getting back to the client. The odd part was it wasn't preventing all parts of DHCP from working, since the clients could get an IP, gateway, and DNS information just fine - it was just the DHCP Inform message with the Lync options that was getting hosed.

We took those traces to the network folks and had them take a look at the Cisco switches in use. Low and behold, someone had previously left a couple DHCP snooping commands enabled in one of the switches, and that switch was dropping all of the DHCP Inform responses. We disabled that debugging feature and found the phones were able to sign in just fine.

Site was hacked

So it turns out that updating your Wordpress code on a fairly regular basis (or more regular than every 2 years) is somewhat advisable, or you may find lots of your code containing links to Russian web sites. Sorry for anyone who hit the site and found a virus alert. In order to remove that garbage I've gone back to a clean install of Wordpress. All of the content - good, bad, and potentially outdated - is still here, and I haven't found any evidence of problems in the actual Wordpress database yet, but I haven't spent too long digging through it. Until I can find the free time (ha!) to verify everything is still ok I'll be leaving the site with this fugly default theme. I have a an entirely new, custom theme in progress so I imagine I'll just move the site to that layout once I finish it up.

Not So Snomtastic

Disclaimer: This post is purely my own opinion and is based on my personal experience using Snom products against Lync Server 2010 circa Fall 2011.

Awhile back I was involved with a migration involving moving a customer from an OCS 2007 R2 Enterprise Voice deployment where they were using a large installation of Snom phones. There were a number of problems which could have probably been alleviated by keeping the Snom device firmware reasonably up to date before we started the project, but the scenario we started with was one where all the phones were running firmware approximately 2 years old. No big deal, right?

Updates
The first major issue with these phones is they are not updated using the same device update feature which is built right in to both Lync and OCS. This means you cannot download a Microsoft-produced UCUpdates.cab and load it on the phones the same way you update all of your other UC devices. All updates for these devices come from Snom and must be distributed to each phone through some other manner.

You have two options - you can manually update each device, or use the freely available Snomtastic server software (which is not actually produced by Snom so good luck on finding quality support or documentation around this product.) Manually updating the phones has the approximate success rate of hitting the jackpot in Vegas, and is probably worth an entire post in itself. Let's just say it involves knowing the secret code to press on the phones as they boot (Up, Up, Down, Down, Left, Right, Left, Right, A, B, A, B, Select, Start... ok, just kidding.) In reality, it's a crapshoot if you pressed the right sequence in time to even kick the phone into an updateable mode. And the odds of the update bricking the phone when coming from these old versions was about 50/50 from our testing.

Updating hundreds of phones manually isn't practical, so you're probably looking at using Snomtastic to perform these updates. In my case Snomtastic was already deployed, but it was an old version which was apparently single-threaded. Sidenote: it also appears to be in contention for the World's Worst User Interface, but it's free so I suppose you take what you get if you don't want to manually update each phone. The single-threaded limitation meant we could only safely update 10-15 phones at a time. If you attempted to do more than that the software would come to a screeching halt with the CPU pegged, and the phones would be stuck trying to load the firmware. The end result was you had to go physically touch and manually kick them out of this mode before they were operational again. Updating to a new version of Snomtastic was out of the question as it was going to involve physically touching each phone once more to change VLANs around.

Any Day Now...
In the end we turned a blind eye to the update process and began validating the product against Lync. This customer had a ton of Snom 3xx devices which did not actually have a firmware release for Lync quite yet. Snom was touting their "UC Edition" firmware which is the version which runs against Lync, but that firmware only existed for their newer 8xx series phones when we started. This meant we were stuck with the older 8.5.8.1 "OCS Edition" version for the 3xx phones. We were told the UC Edition for the 3xx's was coming "any day now" when the project started. 3 months later the story hadn't changed, so the confidence in quick updates and issue resolution was not high. We did notice the 8.5.8.1 phones had MRAS issues aginast our Lync Edge server pool which was using DNS load balancing, but we were assured by support that the UC Edition firmware would solve this issue, whenever it came out. Surely the OCS Edition couldn't be expected to support DNS load balancing when that feature did not exist for OCS.

DNS Load Balancing
While waiting for the 3xx update Snom support encouraged us to test the 8xx devices because the code would be identical. This seemed like a good idea so we started doing some tests against the environment Again we noticed that even the 8xx devices couldn't communicate with remote users. Internal calls worked fine, but any time a call required Edge traversal the call would fail. This was only occurring when the Snoms were involved, and a full Lync or Lync Phone Edition client received MRAS credentials just fine. We eventually tracked this down with Snom support to a problem with DNS load balancing in their code. The gist was if the Snom phone got more than one name back for the Edge pool it freaked out and wouuldn't get any MRAS token. Snom support got us a "fix" in a nightly beta snapshot which I was told ignores any DNS entry other than the first one. Yes, you read that correctly - the fix was to essentially turn off the HA features of the Snom client so there was no resiliency to DNS load balancing of the Edge pool. It would be good to point out this was the initial public release of the UC Edition code, which specifically advertised support for DNS Load Balancing and had been stamped with the official "Qualified with Lync" label.

Snom support eventually gave us a beta version of the UC Edition for 3xx devices which had the exact same problem. The UC Edition introduced support many Lync-specific features, but it was really hard to have any confidence in any of these claims when it was obvious that a pretty major new feature wasn't working at all. I have no clue if this bug has been fixed to this day, or if the code still ignores anything past the first DNS record returned for the pool.

The Big Switch
In light of these issues the customer made the tough call to switch out all the phone hardware and went with one of the devices which carries the "Optimized for Lync" label from one of the major Lync Phone Edition vendors. The end-user and administrator feedback with this switch was overwhelmingly positive. Call quality was perceived as better by the users, and we were able to back up those claims with the quality reports from the Monitoring server. The administrative experience for updating and monitoring phones was also much improved by using the native reporting and management tools.

There are plenty of other Lync Phone Edition options available which may seem likea larger initial investment in total price, but you really need to consider a number of factors in that decision other than per-device pricing. With a lower-end device you're likely to spend an incredible amount of time troubleshooting issues which wouldn't normally be a problem. You also have additional overhead in terms of management for these devices since you now have to maintain a Snomtastic server (with no real support) alongside your Lync deployment, when you could just be using the native Lync tools for updates. The end-users are going to have a much nicer experience via the USB tethering abilities of Lync Phone Edition clients, and it's important to note that the Snoms also do not support RTAudio. They advertise wideband audio support, but it is G.722 only which uses nearly twice the bandwidth of wideband RTAudio. This can be a concern for bandwidth-constrained branch sites.

I'm not advocating a boycott of Snom products here. I'm sure others have had different, and hopefully more positive experiences, but mine was incredibly poor with Lync for a number of reasons I've mentioned above. And I'm sure Snom is making a strong effort to improve these products, but my perspective at this point is that they just doesn't seem Enterprise-class. They are not an option I would present to a customer today over any of the vendors producing Lync Phone Edition devices.

Migrating OCS conference directories to Lync the hard way

A few evenings ago I ran into a scenario where moving a conference directory from OCS to Lync failed, and the conference directory ended up in this limbo state where it wasn't on Lync, I couldn't move it back to OCS, and the conferencing attendant wouldn't recognize any PSTNs IDs which were part of the directory. Not a great scenario.

After running Move-CsConferenceDirectory I could verify the move was in progress, but it never completed. The status would show it was trying to move, and OCS eventually started throwing errors that it no longer had a conference directory, but it never fully made it to Lync. The TargetServerIfMoving parameter stayed populated:

Get-CsConferenceDirectory -Identity 5
Identity: 5
ServiceId: UserServer:OCSPOOL.ptown.local
TargetServerIfMoving: UserServer:LYNCPOOL.ptown.local

Trying to run Move-CsConferenceDirectory again would consistently fail with the following errors:

WARNING: Move operation failed for conference directory with ID "5". Cannot perform a rollback because data migration might have already started. Retry the operation.
WARNING: Before using the -Force parameter, ensure that you have exported the conference directory data using DBImpExp.exe and imported the data on the target pool. Refer to the DBImpExp-Readme.htm file for more information.
Exception from HRESULT: 0xC3EE7950, Microsoft.Rtc.Management.ConferenceDirectoryCmdlets.MoveConferenceDirectoryCmdlet

In the end I needed to export the data from the OCS directory via DbImpExp, force the directory to move, and then import the data. Not the cleanest route, but it works. The order is important, so be patient.

On the OCS pool and database export the conference directory data:

DbImpExp.exe /hrxmlfile:C:\Temp\OCSDirectory5.xml /SQLServer:OCS-SQL.ptown.local /restype:confdir

Only once you're positive you have a good export (Read: opened the file and checked!), and made a copy of it you can force the Move-CsConferenceDirectory operation:

Move-CsConferenceDirectory 5 -TargetPool LYNCPOOL.ptown.local -Force

Congrats. You've moved the directory to Lync, but it's empty. Copy the .xml export file to a FE in the Lync pool. On the Lync pool and database import the directory data while specifying the conference directory ID to recover the old data:

DbImpExp.exe /import /hrxmlfile:C:\Temp\OCSDirectory5.xml /SQLServer:LYNC-SQL.ptown.local /restype:confdir /dirid:5

At this point I could see the directory was no longer moving because TargetServerIfMoving was empty, and the conference attendant was now recognizing PSTN IDs which had been created against this directory.

Get-CsConferenceDirectory -Identity 5
Identity: 5
ServiceId: UserServer:LYNCPOOL.ptown.local
TargetServerIfMoving: 

Again, this is a good reason to always do a DbImpExp.exe dump before moving directories or databases around. Those .XML files can save your skin!

Lync Mobile on a Windows Phone 7 Emulator

Looking to try out Lync Mobile on Windows Phone 7, but don't have a Windows Phone? That was my scenario when I needed to troubleshoot a sign-in issue specific to WP7. I figured it would be as easy as firing up the WP7 emulator, but there are a few roadblocks here such as the fact that a WP7 emulator exists, but you can't access the Marketplace in it. There are some workarounds for that, but even if manage you launch the Marketplace you can't actually sign in with a Live ID to download anything.

So, Phone7Market to the rescue. This freebie application allows you to download an app from the Marketplace and load it into the emulator. The first thing you'll need is the Windows Phone 7 emulator so start by downloading and installing the Windows Phone 7 SDK from Microsoft.

Now that you have the SDK installed you can launch the emulator, but as you can see there's not much you can do with it out of the box:

Next you'll need to download and install the Phone7Market application in order to load Lync Mobile.

After installing, open the Phone7Market program and search for Lync.

Right-click the Lync 2010 result, select Quick actions, and select Deploy to Emulator.

This should launch your WP7 emulator and you'll see the Lync 2010 application loaded for you:

Tip: Press Page Up once to enable keyboard entry from the host PC. You should be able to sign-in successfully:

Lync Mobile Clients and TMG Server Farms

Quick update here for those of you publishing Lync web services with TMG and having trouble with mobile clients:

If you're following the Mobility load balancing requirements you'll find that cookie-based persistence is recommended in order to ensure the clients are always directed to the same Front-End server and session. This isn't an issue for a single FE, but once you start publishing a farm of FEs within TMG you'll notice the Lync mobile clients can't sign in. Android clients can for some reason, but WP7 and iPhone cannot.

The issue you'll face is that while TMG offers you cookie persistence when publishing a web farm, it really only works when the web listener is enabled for forms-based authentication. Since the Lync Web Services cannot be published via FBA the cookie never gets inserted. The end result is that TMG will now round-robin requests between the published farm members and the mobile clients will never sign in due to a ping-pong behavior. You can verify this behavior by draining all Front-End servers from the farm except for one and you'll see the clients can now sign in.

For a small deployment where a single FE can handle your entire user load I recommend switching your TMG persistence to source IP. All requests will hit a single FE, but the mobile clients can now maintain their session. And if an FE fails TMG will then fail over to the next server in the farm automatically. For the folks where multiple FEs are used more for capacity reasons you'll need to use something other than TMG for publishing Lync going forward.

Lync Topology Builder error: "The directory name is invalid"

Earlier this week I had a project where we were moving the back-end Lync database from a standalone SQL to a clustered instance (pilot to production for Enterprise Voice!), but we ran into an error I haven't hit before when trying to add the new SQL store to the topology - Topology Builder was failing on the enabling topology step with the error "The directory name is invalid." As a result the databases and permissions were not even being populated in the new instance.

We checked the usual suspects: firewalls, permissions, etc with no luck. In the end we just used a different machine that had Topology Builder installed and the same topology file published OK. After doing some more research it seems this is a very generic SQL error that can typically be resolved with a reboot (of the server trying to talk to the SQL instance). So if you hit this error I would suggest just using a different machine, or restart the one you're using.