Know Where to 'TURN' When Deploying WebRTCKnow Where to 'TURN' When Deploying WebRTC
Deploying successful IP communications and WebRTC solutions favors a "Relay First" design strategy.
June 6, 2016
Deploying successful IP communications and WebRTC solutions favors a "Relay First" design strategy.
That is how many WebRTC calls fail, Varun Singh, CEO at Callstats.io, told attendees of the WebRTC Conference-in-Conference at Enterprise Connect earlier this year. Of those failed calls, 22% required some form of media relay. The main reason behind this statistic is that network engineers haven't considered network address translation (NAT) firewall traversal, important for enterprise deployment, when building many real-time communications (RTC) networks.
A Word About NAT & Firewall Traversal
NAT has long been the bane of VoIP services, since it changes the IP addresses and ports that VoIP elements need for addressing. In the meantime, some firewalls outright block certain kinds of traffic in the interest of security. But NAT traversal and media relay products allow VoIP and WebRTC packets to pass through the majority of enterprise firewalls.
In simple terms, this means a user can connect and hear what the person at the other end is saying -- unto itself a compelling reason to include NAT traversal as part of your enterprise WebRTC or UC network design strategy. Traditionally this has been done via session border controllers (SBCs), but WebRTC has forced the adoption of other technologies -- STUN, TURN, and ICE.
These technologies allow endpoints to communicate with each other, often directly, without expensive and quality-impacting devices like SBCs sitting in the path. STUN, formally Session Traversal Utilities for NAT, basically echoes a public IP address back to the endpoint. TURN, for Traversal Using Relays around NAT, acts like a lightweight media relay when a peer-to-peer connection cannot be established. ICE, for Interactive Connectivity Establishment, is a framework that combines local address, STUN, and TURN to find the best possible connection.
ICE is built into WebRTC. STUN servers are lightweight, and readily available for free use. TURN servers, on the other hand, can chew up a lot of media depending on how you use them. TURN requires you to set up a separate server or use a TURN service and are generally not free.
Deploying TURN for WebRTC
Now that you know NAT traversal must be part of the network design, you need to consider how to implement it for optimal efficiency in your network and to create the best user experiences. This comes down to two important aspects:
Minimizing Latency
Follow a couple of guidelines for reducing the amount of time packets take to get from point A to point B to improve quality.
Geographic Distribution of Relay POPs
When we talk about relay, we have to be careful of a few things: where users are with respect to relay POPs, what we are using as a backhaul between the relay POPs, and what to do if the quality starts to suffer. As an example:
In this case the call needs to go all the way through Texas to make the connection, adding to latency and hurting call quality. If the TURN network had connections in Germany and New York, then the system could compare the end-to-end latencies between these various sites and choose the path with the lowest latency. Connecting through the closest site is generally, but not always, best. A geographically distributed TURN network will give more low-latency options.
Relay Backhaul
The next question is, "How is this relayed traffic going to be routed to the other relay POPs and then to the remote user destination?" In a word, the answer is "backhaul." We need a fast, resilient and, preferably, private network that ties all of these relay POPs together.
Using the open Internet as a backhaul will likely not cut it for enterprise use cases, but it may suffice for smaller, less-critical deployments.
QOS & Monitoring
We can take this one step further with a bit of machine learning to help optimize the path the streams take in real time. While running that call over our relay, we know the benchmark for call latency is something like 200 milliseconds. If we see the call rise above that upper limit -- e.g., the delay on the call is above 500ms, we can flip to a faster relay to ensure we maintain a quality of service that is acceptable to our users.
Call Setup Times
Implementing TURN will certainly help you in reducing call failures, incomplete call setups, and half-duplex audio, but what about call setup times? Long call setup times can lead to a significant reduction in user satisfaction due to lack of confidence in your service. If a call takes more than a couple for seconds to set up, you are headed for long queues at your help desk, or worse -- users will just stop using your service and you will never know why.
The ICE process can take a lot of time. Fortunately WebRTC includes technologies like Trickle ICE to let the call start before the whole ICE process finishes. Still, even starting the ICE process takes some time. Fortunately other tricks can help make this even speedier.
TURN First
TURN First, or Relay First, is quickly becoming the de facto network design implementation that will help reduce your long call setup times so your users won't choke on them. Simplistically, here is how it works:
Now we have set up our call as soon as possible and we don't waste any time looking for the best route in the first few precious seconds of the call.
Summary
When designing your WebRTC service, you need to take a few key considerations into account. TURN servers should be close to your users to minimize latency. If you have a globally distributed user base, then you should have a globally distributed TURN network. Look for reliable, fast backhaul between the TURN sites. Lastly, Relay/TURN First is a must when considering NAT traversal in your network design, and should be something you look for when considering a TURN solution.