Sponsored By

The What and Why of JitterThe What and Why of Jitter

The three key network specifications for real-time (voice and video) traffic are packet loss, jitter and latency. Whenever I talk to folks about these, they understand the packet loss issue and they worry about the latency issue, but there is often little discussion about the jitter. So I wanted to tackle the jitter topic here and lay out both why jitter is a problem and some of the issues around how we measure it in the network.

John Bartlett

April 25, 2008

5 Min Read
No Jitter logo in a gray background | No Jitter

The three key network specifications for real-time (voice and video) traffic are packet loss, jitter and latency. Whenever I talk to folks about these, they understand the packet loss issue and they worry about the latency issue, but there is often little discussion about the jitter. So I wanted to tackle the jitter topic here and lay out both why jitter is a problem and some of the issues around how we measure it in the network.

The three key network specifications for real-time (voice and video) traffic are packet loss, jitter and latency. Whenever I talk to folks about these, they understand the packet loss issue and they worry about the latency issue, but there is often little discussion about the jitter. So I wanted to tackle the jitter topic here and lay out both why jitter is a problem and some of the issues around how we measure it in the network.First let us define jitter. To define jitter we need to back up and understand latency. Latency is the time it takes a packet to traverse the network from source to destination. Latency is half of the 'ping' time, or half of the round-trip delay. Jitter is the variation in latency. For example, suppose that the average latency from source to destination is 100 ms. If a specific packet, packet A, traverses the network in exactly 100 ms it arrives just as expected and has a jitter of 0. If packet B traverses the network and is delayed slightly, arriving 130 ms after it left, it is 30 ms later than expected, and has a jitter of 30 ms.

When we send real-time traffic across the network we are trying to encode and then decode a real-time event such as sound or a visual image. The sound and the visual image change constantly, so we have to continually take samples, encode them, send them across the network, decode them and reproduce the sound and images on the far end. The receiving end is expecting a continuous stream of data and needs that data to arrive at regular intervals so it can properly recreate the original audio or video. If a packet is late, the time slot in which that data was needed has gone by and the arriving packet is of no use.

Because we know that IP networks are asynchronous and can cause delays in the packets, we implement a jitter buffer on the receiving end. Let's consider a 40 ms jitter buffer. The jitter buffer predicts the expected time of arrival for each packet, but then delays the playing of those packets by 40 ms. So the real-time event at the far end is being recreated 40 ms later than it could otherwise have been recreated. The value of this is that if a packet arrives less than 40ms late it can be pushed ahead in the jitter buffer so that it is available for its play window even though it arrived late. This is like your colleague telling you the train leaves a half hour earlier than it really does because he knows you often arrive late. The train really leaves a half hour later (the jitter buffer) than the expected schedule.

So a 40 ms jitter buffer will take care of network jitter up to 40ms. If packets are later than that, then again their play window has gone by and they are discarded by the jitter buffer. So why don't we just make the jitter buffer arbitrarily long to allow for any amount of jitter in the network? Remember that delay. As we delay the recreation of the voice or video image, we reduce the ability for participants to easily interact. When there is a delay on the connection we find ourselves stepping on each other's speech. This effect is disconcerting. It can make you wonder if the other party is listening, and it can make a back and forth discussion very difficult. So we limit the size of the jitter buffer. This means we need to ensure that the network can keep packets within the jitter specification that the jitter buffer can handle.

Jitter is often measured using RFC 1889, which does a continuous average of the jitter of individual packets smoothed by a factor of 16. This is the jitter that is reported in the RTCP packets that accompany each RTP stream of a voice or video call. Unfortunately, comparing the RFC 1889 jitter values to the size of the jitter buffer does not give us direct information about how jitter is affecting the quality of the voice or video stream. What we really want to know is how many packets were dropped by the jitter buffer. The voice and video quality is impacted when data is not available for reproducing the original event. So in the same way that voice and video are impacted by lost packets, they are impacted by packets with jitter that exceeds the jitter buffer and are then dropped before reaching the codec. They arrived too late to be useful so they have the same effect as if they never arrived at all.

Some of the test tools I use will report max jitter. This is somewhat useful because it tells me that jitter reached the max value at least once during the test interval. If I can create short enough test intervals (e.g. 10 seconds) I can get some idea of how often the network is creating jitter problems.

A better measure (IMHO) would be to directly know how many packets the receiving codec dropped due to jitter. When I study a sniffer trace I emulate a 40ms jitter buffer, and then generate a value for jitter loss, or the percentage of packets dropped due to having jitter in excess of my simulated jitter buffer. This value then can be evaluated the same ways as packet loss since its effect is the same.

We'll have to ask Eric Krapf why he titled this site "No Jitter". It's a noble goal, but for me keeping jitter within the bounds of the jitter buffer will suffice.

About the Author

John Bartlett

John Bartlett is a principal with Bartlett Consulting LLC, where he provides technical, financial, and management leadership for creation or transition of Unified Collaboration (UC) solutions for large enterprises. John discovers the challenges in each enterprise, bringing disparate company teams together to find and execute the best strategy using Agile-based methodology to support quick wins and rapid, flexible change. John offers deep technical support both in collaboration solutions and IP network design for real-time traffic with global enterprises world-wide.

 

John served for 8 years as a Sr. Director in Business Development for Professional & Managed Services at Polycom. In this role he delivered, defined and created collaboration services and worked with enterprises to help them shorten time-to-value, increase the quality and efficiency of their UC collaboration delivery and increase their collaboration ROI.

 

Before joining Polycom, John worked as an independent consultant for 15 years, assessing customer networks for support of video applications and other application performance issues. John engaged with many enterprises and vendors to analyze network performance problems, design network solutions, and support network deployments.

 

John has 37 years of experience in the semiconductor, computer and communications fields in marketing, sales, engineering, manufacturing and consulting roles. He has contributed to microprocessor, computer and network equipment design for over 40 products.