Why is Packet Loss Such a Big Deal?Why is Packet Loss Such a Big Deal?
One of the big worries with carrying voice or video on an IP network is packet loss. When packets don't make it to the destination, voice and video quality is compromised. But why is this a big deal when our networks have been running fine for years? Network teams are often surprised about how much packet loss there is in a normal data network. Lets look first at why this is a big deal for voice and video, but not a big deal for data applications.
March 27, 2008
One of the big worries with carrying voice or video on an IP network is packet loss. When packets don't make it to the destination, voice and video quality is compromised. But why is this a big deal when our networks have been running fine for years? Network teams are often surprised about how much packet loss there is in a normal data network. Lets look first at why this is a big deal for voice and video, but not a big deal for data applications.
One of the big worries with carrying voice or video on an IP network is packet loss. When packets don't make it to the destination, voice and video quality is compromised. But why is this a big deal when our networks have been running fine for years? Network teams are often surprised about how much packet loss there is in a normal data network. Lets look first at why this is a big deal for voice and video, but not a big deal for data applications.There are two key differences between real-time traffic (voice and video) and normal data applications. The first difference is that real-time traffic is constant. Packets will flow from the source to the destination for the duration of the voice or video call. The data represents a continuous event, such as speech or a video image, which needs to be constantly updated to track what is going on at the source. Data traffic, on the other hand, usually has to do with moving a block of data from the source to the destination. When we click on a web page there are a handful of data objects that are fetched across the network and then displayed on our screen. Once they are transferred, the work is done until the next page is requested.
The second key difference is that we are talking about real-time interactive applications (voice and video conferencing calls). That means that the data must be transferred quickly! We want to reproduce the original event with as little delay as possible so that the people at both ends believe they are in the same room having a conversation. Because of this requirement, there is no time to recover from a lost packet.
For a data application the primary requirement is that all the data arrive and that it be exactly correct. If a packet is lost in transit, the TCP protocol recognizes a packet is lost and requests that it be sent again. This retry causes at least a round trip delay, and slows down the transfer rate of the data as well. But the net effect on the user is slight, especially in a high speed network, and waiting a couple hundred milliseconds or even a couple of extra seconds is not a problem.
Not so for real-time traffic. The data must be played as soon as it arrives. If it arrives late, the time slot in which it should have been played has passed and the data is useless. Because real-time traffic cannot wait for packet loss retries, it is carried using the UDP protocol instead of TCP. UDP does not try to recover the packets; it just gets as many through as possible and leaves it up to the application to solve the problem of how to muddle through without all the data.
Voice and video codecs reproduce the sound and images as best they can with the data that arrives. Some codecs have packet loss concealment algorithms that play various tricks to cover up the fact that data is missing. Some compression strategies behave better when data is missing, others behave worse. Higher levels of compression often don't behave as well when packets are missing because the compressed packet represents more time (more missing data) and because decompression may depend on data in adjacent packets. Thus when a packet is lost not only is its data lost, but the next packet's data may not be able to be uncompressed without the missing data from the previous packet.
Microsoft has recently introduced a wideband voice codec which is designed to work well on a lossy network.. I don't know the internal workings of this algorithm, but it may be dispersing data across multiple packets so that the effect of losing an individual packet is reduced. Polycom has recently introduced a technology they call LPR for their videoconferencing systems, which is a form of Forward Error Correction (FEC). This approach adds additional data to each packet sent which can be used to reproduce nearby packets that are lost in the network. FEC can be effective for consistent levels of low loss, but will add some latency to the connection because the source has to look at a whole block of packets before sending them to be able to add the right correction bits. When loss is occurring, the receiver has to do the same thing by collecting a whole group of packets before being able to reproduce the missing few.
Packet loss causes a fairly rapid deterioration of quality in both voice and videoconferencing streams. Network loss of 1% is very noticeable in both voice and video unless concealment algorithms are used. Although the concealment algorithms are useful, the best approach is to deploy a clean network with the right QoS so that real-time packets are delivered reliably and quickly. I'll write some more next time about where packet loss occurs in the network and how to ferret it out.