Sponsored By

Solving the Pain of VoIP QualitySolving the Pain of VoIP Quality

Voice quality over IP networks is never constantly acceptable. As the network and traffic change, so do the conditions that VoIP calls will encounter on the IP network. There will not only be good and bad days, there will be good and bad minutes. Listeners describe the voice quality problems using words like garbled, choppy, robotic, muffled, clipped, echoes, hissing and static. Speaker recognition may not be possible.

Gary Audin

January 23, 2008

4 Min Read
No Jitter logo in a gray background | No Jitter

Voice quality over IP networks is never constantly acceptable. As the network and traffic change, so do the conditions that VoIP calls will encounter on the IP network. There will not only be good and bad days, there will be good and bad minutes. Listeners describe the voice quality problems using words like garbled, choppy, robotic, muffled, clipped, echoes, hissing and static. Speaker recognition may not be possible.

Voice quality over IP networks is never constantly acceptable. As the network and traffic change, so do the conditions that VoIP calls will encounter on the IP network. There will not only be good and bad days, there will be good and bad minutes. Listeners describe the voice quality problems using words like garbled, choppy, robotic, muffled, clipped, echoes, hissing and static. Speaker recognition may not be possible.VoIP quality will be transient and ever changing. A voice quality problem can be of such a short duration that once the troubleshooter attempts to diagnose the problem, network conditions have changed. As enterprises move to Unified Communications with conferencing traffic, especially video, the capacity and performance demands on the IP network will increase.

I had the opportunity to host a webcast, "Solving the Pain of VoIP Quality" with Psytechnics. The thrust of the webcast is "what you don't measure, you can't manage". During the webcast, we went beyond the traditional approaches to voice quality issues as viewed by network trouble shooters. What makes this presentation different is that it shows why the measurements of the data side of the voice call are insufficient for determining what the voice quality problem is in human terms.

An early standard that attempts to measure voice quality is the Real Time Control Protocol (RTCP), which reports on data network impairments. The impairment measurements are then used to calculate a prediction of the impact of the IP network on voice Mean Opinion Score (MOS). MOS is the average of the opinions given by a group of people (subjects) for a given example of voice quality in a subjective test. The subjects typically give their opinions against an ITU P.800 opinion score scale: excellent (5), good (4), fair (3), poor (2), bad (1). MOS prediction is then a numeric measure/value of the voice quality where 5 = perfect; 4.4 = toll quality and 3.5 = marginally acceptable voice quality. The RTCP measurements include:

  • RTP time stamp

  • Packet loss

  • Jitter

  • Delay

  • Sequence number

RTCP is a good start, but does not provide nearly enough information to really determine the call quality from a human perspective.

A newer standard, RTCP XR (extended reporting) adds several more measured elements:

  • Packet loss and discard rates

  • Burst length and density

  • Gap length and density

  • Packet path, end system and round trip delays

  • Signal, noise and echo levels

  • Jitter buffer configuration

  • Packet Loss Compensation (PLC) type

This is an improvement that delivers metrics to produce a better estimation of the impact on MOS of the IP-Network and also more information on the factor(s) that cause the voice quality issue. RTCP-XR also provides fields for edge-devices to report a locally calculated MOS score, in which case it is better to rely on a standardized measure than a proprietary approach. RTCP is still not the human ear. No listener is going to describe their voice quality complaints with most of these measurements. There have also been cases where a vendor says they use RTCP XR, but not all the fields are included.

Here are other factors that will influence voice quality that are not part of the RTCP XR measurements:

  • CODEC type (G.711,G.729, G.722)

  • Packet size (20ms, 30ms, 40ms)

  • Silence suppression (VAD) competition

  • Clipping during silence suppression

  • Link utilization and low speed transmission

  • Adaptive jitter buffer operation

  • Competing with data traffic

Further factors that directly determine voice quality that are not caused by the IP-transport and so go entirely un-measured with IP-network measures include:

  • Echo (hybrid and acoustic)

  • Speech levels (volume)

  • Noise

  • Cross-talk

  • Speech distortion

The ITU has a group of standards for both the network side and the analog (listener) side of the voice call as shown in the following figure.

NB = narrowband, classic voice bandwidth of about 3.4 kHz WB = wideband, that is about 7 kHz bandwidth PESQ = Perceptual Evaluation of Speech Quality

These ITU standards, that were initially developed for service providers, are better than the RTCP XR standard for reporting voice quality. Using these standards, it is easier to compare voice quality experiences among many networks and VoIP products. Although there are non-standard approaches used by some vendors, these approaches make comparing call quality more difficult because they calculate the MOS using different algorithms. Using standard methods allows VoIP vendors and service providers to provide universal quantitative comparisons.

About the Author

Gary Audin

Gary Audin is the President of Delphi, Inc. He has more than 40 years of computer, communications and security experience. He has planned, designed, specified, implemented and operated data, LAN and telephone networks. These have included local area, national and international networks as well as VoIP and IP convergent networks in the U.S., Canada, Europe, Australia, Asia and Caribbean. He has advised domestic and international venture capital and investment bankers in communications, VoIP, and microprocessor technologies.

For 30+ years, Gary has been an independent communications and security consultant. Beginning his career in the USAF as an R&D officer in military intelligence and data communications, Gary was decorated for his accomplishments in these areas.

Mr. Audin has been published extensively in the Business Communications Review, ACUTA Journal, Computer Weekly, Telecom Reseller, Data Communications Magazine, Infosystems, Computerworld, Computer Business News, Auerbach Publications and other magazines. He has been Keynote speaker at many user conferences and delivered many webcasts on VoIP and IP communications technologies from 2004 through 2009. He is a founder of the ANSI X.9 committee, a senior member of the IEEE, and is on the steering committee for the VoiceCon conference. Most of his articles can be found on www.webtorials.com and www.acuta.org. In addition to www.nojitter.com, he publishes technical tips at www.Searchvoip.com.