Sponsored By

Isn’t It Time HD Voice Was Standard?Isn’t It Time HD Voice Was Standard?

Forget meaningless incremental feature improvements. We need to make high-definition voice available everywhere.

Michael Finneran

May 13, 2019

8 Min Read
HD voice image

For my money, the single biggest advance we’ve seen from VoIP and unified communications is better-quality voice provided through high-definition (HD) or wideband voice. I realize the UC guys also provided convenience features like one click to call/video/text/email, but since they copied most of those ideas from smartphones they don’t score too high on the innovation scale.

 

HD voice, on the other hand, is pure pleasure. Whether on a phone call or in an audio or a video conference, the ability to hear someone clearly changes the whole communications experience -- we’re in the communications business, after all. I spend an inordinate amount of time listening to product pitches for all manner of capabilities, and to my amazement, I hear almost no one talking about liberating us from the cramped, noisy drudgery of listening to someone drone on over a miserable 3-kilohertz phone connection.

 

I expect that most of us in the field can now recognize HD audio as soon as we hear it. However, even the most audio inept can recognize that one schlub on the conference call who dialed in on the phone. I don’t care what he has to say, he’s too hard to listen to!

 

For the uninitiated, HD voice describes an audio connection (telephone call, audio conference connection, or audio accompanying video) using a codec that captures around 7 kHz of the speaker’s audio signal as opposed to the 3-kHz codecs that were baked into the design of the traditional telephone network. That increased bandwidth (that is “bandwidth” in its original meaning: “the range of frequencies carried on the channel”) translates into vastly improved sound quality.

 

HD means communications, particularly those involving the hearing impaired or speakers of different native languages, is far easier to understand. If you’re looking for business impact, you don’t need a major research project to tell you that understanding will translate into better, and more productive communications, and a vastly improved user experience.

 

I’m a musician, so maybe I’m a little more sensitive to sound quality than the average Joe. By the same token, I can never see any significance difference among computer displays, so don’t ask me about those. Audio is my thing, and frankly I can’t believe people don’t hear differences that are this pronounced. If they can’t, there really is a bright future in audiology.

 

The technology to deliver on this goal is readily available, but what’s lacking is vision, will, and industry cooperation to get it accomplished. To break down the problem, the essential components for an HD voice system are:

 

  • HD-compatible audio components in the endpoints -- Speakers and microphones in the endpoints have to be able to capture the higher frequencies, and IP desk and conference phone vendors have all brought their components up to snuff.

  • Compatible HD codecs at each end -- It would be great if all industry segments adopted the same HD coding format, but we don’t have the Bell System any longer so that’s unlikely to happen. However, for decades we’ve had digital transcoders that convert between various digital voice codings, so I can’t imagine that is a major impediment.

  • A signaling system that allows the codec, bit rate, and QoS requirement for the connection to be negotiated as part of the call set-up -- SIP provides all of that in the Session Description Protocol, so the failure to deploy it is clearly one of execution on the part of the carriers and endpoint providers.

  • A flexible (as opposed to one-size-fits-all) transport service that provides a connection with the required transmission rate and QoS parameters for the particular service being delivered -- for intracompany, intersite communications, we initially looked at MPLS and like services that could prioritize certain classes of service to ensure performance. As the Internet continues to develop, we’re clearly finding that even basic “best effort” Internet services (like we now routinely use to support our remote users) can support good-quality voice through brute force, or simply by increasing the network capacity to the point where any differences in delay or jitter become insignificant. We also refer to that solution as: “Bury your problem in bits.”

Click below to continue to Page 2

We’ve already seen that delivering end-to-end HD voice on a closed, proprietary basis is a gimme. Virtually all of the current UC&C platforms support HD audio for all intrasystem and intranetwork audio and video calls and conferences. That’s swell, if we were all in the same building, but then we’d just book a conference room! Ninety-nine times out of 100, at least one participant is going to be remote. If we’ve got IP connectivity to that remote party, we’re golden. If we don’t, however, we’ll be dealing with one bad 3-kHz leg on this conference, and the persistent noise from that one leg will remind us all it’s still there!

 

That’s where the core challenge becomes apparent. To make this vision a reality, the public network carriers and the endpoint suppliers would actually have to cooperate. Unfortunately, having lived through the aggravation of getting carriers and equipment suppliers to get SIP to just set up and tear down a simple phone call, I’ve come to question the level of technical competence that our industry can achieve.

 

The most amazing derelicts in the move to HD are webinar providers. Yeah sure, they all have the ability to support HD voice over IP audio. But why aren’t they telling their customers that, despite all of this swell technology, if the speaker connects on a phone, quality will suck -- and all the worse if that presenter is using a cell phone. (For the technically literate, best-case mean opinion score on a cellphone connection is about 4.0.) Webinar providers should be insisting that all speakers are on HD audio.

 

Let’s not forget the cellular dimension. I’ve been dragged into dealing with this problem of late, as the cellular carriers are beginning to roll out HD voice over LTE. The AMR Wideband (AMR-WB) codecs they’re using get the voice down to the 12- or 13-Kbps range -- pretty good when you recall that the pulse-code-modulation voice coding we used in the legacy telephone network (and with PBXs) carried standard definition 3-kHz voice in 64 Kbps.

 

When I first heard about wideband cellular voice, my question was, “Is that going to be compatible with the HD voice capability on my UC&C platform?” Not surprisingly, the answer was a resounding “No” -- accompanied by astonishment that anyone would ever actually think that something like that would be within the realm of possibility. Some went so far as to half-promise they might support HD connections to subscribers on other cellular carrier networks, but even that seemed to be a stretch for them.

 

So, whether we’re talking about basic telephony, UC&C, audio/video conferencing services, or cellular, we’re still dealing with service interconnection using the least common denominator, 3-kHz voice. That level of performance is 20 years behind where we should be, and no one seems to be taking the lead on doing something better.

 

To bring the point home, the Internet is pushing the boundaries of audio technology, and the “phone business” is still trying to sell 1960s-era vintage transistor radios. Is this really the best we can do?

 

Conclusion

One of my favorite axioms has always been: “The hallmark of an important new technology is that it solves a problem you didn’t know you had.” I can’t find a definitive source for it, but I can’t say that I’m overly optimistic such a leap of imagination will emerge in our space. Clearly, users aren’t going to rise up in revolt demanding we give them HD voice, so an initiative to move this forward would have to come from within. That means this movement would have to emerge from a plodding, bureaucratic process that starts with, “Let’s have a meeting about that.”

 

This is an initiative we as an industry would have to get in front with. The reality is that most users make a judgement around “acceptable quality” by weighing a number of factors. The classic example of that is cellular. The voice quality in cellular peaks at “barely acceptable,” but people take the good with the bad and the ability to make mobile phone calls far outweighs their disappointment with the sound quality. Remember, when cellphones first showed up, the alternative was yelling “breaker 1-9” into your CB radio.

 

The push for HD may have to come from the margins. We’re faced with an aging population, and with that, hearing impairments multiply. At some point, going to see your mom is more efficient than trying to talk to her over the phone. Also, with international businesses, we’re often dealing with people who speak different native languages. In either case, we’re already struggling to understand one another -- a crappy phone call isn’t making that any easier! It’s in these challenging environments where HD voice will pay the biggest benefits.

 

One interesting side note, we’re finding that voice recognition and natural language processing systems have a much easier time with a clearer voice signal. So whether you’re talking to your mom or to Alexa, it might be time we start focusing on providing a better voice connection.

 

Finneran is writing as a member of BCStrategies, an industry resource for enterprises, vendors, system integrators, and anyone interested in the growing business communications arena. A supplier of objective information on business communications, BCStrategies is supported by an alliance of leading communication industry advisors, analysts, and consultants who have worked in the various segments of the dynamic business communications market.

About the Author

Michael Finneran

Michael F. Finneran, is Principal at dBrn Associates, Inc., a full-service advisory firm specializing in wireless and mobility. With over 40-years experience in networking, Mr. Finneran has become a recognized expert in the field and has assisted clients in a wide range of project assignments spanning service selection, product research, policy development, purchase analysis, and security/technology assessment. The practice addresses both an industry analyst role with vendors as well as serving as a consultant to end users, a combination that provides an in-depth perspective on the industry.

His expertise spans the full range of wireless technologies including Wi-Fi, 3G/4G/5G Cellular and IoT network services as well as fixed wireless, satellite, RFID and Land Mobile Radio (LMR)/first responder communications. Along with a deep understanding of the technical challenges, he also assists clients with the business aspects of mobility including mobile security, policy and vendor comparisons. Michael has provided assistance to carriers, equipment manufacturers, investment firms, and end users in a variety of industry and government verticals. He recently led the technical evaluation for one of the largest cellular contracts in the U.S.

As a byproduct of his consulting assignments, Michael has become a fixture within the industry. He has appeared at hundreds of trade shows and industry conferences, and helps plan the Mobility sessions at Enterprise Connect. Since his first piece in 1980, he has published over 1,000 articles in NoJitter, BCStrategies, InformationWeek, Computerworld, Channel Partners and Business Communications Review, the print predecessor to No Jitter.

Mr. Finneran has conducted over 2,000 seminars on networking topics in the U.S. and around the world, and was an Adjunct Professor in the Graduate Telecommunications Program at Pace University. Along with his technical credentials, Michael holds a Masters Degree in Management from the J. L. Kellogg Graduate School of Management at Northwestern University.