Sponsored By

Is QoS Becoming Irrelevant?Is QoS Becoming Irrelevant?

Changes in applications and their locations are conspiring to make it increasingly difficult to implement QoS.

Terry Slattery

September 6, 2018

6 Min Read
No Jitter logo in a gray background | No Jitter

Evolving UC communications requires increasingly complex quality-of-service strategies. Have we reached the point where QoS is too complex for normal use? Do cloud services over the Internet mean that QoS is irrelevant?

QoS Operation
The differentiated services (DiffServ) model of QoS prioritizes different types of traffic so that time-critical applications like voice, video, and interactive business applications receive priority over traffic flows that aren't as time critical. In the DiffServ model, a Differentiated Services Code Point (DSCP) value is set within the IP packet header. While up to 64 different per-hop behaviors can be specified with DSCP, most QoS designs use only six to 12 of them. (Note: An Integrated Services QoS model is also available, but it hasn't gained widespread acceptance.) IETF standards documents describe each of the per-hop behaviors, one of them being a priority queue that's typically reserved for time-critical protocols like voice. Some QoS designs may also prioritize critical device communications, such as from bedside monitors in healthcare or process control alerts in manufacturing plants.

The DSCP version of QoS consists of three basic functions: classification, marking, and queueing. Classification, which is typically performed at the ingress edge of the network, identifies different classes of traffic. Once a packet has been classified, its headers are marked by adding a QoS tag. Performing classification and marking at the network's edge allows successive network equipment in the forwarding path to make forwarding decisions efficiently -- meaning, they don't have to repeat the classification function. Packets are then queued according to the applied classification and marking, with higher-priority packets selected for transmission over congested links before lower-priority packets.

It's important to note that QoS only applies during congestion, when more packets need to be sent over a link than the link can handle. The network equipment only needs to make a queueing decision when multiple packets need to be sent on a link. Ideally, the highest-priority packets get selected to go first, followed by lower-priority packets. The 64 traffic classes defined by DSCP provides a lot of flexibility.

Classification Challenges
Endpoints can add marking to packets that they originate, but the lack of a good security model forces most organizations to implement classification and marking at the point where endpoints connect to the network (the ingress access layer). Unfortunately, classification is becoming more and more difficult to implement.

Desktops, laptops, tablets, and phones are now sources of low-priority data traffic, higher-priority business data, and high-priority voice and video. At one point, network designers could assign phones to separate address ranges and use the packet addresses for classification. But with the lines between data and UC&C blurring, building reliable classification rules is becoming more and more of a challenge. A possible answer is to use network-based application recognition. But with integrated applications becoming prevalent, even that solution may not be appropriate in all circumstances.

Microbursts
Recall that QoS is used only when congestion occurs. This, along with the complexity of QoS configuration, leads many organizations to consider increasing link speeds to the point that congestion never occurs. However, this line of thinking ignores the presence of microbursts. A microburst is a brief period in which multiple packets arrive in a burst. A good example is when a workstation downloads a picture or a large document. Or it can be a graphic-intensive Web page where the browser opens multiple TCP connections to speed the population of the page.

A typical enterprise-based application environment may have 10-Gbps links out to the access layer and 1-Gbps links to the desktop. TCP performs its initial handshake and a few packets are sent. The window size doubles on each round-trip time. A burst of 64 packets is possible after only seven round-trip times. A 10G server infrastructure can easily deliver a big burst that can overwhelm the queues on a downstream 1G interface, causing undesirable packet loss and higher jitter than normal. TCP will handle the resulting packet loss by adjusting its transmission rate, but UDP will continue to blast away since it has no congestion feedback mechanism. QoS will need to be used if the packet loss due to microbursts grows too large.

Continue to Page 2: Where QoS works, where it doesn't work, and more

Continued from Page 1

Where QoS Works
QoS works very well for enterprises that have reasonable control over the endpoints and infrastructure, such as in healthcare. An enterprise may also have mechanisms in place that allow UC endpoints and flows to be easily identified so that the classification and marking processes can work efficiently. It may also be possible to provision sufficient bandwidth inexpensively so that queueing rarely occurs and that any microbursts in data transmissions that arise are short enough so as not to impact UC&C applications.

Private wide-area communications over MPLS supports QoS. That is one of the big selling points of private WAN services. You purchase bandwidth from a WAN provider and then use QoS to prioritize the traffic that must transit the links.

Interestingly, a cloud-based UC controller might work well because the endpoints connect to the UC controller via TCP. Only the endpoint-to-endpoint communications benefit from QoS, and if the endpoints are within the enterprise network, QoS can be beneficial.

Where QoS Doesn't Work Well
QoS doesn't work across the public Internet. That's because QoS requires cooperation between priority levels. Everyone considers their own traffic to be the most important and getting agreement among the participants isn't possible. The result is that QoS markings are ignored in the Internet.

The lack of QoS over the Internet implies problems for cloud-based applications where QoS is desirable. Distributed UC endpoints that rely on Internet access for connectivity (e.g., telecommuters) also can't take advantage of QoS. Conference calling systems, another favorite cloud-based service, are handicapped as well.

Making QoS Useful
One way to make QoS useful is to keep all time-critical applications within the enterprise. But that's often not financially viable.

Most network equipment vendors now have software-defined WAN (SD-WAN) products that measure the transmission characteristics across multiple paths. SD-WAN devices can tell when the Internet path provides a level of service that is suitable for voice. Then, when the Internet latency and packet loss increases (typically on weekday afternoons), traffic can be switched to the more expensive MPLS paths. The neat thing about SD-WAN products is that you can specify that voice traffic only transit links that have certain characteristics (less than 1% packet loss, less than 100ms latency, etc.). Network administrators can define the criteria for prioritizing traffic and links for their mix of applications and link types.

Another approach to QoS is to have application servers communicate with the network's classification and marking system. For example, when a voice call is initiated, the call controller tells the QoS application to do ingress marking on a specific traffic flow. The ingress may be where the endpoint connects to the enterprise infrastructure or it may be where the flow enters the enterprise from and external connection. This type of interaction between the applications and network is rare and only a few networking vendors provide the required functionality.

Conclusion
QoS within the enterprise can be valuable for critical applications as well as voice and video. Cloud-based applications that require communications over the Internet cannot rely on QoS. The things we can do are limited and depend on the organization's infrastructure.

Adding bandwidth is good if links are oversubscribed and it can move the point at which problems occur even if links aren't oversubscribed. SD-WAN is a great solution for optimizing QoS over disparate paths and for inexpensively providing redundant paths. Application and network integration is still in the future so we'll need to rely on other mechanisms.

Changes in applications and their locations are conspiring to make it increasingly difficult to implement QoS. Time will tell if QoS is really becoming irrelevant.

About the Author

Terry Slattery


Terry Slattery is a Principal Architect at NetCraftsmen, an advanced network consulting firm that specializes in high-profile and challenging network consulting jobs.  Terry works on network management, SDN, network automation, business strategy consulting, and network technology legal cases. He is the founder of Netcordia, inventor of NetMRI, has been a successful technology innovator in networking during the past 20 years, and is co-inventor on two patents. He has a long history of network consulting and design work, including some of the first Cisco consulting and training. As a consultant to Cisco, he led the development of the current Cisco IOS command line interface. Prior to Netcordia, Terry founded Chesapeake Computer Consultants, a Cisco premier training and consulting partner.  Terry co-authored the successful McGraw-Hill text "Advanced IP Routing in Cisco Networks," is the second CCIE (1026) awarded, and is a regular speaker at Enterprise Connect. He blogs at nojitter.com and netcraftsmen.com.