Does QoS Really Work?Does QoS Really Work?
There are a few ideas that people keep bringing back to me one way or another, saying that QoS isn't really needed, or only has a limited range of usefulness. These stem from a lack of understanding about the way QoS works. Lets take a look at these arguments and see where they fail.
February 22, 2008
There are a few ideas that people keep bringing back to me one way or another, saying that QoS isn't really needed, or only has a limited range of usefulness. These stem from a lack of understanding about the way QoS works. Lets take a look at these arguments and see where they fail.
There are a few ideas that people keep bringing back to me one way or another, saying that QoS isn't really needed, or only has a limited range of usefulness. These stem from a lack of understanding about the way QoS works. Lets take a look at these arguments and see where they fail.Argument #1 is that with low link utilizaitons I don't need QoS. If my link utilizaiton is only 10%, and the addition of voice or video will only push it to 25%, why should I need QoS? There is plenty of bandwidth available, it should work!
What the propoenent of this argument doesn't understand is the behavior of traffic at different time scales. The 10% utilization number comes from a measurement tool that is counting the bytes used on the network link over a 5 minute, or 15 minute, or 1 hour timeframe. Over that time period only 10% of the available bandwidth has been used. But if we look closely at what is going on, we see that bursty data traffic will drive utilization to 100% for short periods of time. An individual TCP connection will continuously increase its utilization of a connection until packet loss occurs. This drives the momentary utilization all the way to 100%. But this 100% utilizaiton only lasts parts of a second or a few seconds at most, and so the 5-minute average is still low.
So if we are running TCP in parallel with our real-time traffic (and who isn't?) there will be moments in time where the data traffic is using 100% of the network link, and the queues will start to fill up, and our real-time traffic can be delayed or dropped. This is where QoS can ensure that the real-time traffic gets through. I have tested 10-Gbps data center links with 10% utilization and seen packet drops in real-time traffic. It is a normal part of the way TCP traffic behaves.
Argument #2 is that QoS is only useful when the network is loaded to a medium to mid-high level. There are long dissertations on the Internet comparing the behavior of a network to a highway, and priority traffic to emergency vehicles with their lights and sirens flashing. The argment goes that if the highway is lightly loaded there is no need for priority, and if the highway is jammed then the emergency vehicles can't get through either. Thus QoS only works in that middle range where we have enough room to pull over and let the emergency vehicles go by.
Nice analogy, but the network doesn't work like a highway. When a highway gets overly crowded, it slows down. We are unwilling for safety reasons to drive at the speed limit with our bumpers only inches apart. Traffic engineers have well defined trendlines for highways that show the througput drops after the traffic reaches a certain volume for the size and type of the road.
But in networks this doesn't happen. We put packets on the wire at the full speed of the link, back to back. The link runs at its rated speed all the time independent of the load. In fact in most synchronous technologies today we pack the link with dummy data when there is no real data to send.
The real action is in the queues that are feeding data into this full speed highway. If there is no room on the highway, we back up the would-be users in a queue waiting for an available slot. If this were a highway, these users would be waiting on the on-ramp for an available opening in the traffic, which continues to run at the speed limit independent of how full the highway has become. It is these queues that we manage with QoS. An ambulance would bypass the waiting line and jump onto the highway first.
Argument #3 is that the real-time traffic application is not a high priority for the organization. This argument is not usually made for voice, but I have heard enterprises argue that video is not mission critical, so it should not be given high priority on the network. This is a confusion about priority versus allocated bandwith.
Real-time applications need to get their packets through in a timely manner, or they just don't work. Not giving priority to real-time applications is like not giving ice to skaters. You just can't skate without ice. Its not that the skating degrades slightly, it just doesn't work. So if you want to run voice and video in your converged network, they need to have priority.
Now it is still possible to ensure that your critical data applications can run with good performance; we do that by allocating bandwidth. The bandwidth that voice and video use can be limited by the network, and the bandwidth available for vital data applications can be ensured by the network, by allocating each to class of service groups and giving them bandwidth allocations as required. So solve the issue of importance to the enterprise through bandwidth, and solve the need for timely delivery of real-time packets through priority.