Voice Codecs: the tale of the Secret Decoder Ring

Tuesday Oct 4th 2005 by Mark A. Miller

Codecs—which convert your voice's analog vibrations to digital signals—balance sound quality with bandwidth usage. Be sure to pick the right one(s).

Remember back when we were youngsters and searched for that secret decoder ring buried in our box of breakfast cereal? Those that possessed the ring became part of an exclusive club that could send and receive messages to other members without fear of unauthorized disclosure to unsuspecting classmates. In a similar (but less clandestine) fashion, VoIP networks employ coders/decoders—codecs for short—that convert the analog voice signal into a digital pulse stream, and then back again. And like the exclusive group that possessed the decoder rings, VoIP stations must have compatible codecs in order to communicate.

But don’t dismiss the codec as just one more piece of firmware; it is a key to the operation and efficiency of the overall VoIP network. Embedded in the codec is an algorithm that converts the analog voice waveform into a digital format. Like many technical decisions, the operation of that algorithm boils down to a fairly simple tradeoff: voice quality versus bandwidth consumption. And like many other technical issues, there are two general categories of solutions, those that are covered by international standards, where the algorithms are available to all, and those that are proprietary to a specific vendor and kept close to the vest. Let’s begin by looking at the benchmark coding scheme, known as Pulse Code Modulation, or PCM.

PCM originated with the development of digital telephony in the 1970s that eventually led to the T-carrier systems, T-1 and T-3, which are widely deployed today. The speech was first band-limited to 4,000 Hz, and then the signal was sampled (or measured) 8,000 times per second. Each sample was assigned to one of 256 discrete levels, using an eight-bit code. This yielded a data rate of 64 Kbps (8,000 samples/second * 8 bits/sample = 64,000 bits/second, or 64Kbps). Two forms of sampling were defined, one called Mu-Law (predominant in North American and Japan), and one called A-Law (predominant in Europe). In both of these forms, the discrete levels were assigned in a logarithmic, not linear, manner, which provided greater resolution when the signal level was low. However, the digital output was always 64 Kbps. PCM encoding was standardized by the International Telecommunications Union (ITU) in 1988 as Recommendation G.711, with details available at www.itu.int.

Speech signals, however, contain redundancy, which PCM does not attempt to remove. With PCM you always get 64 Kbps output, whether you are encoding a politician’s rhetoric (with lots of variability), or a single tone (with no variability). As a result, other speech encoding algorithms that were developed after PCM endeavored to reduce this redundancy, and therefore reduce the output data rate. Reducing the data rate also reduces the bandwidth required for the system. For example, if you cut the data rate from 64 Kbps to 32 Kbps, you can pack twice as many voice calls into the same bandwidth. Similarly, if you can reduce the 64 Kbps to only 8 Kbps, you can pack eight times the number of conversations into the same bandwidth. Some of the bandwidth-reducing ITU codec standards include: G.722.1 (24/32 Kbps data rates), G.723.1 (5.3/6.3 Kbps data rates), G.726 (16/24/32/40 Kbps data rates), G.728 (16 Kbps data rate) and G.729 (8 Kbps data rate). A summary of these standards is available at the ITU site, and the following link provides a good summary of the support for these standards within various VoIP client implementations: http://compare.ozvoip.com/codecsupport.php.

Aside from the codecs that are defined by openly available international standards, there are a number of proprietary algorithms that appear in many vendor products. These codecs may claim bandwidth or voice quality advantages over those defined by the ITU, however (as with most vendor-proprietary offerings), such a system may lock you into one vendor’s end-station implementation for the economic life of your VoIP network. Word to the wise: Make the codec selection part of your overall system design criteria. Study which codecs—standard and/or proprietary—your prospective vendors support, and the various tradeoffs that come with each selection. A thorough understanding of this area will not only assist with the design and implementation phase of your VoIP network rollout, but may have longer lasting bandwidth (and therefore financial) implications down the road.

Copyright Acknowledgement: © 2005 DigiNet ® Corporation, All Rights Reserved

Author's Biography
Mark A. Miller, P.E. is President of DigiNet ® Corporation, a Denver-based consulting engineering firm. He is the author of many books on networking technologies, including Voice over IP Technologies, and Internet Technologies Handbook, both published by John Wiley & Sons.
Mobile Site | Full Site
Copyright 2017 © QuinStreet Inc. All Rights Reserved