How does MP3 compression work?

Thousands of songs in the palm of your hand. No more albums full of filler tracks when all you want is the latest single. And forget about buying plastic discs. Two letters and a number have changed the music industry forever: MP3. The format lets people store and trade music with ease thanks to its small file size. MP3 compression is a result of combining maths and the scientific study of how we perceive sound, psychoacoustics.

Sampling the best

Sound is analogue, a constantly varying wave, but computers are digital, representing data as binary numbers. Creating a digital audio recording involves measuring the sound waves at regular intervals, with the quality of the recording dependent on two factors: how often you take a measurement, known as the sampling rate, and how many possible values you assign to the wave, or the bit depth. If the sampling rate isn’t high enough then you won’t properly record high-frequency sounds, while a low bit depth will result in inaccurate measurements and a fuzzy recording.

Song waves

CDs are recorded at 16-bit and 44.1 kHz, meaning the sound wave is sampled 44,100 times a second at one of 65,526 (2 to the power of 16) possible values. The result is a whole lot of numbers, and a large file size – a standard 80 minute CD takes up 700 MB of memory. With files that big a typical 16GB MP3 player would only hold around 20 albums, so how is it possible for people to walk around with an entire music collection in their pocket? By throwing most of it away!

Bargain bin

EarIt may surprise you, but the human ear can’t hear everything. We’re limited to a range of sound frequencies, starting at 20 to 20,000 Hz when you’re younger, and shrinking with age as hearing deteriorates. The lowest sounds we can hear are the deep, rumbling bass notes of a giant church organ, while the highest are annoying whines that have been used to scare away teenagers. Some animals can hear outside of this range – whales and elephants at the lower end, bats and mice at the higher end – but to us these sounds might as well be pure silence.

Removing these inaudible frequencies would reduce the amount of memory a track takes up on your hard drive, but how can you separate the silence from the song? MP3 conversion uses a mathematical technique called Fourier transforms to split them up. Any wave, no matter how complex, can be described as a sum of simple sine and cosine waves called a Fourier series. These waves correspond to individual frequencies, and the MP3 algorithm picks out the ones we can’t hear while preserving those in the important 20 to 20,000 Hz range. The result is a wave that contains much less information than the original, but sounds more or less the same to our ears.

MP3 Compression and Huffman Coding

You’d think that dumping all this data would be enough to keep MP3s small, but Fourier transforms aren’t the only mathematical trick for squeezing the most out of your music. A technique called Huffman coding helps reduce file sizes by making the most common parts of an MP3 smaller.

An MP3 recording is basically a list of numbers that describe the sound wave at any given point, and each number is stored as a binary code that can vary in length. Just as the letter E crops up more often than a Z or a Q, some numbers in an MP3 are more common than others, and by examining their frequency we can tweak the binary code to make it more memory efficient.

Numbers that occur most often get a short binary representation, while the less frequent ones are assigned a longer code. On average, the binary for the whole sequence of numbers ends up shorter than if you used a fixed-length code. Huffman coding isn’t just used to create in MP3 compression – it compresses data in zip files, JPEG images, and more. Visit Plus magazine for a full description of the maths behind Huffman coding.

Skip forward

Thanks to these two techniques, the average MP3 is about 10 times smaller than the same song on CD. Smaller means more versatile, and without the maths of Fourier transforms or Hoffman coding, there’d be no iPod, no downloadable music, and the modern world of music would be very different. Other music formats such as AAC (used by iTunes) and Ogg Vorbis followed in the footsteps of MP3, and new efficient ways of encoding music are still in development today. Whatever technique we use to store music in the future, you can be sure mathematicians will help invent it.