Our most highly-read blog post in years was Video Formats Uncovered which explained the ins and outs of video files and the details behind them.

With the successful launch of Music Converter and Music Converter Pro behind us, we thought it was time to give music files the same treatment.

Getting Started

Let's begin by looking at the creation of digital audio, because this will help us to understand the terms that we so often encounter, both on the web, and in software like iTunes and our own Music Converters.

When you create a digital audio file from a 'real-world' piece of music or sound you have to decide on a couple of things:

  • how often to take a sample of the sound wave to create the digital file. This is referred to as the Sample Rate; and
  • how precisely you represent each of those samples. This is referred to as the Sample Size.

These two factors combined go a long way to determining the quality of the resulting digital sound file, and also its file size.

Sample Rate

So the sample rate is how frequently a sample is taken of the sound wave to create the digital file.

You can think of sample rate as being similar to the frame rate on a movie. Low frame rates result in 'jumpy' videos. High frame rates give you a smooth playing video.

Sample rates are usually stated in kilohertz (kHz), which means one thousand samples per second of audio. A typical sample rate (for CD quality music) is 44.1 kHz, which means that every second of audio is sampled 44,100 times when creating the digital file for the CD.

Another common sample rate is 48 kHz, which is often used for movie soundtracks. Higher sample rates, such as 96 kHz and even 192 kHz are sometimes used to satisfy very high quality audio requirements.

Sample Size

When the audio file is sampled, all of the 'samples' are stored within the digital file, and the size of the data (in bits) used to store each sample is the Sample Size. If you like, this is the preciseness or granularity of each stored sample.

You can think of the sample size as being akin to the number of megapixels used to store a digital image - the more there are, the better the quality.

Commonly used sample sizes are 16 and 24 bit. A digital file with a 16 bit sample size means that each sample (yes, each of the 44,100 samples per second of audio) is represented using 16 bits of data.

Image here showing comparison.

So the higher the sample size, the higher the quality of each stored sample - more bits means higher quality.


The word codec is shorthand for the term coder/decoder.

A codec is the file format and compression technique used to turn a real-world audio signal into a digital file or stream of data.

A codec relates only to the actual audio streams within an audio file, not the file format itself.

What About Bit Rate?

In audio files, the bit rate is a derived value. Files aren't recorded at a bit rate. Files end up with a bit rate, based on the sampling frequency and the sample size used.

Let's look at a common audio recording. Let's say the audio was recorded at 44.1 kHz with a 24 bit sample size. This would give us 44,100 x 24 bits per second, which is 1,058,400 bits per second or about 1,000 kilobits per second (that's 1,000k). This is only one channel, so to get a stereo recording we need two channels. That gives us a total bit rate of about 2,000k.

Your iPod or iPhone can handle up to 320k only, so how does this work? The answer is compression! MP3 and AAC are compressed formats which reduce the size considerably. That brings us to encoding types.

Encoding Types

No Compression

The first type of encoding that we will look at is No Compression. This means that no mathematical compression algorithm is applied to the digitally encoded audio stream.

Uncompressed audio is recorded using a technique call PCM (Pulse Code Modulation) and stored in file types such as WAV and AIFF. More on these later, but for now just be aware audio files don't necessarily get compressed at all in some cases.

Lossless Compression

When audio data is compressed, there are two types of compression - lossless and lossy.

Lossless compression is a compression type where the original data can be completely recovered from the compressed data. Data compression is a complex topic of its own, but here's an idea of what goes on.

Let's say we have an original piece of data that contains 100 bits in a row, all set to "1". The raw form of this data would take up 100 bits of space. This data could be compressed into far fewer bits using a compression technique. The technique could involve recording the number 100 and the bit "1". The number 100 can be recorded using 7 bits of space. Hey presto, you've compressed 100 bits into 8!

This is a lossless compression technique because we can take our 8 bits of data and, knowing the compression algorithm used, can re-construct the original data.

Examples of lossless audio compression codecs include FLAC and Apple Lossless (ALAC).

Lossy Compression

Lossy compression techniques take the compression even further to the point that the original data cannot be fully recovered. While you would never use a lossy compression technique for data files, it can be applied to audio since many devices (and humans!) can't tell the difference when the audio is downgraded slightly.

Examples of lossy audio compression codecs include MP3 and AAC.

Why would you use a lossy compression technique? Lossy compression codecs create smaller files.

File Formats

Audio files include a file container, metadata and data streams.

Unlike video files, audio files usually have a one-to-one alignment between codecs and file formats. For example, the MP3 audio codec also has a corresponding MP3 file format.

Common file formats include:

  • MP3 - Moving Picture Expert Group Layer 3. The most popular codec and file format in use today.
  • M4A - Apple's file format for audio. This is really a Quicktime container containing audio only. This can contain AAC or ALAC audio./
  • FLAC - Free Lossless Audio Codec. A popular open source codec and file format for lossless audio. Often used for high quality recordings.
  • AIFF - Audio Interchange File Format. Apple's implementation of uncompressed audio. Audio data is uncompressed PCM in this case.

File Container

An audio file container is usually aligned to the codec it contains. The MP3 file format contains MP3 audio (two channels maximum) and some metadata.

Apple's M4A file container is a Quicktime container that can contain almost any media stream. The file container is the same as is used for Quicktime movies (MOV file extensions).

Apart from the Apple M4A file format, audio file formats generally match the codec which they contain.


Metadata is an important part of any digital media file. Audio files include a chunk of metadata which tells you things like the artist, album, track name and so on.

MP3 files often include metadata in an ID3 metadata block. ID3 is a defacto-standard for the storage of metadata in a audio file.

Apple's M4A format holds metadata in tags called 'atoms'. Again, this is the same technique used in Quicktime movie files.

FLAC uses the Vorbis comment metadata approach. This is a series of name/value pairs about the file, e.g. Artist=Green Day, Album=American Idiot, etc.


An audio file will include a number of streams encoded using the required codec. Different file formats and codecs support different numbers of audio channels.

MP3 support only two channels of audio, which means that MP3 is good for stereo, but won't do surround sound which needs at least 6 channels.

FLAC, ALAC, AAC and most other popular codecs support multi-channel audio streams.

Bringing it all together

Now you know all about sample rates, sample sizes, codecs, bit rates compression and what's inside an audio file.

We hope this will help you understand more about audio files how to best look after your music library.

Post new comment

The content of this field is kept private and will not be shown publicly.
This question is for testing whether you are a human visitor and to prevent automated spam submissions.
Enter the characters shown in the image.