Table of contents:
Introduction | Introduction | Audio Codecs | Audio File Formats | Assignments | Additional References | On to Lesson 2.3

Lesson 2.2: Audio Codecs and File Formats for the Web (back to top)

Introduction (back to top)

So...you have the desired media or script (for voice recordings), now how do you decide what format to digitize it in? Well, the first thing you want to do is maximize sound quality while taking into account available RAM and hard disk resources. And if you are going to use the sound in a multimedia project, you are going to want the smallest, most portable sound you can get while maintaining acceptable quality. Not perfect, acceptable. In most projects, you won't have the space for perfect, especially if the project is to be played from the Web.

With sound for multimedia, you'll need to use different codecs and file formats. A codec is the combination of the terms compression/decompression. This is a circuit for converting audio or video signals into and out of compressed digital format. A file format is a standardized set of rules for storing and reproducing the sound, resulting in a file extension that tells the browser or computer what rule set it is dealing with. A terrific comparison of audio codecs, complete with samples, can be found at the UTexas web site. There is a table of sound file formats in your text on page 227.

For purposes of this course, we'll deal with the three most common file formats:

The .ra format, proprietary to Real Media Players, will be discussed in Lesson 2.5 of this Unit. Real Audio is a very popular format for streaming media, but everyday users would probably use something else since they have to buy software to convert to this format.

The two ways of playing sound from the Web at this time are downloaded and streaming. If downloaded, then your machine downloads the sound to a temp file and then plays it using a helper application. If the sound is streaming, then a connection with the streaming server is made, a small portion of the sound is downloaded, and then play begins while the rest is being downloaded in the background. Streaming files are linear digital files, where every second of music that you hear is represented in a unique piece of data in the file.

Audio Codecs

Audio file formats are based on codecs. A codec, once again, is the combination of the terms code and decoder--sort of a modem (modulator-demodulator) for sound and video, while the modem works on computer signals. A codec converts audio or video signals into and out of digital format. If you cannot hear a sound file you download, then you do not have the right codec installed for it.

Other common codecs include

CD rate, which is commonly encoded as .wav or .aiff files, is a raw, linear format. Music plays, uncompressed, from beginning to end. The only thing that can be modified using this codec is the sampling rate. You would use the CD rate codec when you want the same sound to come out that you recorded in. The result? A large file that sounds very good (depending on sampling rate).

IMA 4:1 is a lightly compressed format that has been in use for years and enjoys wide compatibility among computer platforms and software. At this time, this format has been eclipsed by the MP3 codec. I myself was unable to hear any appreciable difference between the CD rate, IMA 4:1 and MP3 samples taken at the same sampling rate. You may have a different result.

Most of the audio codecs are just for...well, audio! But one that is in common use is the QDesign Music 2 codec, developed as a part ofApple's QuickTime program, and you must have QuickTime installed to hear music in that format. QDesign Music 2 was first released in QuickTIme 3. Using the samples on the UTexas page, the size of the files is 2/3 smaller even than MP3, although I do not hear any appreciable difference. So why is it not in more common use? Well, it's my opinion that developers don't want to pay the price for the professional version of Quicktime when they already have other multimedia (sometimes costly) development tools, and that users don't always want to have QuickTime on their computers, since it has gotten some bad press. We're going to explore QuickTime more fully in the Digital section of this course.

MP3 is now the standard codec/format for music files on the web, and, given a sufficiently high bitrate, all but indistinguishable from CD rate to my ears. Audiophiles may argue differently, but I don't have equipment that can reproduce the difference noticeably--and neither do most users. Therefore, since it's a readily available and free codec, along with compressing files to be reasonable small, what's not to love?

The last audio codec to explore in this lesson is RealAudio Streaming. The compression rate is excellent, the quality generally acceptable, but is streaming the way to go for your project? We'll explore that in Unit IV.

Audio File Formats

The WAV File Format (back to top)
Comparison file size: Chopin's Minute Waltz=18,415 KB of storage in this format

WAV files are probably the simplest of the common formats for storing audio samples. A WAV is a windows format sound file. The beeps, bells, and whistles that Windows makes are from .wav files. WAV files have the file extension .wav

Unlike MPEG and other compressed formats, WAVs store samples "in the raw" where no preprocessing is required other that formatting of the data. WAV files are a de-facto standard in Windows sound software. If you want to create WAV files, you need a sound card and software that will let you record sound to WAV files. Luckily, sound recording software usually ships with the sound cards, not to mention the Sound Recorder in Windows itself. And if you want to maintain CD quality on a CD-audio file you are downloading to your computer, converting it as a WAV is the surest way to go.

The WAV file itself consists of three "chunks" of information: The RIFF chunk which identifies the file as a WAV file, The FORMAT chunk which identifies parameters such as sample rate and the DATA chunk which contains the actual data (samples). Fortunately, mere mortals do not need to know anything about the chunks to play or create WAV files.

The AIFF Format (back to top)

AIFF, Audio Interchange File Format, is a format developed by Apple and it is based upon the Amiga IFF tagged file structure. AIFF is one of the formats used on Macintosh computers, and it is also used by Silicon Graphics workstations.

For purposes of this course, AIFF files are sufficiently like WAV files to merit no further discussion. If you have a Mac (and you know who you are!), you'll probably be familiar with this format. If not, any sound program you have will generally default to this format to save a sound file.

The MP3 Format (back to top)
Comparison file size: Chopin's Minute Waltz=1,671 KB of storage in this format (a factor of 10 smaller than .wav)

First and foremost, you must understand that MP3 is a compressed, and therefore "lossy" file format, lossy meaning that all the original data is not retained. There is no relation between how a WAV file is stored and how an MP3 file is stored. And an MP3 will NOT sound the same as a WAV, although it may be perfectly acceptable for your purposes.

How is an MP3 created? As we discussed in Lesson 2.1, the sampling frequency is basically the number of times per second audio is sampled and stored as a number. MP3#, however, also carries a "bitrate." This refers to the transfer bitrate for which the files are encoded - i.e. an MP3 file encoded "at a bitrate of 128 Kbps" is compressed such that it could be streamed continuously through a link providing a transfer rate of 128 thousand bits per second. But most of us don't really use MP3 as a streaming medium, so really what the MP3 "bitrate" really provides is a measure of is how severely the file is being compressed - the lower the bitrate, the more the file has been compressed... and the more you compress a file, the more of the original data is lost, and so the worse the playback sound quality will be.

"Ripping" in this lesson refers to the process of taking audio data from an audio CD and storing it as digital audio data of some form on your PC hard disk. "Encoding" refers to the process of taking uncompressed digital audio data (e.g. WAV files on a PC, AIFF files on a Mac) and compressing them according to a particular compression scheme, such as MP3. So "ripping" would take a CD track and make a file on your PC hard disk. An example of encoding would be to take a WAV file on your PC and make an MP3 file from it. If you take a track from an audio CD and create an MP3 file on your hard disk from it directly, then you are ripping and encoding in one step.

While the encoder you use and the quality of the original recording will make a big difference, it is fair to say that 128 kbps and higher will produce acceptable sound. Go below that and you will really be able to hear the loss in quality.

To continue the baseline from above, Chopin's Minute Waltz requires 1,671 KB of storage as a .mp3 file--smaller by a factor of 10 than its .wav storage size.

The MIDI format (back to top)
Comparison file size: Chopin's Minute Waltz= 9 KB of storage in this format. That is, for those whose calculator is not in their drawer, 185 times smaller than the same file in .mp3 format, and 2,046 times smaller than the same file in .wav format, and it sounds better, to boot!
Chopin's Minute Waltz

MIDI, Musical Instrument Digital Interface, is an international hardware and software standard. There are actually three components to MIDI, which are the communications Protocol (language), the Connector (hardware interface) and a distribution format called Standard MIDI Files.

MIDI files contain the instructions that MIDI instruments and MIDI sound cards use to recreate or synthesize sounds. MIDI files store and recreate musical instrument sounds (notes and chords), but not speaking or singing voices. Because they store only the directions to reproduce notes, MIDI files are much more compact than any other form of audio file storage. Three minutes of MIDI music requires only 10 kilobytes of storage, vs. 15 megabytes for wav files.

MIDI allows computers to connect with electronic musical instruments from different manufacturers. In other words, it is "the wiring diagram." It also specifies a communication protocol for passing data from one device to another. This means, in plain English, that MIDI devices pass standardized messages.

The computer encodes the music as a sequence and stores it in a file with a .mid extension, or other midi-format extensions. A sequence is like a player piano roll, in that it carries instructions specifying the pitch of a note, the point at which a note begins, the instrument that plays the note, the volume of the note, and the duration of the note. MIDI is the only file format whose pitch does not change as a result of changing the length of the file.

The General MIDI (GM) specification includes

Assignments (back to top)

  1. Go to the UTexas Page and play the Big Band samples. Write up a comparison of the effects of the different codecs. Which were clear? Which were muddy? What specific differences did you hear, like an instrument that sounded different? Please email me your analysis.
  2. Please use the Easy CD Ripper (PC) or MVP1.2 (Mac) even if you already have a tool for MP3 encoding, so that everyone is using the same software and should get about the same quality from this experiment. Then, rip a CD track (any CD) to MP3 using the ripper/encoder at the following bitrates: 32 Kbps bit mono, 56 Kbps stereo, and 192 Kbps stereo. If you are unsure how to do this, you can watch the ripenc.exe on the course CD. It is merely a matter of changing the bitrate and saving the file with a unique name. Tell me as much as you can about what is different between your three versions as you can, by email. I won't ask you to send them, they'd be too big.

If you just want to play with MIDI (this is not an assignment), download Aldo's Pianito 2.3 (also on the Course CD) and enjoy! You can't save these, but it's fun.

When you have completed and emailed these assignments, please go on to Lesson 2.3.

Additional references (back to top)

Digitizing Sound and the MP3 Format

How Do I Hear Digital Data?

MIDI Manufacturers' Association

Comparison of Audio Codecs.