Hi guys,

After a real long break, I'm back with my 4th eduinfoline.com newsletter. Due to hectic activity with my engineering studies, I was unable to draft one for a real long time. Also, much of the time in the last week was spent designing and deciding eduinfoline.com which should be ready hopefully within this month. With God's grace, we hope that we would be able to cope with our deadline. Anyway, now onto today's topic.

Mp3. Mp3. Mp3 is all one hears from an audiophile. This format, is now pretty old (and so is the technology), but I bet there are guys out there who don't understand how the technology works. Its pretty simple to understand actually, unless of course you go into the technical details. In this article, I won't be dwelling into the depths, but I shall be covering all the questions you may ask about mp3s.

I won't be going into the history much, cause it doesn't have much relevance to the technology. All I got to say is that this technology was developed by the guys at MPEG (Motion Pictures Expert Group) and that Mp3 is nothing but Motion Pictures Expert Group Layer 3 (hence the name mp3). The obvious question is, was there an mp1 and mp2 format? The answer is yes, but the compression algorithms were not that efficient. In fact, most of the video CD's (older ones) use the mp2 format for the audio stream.

Mp3 is the first format which gives a 1:12 or a 1:10 compression ratio for near CD quality. Which means that a 30 MB Audio Track gets compressed to about 3 MB without any audible loss in the stream. I use the word audible because technically speaking, some of the sounds are actually cut off when the song is encoded in the mp3 format. (Obviously, this doesn't fall in the audible range).

Lets now have a look at how the encoding works. The encoding takes place in 2 phases:

1. Loseless Compression
2. Lossy Compression

As the name suggests, in loseless compression an efficient algorithm is used to reduce the size of a song without disrupting the audio. For example, consider we are compressing a text file which uses the word music a lot of times. If initially, we give '~'=music and replace all the times music occurs with a '~', the size obviously reduces. In practice, the initial definitions such as '~'=music etc. are given in the dictionary (you must've seen the dictionary setting in compression programs like Winrar, Winace, Winzip), and this is where the much of the processing time is used (creating an efficient dictionary). Once the dictionary is ready, the actual compression takes place. Without going too much into details of compression algorithms, for most audio signals (music), one might notice that most of the sounds are repeated several times (like the beats of a drum) which gets drastically compressed by this method. Plus, certain encoders (mp3 compressors) give the option of Joint Stereo which enhances compression even more. Joint stereo utilizes the fact that most of the times, the audio in both the left and right channels are the same. Thus instead of encoding the left and right streams separately, the sounds which occur simultaneously in both streams are encoded together thus improving overall compression. All this comes under loseless compression as there is absolutely no loss of information. Generally, the file size becomes at best half the size of the original sound. However, this compression is not enough. We know the mp3 is 1/10 the the original size. Lets now see where the major compression occurs.

For the sake of analogy, we all know how jpeg (jpg) files are pretty small when compared the bitmaps (bmp). Also, you must have noted that upon zooming into the image, you find a fuzz at the edges. This is the lossy compression algorithm, and a similar one is employed for mp3 encoding. It is this algorithm which makes mp3 better than mp2 or mp1. Mp3 uses the fact that the human ear only hears the maximum amplified part of the music at a given instant and is unable to hear the lower tones. Thus, it masks off the lower tone with the higher tones. Also, the encoder then removes sounds which are too low to be heard, and the frequencies which cannot be heard. This forms the lossy compression algorithm.

The mp3 encoding algorithm was developed starting in 1987 by the German research organization Fraunhofer Gesellschaft - Institut Integrierte Schaltungen - Audio & Multimedia or FhG-IIS-A. Let's just call them FhG for short. Their code was eventually submitted to the International Standards Organization (ISO) and adopted as a standard format. This was then incorporated in most of today's softwares like Audiocatalyst, Musicmatch Jukebox etc. Also, there have other independant encoders not using the FhG (like Blade and Lame) and surprisingly, it is Lame which has the most efficient algorithm (yup, unbelievable but true). Also, Lame is the slowest encoder (probably thats why its more efficient). DBPowerAMP has a front-end for lame encoder (probably, it is the only software which has the front-end for Lame).

Lets now have a look at the bit rate of mp3s. Bit rate is nothing but bits used per second of audio in a mp3 file. For the geeks, 8 bits = 1 byte. The next obvious question (or is it obvious?) in your mind is "How come the mp3s of equal time have the same size when the encoding (or shall I say compression) depends upon the audio signal? The answer is that the mp3 encoding algorithm tries to compress the audio stream to 128 kbits (or whatever you specify) in a second regardless of the the actual audio. Most of the time, 128 kbit encoding ensures that there is no audible cut regardless of the audio. However, sometimes, if we use 96 kbit encoding, in certain sections of the audio, we would be able to make out the loss in the audio signal. For most cases, 128 kbit encoding gives the best quality to compression ratio. However, for highly sophisticated systems, one would prefer the 256 kbit encoding. It is interesting to note that even if a given stream can be compressed beyond 128 kbit, say 90 kbit, if the user has specified 128 kbit, it will compress to 128 kbit and move on. To remove this barrier due to CBR (constant BitRate), you could use the variable bitrate option which compresses depending upon the input audio stream. (Not all encoders and players support Variable Bitrate)

Thats all for now folks. Hope to see you soon.

Venkat Krishnaraj,
eduinfoline.com