DIYbanter - View Single Post

"Swingman" wrote in message
...
|
| In the world of digital audio they are two different things. What follows
is
| purposely a _very_ simplified explanation, so if some dip**** wants to
get
| anal, GFY in advance:

Noted. I stand F in advance.

| A CD is 16 bit resolution, a standard mp3 is 128 mbps

As you noted, two very different things. What follows is a less simplified
explanation, meant neither to upstage, correct, nor to annoy Swingman.

Digitization of sound in the simplest case, as you noted, is a matter of
taking a certain number of samples every second and representing each sample
as a number that describes the amplitude of the sound at that instant.

Two values -- the sampling rate and the sampling resolution -- dictate the
quality of the encoding. The telephone uses 8 bits per sample. This gives
you 256 possible levels of sound between the faintest and the loudest sound.
It samples at 8 kHz, or 8,000 times per second. So once every 1/8000 second
it checks the sound amplitude and assigns it a number between 0 and 255
depending on where it falls in the loud-soft range. So every second the
telephone produces 64,000 bits of information that can be used to
reconstruct the signal at the other end.

A compact disc, on the other hand, uses 16 bits per sample, giving 65,536
possible levels of sound at each instant. This is an important increase in
quality. Sound, especially music, is a convolution of many different waves
at a wide range of frequencies. You tell the difference in character
between a violin, and oboe, and a trumpet all playing the same note by the
relative presence and absence of overtones that occur at very high
frequencies and "beat" against each other. Having only 256 levels of sound
"forces" sound to be at one level or another, possibly erasing and important
overtone. That's why you don't necessarily recognize voices over the
phone -- you rely on the overtones to discern Jim's voice from Janice's.

The CD also samples at 44.1 kHz. That 16-bit sample is taken once ever
1/44100 second. This figure was chosen because it was thought at the time
that the human ear could only hear frequencies up to about 22 kHz and that
any tighter frequencies were inaudible. (Now we believe that the human ear
may hear sounds as high as 40-50 kHz.) In signal processing, the Nyquist
principle says that if you want to digitally capture a signal at 22 KHz, you
have to sample it at twice that frequency or greater, or 44+ kHz.

These two values together define the CD sample, which samples two channels
of sound at those parameters, producing 1,411,200 bits of information every
second.

Now comes the bandwidth issue. Whatever you use to store and transmit those
signals in that format has to be capable of delivering the information at
the proper rate. Telephone equipment has to be capable of delivering 64,000
bits per second (bps) per telephone call. CD equipment has to be capable of
delivering 1.4 Mbps. In some contexts that bandwidth (data-carrying
capacity) simply isn't available, or is expensive to provide.

Enter MPEG and its sound encodings. MPEG is primarily an information
*transmission* format designed to control the transfer of audio and video
information over certain delivery systems such as cable and satellite.
Those systems have inherent data-delivery rate limits that may be fairly
draconian.

If you sample utter silence with the CD method, you still get exactly the
same amount of data as you would sampling a Grateful Dead song of the same
length. It would be nice if you only had to transmit sound data only when
there was actual sound, since silence is the default output. That way you
could make the most of a limited or fixed data rate without sacrificing
quality. If your link is capable of only 100 kbps, you could send 100,000
bits in one second that may expand to ten seconds of silence followed by two
seconds of brilliantly reproduced sound. That way, when the decoder is
playing out those twelve seconds of music, your transmission system is busy
sending the next 1,200,000 bits of encoded information.

MP3, MP4, AC3, and other more advanced encoding schemes depart from the
plodding, "dumb" take-a-fixed-sample-every-nth-of-a-second method and use a
variable sampling rate coupled with high-level mathematical ways of
approximating the shapes of sound waveforms. So where the CD method dumbly
sends a sequence of numbers representing a climbing waveform: 4, 8, 12, 30,
70, 118, 200; the newer methods might simply record a digital shorthand that
says, "in the output, generate a geometrically-ramped signal from 4 to 200
over 0.001 second", and that takes fewer bits to describe. Now of course
you don't get exactly the same numbers back out at the other end as you put
in. So the art is to carefully establish those approximations so the
difference between them and the original isn't noticeable.

But that's why MP3 quality is expressed as a bandwith -- so much information
per unit time -- and why CD quality is expressed as a sampling
rate/resolution. They are completely different *methods* of representing
sound in digital form and so they can't be directly compared. Obviously for
low transmission speeds the adaptive methods like MP3 have to rely more and
more on approximations that can be expressed in shorthand, and have to
extend those shorthands over longer ranges of input data. But the notion
behind 128 kbps is, "We have 128,000 bits per second of achievable
bandwidth; let's make the most of it by adapting our sampling strategy to
that ceiling."

MPEG compression methods can take into account things like channel coherency
and frequency separation issues. Low frequencies are non-directional, so
you can't tell whether they come from the left or right channel. Thus you
don't need to encode a rumbling bass on both the left and right. And in
most recordings, there isn't a lot of difference between the left and right
channels. So they can introduce the notion of a "common" data stream that
represents the common left-right agnostic information and then smaller data
streams that represent only what's different about the left or right
channel.

You can tune each method to produce sound that is fundamentally
indistinguishable in quality from each other, or from a high-quality analog
recording. So saying that MP3s are inherently "better" or "worse" isn't
really addressing the question. You can make CD-type encoding bad (like the
telephone does) by lowering the bitrate and the sample resolution. You can
make MP3 sound very good by increasing the raw bitrate available to it. But
perceived quality being equal, MP3 makes more efficient use of the available
bitrate.

If you wanted to compare the two as bitrates, a 192 Kbps MP3 will sound as
good as a CD for all but the most sensitive listeners, but the MP3 will
require only 192,000 bits per second of storage and transmission space, but
the CD will require about 1,400,000 bits per second of storage and
transmission space. Not quite, but nearly and order of magnitude
improvement in bandwidth usage.

There is actually a tie-in to woodworking here, so it's not as off-topic as
it seems.

Say you want to duplicate a contour in some piece -- say an old crown
molding. You can follow a fairly straightforward but tedious method of
taking the cross section at intervals and establishing the relative position
of all the points along the profile curve at small intervals, relative to
some reference. We have contour gauges that do this. So your "data set"
for that molding is a set of contour gauge tracings at intervals along the
length of the piece.

But a smarter method might be to note that the molding is an extrusion, so
you only have to sample one cross section that applies to the whole length.
Or at worst, the molding might be a repeating pattern, so you only have to
sample the pattern at intervals and then just specify that the pattern is to
be repeated as needed. Instead of a contour gauge that blindly collects the
same amount of data each time it is used, you might note instead that the
contour is composed of a circular arc (of a certain radius and center point,
with certain angular end points) followed by a line segment (of two end
points), and so forth. So your record of the contour is a high-level
description of the geometry, not a lengthy collection of raw points in space
(i.e., the settings of each pin in the contour gauge). That might actually
come in handy later as you're making the molding plane blade or selecting
router bits. You might have a router bit that cuts that specific circular
arc or line segment.

The point is that your description taken by the second method -- while more
complicated to obtain and possibly to reproduce than simple contour gauge
tracings -- is more concise and may boil down to standard tool crib
equipment. Where that's important, you have an advantage.

--Jay
(who used to program MPEG satellite systems for a living)