Survey of Audio Digitization
Introduction
I'm J, and this is the page
of audio information I've gathered for the Spring '06 Survey of Digitization
(SoD) class at the School of Information at UT-Austin. I am not an expert, but
a musician, music collector, and audio enthusiast. This site will focus,
for the sake of the SoD students, on questions surrounding the digitization
of audio at libraries, collections, and archives. This site is for those who
may be involved in some way with the care of audio and its migration from analog
to digital.
For the first of my two, three hour classes I've decided to concentrate on
covering all the concepts and issues involved in working with analog/digital
(a/d) conversion. The second class can then focus on the dilemmas involved with
actual projects, particularly our own final project: the digitization of cassettes
from the Benson Library at UT-Austin.
Overview of Sound and Recording
Sound Collections and Guides
Our Digitization Project

A few basic points of my own:
° More than with the ears we hear with the mind.
° The audio preservationist and restorationist needs to understand/hear
every stage of a sound's life-cycle. (Ideally, this requires knowledge of
genres and traditions, ethnography and musicology, vocalization and instrumentation,
score and arrangement, players, acoustics, analog recording technology, formats
and playback technologies, sound engineering, digital technologies, etc.)
° Though audio archives need preservation standards and fixed playback
options, formats and players constantly evolve in a race to their own extinction.
Andrew Dylan's points from his
commencement of the Sound Savings Symposium on why sound matters:
° Sound is a medium that is intimately tied to tests of copyright limits
in our society. And this is most blatantly so since sound is a medium in high
demand by consumers.
° Music and sound are transcultural in a manner that is not so for text.
Whether white men can play the blues may never be resolved in some purists'
minds, but there is no doubting that the representations of history and culture
that are captured in music can be processed and enjoyed by people outside
that culture. The rise of world music, the merging of cultural styles, and
the worldwide love of opera by people who cannot speak a word of Italian are
testimony to the emotional response people have to music.
° The next tidal wave of digital content is rich media, a seamless convergence
of audio, video, and text. As yet, true hypermedia of the kind envisaged decades
ago has yet to emerge and even the Web, in all its glory, is (with some noticeable
exceptions) a text-heavy medium. Audio is the great underutilized resource.
Hypermedia in popular use is a visual medium, with audio seen as ‘extra’,
but there are signs that this will change.
Sound Basics
What is sound? "Sound can be defined as the change in
air pressure above and below an equilibrium (usually the barometric pressure).
For example, when a bass drum is struck, the skin vibrates back and forth. As
the skin travels outwards, away from the center of the drum, the air pressure
surrounding the drum rises above the barometric pressure; conversely as the
drum skin travels inwards, the air pressure lowers. This to-and-fro action occurs
numerous times per second creating waves of compression and decompression in
surrounding air. As air pressure increases by the outward motion of the bass
drum skin, the eardrum is pushed towards the center of the head; conversely,
as pressure decreases, the eardrum travels away from the center of the head.
The higher the vibration speed, the higher the pitch; the larger the change
in air pressure, the louder the sound." By Gilles St-Laurent, Music Division,
National Library Of Canada, from here.

There is a good animated gif of a sound wave's low/high pressure oscillation
against a membrane here.
Check out this old black and white short, "How
the Ear Functions."
Sound Perception
Human's generally perceive sound in the pitch range between 20Hz to 20,000kHz.
"For those that are curious, the lowest note on the piano is A sounding
at 27.500 hz and the highest is the top C sounding at 4186.009 hz (cycles per
second). A440 concert pitch refers to tuning the A above middle C to 440.000
hz." from here.


Humans generally perceive sound in the noise intensity range as quiet/low as
about 4dBA. On the upper end, noise generally becomes painful around 120dBA.
Hearing loss is a function of noise level over time. Maximum exposure should
be limited to about 85dBA for 8hrs. 110dBA can damage your ear after a minute
and a half, 140dBA can cause damage in an instant, and perforation of the ear
drum could occur around 160dBA.
Here is a scale to give you an idea of decibel levels:
10 normal breathing
20 whispering at 5 feet
30 soft whisper
50 rainfall
60 normal conversation
70 TV audio
85 handsaw
110 shouting in ear
120 thunder
For a more exhaustive list of common sounds and their decibel levels go here.
Can You Trust Your Ears?
Equal-loudness
contour: this graph represents the change in loudness necessary across the
audible pitch range for the pitches to be perceived at equal loudness.

Effect of Loudness Changes on Perceived Pitch - A high pitch (>2kHz) will
be perceived to be getting higher if its loudness is increased, whereas a low
pitch (<2kHz) will be perceived to be going lower with increased loudness.
-from here.
Psycho-acoustics
(from Wikipedia):It is important to note that the question of what humans hear
is not only a physiological question of features of the ear but very much also
a psychological issue.
Say this: Maresy-doats and dozy-doats, and liddle lamzy divey. A kiddley
divey too, wouldn't you-oo? The way you hear it depends on whether you
know what you're saying, which greatly affects interpretation. It's about animals
eating oats. How do you hear it now? Sound is its own language. The more you
know about what you're listening to, the more you hear. Recording, listening,
making sound, dubbing recordings, making conversions, are all a psycho-acoustical
art and science.
Brief History of Recording
Upon hearing an early sound recording device in 1888, Sir Arthur Sullivan stated
that he was "astonished and somewhat terrified at the result of this evening's
experiments--astonished at the wonderful power you have developed, and terrified
at the thought that so much hideous and bad music may be put on record forever.
- Edward B Moogk
Sound Timelines
A Few Major Events in Early Audio Technology
1876 Telephone - Alexander Graham Bell patents his telephone, built with the
assistance of young self-trained engineer Thomas A. Watson. Elisha Gray, who
developed a similar device at about the same time, will unsuccessfully
1877 Phonograph - Working with a team of engineers at his Menlo Park, New Jersey,
laboratories, Thomas Alva Edison perfects a system of sound recording and transmission.
The first recording replayed is a voice saying "Mary had a little lamb
its fleece was white as snow." from here.
1916 Microphone - E.C. Wente at Bell Labs developed the condenser microphone
to translate soundwaves into electrical waves that could be transmitted by the
vacuum tube amplifier.
1918 Speaker - Henry Egerton at Bell Labs patented on Jan. 8 the first balanced-armature
loudspeaker driver.
1921 P.A. System - The amplifier, microphone, loudspeaker innovations were
combined at Bell Labs to create the first public address systems.
1928 - Nyquist presented his theorem stating that an analog signal should
be sampled at regular intervals of time and twice the frequency of the signal's
bandwidth to reproduce it with good fidelity.
1931 Hi-Fi Recording - in April, Leopold Stokowski invited Bell Labs to begin
sound recording experiments with his Philadelphia Orchestra. In December, the
first electrical recordings were made and continued throughout the 1931-32 concert
season. 125 of these test recordings have been preserved (a limited edition
album of these masters was released in 1980 by Bell Labs).
1932 Stereophonic Recordings - in March, several test recordings were made
at the Academy of Music using two microphones connected to two styli cutting
two tracks on the same wax disk. This recording is the earliest example of stereophonic
recording that has survived.
1957 A/D Conversion - "New forms of playback, file formats, compression and storage of
data are all changing on a seemingly daily basis, but the underlying mechanisms
for converting real-world sound into digital values, manipulating those data
and finally converting them back into real-world sound has not varied much since
Max Mathews developed MUSIC I in 1957 at Bell Labs...The importance of their
work to information theory, computing, networks and digital audio cannot be
understated. For example, the same data stream theory used in high-speed networking,
known as T-1 lines is the same technology used in higher-end Digidesign Pro
Tools systems today (TDM or time division multiplexing). A widely used method
for encoding and decoding binary data, such as that used in digital audio, pulse
code modulation or PCM was also developed early on at Bell Labs, attributed
to John R. Pierce..." Jeffrey
Hass
Analog Formats
We've read "How
to Care for your Audio," from the National Film and Sound Archive about
basic care for your audio. Let's take a look at a few formats you might find
in the basement. Most info gathered from here.
Cylinders - The first and first mass-produced
recording format.
- Material - Originally they were a blend of plant and animal waxes that was
a bit too variable. Then Brown wax cylinders were made of metallic soap composite.
Then came Gold-Molded, Amberol, Celluloid and others.
- Handling - Handle by inserting middle and index fingers in the center hole,
then gently spread them to just keep the cylinder from slipping off. Do not
touch the grooves of wax cylinders; they are very susceptible to mold. Wax
cylinders should be at room temperature before touching; the thermal shock
from the warmth of your hand can cause cold wax cylinders to split.
Disks - Many types of disks have been produced
over the last 100 years. Here are a two common ones.
Acetate
- Material - Continuous Shrinkage of the lacquer coating due to the loss of
the castor oil plasticizer is the primary destructive force. The gradual loss
of plasticizer causes progressive embrittlement and the irreversible loss
of sound information. Because the coating is bonded to a core which cannot
shrink, internal stresses result, which in turn cause cracking and peeling
of the coating.
Vynil

Vynil Bowls - make your own!
- Material - Thus far, vinyl has proven to be the most stable of the materials
that have been used in the manufacture of sound recordings.
All Disks
- Storage - Use soft polyethylene inner sleeves. Do not use paper or cardboard
inner sleeves and do not store records without inner sleeves.
- Handling - Handle all grooved discs (78s, 45s, LPs, and acetate discs) by
their edge and label areas only.
Magnetic Tape - Magnetic tape first appeared
in North America just after World War II.


- Material - Magnetic tape is made up of two layers: a "base" layer,
and a thin "binder" layer which is bonded onto the base. The binder
contains ferromagnetic particles whose permanent alignment within the binder
produce the copy of sound waves. It is prone to sticky-shed syndrome: The
most common and serious magnetic tape degradation occurs through hydrolysis,
the chemical reaction wherein an ester such as the binder resin "consumes"
water drawn from humidity in the air to liberate carboxylic acid and alcohol.
Hydrolysis in magnetic tape results in the binder shedding a gummy and tacky
material which causes tape layers to stick together and inhibits playback
when it is deposited onto the tape recorder heads. The added friction increases
tape stress and can cause machines to stop. Hydrolysis also causes a weakening
in the bond holding the binder to the backing, which results in shedding or
possible detachment.
- Storage - After being played, not rewound.
- Handling - Handle by the outer edge of the reel flanges and center hub areas
only. Do not squeeze flanges together -- it will damage tape edge.
- Playback - many reel-to-reel players still exist, though there are a diversity
of formats, and they too are usually in need of repair, cleaning, and calibration.
General Handling and Storage Rules
- Store in a facility that stays within a constant cool temperature and low
humidity, and has very limited exposure to u/v light.
- The air conditioning system should be equipped with dust filtering equipment.
- Store most formats standing up, not laying on down on their sides.
- Never touch the surface of a recording. Use clean, white lintless cotton
gloves and handle by the edges.
- Recordings should not, unnecessarily, be left exposed to open air. Return
items to their containers when not in use and never leave storage containers
open.
- Do not place recordings near sources of dust including paper or cardboard
dust. Keep the surrounding area clean.
- Do not consume food or beverages in the area in which recordings are handled.Keep
storage facilities as clean and dust-free as possible.
- Keep labeling to a minimum, but limit the placement of labels, especially
pressure sensitive labels, to the container using conservation ink.
- Keep equipment clean, well adjusted and in good working condition.
Good Resources
Analog/Digital Conversion
and Digital Formats
I found these sites to be good at explaining the process:
Digital Formats
Library of Congress
The
Save Our Sounds Project - Dr. Michael Taft (Library of Congress), Head,
Archive of Folk Culture, American Folklife Center
Factors considered in choosing which sounds to save for the Save Our Sounds
project:
° Content.
° Historical or cultural significance.
° Present state of accessibility.
° Fragility and deterioration.
° Variety of sound recording formats.
° Complexity of collections.
° Diversity of material.
° Other political considerations.
Review
of Audio Collection Preservation Trends and Challenges - Sam Brylawski (Library
of Congress), Head, Recorded Sound Section Motion Picture, Broadcasting and
Recorded Sound Division
"The Library's collections include over 100,000 audio cassettes and
170,000 open-reel tapes."
"Indeed, there is no permanent digital format."
"Following the Open Archival Information System (OAIS) model established
by NASA,6 digital objects for sound recordings in the repository will include
digital images of record labels or tape boxes and other graphics or accompanying
text, in addition to the audio files. The audio files themselves will be very
large, recorded at a sampling rates of 96kHz or 192 kHz, with 24-bit word
lengths."
LOC uses Metadata
Encoding and Transmission Standard (METS):
"METS is complicated. Because it requires populating a very large number
of fields, at the present, it is time-consuming to create a full METS record.
Officials at the Library of Congress hope and expect to "develop tools
for automatically creating metadata," as recommended in a study of challenges
related to the preservation of digital content."
The Macaulay Library of
Natural Sounds at Cornell University
Info gleaned from this article at RLG DigiNews.
- 160,000 recordings of bird, insect, frog, and mammal vocalizations
- Analog formats included acetate disk, cassette, and open reel-- in various stages of deterioration
- Started with 3, moved to 6 studios

Audio control issues
- Playback equipment calibration
- visually inspected for excessive wear patterns
- adjust tape tension; set low speed to provide
gentle handling of fragile tapes
- calibrate reel-to-reel to known international
standards using calibration tapes from Magnetic
Reference Laboratories, Inc: head alignment (height, wrap, and azimuth),
playback equalization, playback reference levels, absolute speed, and
wow and flutter.
- biweekly calibration/alignment using a computer-based
test set manufactured by Audio
Precision; stored and routinely compared for performance over time;
easy to spot problems before they have a negative effect on the transfer
process
- A/D Converter and Editing System
- reviewed A/D converters, narrowed to six
possibilities, requested in-house testing units and audition; though all
six had very similar published specifications, the actual sound character
or lack thereof was very different.
- After grueling A/B listening tests and spectral
analysis, final decision: the Prism
Dream AD-2, the only device that did not color (alter) our signals...were
indistinguishable from our highest-quality analog sources.
- A/D conversion requires ultra-clean power supplies; exceptional grounding
procedures; ultra-stable clocking devices; high-quality, low-tolerance
components; and exceptional printed circuit board design
- Chose for 48-bit data path throughout system; most workstations are
32 bit
- Digital Media
- pre-determined: optical-disc rather than
tape based; storage requirements were
going to be huge, 32MB/minute of 2-channel audio; 96kHz sampling rate,
24-bit
- options available at the time: CD-R, DVD-RAM,
and DVD-R
- CD-R: good player compatibility, excellent
life expectancy, and low cost but fell short in terms of capacity
- DVD-RAM offered increased capacity and
long life expectancy but was expensive and did not offer the archival
security of a write-once format
- DVD-R offered everything we require
- Digital Format - goal: as generic a digital collection
as possible, maximizing accessibility and migration to the next generation
of digital storage
- wanted DVD-Audio but soon realized that industry-imposed
copy-protection schemes would significantly hamper our accessibility and
future migration requirements
- chose DVD-ROM using the Universal
Disc Format (UDF) standard
- Audio Interchange File Format (AIFF) data
files
- audio file has a voice ID at the beginning
with the asset number, which in our case is the MLNS catalog number
- same number is also used as the digital file
name
- No other metadata is embedded in the audio
file. Due to the common occurrences of species splitting and renaming
we chose to store all relevant metadata in a separate relational database.
- Disk Writing
- written using Pioneer DVR-S201 using DVD-R 4.7GB Authoring version 2.0
media
- uses a 635nm-laser wavelength instead of the general-use 650nm-laser;
still fully compatible with all DVD readers
- designed for authoring and replica masters, higher quality, more consistent
than the general-use discs
- Maxell, TDK, and Pioneer in lots of 100 to 200 at a time to minimize
any batch-related problems
- custom DVD authoring program from Sonic Solutions handles the actual
disc formatting, writer control, and bit/bit verification
- 60 minutes/4.3GB of data.
- in-house testing has revealed that the disc quality decreases near the
extreme outer diameter of the disc, so we limit our data to 4.3 to 4.4GB/disc
- two, first-generation discs containing roughly 125 min. stereo/250 min.
mono; One disc is placed in a large Plasmon D-480 robotic jukebox for
in-house distribution, while the second is stored off site at a secure,
climate-controlled, underground storage facility
- Quality control - many years of experience testing CD-R technologies
- issues: writer/disc compatibility, writing-speed issues, and dye-formulation
problems; differences often manifest themselves as significantly higher-than-acceptable
error rates or tracking problems
- DVD-R technologies appear to share some of these same issues so every
disc created undergoes a series of rigorous tests
- Using an AudioDev Computer Aided Test System (CATS), we first test
every blank disc. During this test phase the disc is subjected to
20 different tests that measure such values as disc reflectivity,
push-pull, wobble signal-to-noise, land pre-pit level, block error
rate (BLER), etc
- testing blank disks saves time and money writing (w/expensive lasers)
to bad disks
- lastly, test quality of the writing process using the CATS: Over
50 important parameters are tested during this phase, including servo
and tracking, jitter analysis, digital errors, dropouts, HF parameters,
and physical measurements. The results of this process offer a pass/fail
report
- quality control reports stored electronically
- Accessibility/Distribution
- DVD-R discs our high-resolution “core-archive,” available
only in-house via a high-speed network
- external distribution via the Internet includes a CD-Audio quality 44.1kHz/16-bit
wave file, a 96kbp/s MP3 file, a multi-bitrate RealAudio streaming file,
and soon, a QuickTime streaming file
Future Proofing
Only time will tell, but if history is any indication, we assume that in
the not-too-distant future some new and better digital format for long-term
preservation will appear in the market place. Unfortunately, as technology
changes, it typically renders existing hardware and software obsolete. In
the meantime we will monitor new digital storage technologies while continuing
to grow what we believe to be a very robust and accessible digital storage
solution. When an improved, standardized digital format does appear, we
feel confident that we have set the stage to migrate data from our current
storage strategy to the next generation in a relatively painless and automated
fashion.
When metadata is lost in the transfer from analog to digital, we lose more
than we realize. This article makes the point that, "Online music stores
should facilitate rather than hinder access to this information before, during
and after a song or album is purchased."


- The Digital Music Era - downloading music from a free P2P network are aware
that, in terms of the absent and incorrect information that comes with MP3s,
they get what they pay for
- Release Dates - "released" often refers to the date the CD version
of an album was released
- Albums - the continuous release of compilations can make it difficult to
identify original albums
- Partial Albums - The store should list the complete songs in order, with
the unavailable songs shaded out.
- Identity Crisis - Removing the identity of artists is one of digital music's
largest threats to jazz preservation
- Album Notes - Most jazz albums for sale in the ITMS have none of the original
album's liner notes or session information. When "Album notes" is
included info and quality is extremely variable
- Get Info - should include in "Get Info" not make listener log
back in to view "Album notes" on site
- Album Design - Album covers shrank drastically when CDs were introduced.
They vanished in P2P interfaces. They have returned as thumbnails in the ITMS.
With the loss of so much other info, browsing by album art may be the most
effective way for jazz fans to find music.
- Act Now - Email your favorite digital music store: Suggested text: I am
a loyal customer who would buy more music from your store if it was sold with
the same information (cover art, photos, dates, liner notes, session and songwriting
credits) that I get when I purchase a CD.
Best Practice Guides
Resources of Resources
These are sites with lots of great links to information about
audio archiving.
Audio Assignment
Thanks to Uri, we have a box full of cassettes from the Benson Library in need
of digitization. Each of you will get a cassette to digitize and a spreadsheet
to fill in for your audio project. The purpose of this assignment is to give
you some experience with audio digitization and to help the Benson out. This
is not a difficult project; I will lead you through it step-by-step and hopefully,
we will produce digital copies of these tapes just as though we were the digitization
lab hired to do the job.
Assignment: Digitize both sides of the tape assigned to you,
filling out all the metadata fields given to you on the spreadsheet for documentation
and quality control. Produce the raw capture .wav files, archival access .wav
copies, and network/internet access .mp3 files for each side of the cassette.
Burn a CD of the archival access copies with a track for each side.
Deliverables:
- 2 archival masters, raw, un-edited 24bit/96kHz wave files - one for each
side of the tape.
- 2 archival access copies - 24bit/96kHz wave files with the major silences
edited out- one for each side of the tape.
- 2 variable bit rate .mp3 files of the edited archival access copies- one
for each side of the tape.
- 1 .xls spreadsheet, which I will give you, filled in as completely as possible.
- 1 CD containing 2 tracks, side a and side b, with a printed label
Where to deliver them:
The audio files will go in our project folder on the network, 2007-spring\projects\ischool-blac-aud-...
Inside this folder, create a folder with your UTEID for its name, in lowercase.
Inside your folder, create three folders:
- an "archival master" folder for the 24bit/96kHz archival master,
- a "publication master" folder for the 24bit/96kHz archival publication
master
- an "access derivative" folder for the VBR mp3 archival access
derivative
- a "CD" folder for each side of tape, saved at 16bit/44.1kHz, and
a 16bit/44.1kHz 2 track CD containing sides a and b
Gradable Components
- 2 pts. for having completed the archival master transfer, named it correctly,
and named your folder correctly
- 2 pts. for having completed the archival publication master transfer, named
it correctly, and named your folder correctly
- 2 pts. for having completed the archival access derivative transfer, named
it correctly, and named your folder correctly
- 2 pts. for having filled in the metadata spreadsheet completely and performed
the quality control measures
- 2 pt. for producing the CD containing 2 tracks, side a and side b, with
a printed label
Quality Control
You will be performing three quality control measures and reporting them on
the spreadsheet:
- Are there strange noises? - I would like you to randomly select 10% of your
audio to listen to for noises that may have entered the audio file during
digitization. These are things that might occur because of a malfunction in
the A/D converter, the processor, power, etc. Note: using the computer you
are doing the digitization on for anything else while digitizing could cause
such noise. Don't have any other application running while digitizing. A momentary
lapse in processor resources could mar your audio. Since we are digitizing
90 minutes, 10% comes out to about 10 minutes to listen back to. Select any
5 minutes from each track to listen for anything "digital sounding"
that doesn't sound like it came from the tape. Once you have listened to 5
minutes of each, and are satisfied there are no strange digital noises, circle
the "no" under "strange noises," and move on. If you do
hear something funny, circle"yes" and make note of where it occurs
and what it sounds like.
- Does the digital dub sound the same as the original? This is a listening
test I would like you to perform on the first ten seconds and the last ten
second of each side. I've chosen such a small amount of time so that you can
effectively listen to the tape and then listen to the digital with the sound
of the tape in mind. You will need to unplug the headphones from the Audiophile box and plug them into the tape player directly, going back and forth to compare. Do it a few times. Do they sound the same? If yes, circle"yes" under "sound the same?" If no, try to describe what
the difference is in the notes. I'd like you to do this at the beginning and
end to confirm that nothing dramatic changed during the process that you might
not have been aware of, but which colored the audio somehow.
- Was there any clipping? Clipping occurs especially easily in digital recording
when the signal is too loud to be processed resulting in distortion. It can
be detected visibly when the audio signal fills the audio wave window so much
that it hits the ceiling/floor and is simply flat. Visually scroll through
the capture audio for signs of clipping. If you see any, listen to it to see
if it distorts, causing unwanted noise. If you find any, circle "yes"
under "any clipping" and note where is occurs and how bad it interrupts
the audio track. If you don't see any, circle"no." In the following
picture clipping is occurring to the red and orange waves, not the blue ones:
- If you answered "yes" to "strange noise?", "no"
to "sounds the same", or "yes" to "any clipping,"
circle "yes" under "needs redub," and explain how serious
you thing the problem interrupts the audio, or if perhaps it only need redubing
if we are going to be perfectionists about the audio.