Citation
Analysis of zero-level sample padding of various MP3 codecs

Material Information

Title:
Analysis of zero-level sample padding of various MP3 codecs
Creator:
Berman, Josh ( author )
Language:
English
Physical Description:
1 electronic file (48 pages) : ;

Subjects

Subjects / Keywords:
Sound -- Recording and reproducing -- Digital techniques ( lcsh )
Computer science -- data compression ( lcsh )
MP3 (Audio coding standard) ( lcsh )
Genre:
bibliography ( marcgt )
theses ( marcgt )
non-fiction ( marcgt )

Notes

Abstract:
As part of the MP3 compression process, the codec used will often pad the beginning and end of a file with “zero-level samples”, or silence. The number of zero-level samples (ZLS) varies by codec used, sample rate, and bit depth of the compression. Each re-compression of a file in the MP3 format will typically add more silence to the beginning and/or end of the file. By creating multiple generations of files using various audio editors/codecs, we hope to be able to determine the generation of MP3 compression of the files based solely off of the number of ZLS at the beginning and end of the file.
Thesis:
Thesis (M.S.) - University of Colorado Denver.
Bibliography:
Includes bibliographic references
System Details:
System requirements: Adobe Reader.
General Note:
College of Arts and Media
Statement of Responsibility:
by Josh Berman.

Record Information

Source Institution:
University of Colorado Denver
Holding Location:
Auraria Library
Rights Management:
All applicable rights reserved by the source institution and holding location.
Resource Identifier:
946652059 ( OCLC )
ocn946652059
Classification:
LD1193.A70 2015m B47 ( lcc )

Downloads

This item has the following downloads:


Full Text
ANALYSIS OF ZERO-LEVEL SAMPLE PADDING OF VARIOUS MP3
CODECS
By
JOSH BERMAN
B.S., University of Colorado, Denver, 2013
A thesis submitted to the
Faculty of the Graduate School of the
University of Colorado, in partial fulfillment
of the requirements for the degree of
Masters of Science
Recording Arts
2015


2015
JOSH BERMAN
ALL RIGHTS RESERVED


This thesis for the Master of Science Degree by
Josh Berman
has been approved by the
Recording Arts Program
By
Lome Bregitzer
Jeff Smith
Catalin Grigoras, Chair
11/20/2015


Berman, Josh (M.S. Recording Arts)
Analysis of Zero-Level Sample Padding of Various MP3 Codecs
Thesis directed by Assistant Professor Catalin Grigoras
ABSTRACT
As part of the MP3 compression process, the codec used will often pad the
beginning and end of a file with zero-level samples, or silence. The number of
zero-level samples (ZLS) varies by codec used, sample rate, and bit depth of the
compression. Each re-compression of a file in the MP3 format will typically add
more silence to the beginning and/or end of the file. By creating multiple
generations of files using various audio editors/codecs, we hope to be able to
determine the generation of MP3 compression of the files based solely off of the
number of ZLS at the beginning and end of the file.
The form and content of this abstract are approved. I recommend its
publication.
Approved: Catalin Grigoras
IV


ACKNOWLEDGEMENTS
Id like to thank my family, first and foremost, for being so awesome and
supportive throughout my education. Without you, none of this would have happened.
Another big thanks to all of the great teachers Ive had throughout the years.
v


TABLE OF CONTENTS
CHAPTER
I. INTRODUCTION...................................................... 1
A Brief History................................................ 1
Zero-Level Padding..............................................1
Purpose of the Study............................................2
II. MATERIALS AM) METHODS.............................................4
III. RESULTS......................................................... 8
Audacity........................................................9
Adobe Audition CS3 (Outlier Omitted)...........................10
Adobe Audition CC2014......................................... 11
Adobe Audition CC2015 ........................................ 12
Blade Encoder................................................. 13
dBpoweramp.....................................................14
ffmpeg........................................................ 15
fpMP3 Encoder................................................. 16
GOGO.......................................................... 17
Apple iTunes.................................................. 18
Apple Logic Pro............................................... 19
LAME (Mac Terminal)............................................20
Mad............................................................21
vi


mpgl23
22
Avid Pro Tools (Outlier Omitted)..............................23
SoX...........................................................24
Switch (Outlier Omitted)......................................25
Wavelab (Fraunhofer)..........................................26
Wavelab (LAME)................................................27
Xing..........................................................28
All LAME Programs.............................................29
All Fraunhofer Programs....................................... 30
IV. DISCUSSION......................................................32
Inter-Program Variance........................................36
V. FUTURE RESEARCH / FRAMEWORK.....................................37
VI. CONCLUSION......................................................39
REFERENCES..........................................................41
vii


CHAPTER I
INTRODUCTION
1.1 A Brief History
The MP3 format, developed in the early 1990s by the Fraunhofer Institute along
with the Motion Picture Experts Group, is a very popular method of digital audio
compression. The motivation behind its development was to create a style of
compression that could reduce the size of an audio file drastically while still retaining as
much quality as possible. In the early 90s hard drive space and Internet bandwidth made
full resolution audio files quite impractical. An average length 3-minute song can take up
to 50 Megabytes of storage, which even by todays standards can be impractical. The
MP3 codec takes advantage of limitations in human hearing to remove information that
we cant hear from a lossless audio file. The result is up to 90% smaller file sizes with
minimal compromise to audible quality depending on the bit rate used. Over time,
various MP3 codecs have been developed (Fraunhofer, LAME, etc) and each has its own
method of coding and decoding MP3 data. Regardless of the exact codec used, MP3 files
have become the standard for recording and distributing audio.
Each codec will potentially interpret the same file in different ways. This is due to
the nature of MP3 compression. The MP3 standard states certain things that make an
MP3 file readable by any decoder, however each decoder may decode the information
differently.
1


1.2 Zero-Level Sample Padding
Part of the coding/decoding process for MP3 files is to pad the beginning and end
of the files with silence, or Zero-Level Samples (ZLS). Depending on the codec used
and specs of the file, more or less zeroes will be added. Typically, each additional re-
compression will add more zeroes to the beginning and end of the file. See Figure 1
showing the expected number of zeroes for each generation MP3 compression. This
hypothesis is what is being tested in this study.
Expected Number of Zeroes
Figure 1 Expected number of ZLS for each additional MP3 compression
The exact number of zeroes will vary, but this graph shows a general trend we would
expect to see.
1.3 Purpose of the Study
The purpose of this study is not to determine the cause/reasoning behind zero-
level sample padding, but rather to provide a comprehensive sample of files and their
respective padding.
2


With this study, we hope to be able to find out how many generations of MP3
compression a file has undergone, solely based on the zero-level samples at the beginning
and end of a file and on each channel. The default format for most handheld recorders is
MP3, so we expect to see some amount of ZLS on an authentic, unedited file. If a file
were to be loaded into an audio editor, manipulated, and re-saved, we would expect that
the codec used by the audio editor would add more zeroes to the file, telling us that the
file was re-compressed. By collecting files created by various handheld recorders then re-
compressed with various programs/codecs, the purpose is to develop a database to
reference when analyzing a file for authenticity. For example, if File A was claimed to
be recorded with Recorder A and has 2000 zeroes at the beginning of the file, yet all
sample recordings from Recorder A only have -500 zeroes at the beginning, this is a
sign that the file was likely re-compressed.
3


CHAPTER II
MATERIALS AND METHODS
The procedure of this study was to take various handheld audio recorders and
record samples of audio, re-compress that audio using different codecs, and analyze the
number of zeroes on the file for each generation. The handheld recorders used were:
(5) Tascam DR-07s
(1) Olympus DM-520
(1) Zoom HI
(1) Marantz PMD620
(1) Philips LFH0882
(2) Olympus WS-700s
Each recorder recorded ten (10) files. The files were all of varying loudness:
(3) Recordings of loud music
(3) Recordings of moderate speech
(4) Recordings of silence
The total number of files came to 110(10 recordings on 11 recorders). The next
process was to recompress each file a number of times. Most of the programs used could
both encode and decode data, meaning that they can read and create MP3 files. The
process for such programs was as follows:
1. Convert each original MP3 file to uncompressed .wav PCM files
2. Convert new .wav PCM files back to MP3
3. Repeat for a total of four (4) generations of .wav files
4


4. Read the number of zeroes at the beginning and end of each .wav file on each
channel.
Various programs/codecs used were strictly coders, not decoders, meaning that
could only create MP3 files from an uncompressed .wav file and not read them/convert to
.wav. The process for such programs was as follows:
1. Convert each original MP3 file to .wav PCM using dBpoweramp
2. Convert new .wav PCM files to MP3 using the code in question
3. Repeat for a total of four (4) generations of .wav file
4. Read the number of zeroes at the beginning and end of each .wav file on each
channel.
Some programs were purely decoders, meaning that they could read an MP3 file
and save it as a .wav, but not convert to MP3. The process for such files was as follows:
1. Convert each original MP3 file to uncompressed .wav file using the decoder in
question
2. Convert new .wav files back to MP3 using Adobe Audition CC2015
3. Repeat for a total of four (4) generations of .wav file
4. Read the number of zeroes at the beginning and end of each .wav file on each
channel.
The reason for converting each MP3 file to .wav before reading the ZLS is
because each program potentially uses a different decoder to read the files. This means
that opening the same MP3 file in different programs will potentially show a different
level of zero-level samples. By using each decoder to create uncompressed .wav files, we
essentially print the number of zeroes imparted on the file by that decoder. Opening
5


this new .wav file in any program will show the same number of zeroes because no
decoder is being used to read it. This way, all of the .wav files could be read/interpreted
by the same MATLAB script without MATLAB adding (or subtracting) more zeroes. All
files were recorded at 16bit/l 28kbps mp3 with the exception of the files from the Phillips
LFH0882 which does not have the capability to record at 128kbps. These files were
recorded at 16bit/l 92kbps instead.
The following programs were used to create the sample files:
Table 1 Programs/Codecs Used
Program Program Version MP3 Codec Used
Audacity 2.1.1 LAME 3.98.3
Adobe Audition CS3 CS3 Fraunhofer
Adobe Audition CC2014 CC2014 Fraunhofer
Adobe Audition CC2015 CC2015 Fraunhofer
Blade Encoder 0.82 Blade 0.82
dBpoweramp 15.3 LAME 3.99
ffmpeg 2.7.1 LAME 3.99.5
fpMP3 Encoder 1.0.0.2 fpMP3 1.0.0.2
Gogo Encoder 3.13 Gogo 3.13
Apple iTunes 12.3.0.44 Fraunhofer
Apple Logic Pro 9.1.8 Fraunhofer
LAME (Mac Terminal) 3.99.5 LAME 3.99.5
Mad Decoder 0.15.2b Mad 0.15.2b
Mpgl23 Decoder 1.22.0 Mpgl23 1.22.0
6


Avid Pro Tools 11.3.1 Fraunhofer
SoX (Mac Terminal) 14.4.2 LAME 3.99.5
NCH Software Switch 4.85 LAME 3.97
Steinberg Wavelab (LAME) 8 LAME 3.98.4
Steinberg Wavelab (Fraunhofer) 8 Fraunhofer 4.1.3
Xing Encoder 1.5.0.5 Xing 1.5.0.5
The only settings modified in the programs was the bitrate. This was done in
order to keep each new file at 16bit/l28kbps CBR. All other settings were left at the
default.
While there were 110 original files, file Hl_L3.mp3 appeared to be corrupt
when opening in certain programs and displayed an abnormally high amount of zeroes.
This outlier was omitted when testing the following programs:
Adobe Audition CS3
Avid Pro Tools
Switch
7


CHAPTER III
RESULTS
Legend for Data Tables/Graphs:
Term Definition
Initial Average Average of both left and right channels at the beginning of the file.
Final Average Average of both left and right channels at the end of the file.
ZLS Zero Level Samples, number of zero level samples.
Generation Number of the .wav file generation.
8


3.1 Audacity
Generation I II III IV
Initial Average 3.2 3.5 3.1 3.4
Initial Standard Deviation 3.1 3.4 3.4 3.5
Final Average 0 0 0 0
Final Standard Deviation 0 0 0 0
Table 3.1: Audacity
Initial Average Audacity (Mac)
Generation
Figure 3.1a: Initial Average Audacity
Final Average Audacity (Mac)
10
9
8
7
0------------4-0-------------4-0------------ 0
I II III IV
Generation
Figure 3.1b: Final Average Audacity
9


3.2 Adobe Audition CS3 (Outlier Omitted)
Generation I II III IV
Initial Average 1211.9 1973.8 3035.3 4191
Initial Standard Deviation 901.5 879.9 909.9 901.4
Final Average .2 1604.5 4385.9 7099.3
Final Standard Deviation 2.4 1976 3353.6 4625
Table 3.2: Adobe Audition CS3 (Outlier Omitted)
Initial Average Adobe Audition
CS3 (Outlier Omitted)
Figure 3.2a: Initial Average Adobe Audition CS3 (Outlier Omitted)
Final Average Adobe Audition CS3 (Outlier Ommited)
ouuu 7000 6000 5000 4000 3000 2000 1000 o -
rUUU.o

^4^385.9


^^4M601.5

I II III IV Generation
Figure 3.2b: Final Average Adobe Audition CS3 (Outlier Omitted)
10


3.3 Adobe Audition CC2014
Generation I II III IV
Initial Average 1206.9 848.8 843 837.7
Initial Standard Deviation 898.9 788.5 785.1 785.1
Final Average .2 0 0 0
Final Standard Deviation 2.4 .2 .2 .2
Table 3.3: Adobe Audition CC2014
Initial Average Audition CC2014
Figure 3.3a: Initial Average Adobe Audition CC2014
Final Average Audition CC2014
(A
-I
N
10
9
8
7
6
5
4
3
2
1
0
0.2-

--o-
I
- 0
IV
Generation
Figure 3.3b: Final Average Adobe Audition CC2014
11


3.4 Adobe Audition CC2015
Generation I II III IV
Initial Average 1206.9 848.8 843 837.7
Initial Standard Deviation 898.9 788.5 785.1 785.1
Final Average .2 0 0 0
Final Standard Deviation 2.4 .2 .2 .2
Table 3.4: Adobe Audition CC2015
Initial Average Audition CC2015
Figure 3.4a: Initial Average Adobe Audition CC2015
Final Average Audition CC2015
IU 9 8 7 V j 5 N 4- 3 2 1 o









I II III IV Generation
Figure 3.4b: Final Average Adobe Audition CC2015
12


3.5 Blade Encoder
Generation I II III IV
Initial Average 1206.9 2109.2 3097.7 4106.7
Initial Standard Deviation 898.9 954.8 953.6 956.4
Final Average 0 0 0 0
Final Standard Deviation .3 0 0 0
Table 3.5: Blade Encoder
Initial Average Blade
Figure 3.5a: Initial Average Blade Encoder
Final Average Blade
-o---------------e----------------e--------------- o
I II III IV
Generation
Figure 3.5b: Final Average Blade Encoder
13


3.6 dBpoweramp
Generation I II III IV
Initial Average 1206.9 1206.9 1054.8 1037.1
Initial Standard Deviation 898.9 989.9 907.6 905.3
Final Average .1 .1 .1 .1
Final Standard Deviation .3 .3 .3 .1
Table 3.6: dBpoweramp
Initial Average dBpoweramp
Figure 3.6a: Initial Average dBpoweramp
Final Average dBpoweramp
Generation
Figure 3.6b: Final Average dBpoweramp
14


3.7 Ffmpeg
Generation I II III IV
Initial Average 825.1 347 21.2 7.1
Initial Standard Deviation 806.9 378.8 42.2 10.7
Final Average 0 .1 361.4 498
Final Standard Deviation .1 .5 133.6 235
Table 3.7: ffmpeg
Final Average ffmpeg
Figure 3.7b: Final Average ffmpeg
15


3.8 Fpmp3
Generation I II III IV
Initial Average 1206.9 694.6 375.7 112.2
Initial Standard Deviation 898.9 711.9 445.8 195.7
Final Average .1 0 0 0
Final Standard Deviation .3 0 0 0
Table 3.8: fpMP3
Initial Average fpMP3
Figure 3.8a: Initial Average fpMP3
Final Average fpMP3
Generation
Figure 3.8b: Final Average fpMP3
16


3.9 Gogo
Generation I II III IV
Initial Average 1206.9 694.6 375.7 112.2
Initial Standard Deviation 898.9 711.9 445.8 195.7
Final Average .1 0 0 0
Final Standard Deviation .3 0 0 0
Table 3.9: Gogo
Initial Average Gogo
Figure 3.9a: Initial Average Gogo
Final Average Gogo
Generation
Figure 3.9b: Final Average Gogo
17


3.10 Apple iTunes
Generation I II III IV
Initial Average 502.9 460.3 441.7 433.8
Initial Standard Deviation 549.7 515 503.7 498.3
Final Average 0 0 0 0
Final Standard Deviation 0 0 0 0
Table 3.10: iTunes
Initial Average iTunes
Figure 3.10a: Initial Average iTunes
W
_l
N
10
9
8
7
6
5
4
3
2
1
0
Final Average iTunes
0------1--------4-0--------------4-0--------------- 0
I II III IV
Generation
Figure 3.10b: Final Average iTunes
18


3.11 Apple Logic Pro
Generation I II III IV
Initial Average 786.2 691.6 669.7 658.9
Initial Standard Deviation 782.5 731.5 717.6 709.6
Final Average 492.2 4.4 .4 .9
Final Standard Deviation 22.4 31.7 6.1 11
Table 3.11: Apple Logic Pro
Initial Average Logic Pro
Figure 3.11a: Initial Average Apple Logic Pro
19


3.12 Lame (Mac Terminal)
Generation I II III IV
Initial Average 787.4 723.5 694.4 672.6
Initial Standard Deviation 783.2 745.2 723.9 715
Final Average 0 0 0 0
Final Standard Deviation .3 .1 .1 .3
Table 3.12: LAME 3.99.5 (Mac Terminal)
Initial Average LAME (Mac
Terminal)
Figure 3.12a: Initial Average LAME (Mac Terminal)
Final Average LAME (Mac
Terminal)
10
9
8
7
3
2
1
o 4~e-----------------------o-------------*-e-------------- o
I II III IV
Generation
Figure 3.12b: Final Average LAME (Mac Terminal)
20


3.13 Mad
Generation I II III IV
Initial Average 1 1 1 1
Initial Standard Deviation 0 0 0 0
Final Average 0 .7 1.1 1.4
Final Standard Deviation .2 2.9 4 4.7
Table 3.13: Mad
Initial Average Mad
1 1 t 1
I II III IV
Generation
Figure 3.13a: Initial Average Mad
21


3.14 Mpgl23
Generation I II III IV
Initial Average 881.8 1994 3048.8 4144.5
Initial Standard Deviation 874.2 905.1 924.8 927.2
Final Average .5 324.5 1178.5 2124.8
Final Standard Deviation 2.1 262.3 351.7 421.8
Table 3.14: mpgl23
Initial Average mpg123
II III IV
Generation
Figure 3.14a: Initial Average mpg!23
Final Average mpg123
Figure 3.14b: Final Average mpg!23
22


3.15 Avid Pro Tools (Outlier Omitted)
Generation I II III IV
Initial Average 793.5 1060.1 1537.4 2168.5
Initial Standard Deviation 784.3 817.8 932 924.6
Final Average 2314.1 3165.2 3630.4 4845
Final Standard Deviation 1688.1 1727.1 2516.2 2792
Table 3.15: Avid Pro Tools (Outlier Omitted)
Initial Average Pro Tools
(Outlier Omitted)
Figure 3.15a: Initial Average Pro Tools (Outlier Omitted)
Final Average Pro Tools
(Outlier Omitted)
Figure 3.15a: Final Average Pro Tools (Outlier Omitted)
23


3.16 SoX
Generation I II III IV
Initial Average 1206.9 2111.3 3178.9 4261.5
Initial Standard Deviation 898.9 968.4 967 964.4
Final Average 0 0 0 0
Final Standard Deviation 0 0 0 0
Table 3.16: SoX
Initial Average SoX
Figure 3.16a: Initial Average SoX
Final Average SoX
-e-------------e------------#-e------------- o
I II III IV
Generation
Figure 3.16b: Final Average SoX
24


4.17 Switch (Outlier Omitted)
Generation I II III IV
Initial Average 1.6 1 1.1 5.1
Initial Standard Deviation 2 0 .4 21.2
Final Average .1 0 0 0
Final Standard Deviation 1.1 0 0 0
Table 3.17: Switch (Outlier Omitted)
Final Average Switch (Outlier Omitted)
ZLS O K> -b- OJ 00 C _i 1 i 1 1 1




1 II III IV Generation
Figure 3.17b: Final Average Switch (Outlier Omitted)
25


3.18 Wavelab (Fraunhofer)
Generation I II III IV
Initial Average 1256.5 2171.7 3284.8 4475.9
Initial Standard Deviation 886 883.5 917.1 938.7
Final Average 1.6 500.7 1393.6 2383.8
Final Standard Deviation 7.1 247.1 397 469.6
Table 3.18: Wavelab (Fraunhofer)
Initial Average Wavelab
(Fraunhofer)
Figure 3.18a: Initial Average Wavelab (Fraunhofer)
Final Average Wavelab
(Fraunhofer)
Figure 3.18b: Final Average Wavelab (Fraunhofer)
26


3.19 Wavelab (Lame)
Generation I II III IV
Initial Average 1256.5 2258.2 3326.5 4436
Initial Standard Deviation 886 907.2 875.8 870.2
Final Average 1.6 1.5 3.2 4.6
Final Standard Deviation 7.1 6.9 13.5 18.1
Table 3.19: Wavelab (Lame)
Final Average Wavelab (LAME)
ZLS --MNUUJiJia otnocnocnocnocnc _l 1 1 1 1 1 1 1 1 1 1








i.i ^ 1fr
1 II III IV Generation
Figure 3.19b: Final Average Wavelab (LAME)
27


3.20 Xing
Generation I II III IV
Initial Average 1206.9 1842.8 2824.9 3831.5
Initial Standard Deviation 898.9 807.6 866.1 877.1
Final Average .1 0 .1 0
Final Standard Deviation .3 0 .4 .2
Table 3.20: Xing
Final Average Xing
Generation
Figure 3.20b: Final Average Xing
28


3.21 All Lame Programs
This section compiles all programs that used LAME. The chart Initial Average All
LAME Programs and Final Average All LAME programs show all LAME programs
superimposed.
Generation I II m IV
Audacity 825.1 347 21.2 7.1
Wavelab LAME 1256.5 2258.2 3326.5 4436
Switch 1.6 1 1.1 5.1
ffmpeg 825.1 347 21.2 7.1
dBpoweramp 1206.9 1206.9 1054.8 1037.1
LAME 787.4 723.5 694.4 672.6
SoX 1206.9 2111.3 3178.9 4261.5
Average 755.4 950.2 1182.9 1489
Table 3.21: Initial Average of all LAME programs
Initial Average All LAME
Programs
Audacity
Wavelab LAME
Switch
>< ffmpeg
* dBpoweramp
--SoX
LAME
Figure 3.21a: Initial Average All LAME Programs
Final Average All LAME
Programs
Generation
Audacity
--Wavelab LAME
A Switch
w ffmpeg
* dBpoweramp
--SoX
iLAME
Figure 3.21b: Final Average All LAME
29


3.22 All Fraunhofer Programs
This section compiles all programs that used Fraunhofer. The chart Initial Average All
Fraunhofer Programs and Final Average All Fraunhofer programs show all
Fraunhofer programs superimposed.
Generation I II in IV
Audition CS3 1211.9 1973.8 3035.3 4191
Audition CC2014 1206.9 848.8 843 837.7
Audition CC2015 1206.9 848.8 843 837.7
iTunes 502.9 460.3 441.7 433.8
Logic Pro 786.2 691.6 669.7 658.9
Pro Tools 793.5 1060.1 1537 2168.5
Wavelab (Fraunhofer) 1256.5 2171.7 3284.8 4475.9
Average 995 1150.7 1522.1 1943.4
Table 3.22: All Fraunhofer Programs
Initial Average All Fraunhofer
5000
~Audition CS3
Audition CC2014
Audition CC2015
^^iTunes
Logic Pro
Pro Tools
t~ Wavelab Fraunhofer
Figure 3.22a: Initial Average All Fraunhofer Programs
30


Final Average All Fraunhofer
Audition CS3
Audition CC2014
Audition CC2015
iTunes
Logic Pro
Pro Tools
Wavelab Fraunhofer
Figure 3.22b: Final Average All Fraunhofer Programs
31


CHAPTER IV
DISCUSSION
While the original hypothesis of this study was that the number of zero-
level samples at the beginning and end of each file would increase with each additional
compression, that was found to not always be the case. While some programs showed the
general upward trend of adding more zeroes after each additional compression, many did
not. The ones that did not varied from having no zeroes on the files whatsoever to starting
with a substantial amount then going down after each recompression.
Programs/Codecs that behaved as expected (number of ZLS grew after each
recompression):
Wavelab (LAME)
Wavelab (Fraunhofer)
Pro Tools
Audition CS3
Blade
Mpgl23
Xing
SoX
Programs that did not behave as expected (number of ZLS did not grow after each
recompression):
ffrnpeg
Switch
Audacity
32


Logic Pro
dBpoweramp
iTunes
Audition CC2014
Audition CC2015
Mad
fpMP3
Gogo
LAME (Mac Terminal)
Only 8 out of the 20 programs (40%) of the programs behaved as expected vs. 12
out of 20 (60%) which did not. Among the programs that did not behave as expected,
some had a high initial number of zeroes then decreased, whereas some had a nominally
small number of zeroes throughout each re-compression.
The programs that had a high initial number of zero-level samples and decreased
throughout were:
ffrnpeg
Logic Pro
dBpoweramp
iTunes
Adobe Audition CC2014
Adobe Audition CC2015
fpMP3
33


Gogo
LAME (Mac Terminal)
The programs that had a nominally small number of zeroes throughout were:
Switch
Audacity
Mad
One possible reason behind the zero-level samples is for gapless playback
information1. Depending on the exact codec and specs used, the encoder and decoder will
add a number of samples. This is used as a buffer by many different audio players to
ensure gapless playback. However, in more recent revisions of MP3 codecs, developers
have been able to remove this buffer to some extent. As LAME states in their FAQ:
Starting with LAME 3.55, we have a new MDCT/filterbank routine written by
Takehiro Tominaga with a 48 sample delay. With even more rewriting, this could be
reduced to 0.
Other more popular codecs such as Fraunhofer also state that they are working to
reduce the encoder and decoder delays2.
When using LAME directly in the MAC terminal, a message is displayed when
decoding MP3 files which clearly states that the zero-level samples are being skipped:
input: DM520_Ll.mp3 (44.1 kHz, 2 channels, MPEG-1 Layer III)
output: DM520_Ll.wav (16 bit, Microsoft WAVE)
skipping initial 529 samples (encoder+decoder delay)
Frame# 1164/1164 128 kbps MS
MP3 Generation 1 Files Being Decoded to WAV Generation 1
34


input: DM520_Ll.mp3 (44.1 kHz, 2 channels, MPEG-1 Layer III)
output: DM520_Ll.wav (16 bit, Microsoft WAVE)
skipping initial 1105 samples (encoder+decoder delay)
skipping final 576 samples (encoder padding-decoder delay)
Frame# 1164/1164 128 kbps L R
MP3 Generation 2 Files Being Decoded to Wav Generation 2
input: DM520_Ll.mp3 (44.1 kHz, 2 channels, MPEG-1 Layer III)
output: DM520_Ll.wav (16 bit, Microsoft WAVE)
skipping initial 1105 samples (encoder+decoder delay)
skipping final 576 samples (encoder padding-decoder delay)
Frame# 1164/1164 128 kbps L R
MP3 Generation 3 Files Being Decoded to WAV Generation 3
input: DM520_Ll.mp3 (44.1 kHz, 2 channels, MPEG-1 Layer III)
output: DM520_Ll.wav (16 bit, Microsoft WAVE)
skipping initial 1105 samples (encoder+decoder delay)
skipping final 576 samples (encoder padding-decoder delay)
MP3 generation 4 Files Being Decoded to WAV Generation 4
Note the phrases Skipping initial 529 samples (encoder+decoder delay) and
Skipping final 567 samples (encoder padding-decoder delay). According to Mark
Taylors LAME FAQ, available at http://lame.sourceforge.net/tech-FAQ.txt most MP3
codecs have a roughly 528 sample decoder delay and 528 sample encoder delay. At the
time of the writing of that FAQ, LAME was working to reduce this delay further. As
stated by the current version (3.99.5) much of that delay is now ignored when decoding,
meaning the resulted .wav files will have no additional zero-level samples added by the
codec. This does not, however, mean that files decoded with LAME will have no zeros.
The data collected shows that the files will have a very similar number of zeros as the
original file did. In other words, regardless of how many generations are created with
LAME 3.99.5, the file will always have a similar number of zeros. The four generations
created in this stuffy showed a slightly decreasing amount of zeros in each generation on
average, but most individual files showed no decrease.
35


4.2 Inter-Program Variance
While one would expect multiple programs that use the same MP3 library to
produce identical results, this was found to not always be the case. Take, for example,
SoX and ffrnpeg. Both programs were executed in the Mac terminal and use LAME
3.99.5. The two programs produced different results. It is clear that more than just the
MP3 codec used factors into the amount of zero-level sample padding added.
36


CHAPTER V
FUTURE RESEARCH / FRAMEWORK
Any future research on this topic would be useful in developing a more
comprehensive database with which to compare evidence files. The most popular codecs
used today are LAME and Fraunhofer, but each has many different revisions with
different behaviors. An interesting topic for further research would be ZLS variance
within different versions of the same codec. This way, if we know which version of a
codec was used to compress a file (sometimes displayed in the file metadata) we can
more accurately estimate how many times the file was recompressed. An interesting topic
of research would be to study each version of a certain codec to see how its behavior
changes over time.
The current framework for audio authentication includes more than just zero-level
sample analysis. One of the more commonly used methods for audio authentication is a
metadata/structure analysis. This analysis includes looking at the metadata of an audio
file to see if it is consistent with an authentic file. A proposed addition to this current
framework is to combine these two analyses (zero-level sample and metadata analysis).
By looking at the metadata of a file we are more likely able to determine the codec and
version used to compress the file. For example, files encoded with LAME will typically
have LAME and the version number in the header of the file.
37


0
16
32
48
64
80
96
112
128
144
160
176
192
208
224
240
256
272
288
304
FFFB9060
0D200000
80000004
00000000
00000000
00000000
00000000
00000000
00000000
00000000
00000000
00000000
00000000
00000000
00000000
00000000
00000000
00000000
00000000
00000000
000FF000
01000001
4C414D45
00000000
00000000
00000000
00000000
00000000
00000000
00000000
00000000
00000000
00000000
00000000
00000000
00000000
00000000
00000000
00000000
00000000
00690000
A4000000
332E3938
00000000
00000000
00000000
00000000
00000000
00000000
00000000
00000000
00000000
00000000
00000000
00000000
00000000
00000000
00000000
00000000
00000000
00080000
20000034
2E340000
00000000
00000000
00000000
00000000
00000000
00000000
00000000
00000000
00000000
00000000
00000000
00000000
00000000
00000000
00000000
00000000
00000000
i i
§ 4
LAME3.98.4
lift
nnnnnnnn nrsnrsri
Figure 5: Hex information of a 4
nnn nnnnnnnn nnnAnitnn
11 Generation MP3 file created with Steinberg W avelab
38


CHAPTER VI
CONCLUSION
In the past examiners may have used the number of zero-level sample on an MP3
file to estimate whether or not the file has been re-compressed multiple times. According
to the results of this study, many of the new versions of codecs no longer add a
significant amount of zeroes to the file. As a result of this development, it is not
necessarily accurate for all codecs to use the number of zero-level samples on the file to
determine the number of re-compressions. It can be effective for some codecs, but not all.
Depending on the codec used, we can determine that a file was likely recompressed at
least once, but not always exactly how many times it was edited.
Take, for example, a file in question which has no zero-level sample padding at
the beginning or end of the file. It is unlikely that the file is directly from the recorder, as
none of the original files had no zero-level samples straight from the recorder. We
cannot, however, make an accurate estimate as to how many times the file was re-
compressed unless we know which codec and which version were used. Based on this
sample data alone, we know that a file with no zero-level samples at the beginning could
be a 4th generation compressed with LAME 3.99.5, a 3rd generation compressed with
Logic Pro, or a number of other possibilities. Most of the recorders used use the
Fraunhofer codec3, but the exact version is not given. Based on the fact that most of the
original files had some number of zero-level samples at the beginning of the files we can
guess that they are using an older version of the codec, but we cannot be sure.
39


By considering the findings of this study with the already existing parts of the
audio authentication framework, we can improve our methods and ensure more accurate
authentication results in the future.
40


REFERENCES
Baylor, Mark. "LAME Technical FAQ." LAME Technical FAQ. LAME, June
2000. Web. 14 Oct. 2015. .
2Allamanche, Eric, Ralf Gieger, Jurgen Herre, and Thoma Sporer. "MPEG-4 Low
Delay Audio Coding Based on the AAC Codec." Audio Engineering Society
(1999). Web. 14 Oct. 2015.
3Product manuals available on manufacturer websites
Sripada, P. (2006). MP3 Decoder in Theory and Practice Masters Thesis
Blekinge Institute of Technology, Sweden
Raissi, R. (2002). The Theory Behind MP3
41


Full Text

PAGE 1

ANALYSIS OF ZERO LEVEL SAMPLE PADDING OF VARIOUS MP3 CODECS By JOSH BERMAN B.S., University of Colorado, Denver, 2013 A thesis submitted to the Faculty of the Graduate School of the University of Colorado, in partial fulfillment of the requirements for the degree of Masters of Science Recording Arts 2015

PAGE 2

ii 2015 JOSH BERMAN ALL RIGHTS RESERVED

PAGE 3

iii This thesis for the Master of Science Degree by Josh Berman has been approved by the Recording Arts Program By Lorne Bregitzer Jeff Smith Catalin Grigoras, Chair 11/20/2015

PAGE 4

iv Berman, Josh (M.S. Recording Arts) Analysis of Zero Level Sample Padding of Various MP3 Codecs Thesis directed by Assistant Professor Catalin Grigoras ABSTRACT As part of the MP3 compression process, the codec used will often pad the beginning and end of a file with "zero level samples", or silence. The number of zero level samples (ZLS) varies by codec used, sample rate, and bit depth of the compression. Each re compression of a file in the MP3 format will typically add more silence to the beginning and/or end of the file. By creating multiple generations of files using various audio editors/codecs, we hope to be able to determine the generation of MP3 compression of the files based solely off of the number of ZLS at the beginning and end of the file. The form and content of this abstract are approved. I recommend its publication. Approved: Catalin Grigoras

PAGE 5

v ACKNOWLEDGEMENTS I'd like to thank my family, first and foremost, for being so awesome and supportive throughout my education. Without you, none of this would have happened. Another big thanks to all of the great teachers I've had throughout the years.

PAGE 6

vi TABLE OF CONTENTS CHAPTER I. INTRODUCTION .......... 1 A Brief History ............. 1 Zero Level Padding ........... 1 Purpose of the Study .......... 2 II. MATERIALS AND METHODS ........ 4 III. RESULTS ......... ....... 8 Audacity ............. 9 Adobe Audition CS3 (Outlier Omitted) ...........10 Adobe Audition CC2014 ..... 11 Adobe Audition CC2015 ..... 12 Blade Encod er .. 13 dBpoweramp .....14 ffmpeg .. 15 fpMP3 Encoder 16 GOGO .. 17 Apple iTunes 18 Apple Logic Pro ... 19 LAME (Mac Terminal) ... 20 Mad .. 21

PAGE 7

vii mpg123 22 Avid Pro Tools (Outlier Omitted) .... 23 SoX .. 24 Switch (Outlier Omitted) 25 Wavelab (Fraunhofer) .. 26 Wavelab (LAME) 27 Xing .. 28 All LAME Programs .... 29 All Fraunhofer Programs 30 IV. DISCUSSION ....... 32 Inter Program Variance ... 36 V. FUTURE RESEARCH / FRAMEWORK ..... 37 VI. CONCLUSION .. ....... 39 REFERENCES 41

PAGE 8

1 CHAPTER I INTRODUCTION 1.1 A Brief History The MP3 format, developed in the early 1990's by the Fraunhofer Institute along with the Motion Picture Experts Group, is a very popular method of digital audio compression. The motivation behind it's development was to create a style of compression that c ould reduce the size of an audio file drastically while still retaining as much quality as possible. In the early 90's hard drive space and Internet bandwidth made full resolution audio files quite impractical. An average length 3 minute song can take up t o 50 Megabytes of storage, which even by today's standards can be impractical. The MP3 codec takes advantage of limitations in human hearing to remove information that we can't hear from a lossless audio file. The result is up to 90% smaller file sizes wit h minimal compromise to audible quality depending on the bit rate used Over time, various MP3 codecs have been developed (Fraunhofer, LAME, etc) and each has it's own method of coding and decoding MP3 data. Regardless of the exact codec used, MP3 files ha ve become the standard for recording and distributing audio. Each codec will potentially interpret the same file in different ways. This is due to the nature of MP3 compression. The MP3 standard states certain things that make an MP3 file readable by any decoder, however each decoder may decode the information differently.

PAGE 9

2 1.2 Zero Level Sample Padding Part of the coding/decoding process for MP3 files is to pad the beginning and end of the files with silence, or "Zero Level Samples" (ZLS). Depending on t he codec used and specs of the file, more or less zeroes will be added. Typically, each addition al re compression will add more zeroes to the beginning an d end of the file. See Figure 1 showing the expected number of zeroes for each generation MP3 compress ion. This hypothesis is what is being tested in this study. Figure 1 Expected number of ZLS for each additional MP3 compression The exact number of zeroes will vary, but this graph shows a general trend we would expect to see. 1.3 Purpose of the Study The purpose of this study is not to determine the cause/reasoning behind zero level sample padding, but rather to provide a comprehensive sample of files and their respective padding.

PAGE 10

3 With this study, we hope to be able to find o ut how many generations of MP3 compression a file has undergone, solely based on the zero level samples at the beginning and end of a file and on each channel. The default format for most handheld recorders is MP3, so we expect to see some amount of ZLS on an authentic, unedited file. If a file were to be loaded into an audio editor, manipulated, and re saved, we would expect that the codec used by the audio editor would add more zeroes to the file, telling us that the file was re compressed. By collecting files created by various handheld recorders the n re compressed with various programs/codecs, the purpose is to develop a database to reference when analyzing a file for authenticity. For example, if "File A" was claimed to be recorded with "Recorder A" and has 2000 zeroes at the beginning of the file, yet all sample recordings from "Recorder A" only have ~500 zeroes at the beginning, this is a sign that the file was likely re compressed.

PAGE 11

4 CHAPTER II MATERIALS AND METHODS The procedure of this study was to take various handheld audio recorders and record samples of audio, re compress that audio using different codecs, and analyze the number of zeroes on the file for each generation. The handheld recorders used were: (5) Tascam DR 07's (1) Olympus DM 520 (1) Zoom H1 (1) Marantz PMD620 (1) Philips LFH0882 (2) Olympus WS 700's Each recorder recorded ten (10) files. The files were all of varying loudness: (3) Recordings of loud music (3) Recordings of moderate speech (4) Recordings of silence The total numb er of files came to 110 (10 recordings on 11 recorders). The next process was to recompress each file a number of times. Most of the programs used could both encode and decode data meaning that they can read and create MP3 files. The process for such prog rams was as follows: 1. Convert each original MP3 file to uncompressed .wav PCM files 2. Convert new .wav PCM files back to MP3 3. Repeat for a total of four (4) generations of .wav file s

PAGE 12

5 4. Read the number of zeroes at the beginning and end of each .wa v file on each channel. Various programs/codecs used were strictly coders, not decoders, meaning that could only create MP3 files from an uncompressed .wav file and not read them/convert to .wav. The process for such programs was as follows: 1. Convert e ach original MP3 file to .wav PCM using dBpoweramp 2. Convert new .wav PCM files to MP3 using the code in question 3. Repeat for a total of four (4) generations of .wav file 4. Read the number of zeroes at the beginning and end of each .wav file on each channel. Some programs were purely decoders, meaning that they could read an MP3 file and save it as a .wav, but not convert to MP3. The process for such files was as follows: 1. Convert each original MP3 file to uncompressed .wav file using the decoder in question 2. Convert new .wav files back to MP3 using Adobe Audition CC2015 3. Repeat for a total of four (4) generations of .wav file 4. Read the number of zeroes at the beginning and end of each .wav file on each channel. The reason for converting each MP3 file to .wav before reading the ZLS is because each program potentially uses a different decoder to read the files. This means that opening the same MP3 file in different programs will potentially show a different level of zero level samples. By using each decoder to create uncompressed .wav files, we essentially "print" the number of zeroes imparted on the file by that decoder. Opening

PAGE 13

6 this new .wav file in any program will show the same number of zeroes because no decoder is being used to read it. Thi s way, all of the .wav files could be read/interpreted by the same MATLAB script without MATLAB adding (or subtracting) more zeroes All files were recorded at 16bit/128kbps mp3 with the exception of the files from the Phillips LFH0882 which does not have the capability to record at 128kbps. These files were recorded at 16bit/192kbps instead. The following programs were used to create the sample files: Table 1 Programs/Codecs Used Program Program Version MP3 Codec Used Audacity 2. 1.1 LAME 3.98.3 Adobe Audition CS3 CS3 Fraunhofer Adobe Audition CC2014 CC2014 Fraunhofer Adobe Audition CC2015 CC2015 Fraunhofer Blade Encoder 0.82 Blade 0.82 dBpoweramp 15.3 LAME 3.99 ffmpeg 2.7.1 LAME 3.99.5 fpMP3 Encoder 1.0.0.2 fpMP3 1.0.0.2 Gogo Encoder 3.13 Gogo 3.13 Apple iTunes 12.3.0.44 Fraunhofer Apple Logic Pro 9.1.8 Fraunhofer LAME (Mac Terminal) 3.99.5 LAME 3.99.5 Mad Decoder 0.15.2b Mad 0.15.2b Mpg123 Decoder 1.22.0 Mpg123 1.22.0

PAGE 14

7 Avid Pro Tools 11.3.1 Fraunhofer SoX (Mac Terminal) 14.4.2 LAME 3.99.5 NCH Software Switch 4.85 LAME 3.97 Steinberg Wavelab (LAME) 8 LAME 3.98 .4 Steinberg Wavelab (Fraunhofer) 8 Fraunhofer 4.1.3 Xing Encoder 1.5.0.5 Xing 1.5.0.5 The only settings modified in the programs was the bitrate. This was done in order to keep each new file at 16bit/ 128kbps CBR All other settings were left at the default. While there were 110 original files, file "H1_L3.mp3" appeared to be corrupt when opening in certain programs and displayed an abnormal ly high amount of zeroes. This outlier was omitted when testing the following programs: Adobe Audition CS3 Avid Pro Tools Switch

PAGE 15

8 CHAPTER III RESULTS Legend for Data Tables/Graphs: Term Definition Initial Average Average of both left and right channels at the beginning of the file. Final Average Average of both left and right channels at the end of the file. ZLS "Zero Level Samples", number of zero level samples. Generation Numb er of the wav file generation.

PAGE 16

9 3 .1 Audacity Generation I II III IV Initial Average 3.2 3.5 3.1 3.4 Initial Standard Deviation 3.1 3.4 3.4 3.5 Final Average 0 0 0 0 Final Standard Deviation 0 0 0 0 Table 3 .1: Audacity Figure 3 .1a: Initial Average Audacity Figure 3 .1b: Final Average Audacity

PAGE 17

10 3.2 Adobe Audition CS3 (O utlier Omitted ) Generation I II III IV Initial Average 1211.9 1973.8 3035.3 4191 Initial Standard Deviation 901.5 879.9 909.9 901.4 Final Average .2 1604.5 4385.9 7099.3 Final Standard Deviation 2.4 1976 3353.6 4625 Table 3.2 : Adobe Audition CS3 (Outlier Omitted) Figure 3.2a : Initial Average Adobe Audition CS3 (Outlier Omitted) Figure 3.2b : Final Average Adobe Audition CS3 (Outlier Omitted)

PAGE 18

11 3 .3 Adobe Audition CC2014 Generation I II III IV Initial Average 1206.9 848.8 843 837.7 Initial Standard Deviation 898.9 788.5 785.1 785.1 Final Average .2 0 0 0 Final Standard Deviation 2.4 .2 .2 .2 Table 3 .3: Adobe Audition CC2014 Figure 3. 3a: Initial Average Adobe Audition CC2014 F igure 3 .3b: Final Average Adobe Audition CC2014

PAGE 19

12 3 .4 Adobe Audition CC2015 Generation I II III IV Initial Average 1206.9 848.8 843 837.7 Initial Standard Deviation 898.9 788.5 785.1 785.1 Final Average .2 0 0 0 Final Standard Deviation 2.4 .2 .2 .2 Table 3 .4: Adobe Audition CC2015 Figure 3 .4a: Initial Average Adobe Audition CC2015 Figure 3 .4b: Final Average Adobe Audition CC2015

PAGE 20

13 3 .5 Blade Encoder Generation I II III IV Initial Average 1206.9 2109.2 3097.7 4106.7 Initial Standard Deviation 898.9 954.8 953.6 956.4 Final Average 0 0 0 0 Final Standard Deviation .3 0 0 0 Table 3 .5: Blade Encoder Figure 3 .5a: Initial Average Blade Encoder Figure 3 .5b: Final Average Blade Encoder

PAGE 21

14 3 .6 dBpoweramp Generation I II III IV Initial Average 1206.9 1206.9 1054.8 1037.1 Initial Standard Deviation 898.9 989.9 907.6 905.3 Final Average .1 .1 .1 .1 Final Standard Deviation .3 .3 .3 .1 Table 3 .6: dBpoweramp Figure 3 .6a: Initial Average dBpoweramp Figure 3 .6b: Final Average dBpoweramp

PAGE 22

15 3 .7 Ffmpeg Generation I II III IV Initial Average 825.1 347 21.2 7.1 Initial Standard Deviation 806.9 378.8 42.2 10.7 Final Average 0 .1 361.4 498 Final Standard Deviation .1 .5 133.6 235 Table 3 .7: ffmpeg Figure 3 .7a: Initial Average ffmpeg F igure 3 .7b: Final Average ffmpeg

PAGE 23

16 3 .8 Fpmp3 Generation I II III IV Initial Average 1206.9 694.6 375.7 112.2 Initial Standard Deviation 898.9 711.9 445.8 195.7 Final Average .1 0 0 0 Final Standard Deviation .3 0 0 0 Table 3 .8: fpMP3 Figure 3 .8a: Initial Average fpMP3 Figure 3 .8b: Final Average fpMP3

PAGE 24

17 3 .9 Gogo Generation I II III IV Initial Average 1206.9 694.6 375.7 112.2 Initial Standard Deviation 898.9 711.9 445.8 195.7 Final Average .1 0 0 0 Final Standard Deviation .3 0 0 0 Table 3 .9: Gogo Figure 3 .9a: Initial Average Gogo Figure 3 .9b: Final Average Gogo

PAGE 25

18 3 .10 Apple iTunes Generation I II III IV Initial Average 502.9 460.3 441.7 433.8 Initial Standard Deviation 549.7 515 503.7 498.3 Final Average 0 0 0 0 Final Standard Deviation 0 0 0 0 Table 3 .10: iTunes Figure 3 .10a: Initial Average iTunes Figure 3 .10b: Final Average iTunes

PAGE 26

19 3 .11 Apple Log ic Pro Generation I II III IV Initial Average 786.2 691.6 669.7 658.9 Initial Standard Deviation 782.5 731.5 717.6 709.6 Final Average 492.2 4.4 .4 .9 Final Standard Deviation 22.4 31.7 6.1 11 Table 3 .11: Apple Logic Pro Figure 3 .11a: Initial Average Apple Logic Pro Figure 3 .11b: Final Average Apple Logic Pro

PAGE 27

20 3 .12 Lame (Mac Terminal) Generation I II III IV Initial Average 787.4 723.5 694.4 672.6 Initial Standard Deviation 783.2 745.2 723.9 715 Final Average 0 0 0 0 Final Standard Deviation .3 .1 .1 .3 Table 3. 12: LAME 3.99.5 (Mac Terminal) Figure 3 .12a: Initial Average LAME (Mac Terminal) Figure 3 .12b: Final Average LAME (Mac Terminal)

PAGE 28

21 3 .13 Mad Generation I II III IV Initial Average 1 1 1 1 Initial Standard Deviation 0 0 0 0 Final Average 0 .7 1.1 1.4 Final Standard Deviation .2 2.9 4 4.7 Table 3 .13 : Mad Figure 3 .13 a: Initial Average Mad F igure 3 .13 b: Final Average Mad

PAGE 29

22 3 .14 Mpg123 Generation I II III IV Initial Average 881.8 1994 3048.8 4144.5 Initial Standard Deviation 874.2 905.1 924.8 927.2 Final Average .5 324.5 1178.5 2124.8 Final Standard Deviation 2.1 262.3 351.7 421.8 Table 3 .14 : mpg123 Figure 3 .14 a: Initial Average mpg123 Figure 3 .14 b: Final Average mpg123

PAGE 30

23 3 .15 Avid Pro Tools (Outlier Omitted ) Generation I II III IV Initial Average 793.5 1060.1 1537.4 2168.5 Initial Standard Deviation 784.3 817.8 932 924.6 Final Average 2314.1 3165.2 3630.4 4845 Final Standard Deviation 1688.1 1727.1 2516.2 2792 Table 3 .15 : Avid Pro Tools (Outlier Omitted) Figure 3 .15 a : Initial Average Pro Tools (Outlier Omitted) Figure 3 .15 a : Final Average Pro Tools (Outlier Omitted)

PAGE 31

24 3 .16 SoX Generation I II III IV Initial Average 1206.9 2111.3 3178.9 4261.5 Initial Standard Deviation 898.9 968.4 967 964.4 Final Average 0 0 0 0 Final Standard Deviation 0 0 0 0 Table 3 .16 : SoX Figure 3 .16 a: Initial Average SoX Figure 3. 16 b: Final Average SoX

PAGE 32

25 4.17 S witch (Outlier Omitted) Generation I II III IV Initial Average 1.6 1 1.1 5.1 Initial Standard Deviation 2 0 .4 21.2 Final Average .1 0 0 0 Final Standard Deviation 1.1 0 0 0 Table 3 .17 : Switch (Outlier Omitted) Figure 3 .17 a : Initial Average Switch (Outlier Omitted) Figure 3 .17 b : Final Average Switch (Outlier Omitted)

PAGE 33

26 3 .1 8 Wavelab (Fraunhofer) Generation I II III IV Initial Average 1256.5 2171.7 3284.8 4475.9 Initial Standard Deviation 886 883.5 917.1 938.7 Final Average 1.6 500.7 1393.6 2383.8 Final Standard Deviation 7.1 247.1 397 469.6 Table 3 .18 : Wavelab (Fraunhofer) Figure 3 .18 a: Initial Average Wavelab (Fraunhofer) Figure 3 .18 b: Final Average Wavelab (Fraunhofer)

PAGE 34

27 3 .19 Wavelab (Lame) Generation I II III IV Initial Average 1256.5 2258.2 3326.5 4436 Initial Standard Deviation 886 907.2 875.8 870.2 Final Average 1.6 1.5 3.2 4.6 Final Standard Deviation 7.1 6.9 13.5 18.1 Table 3 .19 : Wavelab (Lame) Figure 3 .19 a: Initial Average Wavelab (LAME) Figure 3 .19 b: Final Average Wavelab (LAME)

PAGE 35

28 3 .20 Xing Generation I II III IV Initial Average 1206.9 1842.8 2824.9 3831.5 Initial Standard Deviation 898.9 807.6 866.1 877.1 Final Average .1 0 .1 0 Final Standard Deviation .3 0 .4 .2 Table 3 .20 : Xing Figure 3 .20 a: Initial Average Xing Figure 3 .20 b: Final Average Xing

PAGE 36

29 3 .21 All Lame Programs This section compiles all programs that used LAME. The chart "Initial Average All LAME Programs" and "Final Average All LAME programs" show all LAME programs superimposed Generation I II III IV Audacity 825.1 347 21.2 7.1 Wavelab LAME 1256.5 2258.2 3326.5 4436 Switch 1.6 1 1.1 5.1 ffmpeg 825.1 347 21.2 7.1 dBpoweramp 1206.9 1206.9 1054.8 1037.1 LAME 787.4 723.5 694.4 672.6 SoX 1206.9 2111.3 3178.9 4261.5 Average 755.4 950.2 1182.9 1489 Table 3 .21 : Initial Average of all LAME programs Figure 3 .21 a: Initial Average All LAME Programs Figure 3 .21 b: Final Average All LAME

PAGE 37

30 3 .22 All Fraunhofer Programs This section compiles all programs that used Fraunhofer. The chart "Initial Average All Fraunhofer Programs" and "Final Average All Fraunhofer programs" show all Fraunhofer programs superimposed Generation I II III IV Audition CS3 1211.9 1973.8 3035.3 4191 Audition CC2014 1206.9 848.8 843 837.7 Audition CC2015 1206.9 848.8 843 837.7 iTunes 502.9 460.3 441.7 433.8 Logic Pro 786.2 691.6 669.7 658.9 Pro Tools 793.5 1060.1 1537 2168.5 Wavelab (Fraunhofer) 1256.5 2171.7 3284.8 4475.9 Average 995 1150.7 1522.1 1943.4 Table 3 .22 : All Fraunhofer Programs Figure 3 .22 a: Initial Average All Fraunhofer Programs

PAGE 38

31 Figure 3 .22 b : Final Average All Fraunhofer Programs

PAGE 39

32 CHAPTER I V DISCUSSION While the original hypothesis of this study was that the number of zero level samples at the beginning and end of each file would increase with each additional compression, that was found to not always be the case. While some programs showed the general upward trend of adding more zeroes after each additional co mpression, many did not. The ones that did not varied from having no zeroes on the files whatsoever to starting with a substantial amount then going down after each recompression. Programs/Codecs that behaved as expected (number of ZLS grew after each reco mpression) : Wavelab (LAME) Wavelab (Fraunhofer) Pro Tools Audition CS3 Blade Mpg123 Xing SoX Programs that did not behave as expected (number of ZLS did not grow after each recompression) : ffmpeg Switch Audacity

PAGE 40

33 Logic Pro dBpoweramp iTunes Audition CC2014 Audition CC2015 Mad fpMP3 Gogo LAME (Mac Terminal) Only 8 out of the 20 programs (40 %) of the pro grams behaved as expected vs. 12 out of 20 (60 %) which did not. Among the programs that did not behave as expected, some had a high initial number of zeroes then decreased, whereas some had a nominally small number of zeroes throughout each re compression. The programs that had a high initial number of zero level samples and decreased throughout were: ffmpeg Logic Pro dBpoweramp iTunes Adobe Audition CC2014 Ad obe Audition CC2015 fpMP3

PAGE 41

34 Gogo LAME (Mac Terminal) The programs that had a nominally small number of zeroes throughout were: Switch Audacity Mad One possible reason behind the zero level samples is for gapless playback information 1 Depending on the exac t codec and specs used, the encoder and decoder will add a number of samples. This is used as a buffer by many different audio players to ensure "gapless playback". However, in more recent revisions of MP3 codecs, developers have been able to remove this b uffer to some extent As LAME states in their FAQ: Starting with LAME 3.55, we hav e a new MDCT/filterbank routine written by Takehiro Tominaga with a 48 sample delay. With even more rewriting, this could be reduced to 0." Other more popular codecs such as Fraunhofer also state that they are working to reduce the encoder and decoder delays 2 When using LAME directly in the MAC terminal, a message is displayed when decoding MP3 files which clearly states that the zero level samples are being skipped: MP3 Generation 1 Files Being Decoded to WAV Generation 1

PAGE 42

35 MP3 Generation 2 Files Being Decoded to Wav Generation 2 MP3 Generation 3 Files Being Decoded to WAV Generation 3 MP3 generation 4 Files Being Decoded to WAV Generation 4 Note the phrases Skipping initial 529 samples (encoder+decoder delay)" and Skipping final 567 samples (encoder padding decoder delay)". According to Mark Taylor's LAME FAQ, available at http://lame.sourceforge.net/te ch FAQ.txt most MP3 codecs have a roughly 528 sample decoder delay and 528 sample encoder delay. At the time of the writing of that FAQ, LAME was working to reduce this delay further. As stated by the current version (3.99.5) much of that delay is now ign ored when decoding, meaning the resulted .wav files will have no additional zero level samples added by the codec. This does not, however, mean that files decoded with LAME will have no zeros. The data collected shows that the files will have a very simila r number of zeros as the original file did. In other words, regardless of how many generations are created with LAME 3.99.5, the file will always have a similar number of zeros. The four generations created in this stuffy showed a slightly decreasing amoun t of zeros in each generation on average, but most individual files showed no decrease.

PAGE 43

36 4.2 Inter Program Variance While one would expect multiple programs that use the same MP3 library to produce identical results, this was found to not always be the case. Take, for example, SoX and ffmpeg. Both programs were executed in the Mac terminal and use LAME 3.99.5. The two p rograms produced different results. It is clear that more than just the MP3 codec used factors into the amount of zero level sample padding added.

PAGE 44

37 CHAPTER V FUTURE RESEARCH / FRAMEWORK Any future research on this topic would be useful in developing a more comprehensive database with which to compare evidence files. The most popular codecs used today are LAME and Fraunhofer, but each has many different revisions with different behaviors. An interesting topic for further research would be ZLS variance within different versions of the same codec. This way, if we know which version of a codec was used to compress a file (sometimes displayed in the file metadata) we can more accurately estimat e how many times the file was recompressed. An interesting topic of research would be to study each version of a certain codec to see how its behavior changes over time. The current framework for audio authentication includes more than just zero level sam ple analysis. One of the more commonly used methods for audio authentication is a metadata/structure analysis. This analysis includes looking at the metadata of an audio file to see if it is consistent with an authentic file. A proposed addition to this cu rrent framework is to combine these two analyses (zero level sample and metadata analysis). By looking at the metadata of a file we are more likely able to determine the codec and version used to compress the file. For example, files encoded with LAME will typically have "LAME" and the version number in the header of the file.

PAGE 45

38 Figure 5 : Hex information of a 4 th Generation MP3 file created with Steinberg Wavelab

PAGE 46

39 CHAPTER VI CONCLUSION In the past examiners may have used the number of zero level sample on an MP3 file to estimate whether or not the file has been re compressed multiple times. According to the results of this study m any of the new versions of codecs no longer add a significant amount of zeroes to the file. As a result of this development, it is not necessarily accurate for all codecs to use the number of zero level samples on the file to determine the number of re compressions. It can be effective for some codecs, but not all Depending on the code c used, w e can determine that a file was likely recompressed at least once, but not always exactly how many times it was edited. Take, for example, a file in question which has no zero level sample padding at the beginning or end of the file. It is unlike ly that the file is directly from the recorder, as none of the original files had no zero level samples straight from the recorder. We cannot, however, make an accurate estimate as to how many times the file was re compressed unless we know which codec and which version were used Based on this sample data alone, we know that a file with no zero level samples at the beginning could be a 4 th generation compressed with LAME 3.99.5, a 3 rd generation compressed with Logic Pro, or a number of other possibilities Most of the recorders used use the Fraunhofer codec 3 but the exact version is not given. Based on the fact that most of the original files had some number of zero level samples at the beginning of the files we can guess that they are using an older version of the codec, but we cannot be sure.

PAGE 47

40 By consideri ng the findings of this study with the already existing parts of the audio authentication framework, we can improve our methods and ensure more accurate authentication results in the future.

PAGE 48

41 REFERENCES 1 Taylor, Mark. "LAME Technical FAQ." LA ME Technical FAQ LAME, June 2000. Web. 14 Oct. 2015. . 2 Allamanche, Eric, Ralf Gieger, JŸrgen Herre, and Thoma Sporer. "MPEG 4 Low Delay Audio Coding Based on the AAC Codec." Audio Engineering Society (1999). We b. 14 Oct. 2015. 3 Product manuals available on manufacturer websites Sripada, P. (2006). MP3 Decoder in Theory and Practice Masters Thesis Blekinge Institute of Technology, Sweden Raissi, R. (2002). The Theory Behind MP3