|
Jonathan Mooser Final Project: Java
Beat Detector Thursday,
April 28, 2005 [ Description (below) ][ Java Source Files ][ Sound Samples ]
The purpose of this project is to build software that
will automatically detect beats in a piece of recorded music using an algorithm
based on Eirc Scheirer’s paper Tempo and beat analysis of acoustic musical
signals [1]. It reads in a wav files, specified by a command line argument,
and plays back the music with a click inserted wherever it finds a beat. When
successful, the result it like listening someone snap their fingers in time
with a piece of music. Although
based mainly on Scheirer’s paper, I also borrowed algorithms from two
additional sources. The first is Beat This: A Beat Synchronization Project [2],
a web site at Rice University that implements an algorithm similar to
Scheirer’s using MATLAB. The second is Beat Detection Algorithms [3], a
web site run by Frederic Patin. This second web site describes a far less
sophisticated approach to beat detection than my other sources, but some of its
ideas proved useful in implementing my program. The details of my algorithm and
how I developed it are described below. 1. Input Format For simplicity, I restrict the audio input to 8-bit single-channel wav files sampled at 22050 samples/second. The files I used for testing were all 20 second clips cut from larger recordings. In principle, larger clips could be used, but memory limitations might cause the program to crash. 2. Frequency Analysis The first step is to break
the recording into 1024 sample frames and apply a Fast Fourier Transform (FFT)
to each. This process uses a sliding
window, that advances 256 samples at a time, so that each chunk overlaps with 6
others. This is probably the single largest departure from the methods
described in [1] and [2]. Scheirer uses a different type of frequency analysis
called an elliptical filter, then smoothes the results by convoluting with a
half-hanning window. The idea of
analyzing frequency one chunk at a time comes from [3]. This allows us to easily
find local peaks in the power spectrum at a resolution of 256/22050 = 11.6ms.
The chunk length of 1024 means we can frequencies as low as 22050/1024 = 21.5
Hz. Within each chunk, the FFT
output is summed into a small number of bands. Also at the suggestion of [3],
the bands are of differing sizes, with the lower bands being smaller. Through
trial and error, I chose 9 bands with upper limits of 100Hz, 200Hz, 300Hz,
400Hz, 600Hz, 800Hz, 1200Hz, 2400Hz, and 11025Hz. The sample rate of the input determines the upper limit of the highest
band. 3. Differentiation The Differentiation process
takes the output of step 2 and finds large increases for each band. The band
energy at each time is subtracted from the band energy at the next time. Those
results are then half-wave rectified, meaning that all values less than zero,
are clipped to zero. This differentiation data is
the basis of the actual tempo finding process. For each time, measured in 256
sample frames, we have an approximation for the energy increase for each band.
A large energy increases in a given band at a different time indicates a high
probability of finding a beat at that time. 4. Beat Finding The program searches for
beats beginning at a random point near the beginning of the input. Within a
certain window, the differentiation values are summed across bands and the
local maximum is taken to be an estimate of the first beat. This first guess
may not be very good, but combined with the comb filter (step 5), the estimates
will hopefully become successively better at the search progresses through the
input. 5. Tempo Estimation by
Comb Filter Once the time estimate for a
beat has been found, the program estimates the period of the current tempo be
searching backward from that point in time. The comb filter implemented here is
somewhat simpler than that described in [1] or [2]. For a given tempo estimate
T, at time t, my program simply find a sum across bands of the energy at times
t-T + t-2T + t-3T…. etc. Having an estimate of the
period T and time t, the next beat estimate is taken to be time t+T. The
program then returns to step 4, now searching a window centered around that new
time estimate. That peak is taken to be the second beat and the process
continues until it reaches the end of the input. Neither [1] nor [2] search
for energy peaks directly. Rather, they both apply a comb filter for all
possible periods over the entire length of the input signal. Both of those
systems, however work with input about 5 seconds long. Trying to work with much
longer inputs (>20 seconds) such an exhaustive search is not possible. It
makes more sense to perform the comb filter over a small window of time and use
the results as an estimate of tempo which may vary. 6. Click Insertion For every beat, the program
inserts a click into the audio signal. It plays back the results, allowing a
listener to evaluate how close the estimated beats are to the actual beats. 7. Results I tested the system on four
musical excerpts, each about 20 seconds long. Two of them, Let Forever Be,
by The Chemical Brothers and Amateur by St. Etienne, would generally be
classified as Techno. The results on these pieces were mixed, with the program
correctly identifying the beat about 50 per cent of the time. A more difficult
example was the jazz recording Descarga by Tito Puente. Although the
music has a strong beat, the percussion is often erratic, thus confusing the
program. The best results by far were those applied to an excerpt from the
second movement of Beethoven’s 9th. Several authors have
observed that classical music is the most difficult to track, but that was not
the case here. In all cases, the program
seemed to act more as a rhythm tracker than a beat tracker. Because it always
marks beats on an onset, the program will miss beats that do not coincide with
a strong onset. It will, however, often find repeated rhythmic patterns that
may be off the beat. Overall, the program is a
good start, and may become very accurate with further refinement. References[1] Tempo and Beat
Analysis of Acoustic Musical Signals,
Eric Scheirer. J. Acoust. Soc. Am. 103 (1), January 1998. [2] Beat This: A Beat
Synchronization Project, Cheng, Nazer, Uppuluri, Verret. http://www.owlnet.rice.edu/~elec301/Projects01/beat_sync/beatalgo.html [3] Beat Detection
Algorithms, Frederic Patin http://www.gamedev.net/reference/programming/features/beatdetection. |