Why do we like the things we like? There are well-understood
physiological reasons why we like ice cream. It's harder to
understand why we like those little dogs that are tiny versions of
really large dogs, but I read something once about a predisposition
toward neonatany, so that probably explains that. Within the realm of
art, however - painting, say, or music - the arguments are less clear,
though probably more loudly made and religiously cherished by their
makers. There is something divine about taste. At least, the
taste-makers would have it so.
Put more concretely, why are some people rabid about Bach, some about
Elvis, and others about Cannibal Corpse? More interestingly, why are
the rabid in each group equally rabid in their dislike of the others?
Recent research[1-2] has opened the door to sensible,
neurologically-grounded explanations of these kinds of phenomena. It
has addressed its efforts toward a limited domain -- perceptual
pleasure in visual scenes -- but the ideas behind those efforts can be
considered information-theoretic, and therefore generalizeable to any
domain. The trick, of course, is getting the information formulation
right, which I entirely failed to do in this project.
Imaginary Experimental Protocol
The idea was as follows:
- Gather two groups of different "similar" musical pieces. (The exact notion of "similar" is problematic, but nevermind.) For example, group A could be Baroque, group B could be death metal.
- Build two "generators" that can spit out pieces in the style of those other sets of musical pieces. (Nevermind what a "generator" is for now.)
- Train two sets of subjects, one set on pieces produced by GeneratorA, the other on pieces produced by GeneratorB. After the training the subjects should be familiar w/ the kinds of music produced by the generators.
- Make modified generators, GeneratorA' and GeneratorB', that are mostly similar to, but a little different from, GeneratorA and GeneratorB, respectively.
- Play pieces from GeneratorA' and GeneratorB' to all the subjects.
- Subjects trained on GeneratorA should prefer pieces produced on GeneratorA' to those produced by GeneratorB'.
The idea, based on the Biederman paper, is that cognitive pleasure
depends on some combination of familiarity and novelty, and that the
slightly familiar will be more pleasurable than the wholly novel. In
other words:
"Big deal," you say, "I knew that already." But did you know
exactly how much familiarity, and how much novelty? Did you even know
how to talk quantitatively about familiarity and novelty with respect
to music? That's the interesting question, and what I wanted to
address.
I didn't run the experiment described above. The final two steps of
that experiment would have gathered a bunch of data from subjects,
from which I could have tried to ferret out relationships between
familiarity, novelty, and pleasure, and which would have been very
hard. Instead, I skipped that part, and tried to get a handle on what
it meant for GeneratorA and GeneratorA' to differ. In other words, to
get an idea about the relative amount of information in the different
generators.
Factor Oracles
Generators were built using a factor oracle. A factor oracle is
an automaton built iteratively from some corpus, and which then
permits fast substring determinations. In other words, after you
build the oracle, you can query it to ask if some string is a
substring of the corpus on which it was built. This has useful
musical application in that it allows both recognition and generation
of structured data wthout all the tedious mucking about with
probabilities that you get with standard n-gram models. This code was
written in Lisp, is hideously ugly because it was rushed, but you can
see it here if you must. It operates on
lists of midi events. The code that takes .mid files and turns it
into events, and takes files of events and turns them into .mid files,
is written in Python, and is here, with the
same caveats.
There are a lot of different ways you can use a factor oracle for
music generation, but first you have to decide what the states mean,
and what the transitions mean. In this code the states are, uh,
states, and reflect the sequence of notes (and silence) that have been
received (or generated) thus far. The transitions are the notes that
are emitted at each time point. Specifically, they are midi note-on
events. For the purposes of equivilence, when constructing the
oracle, the events are specified by the note (pitch) alone; the
velocity (amplitude) is not taken into account. Further, all midi
note events are collapsed into the same channel, and all changes of
instrumentation or anything else are lost.
In the course of working with this proto-experiment I made three
generators: one from some Japanese folk songs, one from some Jewish
folk songs, and one from a fusion of the two, though a bit weighted
toward the Japanese. Then, using each of these generators, I
generated a song. Here they are:
Japanese generator
Jewish generator
Fusion generator
Information content of music
If we can
generate music based on other music, then we could presumably get
people to opine on what they think about it. We could, in this way,
determine that people trained on GeneratorA prefer GeneratorA' to
GeneratorB', if it happened that they did. The next important step is
to ask: how exactly does GeneratorA compare to GeneratorA'? What does
this preference mean? How can it be quantified?
Unfortunately, I didn't come up with a satisfying answer to this
question. Using the notion of "runs" in music I came up with some
data comparing the generators to each other. A run is a consecutive
string of recognized (or unrecognized) events. For instance, for a
"hello" generator, the string "hellohello" contains two runs, each of
length five, whereas "helloxxxhello" contains two (positive) runs of
length five, and one (negative) run of length three. This begins to
capture some of the informational aspects of musical sequences. We'd
expect similar generators to have similar run profiles, skewed heavily
toward positive runs, with few negative runs. The results, using the
pieces I made from the generators I extracted, are less clear. See the report for the graphs.
Analysis
The subject of information
content in music, and its relation to aesthetics, deserves better
treatment than the poor one I've given it here. Setting aside the
issues involved in formulating a factor oracle music generator, and
the assorted problems in the methodology I've outlined, a central
issue of quantifying musical information, relative to some knowledge
base, is an important one. Returning to the string example, it seems
reasonable that, w/ reference to a "hello" generator, "helloxxxhello"
and "xxxhellohello" are different entities, with the same run profile.
Were they encoded musically, they would likely evoke different
reactions. A better nomenclature is necessary before issues like this
can be addresses in earnest.
[1] Biederman, I., & Vessel, E. A. (2006). Perceptual Pleasure and the Brain. American Scientist, 94, 249-255
[2] Yue, X., & Vessel, E. A., & Biederman, I. (2007). The neural basis of scene preferences. NeuroReport v.18 n.6, 525-529
The University of Southern California does not screen or control the content on this website and thus does not guarantee the accuracy, integrity, or quality of such content. All content on this website is provided by and is the sole responsibility of the person from which such content originated, and such content does not necessarily reflect the opinions of the University administration or the Board of Trustees