Looking for taste: musical aesthetics

Shane Hoversten
CSCI 575b

Why do we like the things we like? There are well-understood physiological reasons why we like ice cream. It's harder to understand why we like those little dogs that are tiny versions of really large dogs, but I read something once about a predisposition toward neonatany, so that probably explains that. Within the realm of art, however - painting, say, or music - the arguments are less clear, though probably more loudly made and religiously cherished by their makers. There is something divine about taste. At least, the taste-makers would have it so.

Put more concretely, why are some people rabid about Bach, some about Elvis, and others about Cannibal Corpse? More interestingly, why are the rabid in each group equally rabid in their dislike of the others? Recent research[1-2] has opened the door to sensible, neurologically-grounded explanations of these kinds of phenomena. It has addressed its efforts toward a limited domain -- perceptual pleasure in visual scenes -- but the ideas behind those efforts can be considered information-theoretic, and therefore generalizeable to any domain. The trick, of course, is getting the information formulation right, which I entirely failed to do in this project.


Imaginary Experimental Protocol

The idea was as follows:

  1. Gather two groups of different "similar" musical pieces. (The exact notion of "similar" is problematic, but nevermind.) For example, group A could be Baroque, group B could be death metal.
  2. Build two "generators" that can spit out pieces in the style of those other sets of musical pieces. (Nevermind what a "generator" is for now.)
  3. Train two sets of subjects, one set on pieces produced by GeneratorA, the other on pieces produced by GeneratorB. After the training the subjects should be familiar w/ the kinds of music produced by the generators.
  4. Make modified generators, GeneratorA' and GeneratorB', that are mostly similar to, but a little different from, GeneratorA and GeneratorB, respectively.
  5. Play pieces from GeneratorA' and GeneratorB' to all the subjects.
  6. Subjects trained on GeneratorA should prefer pieces produced on GeneratorA' to those produced by GeneratorB'.

The idea, based on the Biederman paper, is that cognitive pleasure depends on some combination of familiarity and novelty, and that the slightly familiar will be more pleasurable than the wholly novel. In other words:

"Big deal," you say, "I knew that already." But did you know exactly how much familiarity, and how much novelty? Did you even know how to talk quantitatively about familiarity and novelty with respect to music? That's the interesting question, and what I wanted to address.

I didn't run the experiment described above. The final two steps of that experiment would have gathered a bunch of data from subjects, from which I could have tried to ferret out relationships between familiarity, novelty, and pleasure, and which would have been very hard. Instead, I skipped that part, and tried to get a handle on what it meant for GeneratorA and GeneratorA' to differ. In other words, to get an idea about the relative amount of information in the different generators.


Factor Oracles

Generators were built using a factor oracle. A factor oracle is an automaton built iteratively from some corpus, and which then permits fast substring determinations. In other words, after you build the oracle, you can query it to ask if some string is a substring of the corpus on which it was built. This has useful musical application in that it allows both recognition and generation of structured data wthout all the tedious mucking about with probabilities that you get with standard n-gram models. This code was written in Lisp, is hideously ugly because it was rushed, but you can see it here if you must. It operates on lists of midi events. The code that takes .mid files and turns it into events, and takes files of events and turns them into .mid files, is written in Python, and is here, with the same caveats.

There are a lot of different ways you can use a factor oracle for music generation, but first you have to decide what the states mean, and what the transitions mean. In this code the states are, uh, states, and reflect the sequence of notes (and silence) that have been received (or generated) thus far. The transitions are the notes that are emitted at each time point. Specifically, they are midi note-on events. For the purposes of equivilence, when constructing the oracle, the events are specified by the note (pitch) alone; the velocity (amplitude) is not taken into account. Further, all midi note events are collapsed into the same channel, and all changes of instrumentation or anything else are lost.

In the course of working with this proto-experiment I made three generators: one from some Japanese folk songs, one from some Jewish folk songs, and one from a fusion of the two, though a bit weighted toward the Japanese. Then, using each of these generators, I generated a song. Here they are:

Japanese generator

Jewish generator

Fusion generator

Information content of music

If we can generate music based on other music, then we could presumably get people to opine on what they think about it. We could, in this way, determine that people trained on GeneratorA prefer GeneratorA' to GeneratorB', if it happened that they did. The next important step is to ask: how exactly does GeneratorA compare to GeneratorA'? What does this preference mean? How can it be quantified?

Unfortunately, I didn't come up with a satisfying answer to this question. Using the notion of "runs" in music I came up with some data comparing the generators to each other. A run is a consecutive string of recognized (or unrecognized) events. For instance, for a "hello" generator, the string "hellohello" contains two runs, each of length five, whereas "helloxxxhello" contains two (positive) runs of length five, and one (negative) run of length three. This begins to capture some of the informational aspects of musical sequences. We'd expect similar generators to have similar run profiles, skewed heavily toward positive runs, with few negative runs. The results, using the pieces I made from the generators I extracted, are less clear. See the report for the graphs.

Analysis

The subject of information content in music, and its relation to aesthetics, deserves better treatment than the poor one I've given it here. Setting aside the issues involved in formulating a factor oracle music generator, and the assorted problems in the methodology I've outlined, a central issue of quantifying musical information, relative to some knowledge base, is an important one. Returning to the string example, it seems reasonable that, w/ reference to a "hello" generator, "helloxxxhello" and "xxxhellohello" are different entities, with the same run profile. Were they encoded musically, they would likely evoke different reactions. A better nomenclature is necessary before issues like this can be addresses in earnest.

[1] Biederman, I., & Vessel, E. A. (2006). Perceptual Pleasure and the Brain. American Scientist, 94, 249-255
[2] Yue, X., & Vessel, E. A., & Biederman, I. (2007). The neural basis of scene preferences. NeuroReport v.18 n.6, 525-529

The University of Southern California does not screen or control the content on this website and thus does not guarantee the accuracy, integrity, or quality of such content. All content on this website is provided by and is the sole responsibility of the person from which such content originated, and such content does not necessarily reflect the opinions of the University administration or the Board of Trustees