Grammar Induction for Musical MelodiesUniversiy of Southern California, Spring 2007 in ISE575/EE675/CSCI575/PSYCH675 by Reid Swanson (2007) |
|||||||||||
|
Syntactic AnalysisThe focus of this work is a syntactic analysis of musical melodies. Many modern syntactic approaches in linguistics are derived from Chomsky's early work on formal languages and phrase structure grammars (Chomsky 1956). Automatic syntactic parsers were immediately developed based on hand written phrase structure rules and the formal properties of the grammars. However, it wasn't until the introduction of the Penn Treebank (Marcus et al. 1993) that high accuracy syntactic parsers became available. The Treebank is a large corpus, roughly one million words, of hand annotated parse trees developed over several years by numerous experts and graduate students in Linguistics. The figure on the left, from Klein & Manning (2001) is an example of what a typical parse tree in the corpus looks like. These parsers have led to an explosion of new techniques, which in some cases have had dramatic improvements in language technologies such as information retrieval and question answering (Harabagiu 2003). It is not hard to imagine how a similar tool for music could be equally useful in analysis, teaching and generation. Unfortunately building such a large scale corpus requires the skill of experts, takes years of time to complete and requires exquisite organization, administrative and political skills. Supervised training on a hand annotated corpus usually results in higher performance but the availability of this data is rarely available due to the expense involved creating it. Although generally not as accurate unsupervised methods offer a solution when enough unlabeled data is available. One approach to automatically generating hierarchical tree structures on musical data is to cluster the data using similarity metrics. The figure on the right, from Tonal Pitch Space (Lerdahl 2001), shows an example tree using chord distance (a distance measure of the chord from the tonic) for clustering the data. Although these results so seem to produce interesting relationships it might also be interesting to try approaches developed for natural language processing. Many attempts and methods have been devised for unsupervised parsing of strings and despite the success of supervised learning in parsing research has continued in unsupervised parsing methods. In this work I utilize the constituent context model as proposed by Klein & Manning (2001) that obtains good performance compared to (impressive) baseline approaches (English is a very right branching language and it turns out that always choosing to attach to the right is a difficult baseline to beat). |
||||||||||
| Previous Next |