Here's what I'm going to do to take a deeper look into the
prosodic nature of human-computer dialogues:
- previous work:
-
As an undergrad I annotated a portion of the
Communicator 2000 corpus and studied user behavior wrt
errorful regions of dialog (ICSLP 2002).
-
As far as prosody is concerned, at ASRU 2003 I looked at
acoustic features global to an utterance: a variety of
pitch, rms, and spectral features that can be seen as
acoustic correlates of prosody.
- Now I want to look at smaller components with in an utterance
- pitch features of pitch contours of voiced regions
- f0 min, max, range, median, and average for each
voiced region
- f0 slope, rise portion slope, fall portion slope,
magnitude of slope (rate of change)
- pitch contour characterizations: look into ToBi
and Fujisaki model.
- energy features
- changes in energy between voiced segments
- using energy from band-limited filters to
calculate speech rate (alternative: use spectral
moment).
- spectral features
- using spectral moment to calc speech rate.
- changes to a given phonetic context given a
prosodic or emotional context (like Chul and Serdar's
work)
- Tools
- Here's what I have to do:
- Get file list for all the COM2000 utterances.
- Make feature files
- Global features:
- similar to what I did before.
- incorporate tags in the data
structure (maybe some other stuff
too, eg maxima, minima, and/or
count/freq of the
local features w/in an utterance).
- Local features: (main focus of this study)
- divide utterance into voiced
regions/unvoiced regions (or
voiced/unvoiced/silence)
- time indeces, duration.
- min, max, range f0
- slope measurements (avg slope, rise/fall slope, rate of change)
- characterize pitch contour
- Arrangement of the feature files?
- Management of data
- need to query and sort efficiently DB or program or XML metadata.
The University of Southern California does not screen or control the content on this website and thus does not guarantee the accuracy, integrity, or quality of such content. All content on this website is provided by and is the sole responsibility of the person from which such content originated, and such content does not necessarily reflect the opinions of the University administration or the Board of Trustees