My research interests include spoken dialogue management, coherent interaction, natural language processing. Specifically I'm interested in applying machine learning techniques to dialogue modeling using human-human corpus and with minimum annotation work.
Here are some of the projects I've worked on.
Virtual Humans ICT has been busy building life-sized virtual humans for simulation training environments. The project involves efforts from various disciplines such as computer animation, natural language processing, speech recognition and synthesis, emotion modeling, dialogue modeling, automated reasoning. These agents have the ability to plan and participate in complex social interactions such as negotiations. I have been closely involved with the dialogue management for virtual humans and have developed natural language generation module for the same. (Traum et. al., 2005) One of the bottlenecks involved in the development of the virtual humans for different scenarios is the annotation work that must be carried out. The knowledge rich rule-based approaches need such kinds of annotations and manual effort for rule-writing. I've been working on bootstrapping the agents from un-annotated human-human dialogue corpus. (Gandhe and Traum, 2007)
Coherent interactions Free text questions and pre-recorded video as a response is a recurrent theme in NL interfaces. It has proved very useful in various applications for training and entertainment as well. Users are allowed to input a free-text question which in turn elicits a pre-recorded video response. Although the video response tends to have very good value in terms of immersive experience, the very design of the system allows for a lack of coherence. It is due to the cases when there are no video responses directly answering the question or are not phrased in a desired manner. We tried to address this issue by introducing short linking dialog between question and answer to bridge the gap. Experiments were carried out to assess whether such linking dialogs can increase the coherence of interaction. It proved that interactions with human-generated linking dialogs have significantly better coherence. (Gandhe et. al., 2004) Further analysis of human-generated linking dialogs reveals that these carry more information than present in the answer or the question. This leads us to realize the need for a knowledge base behind such a system. We have built such a knowledge base and have implemented first techniques for creating simple computer generated linking dialogs. (Gandhe et. al., 2006)
Speech-to-speech translation I helped develop a speech to speech translation system for medical domain. (Narayanan et. al., 2004, Belvin et. al., 2005) Using this system, an English speaking doctor can communicate with a Farsi speaking patient and carry out the medical diagnosis. The system is composed of many modules, viz. automatic speech recognizer, machine translation, dialogue manager, GUI and speech synthesis. My work focused on the dialog manager. A java based GUI facilitates the communication between patient and the doctor. Only one participant, the doctor, can control the interaction. The GUI also shows the history of the current dialog along with possible next utterances the doctor may choose to speak. The dialog manager component in this system is different from most of the dialog systems, in the sense that it has no active participation in carrying out the dialog. It can only assist the communication process.