Kartik Audhkhasi's Homepage
Ph.D. Candidate
Signal Analysis and Interpretation Lab (SAIL),
Advisor: Prof. Shrikanth Narayanan,
Electrical Engineering Department,
Viterbi School of Engineering,
University of Southern California, Los Angeles, U.S.A.
Previous Affiliations:
2003-2008: BTech. in Electrical Engineering,
MTech. in Information and Communication Technology,
Indian Institute of Technology - Delhi, India
Research Interests
- Specific: Modeling, analysis and design of ensembles of diverse human and machine experts. Applications include speech processing and recognition, ensemble methods in machine learning, crowd-sourcing, behavioral and multi-modal signal processing.
- Generic: Machine learning, statistical signal processing (especially stochastic resonance), optimization, speech recognition and natural language processing.
Publications
(Lemma: Papers under submission/review are not listed)
Peer-Reviewed Journals
- Kartik Audhkhasi and Shrikanth Narayanan, "A globally-variant locally-constant model for fusion of labels from multiple diverse experts without using reference labels", IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), vol. 35, no. 4, pp. 769-783, April 2013. (Preprint) (Software) (USC EE Best Paper Award, 2013)
- Kartik Audhkhasi and Arun Kumar, "Two-scale auditory feature based non-intrusive speech quality evaluation", IETE Journal of Research, vol. 56, no. 2, pp. 111-118, March-April 2010. (Web link)
- Kartik Audhkhasi, "Automatic evaluation of fluency in spoken language" (Review article), IETE Technical Review, vol. 26, no. 2, pp. 108-114, March-April 2009. (Web link) (IETE M. N. Saha Memorial Award for Best Application Oriented Paper, 2010)
Peer-Reviewed Conference Proceedings
- Kartik Audhkhasi, Osonde Osoba, Bart Kosko, "Noise benefits in backpropagation and deep bidirectional pre-training", Proc. IJCNN, 2013, Dallas, USA.
- Kartik Audhkhasi, Osonde Osoba, Bart Kosko, "Noisy hidden Markov models for speech recognition", Proc. IJCNN, 2013, Dallas, USA.
- Fabrizio Morbini, Kartik Audhkhasi, Ron Artstein, Maarten Van Segbroeck, Kenji Sagae, Panayiotis Georgiou, David Traum and Shrikanth Narayanan, "A reranking approach for recognition and classification of speech input in conversational dialogue systems", Proc. SLT, 2012, Miami, USA. (Preprint)
- Kartik Audhkhasi, Angeliki Metallinou, Ming Li and Shrikanth Narayanan, "Speaker personality classification using systems based on acoustic-lexical cues and an optimal tree-structured Bayesian network", Proc. InterSpeech (Speaker Trait Challenge), 2012, Portland, USA. (Preprint)
- Kartik Audhkhasi, Panayiotis Georgiou and Shrikanth Narayanan, "Analyzing quality of crowd-sourced speech transcriptions of noisy audio for acoustic model adaptation", Proc. ICASSP, 2012, Kyoto, Japan. (Preprint)(Data,Readme)
- Kartik Audhkhasi, Abhinav Sethy, Bhuvana Ramabhadran, and Shrikanth Narayanan, "Creating ensemble of diverse maximum entropy models", Proc. ICASSP, 2012, Kyoto, Japan. (Preprint)
- Kartik Audhkhasi, Panayiotis Georgiou and Shrikanth Narayanan, "Reliability-weighted acoustic model adaptation using crowd sourced transcripts", Proc. InterSpeech 2011, Florence, Italy. (Preprint)
- Kartik Audhkhasi and Shrikanth Narayanan, "Emotion classification from speech using evaluator reliability-weighted combination of ranked lists", Proc. ICASSP 2011, Prague, Czech Republic. (Preprint)
- Kartik Audhkhasi, Panayiotis Georgiou and Shrikanth Narayanan, "Accurate transcription of broadcast news speech using multiple noisy transcribers and unsupervised reliability metrics", Proc. ICASSP 2011, Prague, Czech Republic. (Preprint)
- Kartik Audhkhasi and Shrikanth Narayanan, "Data-dependent evaluator modeling and its application to emotional valence classification from speech", Proc. InterSpeech 2010, Makuhari, Japan. (Preprint)
- Qun Feng Tan, Kartik Audhkhasi, Panayiotis Georgiou, Emil Ettelaie and Shrikanth Narayanan, "Automatic speech recognition system channel modeling", Proc. InterSpeech 2010, Makuhari, Japan. (Preprint)
- Kartik Audhkhasi, Panayiotis Georgiou and Shrikanth Narayanan, "Lattice-based lexical cues for word fragment detection in conversational speech", Proc. ASRU, Merano, Italy, pp. 568-573, December 2009. (Preprint)
- Kartik Audhkhasi, Kundan Kandhway, Om Deshmukh and Ashish Verma, "Formant-based technique for automatic filled-pause detection in spontaneous spoken English", Proc. ICASSP, Taipei, Taiwan, pp. 4857-4860, April 2009. (Preprint)
- Om Deshmukh, Kundan Kandhway, Ashish Verma and Kartik Audhkhasi, "Automatic evaluation of spoken English fluency", Proc. ICASSP ,Taipei, Taiwan, pp. 4829-4832, April 2009. (Preprint)
- Kartik Audhkhasi and Ashish Verma, "Keyword search using modified minimum edit distance measure", Proc. ICASSP, Honolulu, Hawaii, pp. 929-932, April 2007. (Preprint)
Patents
- Kartik Audhkhasi, Om D. Deshmukh, Kundan Kandhway and Ashish Verma, "Automatic evaluation of spoken fluency", US Patent Appl. No. 12/541,927, filed August 12, 2009.
Curriculum Vitae: PDF
Updated on 09 February, 2013
SAIL Machine Learning Reading Group (MLRG): Webpage
I organize weekly meetings of the MLRG. Feel free to email me in case you are interested in giving a talk/tutorial, attending, or have suggestions regarding the discussion topics/papers.
Projects at SAIL
- BABEL (IARPA)
Project focuses on keyword search in multiple languages using limited amounts of transcribed speech.
- An Integrated Approach to Creating Enriched Speech Translation Systems (NSF)
Currently involved in developing techniques for using crowd-sourced data in training conversational ASR systems in Mexican Spanish and English. Previously involved in developing features and algorithms for detecting anomalous events from spontaneous speech. Worked on detection of filled pauses, word fragments and repetitions using acoustic-prosodic and lexical features.
- ASR Training (CHAOS, TRANSTAC, TATRC and many others from various funding agencies)
Involved in training/testing acoustic and language models for speech recognition in diverse languages.
- AM Adaptation in ASR
Involved in implementing maximum likelihood linear regression (MLLR) for adapting acoustic models in a large vocabulary ASR system.
Dual Degree Project at IIT-Delhi
Non-Intrusive Speech Quality Evaluation (Advisor: Dr. Arun Kumar, Center for Applied Research in Electronics, IIT-Delhi)
Teaching Assistance
Why is teaching important for researchers?
*Industry folks: I do not necessarily agree with the last sentence on the above link ;)
- At USC
Fall 2011 - EE 562a: Random Processes in Engineering (Prof. Robert Scholtz)
- Discussion session - 5:00 to 5:50 pm, Friday, OHE 100C
- Office hours - 3:00 to 6:00 pm, Thursday, PHE 320
- TA Evaluation Summary*
Spring 2011 - EE 562a: Random Processes in Engineering (Prof. Robert Scholtz)
- Discussion session - 8:30 to 9:20 am, Monday, OHE 100B
- Office hours - 9:30 to 11:30 am, Monday, PHE 320
- TA Evaluation Summary*
Fall 2010 - EE 562a: Random Processes in Engineering (Prof. Robert Scholtz)
- Discussion session - 5:00 to 5:50 pm, Friday, OHE 100C
- Office hours - 3 to 5 pm, Tuesday, PHE 320
- TA Evaluation Summary*
*Ratings on a 1-5 Likert scale
EE 562a Syllabus: Click here
- At IIT-Delhi
Fall 2007 - EEP 306: Communication Engineering Laboratory (Prof. Ranjan Mallik)
Spring 2008 - EEL 316: Digital Communications (Prof. Ranjan Mallik) (laboratory sessions)
Industrial Experience
- IBM T. J. Watson Research Center, Yorktown Heights, New York (Human Language Technologies Group, May-August 2011)
Mentors: Abhinav Sethy, Bhuvana Ramabhadran
Worked on training a diverse ensemble of maximum entropy classifiers with a focus on natural language processing tasks.
- IBM India Research Lab, New Delhi (Human Language Technologies Group, May-July 2008)
Mentors: Om Deshmukh, Ashish Verma
Worked on filled pause detection and automatic fluency evaluation for spoken English.
- IBM India Research Lab, New Delhi (Human Language Technologies Group, May-July 2006)
Mentor: Ashish Verma
Worked on keyword spotting from spoken English.
Select Graduate Coursework
- At USC
Neural and Fuzzy Systems, Optimization: Theory and Algorithms, Machine Learning, Advanced Artificial Intelligence (Graphical Models), Empirical Methods in Natural Language Processing, Mathematical Pattern Recognition, Statistics for Engineers, Random Processes in Engineering, Wavelets.
- At IIT-Delhi
Human and Machine Speech Communication, Detection and Estimation Theory, Statistical Signal Processing, Multimedia Systems, Signal Theory, Computer Vision, Image Processing, Selected Topics in Communications (MIMO Communications), Neural Networks and Machine Learning.
Other Interests
Body-building, Racquetball, Dux Ryu Ninjistu, Cricket, Movies
Contact Information
Office: EEB 405, 3740 McClintock Avenue,
University of Southern California,
Los Angeles, CA 90089-2564
Office phone: (213)-740-4660
E-mail: 
The University of Southern California does not screen or control the content on this website and thus does not guarantee the accuracy, integrity, or quality of such content. All content on this website is provided by and is the sole responsibility of the person from which such content originated, and such content does not necessarily reflect the opinions of the University administration or the Board of Trustees