 |

Jorge
Silva and Shrikanth Narayanan,
"Average Divergence Distance as a Statistical Discrimination
Measure for Hidden Markov Models," IEEE
Transaction on Audio, Speech and Language Processing, vol.
14, issue 3, pp. 890-906, May 2006
Abstract
The
paper proposes and evaluates a new statistical discrimination
measure for hidden Markov models (HMMs) extending the notion
of divergence, a measure of average discrimination information
originally defined for two probability density functions.
Similar distance measures have been proposed for the case
of HMMs, but those have focused primarily on the stationary
behavior of the models [1], [2]. However in speech recognition
applications, the transient aspects of the models have a
principal role in the discrimination process and consequently,
capturing this information is crucial in the formulation
of any discrimination indicator. This work proposes the
notion of Average Divergence Distance (ADD) as a statistical
discrimination measure between two HMMs, considering the
transient behavior of these models. The paper provides an
analytical formulation of the proposed discrimination measure,
a justification of its definition based on the Viterbi decoding
approach, and a formal proof that this quantity is well
defined for a left-to-right HMM topology with a final non-emitting
state, a standard model for basic acoustic units in automatic
speech recognition (ASR) systems. Using experiments based
on this discrimination measure, it is shown that ADD provides
a coherent way to evaluate the discrimination dissimilarity
between acoustic models.
Index
Terms:
Acoustic
discrimination measures, Kullback-Leibler distance and
Divergence, hidden Markov models, Speech Recognition,
Information Theory.
File:
pdf
References
[1]
B. H. Juang and L. R. Rabiner, "A probabilistic distance
measure for hidden Markov models," AT&T Technical
Journal, vol. 64 no. 2, pp. 391-408, 1985.
[2] M. N. Do, "Fast approximation of Kullback - Leibler
distance for dependence trees and hidden Markov models,"
IEEE Signal Processing Lett. , vol. 10, no. 4, pp. 115-118,
Apr. 2003.
[3] M. N. Do and M. Vetterli, "Rotation invariant texture
characterization and retrieval using steerable wavelet-domain
hidden Markov model," IEEE Transaction on Multimedia,
vol. 4, no. 4, pp. 517-527, Dec. 2002.
[4] M. N. Do and M. Vetterli, "Wavelet-Based Texture
Retrieval Using Generalized Gaussian Density and Kullback-Leibler
distance," IEEE Transaction on Image Processing,vol.
11, no. 2, pp. 146-158, Feb. 2002.
[5] Y. Singer and M.K. Warmuth, "Training Algorithm for
Hidden Markov Models Using Entropy Based Distance Functions,"
in Advances in Neural Information Processing System 9, pp.
641-647, Morgan Kaufmann Publishers, 1996.
[6] M. Falkhausen, H. Reininger, and D. Wolf, "Calculation
of distance measures between hidden Markov models," in
Proceedings of Eurospeech 1995, pp. 1487-1490, 1995.
[7] M. Vihola, M. Harju, P. Salmela , J. Suontausta and J.
Savela," Two dissimilarity measures for HMMS and their
application in phoneme model clustering," in Proc. ICASSP
2002, pp. 933-936, May 2002.
[8]N. Vasconcelos, " On the Efficient Evaluation of Probabilistic
Similarity Functions for Image Retrieval," IEEE Transactions
on Information Theory, vol. 50, No.7, pp1482-1496, July 2004.
[9] S. Kullback, "Information Theory and Statistics,"
New York: Wiley, 1958.
[10] F. Jelinek, "Statistical Methods for Speech Recognition,"
MIT Press, 1997.
[11] David J.C. MacKay, "Information Theory, Inference,
and Learning Algorithms," Cambridge Press, 2003.
[12] L. R. Rabiner, "A tutorial on hidden Markov models
and selected applications in speech recognition," Proc.
IEEE, vol. 77, no. 2, pp. 257-286, Feb 1989.
[13] Ming-Yi Tsai and Lin - Shan Lee, "Pronunciation
Variations Based on Acoustic Phonemic Distance Measures with
Applications examples of Mandarin Chinese," in ASRU December
2003.
[14] H. Printz and P. Olsen, "Theory and Practice of
Acoustic Confusability," in ISCA ITRW ASR2000, pp. 77-84,
2000.
[15] J.R. Norris, "Markov Chains," Cambridge series
in Statistical and Probabilistic Mathematics, 1999.
[16] A.P. Dempster, N. M. Laird, D.B. Rubin, "Maximum
Likelihood Incomplete Data via EM Algorithm," Journal
of the Royal Statistical Society, Series B, vol. 39, pp. 1-38,
1977.
[17] S. Young, J. Odell, D. Ollason, V. Valtchev, P. Woodland,
"HTK book," Cambridge Research Laboratory, 1997.
[18] P. Geutner, M. Finke, A. Waibel, "Selection Criteria
for hypothesis driven lexical adaptation," in ICASSP,
1999.
[19] R. Singh, B. Raj, R. Stern, "Structured redefinition
of sound units by merging and splitting for improved speech
recognition," in ICSLP, 2000.
[20] J. Kohler, "Multi-lingual phoneme recognition exploiting
acoustic-phonetic similarities of sounds," in ICSLP,
1996.
[21] A. J. Viterbi, "Error bounds for convolutional codes
and an asymptotically optimal decoding algorithm," IEEE
Trans. Information Theory, vol. IT-13, pp. 260-269, Apr. 1967.
[22] J. Li, R. M. Gray, and R.A. Olshen, "Multiresolution
image classification by hierarchical modeling with two-dimensional
hidden Markov models," IEEE Trans. On Information Theory,
vol. 46, no.5, pp1826-1841, Aug 2000.
[23] M. S. Crouse, R. D. Nowak, and R. G. Baraniuk, "Wavelet-based
statistical signal processing using hidden Markov models,"
IEEE Trans. on Signal Processing, vol. 46, pp. 886-902, Apr.
1998.
[24] R. M. Gray, "Entropy and Information Theory,"
Springer - Verlag, New York, 1990.
[25] S. Chretien and A. L. Hero III, "Kullback Proximal
Algorithms for Maximum-Likelihood Estimation," IEEE Trans.
on Information Theory, vol. 46, no. 5, pp. 1800-1810, Aug
2000.
[26] G.D. Forney, "The Viterbi algorithm," Proceedings
of the IEEE, 61:268-278, 1973.
[27] L. E. Baum, T. Petrie, G. Soules, and N. Weiss, "A
maximization technique occurring in the statistical analysis
of probabilistic functions of Markov chains," Ann. Math.
Stat., vol. 41, pp. 164-171, 1970.
[28] L. R. Liporace, "Maximum likelihood estimation for
multivariate observation of Markov sources," IEEE Trans.
on Information Theory, vol. IT-28, pp. 729-734, Sept. 1982.
[29] S. Yildirim and S. Narayanan, "An information-theoretic
analysis of developmental changes in speech," Proc. ICASSP,
April 2003.
[30] L. R. Bahl, P. F. Brown, P. V. de Souza, and R. L. Mercer,
"Maximum mutual information estimation of hidden Markov
model parameters for speech recognition," in Proc. ICASSP
86, pp. 49-52, Apr. 1986.
[31] Y. Normandin, R. Cardin, and R. De Mori, "High-performance
connected digit recognition using maximum mutual information
estimation," IEEE Trans. Speech Audio Processing, vol.
2, pp. 299-311, 1994.
[32] B.-H. Juang, W. Chou, and C.-H. Lee, "Minimum classification
error methods for speech recognition," IEEE Trans. Speech
Audio Processing, vol. 5, no. 3, pp. 257-265, 1997.
[33] Y. Ephraim, A. Dembo and L. R. Rabiner, "A minimum
discrimination information approach for hidden Markov modeling,"
IEEE Trans. on Information Theory, vol. 35, no. 5, pp. 1001-1013,
Sept. 1989.
[34] C.E. Shannon, "A mathematical theory of communication,"
Bell Syst. Tech. J., vol. 27, pp. 379-493, 623- 656, 1948.
[35]Y. Singer and M. K. Warmuth, "Batch and on-line parameter
estimation of Gaussian mixtures based on the joint entropy,"
in Advances in Neural Information Processing System 11, pp.
578-584, 1998.
[36] O. Ronen, J.R. Rohlicek, and M. Ostendorf, "Parameter
estimation of dependence tree models using EM algorithm,"
IEEE Signal Proc. Letter, vol. 2, no. 8, pp. 157-159, Aug.
1995.
[37] P. Smyth, D. Heckerman, and M. Jordan, "Probabilistic
independence networks for hidden Markov models," Neural
Computation , vol.9, no. 2, pp. 227--269, 1997.
[38] P. Smyth, "Clustering sequences using hidden Markov
models, " in Advances in Neural Information Processing
9, MIT Press, pp. 648-654, 1997.
[39] Alan Willsky," Multiresolution Markov Models for
Signal and Image Processing ," Proceedings of the IEEE
90(8), August 2002.
[40] H. Lucke, "Which Stochastic Models Allow Baum-Welch
Training?," IEEE Transactions on Signal Processing, vol.
44, no.11, November 1996.
[41] T. M. Cover and J. A. Thomas, "Elements of Information
Theory," Wiley Interscience, New York, NY, 1991.
Home
Research
Publications Presentations
sail Links
|
 |