№1, 2015
AN APPROACH TO PITCH PERIOD DETECTION OF SPEECH SIGNAL BASED ON ADAPTED WAVELETS
Among the existing methods used for speaker recognition, only a few can work in the case of non-linear and non-stationary speech signals. Pitch period is one of the most important features for speaker characterization. This paper presents a method for pitch period detection of nonlinear and non-stationary speech signals based on empirical wavelet transform. Experiments show high relative efficiency of the proposed approach for different noise levels. (pp. 33-41)
Keywords: pitch period, empirical wavelet transform, Teager-Kaiser energy operator, intrinsic mode function, instantaneous frequency
References
- Rabiner L.A., Cheng M.J., Rosenberg A.E., McGonegal C.A. A comparative performance study of several pitch detection algorithms // IEEE Trans. on Acoust., Speech and Signal Proc., 1976, no.5, pp.399–417.
- Tan L.N., Alwan A. Multi-band summary correlogram-based pitch detection for noisy speech // Speech Communication, 2013, vol.55, no.78, pp.841–856.
- Ba H., Yang N. BaNa: a hybrid approach for noise resilient pitch detection // IEEE Statistical Signal Processing Workshop, 2012, pp.369–372.
- De Cheveigne A., Kawahara H. Yin, a fundamental frequency estimator for speech and music // J. Acoust. Soc. Am., 2002, vol.111, no.4, pp.1917–1930.
- Kasi K., Zahorian S.A. Yet another algorithm for pitch tracking / Proc. of the ICASSP, 2002, pp.361–364.
- Camacho A. SWIPE: a sawtooth waveform inspired pitch estimator for speech and music. Ph.D. dissertation. Florida, 2007, 116 p.
- Gonzalez S., Brookes M. A pitch estimation filter robust to high levels of noise (PEFAC) / Proc. of EUSIPCO, 2011, pp. 451–455.
- Boashash B. Estimating and interpreting the instantaneous frequency of a signal // Proc. IEEE, 1992, vol.80, no.4., pp.520–568.
- Maragos P., Kaiser J.F., Quatieri T.F. On amplitude and frequency demodulation using energy operators // IEEE Trans. on Signal Processing, 1993, vol.41, no.4, pp.1532–1550.
- Abe T., Kobayashi T., Imai S. Harmonics tracking and pitch extraction based on instantaneous frequency / Proc. of ICASSP, 1995, vol.1, pp.756–759.
- Abe T., Honda M. Sinusoidal model based on instantaneous frequency attractors // IEEE Trans. on Audio, Speech and Language Processing, 2006, vol.14, no.4, pp.1292–1300.
- Azarov E., Petrovsky A., Parfieniuk M. Estimation of the instantaneous harmonic parameters of speech / Proc. of EUSIPCO, 2008, pp.1–5.
- Huang N.E., Shen Z., Long S.R., Wu M.L., Shih H.H., Zheng Q., Yen N.C., Tung C.C., Liu H.H. The empirical mode decomposition and Hilbert spectrum for nonlinear and non-stationary time series analysis // Proc. Roy. Soc. London A, 1998, vol.545, pp.903–995.
- Gilles J. Empirical Wavelet Transform // IEEE Transactions on Signal Processing, 2013, vol.61, no.16, pp.3999–4010.
- Vakman D. On the analytic signal, the Teager–Kaiser energy algorithm, and other methods for defining amplitude and frequency // IEEE Trans. on Signal Process., 1996, vol.44, no.4, pp.791–797.
- Chu W., Alwan A. Reducing f0 frame error of f0 tracking algorithms under noisy conditions with an unvoiced/voiced classification frontend / Proc. of ICASSP, 2009, pp.3969–3972.
- Varga A., Steeneken H.J. Assessment for automatic speech recognition: II. Noisex-92: a database and an experiment to study the effect of additive noise on speech recognition systems // Speech Communication, 1993, vol.12, no.3, pp.247–251.
- Drugman T., Alwan A. Joint robust voicing detection and pitch estimation based on residual harmonics / Proc. of Interspeech, 2011, pp.1973–1976.
- Azarov E., Vashkevich M., Petrovsky A. Instantaneous pitch estimation based on RAPT framework / Proc. of EUSIPCO, 2012, pp.2787–2791.