Speaker based segmentation on broadcast news- on the use of ISI technique

Authors

  • S. Ouamour USTHB, Electronics Institute, BP 32 Bab Ezzouar, Alger, Algeria Author
  • M. Guerti ENP, Hacen Badi, El-Harrach Alger, Algeria Author
  • H. Sayoud USTHB, Electronics Institute, BP 32 Bab Ezzouar, Alger, Algeria Author

DOI:

https://doi.org/10.36505/ExLing-2006/01/0042/000042

Abstract

In this paper we propose a new segmentation technique called ISI or “Interlaced Speech Indexing”, developed and implemented for the task of broadcast news in-dexing. It consists in finding the identity of a well-defined speaker and the moments of his interventions inside an audio document, in order to access rapidly, directly and easily to his speech and then to his talk. Our segmentation procedure is based on an interlaced equidistant segmentation (IES) associated with our new ISI algorithm. This approach uses a speaker identification method based on Second Order Statisti-cal Measures. As SOSM measures, we choose the “µGc” one, which is based on the covariance matrix. However, experiments showed that this method needs, at least, a speech length of 2 seconds, which means that the segmentation resolution will be 2 seconds. By combining the SOSM with the new Indexing technique (ISI), we dem-onstrate that the average segmentation error is reduced to only 0.5 second, which is more accurate and more interesting for real-time applications. Results indicate that this association provides a high resolution and a high tracking performance: the in-dexing score (percentage of correctly labelled segments) is 95% on TIMIT database and 92.4% on Hub4 Broadcast news 96 database.

 

References

Bimbot F. et al. 1995. Second-Order Statistical measures for text-independent Broadcaster Identification. Speech Communication, 17, 177-192.

Bonastre J.F. et al. 2000. A speaker tracking system based on speaker turn detection for NIST evaluation. IEEE ICASSP, Istanbul, june 2000.

Delacourt P. et al. 2000. DISTBIC: a speaker-based segmentation for audio data indexing, Speech Communication, 32, Issue 1-2.

Gish H. 1990. Robust discrimination in automatic speaker identification. IEEE Inter. Conference on Acoustics Speech and Signal Processing. April 90, New Mexico, 289-292.

Liu D., and Kubala F. 1999, “Fast speaker change detection for broadcast news transcription and indexing”. Eurospeech, 1999. Vol. 3, 1031-1034.

Reynolds D.A. et al. 1998, “Blind clustering of speech utterances based on speaker and language characteristics”. ICSLP, 1998. Vol. 7, 3193-3196.

Downloads

Published

01-01-2006

How to Cite

Speaker based segmentation on broadcast news- on the use of ISI technique. (2006). Linguistic Proceedings Series, 1(1), 193-196. https://doi.org/10.36505/ExLing-2006/01/0042/000042