Presented at the 1999 IEEE Workshop on Nonlinear Signal and Image Processing

Speaker Localization for Far-field and Near-field Wideband Sources Using Neural Networks

Guner Arslan (1), F. Ayhan Sakarya (2)(3), and Brian L. Evans (1)

(1) Department of Electrical and Computer Engineering, Engineering Science Building, The University of Texas at Austin, Austin, TX 78712-1084 USA
arslan@ece.utexas.edu - bevans@ece.utexas.edu

(2) Department of Electronics and Telecommunication Engineering, Yildiz Technical University, 80750 Istanbul, Turkey
sakarya@ana.cc.yildiz.edu.tr

(3) Wireless Technology Laboratory, Lucent Technologies, Holmdel, NJ 07733-3030 USA
sakarya@lucent.com

Paper - Talk

Abstract

Many applications such as hands-free videoconferencing, speech processing in large rooms, and acoustic echo cancellation, use microphone arrays to track speaker locations in real-time. A speaker is a wideband source which may be in the near field or far field of the array. Current source localization approaches based on neural networks can meet real-time constraints but assume far-field narrowband sources. In this paper, we (1) apply neural networks for determining direction-of-arrival for near-field and far-field wideband speaker localization, and (2) compute the instantaneous cross-power spectra between adjacent pairs of sensors to form the feature vector. We optimized the overall speaker localization system off-line to yield an absolute error of less than 6 degrees at an SNR of 10 dB and a sampling rate of 8000 Hz at each sensor. When performing speaker localization in real-time, the system would require 1 MFLOP/s.

Questions


Last Updated 08/07/99.