|
Cocktail Party Problem as Binary Classification
|
Cocktail Party Problem as Binary Classification
Speech segregation, or the cocktail party problem, has proven to be extremely challenging. Part of the challenge stems from the lack of a carefully analyzed computational goal. While the separation of every sound source in a mixture is considered the gold standard, I argue that such an objective is neither realistic nor what the human auditory system does. Motivated by the auditory masking phenomenon, we have suggested instead the ideal time-frequency (T-F) binary mask as a main goal for computational auditory scene analysis. Ideal binary masking retains the mixture energy in T-F units where the local signal-to-noise ratio exceeds a certain threshold, and rejects the mixture energy in other T-F units. Recent psychophysical evidence shows that ideal binary masking leads to large speech intelligibility improvements in noisy environments for both normal-hearing and hearing-impaired listeners. The effectiveness of the ideal binary mask implies that sound separation may be formulated as a case of binary classification, which opens the cocktail party problem to a variety of pattern classification and clustering methods. As an example, I discuss a recent system that segregates unvoiced speech by supervised classification of acoustic-phonetic features.
Video Length: 0
Date Found: October 13, 2010
Date Produced: July 30, 2009
View Count: 2
|
|
|
|
|
I got punched by an old guy, for farting near his wife. Read MoreComic book creator Stan Lee talks the future of the medium in the digital age. Panelists Zachary... Read MoreThe U.S. launch of Spotify is still on music lovers' minds. Join Zachary Levi, from NBC’s... Read MoreTuesday: Rupert Murdoch testifies before Parliament on the hacking scandal that brought down "News... Read MoreAfter a long slump, the home construction industry may be showing signs of life. But as Bill... Read More | 1 2 3 4 5 |
|
|
|