[chuck-users] Phoneme Recognition with ChucK
joerg at piringer.net
Wed Nov 14 03:43:48 EST 2007
for an interactive installation i was somehow successful to
differentiate phonemes by using linear prediction coding.
as a starting point i used (in C++):
i extracted a set of features and compared them via a simple
n-dimensional distance formula to prerecorded samples.
it works reliable for phonemes that are quite different from each other.
for example "a" and "s" produce almost 100% hits.
to distinguish between "a" and "o" for example is much less reliable.
i am planning to involve some learning algorithm to increase the
realiability. maybe neuronal networks.
i'd really recommend to test your algorithm as early as possible with
different speakers (male & female) because i think it could be
relatively easy to create something that works for your own voice but
not something that works for others.
Inventor-66 at comcast.net schrieb:
> On the electro-music.com forum we've been discussing a phoneme recognizer based on ChucK. So far we have an FFT grabber that gets its samples off the dac so you can ChucK-up any input file to it including a microphone listener, rec_in.ck, a little bit of feature extraction and a rules-based phoneme recognizer. It can basically tell an "ee" phoneme from an "ay" phoneme, and I plan to add all the vowels, a e i o u really soon.
> To help with the project, I was wondering if anyone from the list would like to contribute some suggestions or comments about our technique. Basically we are shunning the hidden markov models that are often used to recognize words from phonemes in favor of creating a simple phoneme recognizer. This way we offload much of the computation from the machine into the human user's brain, thus enabling a compact ChucK implementation as well as someday a simple dsPIC processor-based handheld product that streams text from a mic for the hearing impaired.
> For example, if you said "Mary had a little lamb", the device might output "mehry hahd ah letl lam" and it is up to the user to interpret that into regular speech. In this way a very compact little product with just a few chips could be produced that would retail for twenty dollars or lower, making it affordable and portable for the hearing impaired to use.
> I was Just curious as to what your thoughts may be, I'm doing the code and Kassen plus others like dewdrop_world are commenting and pointing me in the right direction as we go. Please feel free to offer suggestions and/or constructive criticism.
> chuck-users mailing list
> chuck-users at lists.cs.princeton.edu
More information about the chuck-users