hi Les, for an interactive installation i was somehow successful to differentiate phonemes by using linear prediction coding. as a starting point i used (in C++): http://soundlab.cs.princeton.edu/software/rt_lpc/ i extracted a set of features and compared them via a simple n-dimensional distance formula to prerecorded samples. it works reliable for phonemes that are quite different from each other. for example "a" and "s" produce almost 100% hits. to distinguish between "a" and "o" for example is much less reliable. i am planning to involve some learning algorithm to increase the realiability. maybe neuronal networks. i'd really recommend to test your algorithm as early as possible with different speakers (male & female) because i think it could be relatively easy to create something that works for your own voice but not something that works for others. best joerg Inventor-66@comcast.net schrieb:
Hi,
On the electro-music.com forum we've been discussing a phoneme recognizer based on ChucK. So far we have an FFT grabber that gets its samples off the dac so you can ChucK-up any input file to it including a microphone listener, rec_in.ck, a little bit of feature extraction and a rules-based phoneme recognizer. It can basically tell an "ee" phoneme from an "ay" phoneme, and I plan to add all the vowels, a e i o u really soon.
To help with the project, I was wondering if anyone from the list would like to contribute some suggestions or comments about our technique. Basically we are shunning the hidden markov models that are often used to recognize words from phonemes in favor of creating a simple phoneme recognizer. This way we offload much of the computation from the machine into the human user's brain, thus enabling a compact ChucK implementation as well as someday a simple dsPIC processor-based handheld product that streams text from a mic for the hearing impaired.
For example, if you said "Mary had a little lamb", the device might output "mehry hahd ah letl lam" and it is up to the user to interpret that into regular speech. In this way a very compact little product with just a few chips could be produced that would retail for twenty dollars or lower, making it affordable and portable for the hearing impaired to use.
I was Just curious as to what your thoughts may be, I'm doing the code and Kassen plus others like dewdrop_world are commenting and pointing me in the right direction as we go. Please feel free to offer suggestions and/or constructive criticism.
Thanks, Les _______________________________________________ chuck-users mailing list chuck-users@lists.cs.princeton.edu https://lists.cs.princeton.edu/mailman/listinfo/chuck-users
-- http://joerg.piringer.net http://www.transacoustic-research.com http://www.iftaf.org http://www.vegetableorchestra.org/