On 31/07/2012, sunny chaudhary
Hi,
Hey Sunny!
I am new to this technology. I want to do some media processing on static file. I want to mix the recorded human voice with some music. The voice should time stretch and pitch stretch, pitch shift as per the music frequency. Totally want to merge sound with music. So when listening to the final output, it produces voice playing as per the flow of music.
Any idea how to achieve this? Is CHUCK is the right tool for achieving this? I need this functionality on urgent basis. We can pay for the solution.
This is quite possible, but it would, of course, take some work to set it all up and callibrate it properly. As a basic method I'd take a "phase vocoder" (see wikipedia or your favourite text book). Examples for how to create something like that will be found in the examples directory, under "analysis", particularly the stuff dealing with fft. For the timing you will need to write some analysis of transients in both the vocal recording and the music, then come up with a way of lining those up. From memory; examples/deep/ should have a enevelope follower that you could use as a basis for that. It's possible but if you are new to this kind of thing and also need it in a hurry you have a lot of work ahead because this doesn't sound especially easy. A short while ago Smule released a "app" for mobile platforms that does basically exactly this. When you write that you are willing to pay for a sollution you could simply buy that? I think you'd still be running ChucK, underneath it all. Of course doing that will mean you don't get to learn about how such things work. For me that would be a factor but I realise it isn't for everyone. Hope that helps a bit. Yours, Kas. PS; thank you for agreeing to move the discussion to the list. I'm sure more people will be interested in the problem as it's quite a interesting one.