check this out!
would be nice to port this to axolotiland! does somebody know neil?
It's a lot of fun!
I'm not sure if the voice model would run on the Axoloti- there's a big difference between the processing power of the Axoloti and a desktop computer. Also, the GUI is great- not sure how well the whole thing would work, without the nice interface.
Having said all that, I'd love to see if it could be done. I love those vocals-ey sounds.
a|x
yeah, i will try to contact him and see if he would share the code...
the ui is not so important i think, mapping those controls to 4 or 5 cc's should be fun as well..
It's JavaScript, so the code will be visible. It will probably be minified though, so not very readable.
I think you'd need quite a few controls, I think. If you could get hold of the basic synthesis code, you could probably work out a method to control the parameters of the model that made more sense in the context of controlling it via MIDI CCs. Simply having a knob for every parameter of the model probably won't make much sense.
In text-to-speech systems, these kind of models are usually fed parameter streams made from analysed natural speech.
a|x
I'm thinking an X/Y touchpad could be implemented to continuously control the tongue placement. Triggering the flips would be easy and a foot pedal or velocity keystroke could influence the palate/lip position. That way, you still have the different notes on the keyboard to play/sing a tune!
Sounds great! Getting it to speak or sing comprehensably would require a lot of work, I think. Reminds me a bit of The Voder, an early manually-operated vocoder, which required a year of solid training to master the multiple hand and foot-operated controls.
On the other hand, if you just want to make cool vocal-esque sounds, it would work very well, I think!
a|x
yeah, just some cool vocal-esque sounds... i looked at the code it seems not to use streamed data.
Browsers have text-to-speech and speech-to-text built in nowadays, accessible via JavaScript. It would be a huge effort to port a full speech engine to the Axoloti.
i'm pretty convinced it is not that, did you look at it? it is not text to speech...@toneburst did an lpc port a while back for the axoloti...
I’m pretty sure it’s not leveraging the built in browser text-to-speech API. It’s a self-contained vocal-tract emulation.
a|x
Some text-to-speech systems use similar models for speech-synthesis, but they’re usually trained on analysed natural speech, and driven by complex sets of rules to generate parameter values from text input.
High-level text-to-speech APIs, like those built in to browsers are ‘black boxes’, and don’t give direct access to the parameters of the synthesis model.
a|x
Would be great to have an Axoloti speech synth object like the system BitSpeek uses as well!
My LPC objects go a lot further, in terms of sound-mangling potential than Bitspeek, but only works with pre-recorded LPC data.
I’d really love to make an object to convert audio to an LPC stream in real-time, but I’m not sure how to approach that, or if the Axoloti has the processing power to do that.
a|x
I think what's needed is a Levinson-Durbin implementation. There seems to be lots of source code available for that, since it's been around for a long time.
Unfortunately, the theory is all a bit over my head, and I don't really have the coding or DSP skills to tackle attempting an Axoloti object implementation on my own.
I also have no real idea if it's practical to attempt this on an MCU like the Axoloti's. All the implementations I've seen documented are non-realtime, even on desktop computers.
a|x