I had a dream one night where a blackbird was talking in human language. When I woke up there was actually a blackbird singing outside the window. Its inflections were curiously speech-like. The dreaming mind only needed to imagine a bunch of additional harmonics to form phonemes and words. One was left wondering if speech could be transformed into a blackbird song by isolating one of the harmonics...
One way to do this would be to:
- Find the instantaneous fundamental frequency and amplitude of the speech. For example, filter the harmonics out and use an FM demodulator to find the frequency. Then find the signal envelope amplitude by AM demodulation.
- Generate a new wave with similar amplitude variations but greatly multiplied in frequency.
A proof-of-concept script using the Perl-SoX-csdr command-line toolchain is available (source code here). The result sounds surprisingly blackbird-like. Even the little trills are there, probably as a result of FM noise or maybe vocal fry at the end of sentences. I got the best results by speaking slowly and using exaggerated inflection.
Someone hinted that the type of intonation used in certain automatic announcements is perfect for this kind of conversion. And it seems to be true! Here, a noise gate and reverb has been added to the result to improve it a little:
And finally, a piece of sound art where this synthetic blackbird song is mixed with a subtle chord and a forest ambience:
Think of the possibilities: A simultaneous interpreter for talking to birds. A tool for dubbing talking birds in animation or live theatre. Entertainment for cats.
What other birds could be done with a voice changer like this? What about croaky birds like a duck or a crow?
(I talked about this blog post a little on NPR: Here's What 'All Things Considered' Sounds Like — In Blackbird Song)
Awesomeness!
ReplyDeleteIs anyway possible (not beeing a programmer or sourcecode nerd) to get this into my reaktor..? I really like it
DeleteCould you make the invert process? from bird to human language? :)
ReplyDeleteBecause so much information is lost in the process, the reverse would require completely re-imagining it. In other words, it would require a dreamer. :) Perhaps something like DeepDream could do it?
DeleteHad a same thought. Sometimes in spring, when the birds starts singing, I thought that they are telling me something and I almost understand them. In reality though, I get that it is just the intonations my brain interprets, and in the "bird language" the meaning is completely different. The diversity of their songs and intonations is impressive though, one day I was listening for a bird for 10 minutes and did not hear one repetition.
DeleteWhat if you were to reverse the chain, and convert some blackbird speech to human speech?
ReplyDeleteThe process is destructive, so you can't reverse it.
DeleteIt would be like painting black over a painting. If I give you a black canva, you can't guess what was painted before and reconstruct it. You'd have to make some opinionated choices.
Just lovely. Now you simply need to write a song to speech decoder!
ReplyDeleteWoow this is awesome!
ReplyDeleteThanks Spiikki :)
DeleteThis is beautiful! I'm glad I stumbled across your blog in a hacker news comment thread.
ReplyDeleteI'm reminded of this [https://www.youtube.com/watch?v=-JftSgb69JY] composition by Chris Hughes, who was inspired by Steve Reich's 1967 notes on a slow-motion score: "Very gradually slow down a recorded sound to many times its original length without changing its pitch or timbre at all."
Thanks! Sorry to ask a signals question, but do you have any search terms or links or pointers so I can learn more about using FM demodulation to find the fundamental? It's kind of melting my brain trying to understand how that could work. Thanks!
ReplyDeleteFM demodulation works here because I first lowpass filtered the speech so that it only (or mostly) contained the fundamental frequency. I found the right cutoff frequencies by trial and error. Now, because FM encodes information in the signal frequency, we can extract this information by FM demodulation. The magnitude of the result is proportional to the frequency (of the fundamental, in this case).
DeleteThere are more robust methods out there, you could search for "pitch detection algorithm".
Thanks for the reply. I think I have a glimmer now. :-) I've actually written some pitch detection software and studied that subject a bit, which is why I was surprised to read about your method because I had never heard of it. Is it a less precise approach than e.g. Schmitt triggering? And it sounds from what you wrote as if you want a relatively pure tone to use this approach, hence the low-pass filtering?
DeleteIt can be precise but it only does the right thing for a pure tone, there's a lot of contamination if there are any harmonics. But in this case it's not a big deal, since it's for art. Also, in this case the inflections in my voice mostly span less than 1 octave, so the second harmonic should be easy to filter out. FM demodulation was the first thing that came to my mind that can quickly be done with command-line tools familiar to me (csdr), so that's why I chose it.
DeleteWhat an interesting idea!
ReplyDeleteThis reminded me of a spanish language called Silbo Gomero, which is whistled. https://en.wikipedia.org/wiki/Silbo_Gomero
ReplyDeleteYou should really ecode a formant (the second one carries the most information, I think) instead of the fundamental frequency, especially for a non tonal language like english. The fundamental frequency carries way too little information.
ReplyDeleteAmazing :)
ReplyDeletecan birds understand any of this? i.e. can you elicit specifi bird behaviors?? like bring me a worm.
ReplyDeleteXD I must try this... I want worms now
DeleteThis is incredible! I was listening to the songbirds when someone showed me this, I love it. What do you use to make the pretty purple and green diagrams?
ReplyDeleteThanks! I design them myself in Inkscape. I've made a free-to-use SVG that contains the styles and some of the elements: signalflow.svg
DeleteTres cool!
ReplyDeleteThank you for creating this. Your blog is very inspriational
ReplyDeleteDo you have an app or would you be willing to create an app so we could record our voice through the computer into birdsong so we can greet our friends and so forth. What fun. ;-}
ReplyDeleteThat would be so cool!
DeleteAn NPR radio piece on the same subject: https://www.npr.org/2021/04/16/988200892/heres-what-all-things-considered-sounds-like-in-blackbird-song
ReplyDeleteAs a poet, I would love to take some lines from my poetry or favorite poems and translate them into birdsong. Maybe one day there will be a method to easily record and convert. This is truly beautiful. Thank you.
ReplyDeleteIs this translator hosted somewhere that it can be used? I'd love to try it!
ReplyDeleteUnfortunately it's not. Would be really cool!
DeleteTrrr
ReplyDeleteUwhheeeee
Shreeee
Ssssssss
Tweeeeeeee
Just heard your NPR interview - very interesting!!
ReplyDeleteFor a related project, consider converting voice to a whistled language (Silbo Gomero) or a drum language. Here's a good reference paper.
Rialland, A. (2005). "Phonological and phonetic aspects of whistled languages".
https://core.ac.uk/download/pdf/191755708.pdf
I love this! This is probably a dumb question- but is there a way to open/recreate this on MaxMSP?
ReplyDeleteIt *should* be possible. It needs to be re-thought a little since there are no complex signals in MaxMSP (as far as I know). Also there's no MSP object to demodulate FM. I tried to get a fundamental frequency with fzero~ instead, but the result sounds quite different, not at all smooth. Here's the MSP patch I tried: [adc~] -- [fzero~ @threshold 0.01] -- [sig~] -- [*~ 8] -- [cycle~] -- [*~ 0.1] -- [dac~]. Additionally you could scale the resulting signal with the output 2 from fzero~ (using [*~]).
DeleteFascinating! From your sample it seems certain phonemes generate a more bird-like sounds than others (at least with the given processing).
ReplyDeleteThat makes me wonder if certain human languages on average produce better results than others in this context :)
Translating bird sounds to human speech patterns would be interesting. As far as Cats, I am pretty sure mine gimmicks words like hello etc. I was listening to some music years ago that had what sounded like a bird that was slowed down and shifted in pitch it sounded like some prehistoric animal. I always wanted to find that music again.
ReplyDeleteThis is brilliant. I'm a poet so I'm not very well versed with the tools but I love this. This is amazing! I'm curious though as to how the birds would react to this. Even though it is a song in their language, is it something that makes sense for them or has any particular meaning? Perhaps, the study of birdsongs would tell us more about their psychology and social behaviour and help us gain more insight. Thank you for this though. It is quite beautiful and innovative!
ReplyDeleteYou might be on to something - https://www.smithsonianmag.com/science-nature/do-birds-have-language-180979629/
ReplyDeleteWoww!! this is great man!
ReplyDeleteholy shit, i was just reading about that famous helicopter signal hack thing you did, and the first thought that came to my head was "i bet this person could figure out how to translate bird language to human language". and now you're doing it. hah.
ReplyDeleteIf you get several pairs of voice + voice converted to "birdy" a AI could easily learn the inverse convertion, so a real bird would sound like a human voice. You will not get a teanslation, just "humized birdy", but who knows if listening to birds with this filter could help us understand them easily one day. Just dreaming!
ReplyDeleteYou may know something near to a idiom of Spanish used in La Gomera. It's known as Silvo. https://youtu.be/YrLcyV5P_GY?si=x-XWmGLJVbxzwwLL
ReplyDeleteCongratulations. You have (re)invented the "Silbo Gomero". https://en.wikipedia.org/wiki/Silbo_Gomero
ReplyDelete