Speech to birdsong conversion

I had a dream one night where a blackbird was talking in human language. When I woke up there was actually a blackbird singing outside the window. Its inflections were curiously speech-like. The dreaming mind only needed to imagine a bunch of additional harmonics to form phonemes and words. One was left wondering if speech could be transformed into a blackbird song by isolating one of the harmonics...

One way to do this would be to:

  • Find the instantaneous fundamental frequency and amplitude of the speech. For example, filter the harmonics out and use an FM demodulator to find the frequency. Then find the signal envelope amplitude by AM demodulation.
  • Generate a new wave with similar amplitude variations but greatly multiplied in frequency.
[Image: Signal path diagram.]

A proof-of-concept script using the Perl-SoX-csdr command-line toolchain is available (source code here). The result sounds surprisingly blackbird-like. Even the little trills are there, probably as a result of FM noise or maybe vocal fry at the end of sentences. I got the best results by speaking slowly and using exaggerated inflection.

Someone hinted that the type of intonation used in certain automatic announcements is perfect for this kind of conversion. And it seems to be true! Here, a noise gate and reverb has been added to the result to improve it a little:

And finally, a piece of sound art where this synthetic blackbird song is mixed with a subtle chord and a forest ambience:

Think of the possibilities: A simultaneous interpreter for talking to birds. A tool for dubbing talking birds in animation or live theatre. Entertainment for cats.

What other birds could be done with a voice changer like this? What about croaky birds like a duck or a crow?

(I talked about this blog post a little on NPR: Here's What 'All Things Considered' Sounds Like — In Blackbird Song)

44 comments:

  1. Replies
    1. Is anyway possible (not beeing a programmer or sourcecode nerd) to get this into my reaktor..? I really like it

      Delete
  2. Could you make the invert process? from bird to human language? :)

    ReplyDelete
    Replies
    1. Because so much information is lost in the process, the reverse would require completely re-imagining it. In other words, it would require a dreamer. :) Perhaps something like DeepDream could do it?

      Delete
    2. Had a same thought. Sometimes in spring, when the birds starts singing, I thought that they are telling me something and I almost understand them. In reality though, I get that it is just the intonations my brain interprets, and in the "bird language" the meaning is completely different. The diversity of their songs and intonations is impressive though, one day I was listening for a bird for 10 minutes and did not hear one repetition.

      Delete
  3. What if you were to reverse the chain, and convert some blackbird speech to human speech?

    ReplyDelete
    Replies
    1. The process is destructive, so you can't reverse it.

      It would be like painting black over a painting. If I give you a black canva, you can't guess what was painted before and reconstruct it. You'd have to make some opinionated choices.

      Delete
  4. Just lovely. Now you simply need to write a song to speech decoder!

    ReplyDelete
  5. This is beautiful! I'm glad I stumbled across your blog in a hacker news comment thread.
    I'm reminded of this [https://www.youtube.com/watch?v=-JftSgb69JY] composition by Chris Hughes, who was inspired by Steve Reich's 1967 notes on a slow-motion score: "Very gradually slow down a recorded sound to many times its original length without changing its pitch or timbre at all."

    ReplyDelete
  6. Thanks! Sorry to ask a signals question, but do you have any search terms or links or pointers so I can learn more about using FM demodulation to find the fundamental? It's kind of melting my brain trying to understand how that could work. Thanks!

    ReplyDelete
    Replies
    1. FM demodulation works here because I first lowpass filtered the speech so that it only (or mostly) contained the fundamental frequency. I found the right cutoff frequencies by trial and error. Now, because FM encodes information in the signal frequency, we can extract this information by FM demodulation. The magnitude of the result is proportional to the frequency (of the fundamental, in this case).

      There are more robust methods out there, you could search for "pitch detection algorithm".

      Delete
    2. Thanks for the reply. I think I have a glimmer now. :-) I've actually written some pitch detection software and studied that subject a bit, which is why I was surprised to read about your method because I had never heard of it. Is it a less precise approach than e.g. Schmitt triggering? And it sounds from what you wrote as if you want a relatively pure tone to use this approach, hence the low-pass filtering?

      Delete
    3. It can be precise but it only does the right thing for a pure tone, there's a lot of contamination if there are any harmonics. But in this case it's not a big deal, since it's for art. Also, in this case the inflections in my voice mostly span less than 1 octave, so the second harmonic should be easy to filter out. FM demodulation was the first thing that came to my mind that can quickly be done with command-line tools familiar to me (csdr), so that's why I chose it.

      Delete
  7. This reminded me of a spanish language called Silbo Gomero, which is whistled. https://en.wikipedia.org/wiki/Silbo_Gomero

    ReplyDelete
  8. You should really ecode a formant (the second one carries the most information, I think) instead of the fundamental frequency, especially for a non tonal language like english. The fundamental frequency carries way too little information.

    ReplyDelete
  9. can birds understand any of this? i.e. can you elicit specifi bird behaviors?? like bring me a worm.

    ReplyDelete
  10. This is incredible! I was listening to the songbirds when someone showed me this, I love it. What do you use to make the pretty purple and green diagrams?

    ReplyDelete
    Replies
    1. Thanks! I design them myself in Inkscape. I've made a free-to-use SVG that contains the styles and some of the elements: signalflow.svg

      Delete
  11. Thank you for creating this. Your blog is very inspriational

    ReplyDelete
  12. Do you have an app or would you be willing to create an app so we could record our voice through the computer into birdsong so we can greet our friends and so forth. What fun. ;-}

    ReplyDelete
  13. An NPR radio piece on the same subject: https://www.npr.org/2021/04/16/988200892/heres-what-all-things-considered-sounds-like-in-blackbird-song

    ReplyDelete
  14. As a poet, I would love to take some lines from my poetry or favorite poems and translate them into birdsong. Maybe one day there will be a method to easily record and convert. This is truly beautiful. Thank you.

    ReplyDelete
  15. Is this translator hosted somewhere that it can be used? I'd love to try it!

    ReplyDelete
    Replies
    1. Unfortunately it's not. Would be really cool!

      Delete
  16. Trrr
    Uwhheeeee
    Shreeee
    Ssssssss
    Tweeeeeeee

    ReplyDelete
  17. Just heard your NPR interview - very interesting!!
    For a related project, consider converting voice to a whistled language (Silbo Gomero) or a drum language. Here's a good reference paper.
    Rialland, A. (2005). "Phonological and phonetic aspects of whistled languages".
    https://core.ac.uk/download/pdf/191755708.pdf

    ReplyDelete
  18. I love this! This is probably a dumb question- but is there a way to open/recreate this on MaxMSP?

    ReplyDelete
    Replies
    1. It *should* be possible. It needs to be re-thought a little since there are no complex signals in MaxMSP (as far as I know). Also there's no MSP object to demodulate FM. I tried to get a fundamental frequency with fzero~ instead, but the result sounds quite different, not at all smooth. Here's the MSP patch I tried: [adc~] -- [fzero~ @threshold 0.01] -- [sig~] -- [*~ 8] -- [cycle~] -- [*~ 0.1] -- [dac~]. Additionally you could scale the resulting signal with the output 2 from fzero~ (using [*~]).

      Delete
  19. Fascinating! From your sample it seems certain phonemes generate a more bird-like sounds than others (at least with the given processing).
    That makes me wonder if certain human languages on average produce better results than others in this context :)

    ReplyDelete
  20. Translating bird sounds to human speech patterns would be interesting. As far as Cats, I am pretty sure mine gimmicks words like hello etc. I was listening to some music years ago that had what sounded like a bird that was slowed down and shifted in pitch it sounded like some prehistoric animal. I always wanted to find that music again.

    ReplyDelete
  21. This is brilliant. I'm a poet so I'm not very well versed with the tools but I love this. This is amazing! I'm curious though as to how the birds would react to this. Even though it is a song in their language, is it something that makes sense for them or has any particular meaning? Perhaps, the study of birdsongs would tell us more about their psychology and social behaviour and help us gain more insight. Thank you for this though. It is quite beautiful and innovative!

    ReplyDelete
  22. You might be on to something - https://www.smithsonianmag.com/science-nature/do-birds-have-language-180979629/

    ReplyDelete
  23. holy shit, i was just reading about that famous helicopter signal hack thing you did, and the first thought that came to my head was "i bet this person could figure out how to translate bird language to human language". and now you're doing it. hah.

    ReplyDelete
  24. If you get several pairs of voice + voice converted to "birdy" a AI could easily learn the inverse convertion, so a real bird would sound like a human voice. You will not get a teanslation, just "humized birdy", but who knows if listening to birds with this filter could help us understand them easily one day. Just dreaming!

    ReplyDelete
  25. You may know something near to a idiom of Spanish used in La Gomera. It's known as Silvo. https://youtu.be/YrLcyV5P_GY?si=x-XWmGLJVbxzwwLL

    ReplyDelete
  26. Congratulations. You have (re)invented the "Silbo Gomero". https://en.wikipedia.org/wiki/Silbo_Gomero

    ReplyDelete

Please browse through the FAQ first, it might be that your question is already answered.

Spammers have even found comments sections, so this comments section is pre-moderated; it will take some time for the comment to show up.