Showing posts with label acoustics. Show all posts
Showing posts with label acoustics. Show all posts

Passing planes and other whoosh sounds

I always assumed that the recognisable 'whoosh' sound a plane or helicopter makes when passing overhead simply comes from the famous Doppler effect. But when you listen closely, this explanation doesn't make complete sense.

(Audio clipped from freesound - here and here)

A classic example of the Doppler effect is the sound of a passing ambulance constantly descending in pitch. When a plane flies overhead the roar of the engine sometimes does that as well. But you can also hear a wider, breathier noise that does something different: it's like the pitch goes down at first, but when the plane has passed us, the pitch goes up again. That's not how Doppler works! What's going on there?

Comb filtering.

Let's shed light on the mystery by taking a look at the sound in a time-frequency spectrogram. Here, time runs from left to right, frequencies from top (high) to bottom (low).

We can clearly see one part of the sound (the one starting at 2500 Hz) sweeping downwards, or from high to low frequencies; this should be the Doppler effect. But there's something else happening at the bottom of the image – another pattern is moving down at first, then up again...

This sound's frequency distribution seems to form a series of moving peaks and valleys. This resembles what audio engineers would call 'comb filtering', due to its appearance in the spectrogram. When the peaks and valleys move about it causes a 'whoosh' sound; this is the same principle as in the flanger effect used in music production. But these are just jargon for the electronically created version. We can call the acoustic phenomenon the whoosh.

The comb pattern is caused by two copies of the same exact sound arriving at a slightly different times, close enough that they form an interference pattern. It's closely related to what happens to light in the double slit experiment. In recordings this often means that the sound was captured by two microphones and then mixed together; you can sometimes hear this happen unintentionally in podcasts and radio shows. So my thought process is, are we hearing two copies of the plane's sound? How much later is the other one arriving, and why? And why does the 'whoosh' appear to go down in pitch at first, then up again?

Into the cepstral domain.

The cepstrum, which is the inverse Fourier transform of the estimated log spectrum, is a fascinating plot for looking at delays and echoes in complex (as in complicated) signals. While the spectrum separates frequencies, the cepstrum measures time, or quefrency – see what they did there? It reveals cyclicities in the sound's structure even if it interferes with itself, like in our case. In that it's similar to autocorrelation.

It's also useful for looking at sounds that, experientially, have a 'pitch' to them but that don't show any clear spectral peak in the Fourier transform. Just like the sound we're interested in.

Here's a time-quefrency cepstrogram of the same sound (to be accurate, I used the autocepstrum here for better clarity):

The Doppler effect is less prominent here. Instead, the plot shows a sweeping peak that seems to agree with the pitch change we hear. This delay time sweeps from around 4 milliseconds to 9 ms and back. Note that the scale: higher frequencies (shorter times) are on the bottom this time.

Now why would the sound be so correlated with itself with this sweeping delay time?

Ground echo?

Here's my hypothesis. We are hearing not only the direct sound from the plane but also a delayed echo from a nearby flat surface. These two sound get superimposed and interfere before they reach our ears. The effect would be especially prominent with planes and helicopters because there is little in the way of the sound either from above or from the large surface. And what could be a large reflective surface outdoors? Well, the ground below!

Let's think about the numbers. The ground is around one-and-a-half metres below our ears. When a plane is directly overhead, the reflected sound needs to take a path that's three metres longer (two-way) than the direct path. Since sound travels 343 metres per second this translates to a difference of 9 milliseconds – just what we saw in the correlogram!

Below, I used GeoGebra to calculate the time difference (between the yellow and green paths) in milliseconds.

When the plane is far away the angle is shallower, the two paths are more similar in distance, and the time difference is shorter.

It would follow that a taller person hears the sound differently than a shorter one, or someone in a tenth-floor window! If the ground is very soft, maybe in a mossy grove, you probably wouldn't hear the effect at all; just the Doppler effect. But this prediction needs to be tested out in a real forest.

Here's what a minimal acoustic simulation model renders. We'll just put a flying white noise source in the sky and a reflective surface as the ground. Let's only update the IR at 15 fps to prevent the Doppler phenomenon from emerging.

Whoosh!

Some everyday whooshes.

The whoosh isn't only associated with planes. When it occurs naturally it usually needs three things:

  • a sound with a lot of structure (preferably a hissy or breathy noise)
  • an unobstructed echo from a closeby surface
  • and some kind of physical movement.

I've heard this outdoors when the sound of a waterfall was reflecting off a brick wall (video); and next to a motorway when the sound barrier provided the reflection. You can hear it in some films – for instance, in the original Home Alone when Kevin puts down the pizza box after taking a whiff (video)!

You can even hear it in the sound of thunder when lightning hits quite close. Nothing is physically moving in this case; but it might be caused because a 'bang' is created simultaneously along a very long path but sound only travels so fast.

Try it yourself: move your head towards a wall – or a laptop screen – and back away from it, while making a continuous 'hhhh' or 'shhh' noise. Listen closely but don't close your eyes, you might bump your nose.

Where have you encountered the whoosh?

A simple little plot.

Finally, if you have JavaScript turned on you'll see (and hear) some more stuff in this blog post. In the interactive graph below you can move the aeroplane and listener around and see how the numbers change. The 'lag' or time difference we hear (orange arrow) comes from how much farther away the reflected virtual image is compared to the real aeroplane. For instance, when it's right above, the copied sound travels 3 meters longer. In the lower right corner, the 'filter' spectrum up to 4.5 kHz is also drawn. The circles are there to visualize the direct distance.

FAQ

I get many questions that point out that planes have two of something: two engines, two ends in one engine, etc... I can see how this is an inviting chain of though, but it has two problems. 1) The sound is not just associated to jet engines or even planes at all; 2) The sounds would have to be nearly identical for interference to happen. Random wind noise wouldn't created phase-coherent sounds from two independent sources. Some discussion in the comments below.

Ultrasonic investigations in shopping centres

I can't remember how I first came across these near-ultrasonic 'beacons' ubiquitous in PA systems. I might have been scrolling through the audio spectrum while waiting for the underground train; or it might have been the screeching 'tinnitus-like' sensation I would often get near the loudspeakers at a local shopping centre.

[Image: Graph with frequencies from 16 to 24 kHz and decibels from -30 to -110. Two peaks are shown, the lower one marked REDI around 19600 Hz and a higher one marked Forum at 20000 Hz.]

Whatever the case, I learned that they are called pilot tones. Many multi-loudspeaker PA systems (like the Zenitel VPA and Axys End of Line detection unit) employ these roughly 20-kilohertz tones to continuously measure the system's health status: no pilot tone means no connection to a loudspeaker. It's usually set to a very high frequency, inaudible to humans, to avoid disturbing customers.

However, these tones are powerful and some people will still hear them, especially if the frequency gets below 20 kHz. There is one such system at an uncomfortable 19.595 kHz in my city; it's marked green in the graph above. I've heard of several other people that also hear the sound. I don't believe it to be a sonic weapon like The Mosquito; those use even lower frequencies, down to 17 kHz. It's probably just a misconfiguration that was never fixed because the people working on it couldn't experientially confirm any issue with it.

Hidden modulation.

Pretty quickly it became apparent that this sound is almost never a pure tone. Some kind of modulation can always be seen wiggling around it in the spectrogram. Is it caused by the background music being played through the PA system? Is it carrying some information? Or is it something else altogether?

I've found at least one place where the tone appears to be amplitude modulated by the lowest frequencies in the music or commercials playing. It's probably a side effect of some kind of distortion.

Here's a spectrogram plot of this amplitude modulation around the strong pilot tone. It's colour-coded so that the purple colour is coming from the right microphone channel and green from the left. I'm not quite sure what the other purple horizontal stripes are here.

But this kind of modulation is rare. It's more common to see the tone change in response to things happening around you, like people moving about. More on that in the following.

Doppler-shifted backscatter.

Look what happens to the pilot tone when a train arrives at an underground station:

The wideband screech in the beginning is followed by this interesting tornado-shaped pattern that seems to have a lot of structure inside of it. It lasts for 15 seconds, until the train comes to a stop.

It's my belief that this is backscatter, also known as reverb, from the pilot tone reflecting off the slowing train and getting Doppler-shifted on the way. The pilot tone works as a continuous-wave bistatic sonar. Here, the left microphone (green) hears a mostly negative Doppler shift whereas the right channel (purple) hears a positive one, as the train is passing us from right to left. An anti-aliasing filter has darkened the higher frequencies as I wasn't yet aware I would find something interesting up here when recording.

A zoomed-in view of this cone reveals these repeating sweeps from positive to negative and red to green. Are they reflections off of some repeating structure in the passing train? The short horizontal bursts of constant tone could then be surfaces that are angled in a different direction than the rest of the train. Or perhaps this repetition reflects the regular placement of loudspeakers around the station?

Moving the microphone.

Another interesting experiment: I took the lift to another floor and recorded the ride from inside the lift. It wasn't the metal box type, the walls were made of glass, so I thought the pilot tone should be at least somewhat audible inside. Here's what I got during the 10-second ride. It's a little buried in noise.

[Image: A spectrogram zoomed into 19500 Hz. There's a pure tone at 19500 Hz in the beginning. Soon it starts to 'disintegrate' into wideband noise and then comes back together, forming a spindle-like pattern.]

Skater calculation.

For the next experiment I went into the underground car park of a shopping centre. I stood right under a PA loudspeaker and recorded a skateboarder passing by. A lot of interesting stuff is happening in this stereo spectrogram!

[Image: A pure tone near 19500 Hz, superimposed with an S-shaped pattern going from higher to lower frequencies and from red to green colour.]

First of all, there seems to be two pilot tones, one at 19,595 Hz and a much quieter one at 19,500 Hz. Are there two different PA systems in the car park?

Second, there's a clear Doppler shift in the reverb. The frequency shift goes from positive to negative at the same moment that the skater passes us, seen as the wideband wheel noise changing color. It looks like the pattern is also 'filled in' with noise under this Doppler curve. What all information can we find out just by looking at this image?

If we ignore the fact that this is actually a bistatic doppler shift we could try and estimate the speed using a formula on Wikipedia. It was pretty chilly in the car park, I would say 15 °C. The speed of sound at 15 °C is 340 m/s. The maximum Doppler shift here seems to be 350 Hz. Plugging all these into the equation we get 11 km/h, which sounds like a realistic speed for a skater.

Why is it filled in? My thought is these are reflections off different points on our test subject. There's variation in the reflection angles and, consequently, magnitudes of the velocity component that causes frequency shifting, down to nearly zero Hz.

What now?

What would you do with this ultrasonic beep all around you? I have some free ideas:

  • Automated speed trap in the car park
  • Detect when the escalators stop working
  • Modulate it with a positioning code to prevent people getting lost in the maze of commerce
  • Use it to deliver ads somehow
  • Use it to find your way to the quietest spots in a shopping centre

Speech to birdsong conversion

I had a dream one night where a blackbird was talking in human language. When I woke up there was actually a blackbird singing outside the window. Its inflections were curiously speech-like. The dreaming mind only needed to imagine a bunch of additional harmonics to form phonemes and words. One was left wondering if speech could be transformed into a blackbird song by isolating one of the harmonics...

One way to do this would be to:

  • Find the instantaneous fundamental frequency and amplitude of the speech. For example, filter the harmonics out and use an FM demodulator to find the frequency. Then find the signal envelope amplitude by AM demodulation.
  • Generate a new wave with similar amplitude variations but greatly multiplied in frequency.
[Image: Signal path diagram.]

A proof-of-concept script using the Perl-SoX-csdr command-line toolchain is available (source code here). The result sounds surprisingly blackbird-like. Even the little trills are there, probably as a result of FM noise or maybe vocal fry at the end of sentences. I got the best results by speaking slowly and using exaggerated inflection.

Someone hinted that the type of intonation used in certain automatic announcements is perfect for this kind of conversion. And it seems to be true! Here, a noise gate and reverb has been added to the result to improve it a little:

And finally, a piece of sound art where this synthetic blackbird song is mixed with a subtle chord and a forest ambience:

Think of the possibilities: A simultaneous interpreter for talking to birds. A tool for dubbing talking birds in animation or live theatre. Entertainment for cats.

What other birds could be done with a voice changer like this? What about croaky birds like a duck or a crow?

(I talked about this blog post a little on NPR: Here's What 'All Things Considered' Sounds Like — In Blackbird Song)

Virtual music box

A little music project I was writing required a melody be played on a music box. However, the paper-programmable music box I had (pictured) could only play notes on the C major scale. I couldn't easily find a realistic-sounding synthesizer version either. They all seemed to be missing something. Maybe they were too perfectly tuned? I wasn't sure.

Perhaps, if I digitized the sound myself, I could build a flexible virtual instrument to generate just the perfect sample for the piece!

[Image: A paper programmable music box.]

I haven't really made a sampled instrument before, short of perhaps using Impulse Tracker clones with terrible single-sample ones. So I proceeded in an improvised manner. Below I'll post some interesting findings and sound samples of how the instrument developed along the way. There won't be any source code as for now.

By the way, there is a great explanatory video by engineerguy about the workings of music boxes that will explain some terminology ("pins" and "teeth") used in this post.

Recording samples

[Image: A recording setup with a microphone.]

The first step was, obviously, to record the sound to be used as samples. I damped my room using towels and mattresses to minimize room echo; this could be added later if desired, but for now it would only make it harder to cleanly splice the audio. The microphone used was the Audio Technica AT2020, and I digitized it using the Behringer Xenyx 302 USB mixer.

I perforated a paper roll to play all the possible notes in succession, and rolled the paper through. The sound of the paper going through the mechanism posed a problem at first, but I soon learned to stop the paper at just the right moment to make way for the sound of the tooth.

Now I had pretty decent recordings of the whole two-octave range. I used Audacity to extract the notes from the recording, and named the files according to the actual playing MIDI pitch. (The music box actually plays a G# major scale, contrary to what's marked on the blank paper rolls.)

The missing notes

Next, we'll need to generate the missing notes that don't belong in the scale of this music box. Because pitch is proportional to the speed of vibration, this could be done by simply speeding up or slowing down an adjacent note by just the right factor. In equal temperament tuning, this factor would be the 12th root of 2, or roughly 1.05946. Such scaling is straightforward to do on the command line using SoX, for instance (sox c1.wav c_sharp1.wav speed 1.05946).

[Image: Musical notation explaining transposition by multiplication by the 12th root of 2.]

This method can also be used to generate whole new octaves; for example, a transposition of +8 semitones would have a ratio of (12√2)8 ≈ 1.5874. Inter-note variance could be retained by using a random source file for each resampled note. But large-interval transpositions would probably not sound very good due to coloring in the harmonic series.

Here's a table of some intervals and the corresponding speed ratios in equal temperament:

–3= (12√2)–3≈ 0.840896
–2= (12√2)–2≈ 0.890899
–1= (12√2)–1≈ 0.943874
+1= (12√2)1≈ 1.059463
+2= (12√2)2≈ 1.122462
+3= (12√2)3≈ 1.189207

First test!

Now I could finally write a script to play my melody!

It sounds pretty good already - there's no obvious noise and the samples line up seamlessly even though they were just naively glued together sample by sample. There's a lot of power in the lower harmonics, probably because of the big cardboard box I used, but this can easily be changed by EQ if we want to give the impression of a cute little music box.

Adding errors

The above sound still sounded quite artificial, I think mostly because simultaneous notes start on the same exact millisecond. There seems to be a small timing variance in music boxes that is an important contributor to their overall delicate sound. In the below sample I added a timing error from a normal distribution with a standard deviation of 11 milliseconds. It sounds a lot better already!

Other sounds from the teeth

If you listen to recordings of music boxes you can occasionally hear a high-pitched screech as well. It sounds a bit like stopping a tuning fork or guitar string with a metal object. That's why I thought it must be the sound of the pin stopping a vibrating tooth just before playing another note on the same tooth.

[Image: Spectrogram of the beginning of a note with the characteristic screech, centered around 12 kilohertz.]

Sure enough, this sound could always be heard by playing the same note twice in quick succession. I recorded this sound for each tooth and added it to my sound generator. The sound will be generated only if the previous note sample is still playing, and its volume will be scaled in proportion to the tooth's envelope amplitude at that moment. Also, it will silence the note. The amount of silence between the screech and the next note will depend on a tempo setting.

Adding this resonance definitely brings about a more organic feel:

The wind-up mechanism

For a final touch I recorded sounds from the wind-up mechanism of another music box, even though this one didn't have one. It's all stitched up from small pieces, so the number of wind-ups in the beginning and the speed of the whirring sound can all be adjusted. I was surprised at the smoothness of the background sound; it's a three-second loop with no cross-fading involved. You can also hear the box lid being closed in the end.

Notation

[Image: VIM screenshot of a text file containing music box markup.]

The native notation of a music box is some kind of a perforated tape or drum, so I ended up using a similar format. There's a tempo marking and tuning information in the beginning, followed by notation one eighth per line. Arpeggios are indicated by a pointy bracket >. I also wrote a script to convert MIDI files into this format; but the number of notes in a music box loop is usually so small that it's not very hard to write manually.

This format could include additional information as well, perhaps controlling the motor sound or box size and shape (properties of the EQ filter).

This format could also potentially be useful when producing or transcribing music from music drums.


Future developments

Currently the music box generator has a hastily written "engineer's UI", which means I probably won't remember how to use it in a couple months' time. Perhaps it could it be integrated into some music software, as a plugin.

Possibilities for live performances are limited, I think. It wouldn't work exactly like a keyboard instrument usually does. At least there should be a way to turn on the background noise, and the player should take into account the 300-millisecond delay caused by the pin slowly rotating over the tooth. But it could be used to play a roll in an endless loop and the settings could be modified on the fly.

As such, the tool performs best at pre-rendering notated music. And I'm happy with the results!

Pea whistle steganography

[Image: Acme Thunderer 60.5 whistle]

Would anyone notice if a referee's whistle transmitted a secret data burst?

I do really follow the game. But every time the pea whistle sounds to start the jam I can't help but think of the possibility of embedding data in the frequency fluctuation. I'm sure it's alternating between two distinct frequencies. Is it really that binary? How random is the fluctuation? Could it be synthesized to contain data, and could that be read back?

I found a staggeringly detailed Wikipedia article about the physics of whistles – but not a single word there about the effects of adding a pea inside, which is obviously the cause of the frequency modulation.

To investigate this I bought a metallic pea whistle, the Acme Thunderer 60.5, pictured here. Recording its sound wasn't straightforward as the laptop microphone couldn't record the sound without clipping. The sound is incredibly loud indeed – I borrowed a sound pressure meter and it showed a peak level of 106.3 dB(A) at a distance of 70 cm, which translates to 103 dB at the standard 1 m distance. (For some reason I suddenly didn't want to make another measurement to get the distance right.)

[Image: Display of a sound pressure meter showing 106.3 dB max.]

Later I found a microphone that was happy about the decibels and got this spectrogram of a 500-millisecond whistle.

[Image: Spectrogram showing a tone with frequency shifts.]

The whistle seems to contain a sliding beginning phase, a long steady phase with frequency shifts, and a short sliding end phase. The "tail" after the end slide is just a room reverb and I'm not going to need it just yet. A slight amplitude modulation can be seen in the oscillogram. There's also noise on somewhat narrow bands around the harmonics.

The FM content is most clearly visible in the second and third harmonics. And seems like it could very well fit FSK data!

Making it sound right

I'm no expert on synthesizers, so I decided to write everything from scratch (whistle-encode.pl). But I know the start phase of a sound, called the attack, is pretty important in identification. It's simple to write the rest of the fundamental tone as a simple FSK modulator; at every sample point, a data-dependent increment is added to a phase accumulator, and the signal is the cosine of the accumulator. I used a low-pass IIR filter before frequency modulation to make the transitions smoother and more "natural".

Adding the harmonics is just a matter of measuring their relative powers from the spectrogram, multiplying the fundamental phase angle by the index of the harmonic, and then multiplying the cosine of that phase angle by the relative power of that harmonic. SoX takes care of the WAV headers.

Getting the noise to sound right was trickier. I ended up generating white noise (a simple rand()), lowpass filtering it, and then mixing a copy of it around every harmonic frequency. I gave the noise harmonics a different set of relative powers than for the cosine harmonics. It still sounds a bit too much like digital quantization noise.

Embedding data

There's a limit to the amount of bits that can be sent before the result starts to sound unnatural; nobody has lungs that big. A data rate of 100 bps sounded similar to the Acme Thunderer, which is pretty much nevertheless. I preceded the burst with two bytes for bit and byte sync (0xAA 0xA7), and one byte for the packet size.

Here's "OHAI!":

Sounds legit to me! Here's a slightly longer one, encoding "Help me, I'm stuck inside a pea whistle":

Homework

  1. Write a receiver for the data. It should be as simple as receiving FSK. The frequency can be determined using atan2, a zero-crossing detector, or FFT, for instance. The synchronization bytes are meant to help decode such a short burst; the alternating 0s and 1s of 0xAA probably give us enough transitions to get a bit lock, and the 0xA7 serves as a recognizable pattern to lock the byte boundaries on.
  2. Build a physical whistle that does this! (Edit: example solution!)

Case study: low-frequency tinnitus with distortion

[Image: A pure tone audiogram of both ears indicating no hearing loss.]

A periodically appearing low-frequency tinnitus is one of my least favorite signals. A doctor's visit only resulted in a WONTFIX and the audiogram shown here, which didn't really answer any questions. Also, the sound comes with some pecularities that warrant a deeper analysis. So it shall become one of my absorptions.

The possible subtype (Vielsmeier et al. 2012) of tinnitus I have, related to a joint problem, is apparently even more poorly understood than the classical case (Vielsmeier et al. 2011), which of course means I'm free to make wild speculations! And maybe throw a supporting citation here and there.

Here's a simulation of what it sounds like. The occasional frequency shifts are caused by head movements. (There's only low-frequency content, so headphones will be needed; otherwise it will sound like silence.)

It's nothing new, save for the somewhat uncommon frequency. Sounds a bit like a car left idling outside the house. Now to the weird stuff.

Real-life audio artifacts!

This analysis was originally sparked by a seemingly unrelated observation. I listen to podcasts and documentaries a lot, and sometimes I've noticed the voice sounding like it had shifted up in frequency, for just a small amount. It would resemble an across-the-spectrum linear shift that breaks the harmonic relationships, much like when listening to a SSB transmission. (Simulated sound sample from a podcast below.)

I always assumed this was a compression artifact of some kind. Or maybe broken headphones. But one day I also noticed it in real life, when a friend was talking to me! I had to ask her repeat, even though I had heard her well. Surely not a compression artifact. Of course I immediately associated it with the tinnitus that had been quite strong that day. But how could a pure tone alter the whole spectrum so drastically?

Amplitude modulation?

It's known that a signal gets frequency-shifted when amplitude-modulated, i.e. multiplied in the time domain, by a steady sine wave signal. This is a useful effect in the realm of radio, where it's known as heterodyning. My tinnitus happens to be a near-sinusoidal tone at 65 Hz; if this got somehow multiplied with part of the actual sound somewhere in the auditory pathway, it could explain the distortion.

[Image: Oscillograms of a wideband signal and a sinusoid tone, and a multiplication of the two.]

Where could such a multiplication take place physically? I'm guessing it should be someplace where the signal is still represented as a single waveform. The basilar membrane in the cochlea already mechanically filters the incoming sound into frequency bands one sixth of an octave wide for neural transmission (Schnupp et al. 2012). Modulating one of these narrow bands would likely not affect so many harmonics at the same time, so it should either happen before the filtering or at a later phase, where the signal is still being handled in a time-domain manner.

I've had several possibilities in mind:

  1. The low frequency tone could have its origins in actual physical vibration around the inner ear that would cause displacement of the basilar membrane. This is supported by a subjective physical sensation of pressure in the ear accompanying the sound. How it could cause amplitude modulation is discussed later on.
  2. A somatosensory neural signal can cause inhibitory modulation of the auditory nerves in the dorsal cochlear nucleus (Young et al. 1995). If this could happen fast enough, it could lead to amplitude modulation of the sound by modulating the amount of impulses transmitted; assuming the auditory nerves still carry direct information about the waveform at this point (they sort of do). Some believe the dorsal cochlear nucleus is exactly where the perceived sound in this type of tinnitus also originates (Sanchez & Rocha 2011).

Guinea pigs know the feeling

Already in the 1970s, it was demonstrated that human auditory thresholds are modulated by low frequency tones (Zwicker 1977). In a 1984 paper the mechanism was investigated further in Guinea pigs (Patuzzi et al. 1984). A low-frequency tone (anywhere from 33 up to 400 Hz) presented to the ear modulated the sensitivity of the cochlear hair cell voltage to higher frequency sounds. This modulation tracked the waveform of the low tone, such that the greatest amplitude suppression was during the peaks of the low tone amplitude, and there was no suppression at its zero crossings. In other words, a low tone was capable of amplitude-modulating the ear's response to higher tones.

This modulation was observed already in the mechanical velocity of the basilar membrane, even before conversion into neural voltages. Some kind of an electro-mechanical feedback process was thought to be involved.

Hints towards a muscular origin

So, probably a 65 Hz signal exists somewhere, whether physical vibration or neural impulses. Where does it come from? Tinnitus with vascular etiology is usually pulsatile in nature (Hofmann et al. 2013), so it can be ruled out. But what about muscle cramps? After all, I know there's a problem with the temporomandibular joint and nearby muscles might not be happy about that. We could get some hints by studying the frequencies related to a contracting muscle.

A 1974 study of EEG contamination caused by various muscles showed that the surface EMG signal from the masseter muscle during contraction has its peak between 50 and 70 Hz (O'Donnell et al. 1974); just what we're looking for. (The masseter is located very close to the temporomandibular joint and the ear.) Later, there has been initial evidence that central neural motor commands to small muscles may be rhythmic in nature and that this rhythm is also reflected in EMG and the synchronous vibration of the contracting muscle (McAuley et al. 1997).

Sure enough, in my case, applying firm pressure to the deep masseter or the posterior digastric muscle temporarily silences the sound.

Recording it

Tinnitus associated with a physical sound detectable by an outside observer, a rare occurrence, is described as objective (Hofmann et al. 2013). My next plan was to use a small in-ear microphone setup to try and find out if there was an objective sound present. This would shed light on the way the sound is transmitted from the muscles to the auditory system, as if it made any difference.

But before I could do that, I went to this loud open air trance party (with DJ Tristan) that, for some reason, eradicated the whole tinnitus that had been going on for a week or two. I had to wait for a week before it reappeared. (And I noted it being the result of a stressful situation, as people on Twitter and HN have also pointed out.)

[Image: Sennheiser earplugs connected to the microphone preamp input of a Xenyx 302 USB audio interface.]

Now I could do a measurement. I used my earplugs as a microphone by plugging them into a mic preamplifier using a plug adapter. It's a mono preamp, so I disconnected the left channel of the adapter using cellotape to just record from the right ear.

I set baudline for 2-minute spectral integration time and a 600 Hz decimated sample rate, and the preamp to its maximum gain. Even though the setup is quite sensitive and the earplug has very good isolation, I wasn't able to detect even the slightest peak at 65 Hz. So either recording outside the tympanic membrane was an absurd idea to begin with, or maybe the neural explanation is the more likely cause of the sound.

[Image: Screenshot of baudline with the result of spectral integration from 0 to 150 Hz, with nothing to note but a slight downward slope towards the higher frequencies.]

My next post, The microphone bioamplifier, starts by exploring this further.

References

The flyback sonar

[Image: Close-up photo of a PCB with various components, including three potentiometers and a black bulky device labeled 'MONITRONICS INC TAIWAN'.]

One of the sounds of my childhood was the 15.6 kHz noise of the vibrating television flyback transformer, caused by a phenomenon known as magnetostriction. It has since disappeared, because cathode ray tubes have been replaced by modern display technologies that don't use magnetic deflection coils.

The piercing, ubiquitous sound would easily reveal if someone had a TV on in the house. It would also change a little when channels were changed. But due to its short wavelength, it would also reveal when someone was moving around in the room the TV was in. This was apparent as a modulation that could even be heard through a closed door and in another floor. Listening to it was much like using a passive sonar.

Now, inspired by some people at the University of Washington and Microsoft Research, I decided to investigate what the modulation was actually about.

Faking it a little

To get a stable tone at the flyback transformer, the TV apparently has to be tuned to a channel. This poses a problem: all analogue TV broadcasts in this country were discontinued in 2007. I can use my RF modulator to generate a TV channel, which stabilizes the sound. But it's still quite weak for scientific purposes, coming from such a tiny portable TV.

So I ended up sampling the sound and then using a digitally generated version, after confirming that it is indeed a pure sinusoid. I'll play the sound through the speakers and use the laptop's microphone to record the soundscape in the room. Of course, this is different from the setup where the moving person is between the sound source and the listener; but nevertheless, it should give us some insight.

It's a Doppler shift!

Blocking the sound source obviously modulates the signal amplitude to some extent. But turns out even a slight movement anywhere in the room causes noticeable frequency modulation in the echoed tone. This is obviously due to Doppler shift, since the sign of the shift correlates with the direction of the movement in respect to the laptop.

In this video, the zero shift frequency has been filtered out of the spectrogram.

Another childhood mystery settled.