Jul 11, 2015

Case study: tinnitus with distortion

[Image: A pure tone audiogram of both ears indicating no hearing loss.]

A periodically appearing low-frequency tinnitus is one of my least favorite signals. A doctor's visit only resulted in a WONTFIX and the audiogram on the right, which didn't really answer any questions. Also, the sound comes with some pecularities that warrant a deeper analysis. So it shall become one of my absorptions.

The possible subtype (Vielsmeier et al. 2012) of tinnitus I have, related to a joint problem, is apparently even more poorly understood than the classical case (Vielsmeier et al. 2011), which of course means I'm free to make wild speculations! And maybe throw a supporting citation here and there.

Here's a simulation of what it sounds like. The occasional frequency shifts are caused by head movements. (There's only low-frequency content, so headphones will be needed; otherwise it will sound like silence.)

It's nothing new, save for the somewhat uncommon frequency. Now to the weird stuff.

Real-life audio artifacts!

This analysis was originally sparked by a seemingly unrelated observation. I listen to podcasts and documentaries a lot, and sometimes I've noticed the voice sounding like it had shifted up in frequency, for just a small amount. It would resemble an across-the-spectrum linear shift that breaks the harmonic relationships, much like when listening to a SSB transmission. (Simulated sound sample from a podcast below.)

I always assumed this was a compression artifact of some kind. Or maybe broken headphones. But one day I also noticed it in real life, when a friend was talking to me! I had to ask her repeat, even though I had heard her well. Surely not a compression artifact. Of course I immediately associated it with the tinnitus that had been quite strong that day. But how could a pure tone alter the whole spectrum so drastically?

Amplitude modulation?

It's known that a signal gets frequency-shifted when amplitude-modulated, i.e. multiplied in the time domain, by a steady sine wave signal. This is a useful effect in the realm of radio, where it's known as heterodyning. My tinnitus happens to be a near-sinusoidal tone at 65 Hz; if this got somehow multiplied with part of the actual sound somewhere in the auditory pathway, it could explain the distortion.

[Image: Oscillograms of a wideband signal and a sinusoid tone, and a multiplication of the two.]

Where could such a multiplication take place physically? I'm guessing it should be someplace where the signal is still represented as a single waveform. The basilar membrane in the cochlea already mechanically filters the incoming sound into frequency bands one sixth of an octave wide for neural transmission (Schnupp et al. 2012). Modulating one of these narrow bands would likely not affect so many harmonics at the same time, so it should either happen before the filtering or at a later phase, where the signal is still being handled in a time-domain manner.

I've had several possibilities in mind:

  1. The low frequency tone could have its origins in actual physical vibration around the inner ear that would cause displacement of the basilar membrane. This is supported by a subjective physical sensation of pressure in the ear accompanying the sound. How it could cause amplitude modulation is discussed later on.
  2. A somatosensory neural signal can cause inhibitory modulation of the auditory nerves in the dorsal cochlear nucleus (Young et al. 1995). If this could happen fast enough, it could lead to amplitude modulation of the sound by modulating the amount of impulses transmitted; assuming the auditory nerves still carry direct information about the waveform at this point (they sort of do). Some believe the dorsal cochlear nucleus is exactly where the perceived sound in this type of tinnitus also originates (Sanchez & Rocha 2011).

Guinea pigs know the feeling

Already in the 1970s, it was demonstrated that human auditory thresholds are modulated by low frequency tones (Zwicker 1977). In a 1984 paper the mechanism was investigated further in Guinea pigs (Patuzzi et al. 1984). A low-frequency tone (anywhere from 33 up to 400 Hz) presented to the ear modulated the sensitivity of the cochlear hair cell voltage to higher frequency sounds. This modulation tracked the waveform of the low tone, such that the greatest amplitude suppression was during the peaks of the low tone amplitude, and there was no suppression at its zero crossings. In other words, a low tone was capable of amplitude-modulating the ear's response to higher tones.

This modulation was observed already in the mechanical velocity of the basilar membrane, even before conversion into neural voltages. Some kind of an electro-mechanical feedback process was thought to be involved.

Hints towards a muscular origin

So, probably a 65 Hz signal exists somewhere, whether physical vibration or neural impulses. Where does it come from? Tinnitus with vascular etiology is usually pulsatile in nature (Hofmann et al. 2013), so it can be ruled out. But what about muscle cramps? After all, I know there's a displaced joint disc and nearby muscles might not be happy about that. We could get some hints by studying the frequencies related to a contracting muscle.

A 1974 study of EEG contamination caused by various muscles showed that the surface EMG signal from the masseter muscle during contraction has its peak between 50 and 70 Hz (O'Donnell et al. 1974); just what we're looking for. (The masseter is located very close to the temporomandibular joint and the ear.) Later, there has been initial evidence that central neural motor commands to small muscles may be rhythmic in nature and that this rhythm is also reflected in EMG and the synchronous vibration of the contracting muscle (McAuley et al. 1997).

Sure enough, in my case, applying firm pressure to the deep masseter or the posterior digastric muscle temporarily silences the sound.

Recording it

Tinnitus associated with a physical sound detectable by an observer, a rare occurrence, is described as objective (Hofmann et al. 2013). My next plan was to use a small in-ear microphone setup to try and find out if there was an objective sound present. This would shed light on the way the sound is transmitted from the muscles to the auditory system, as if it made any difference.

But before I could do that, I went to this loud open air trance party (with DJ Tristan) that, for some reason, eradicated the whole tinnitus that had been going on for a week or two. I had to wait for a week before it reappeared. (And I noted it being the result of a stressful situation, as people on Twitter and HN have also pointed out.)

[Image: Sennheiser earplugs connected to the microphone preamp input of a Xenyx 302 USB audio interface.]

Now I could do a measurement. I used my earplugs as a microphone by plugging them into a mic preamplifier using a plug adapter. It's a mono preamp, so I disconnected the left channel of the adapter using cellotape to just record from the right ear.

I set baudline for 2-minute spectral integration time and a 600 Hz decimated sample rate, and the preamp to its maximum gain. Even though the setup is quite sensitive and the earplug has very good isolation, I wasn't able to detect even the slightest peak at 65 Hz. So either recording outside the tympanic membrane was an absurd idea to begin with, or maybe the neural explanation is the more likely cause of the sound.

[Image: Screenshot of baudline with the result of spectral integration from 0 to 150 Hz, with nothing to note but a slight downward slope towards the higher frequencies.]

References

Apr 14, 2015

Trackers leaking bank account data

A Finnish online bank used to include a third-party analytics and tracking script in all of its pages. Ospi first wrote about it (in Finnish) in February 2015, and this caused a bit of a fuss.

The bank responded to users' worries by claiming that all information is collected anonymously:

[Image: A tweet by the bank, in Finnish. Translation: Our customers' personal data will not be made available to Google under any circumstances. Thanks to everyone who participated in the discussion! (2/2)]

But is it true?

As Ospi notes, a plethora of information is sent along the HTTP request for the tracker script. This includes, of course, the IP address of the user; but also the full URL the user is browsing. The bank's URLs reveal quite a bit about what the user is doing; for instance, a user planning to start a continuous savings contract will send the url continuousSavingsContractStep1.do.

I logged in to the bank (using well-known demo credentials) to record one such tracking request. The URL sent to the third party tracker contains a cleartext transaction archive code that could easily be used to match a transaction between two bank accounts, since it's identical for both users. But there's also a hex string called "accountId" (highlighted in red).

Remote Address: 80.***.***.***:443
Request URL:    https://www.google-analytics.com/collect?v=1&_v=j33&a=870588619&t
                =pageview&_s=1&dl=https%3A%2F%2Fonline.********.fi%2Febank%2Facco
                unt%2FinitTransactionDetails.do%3FbackLink%3Dreset%26accountId%3D
                69af881eca98b7042f18e975e00f9d49d5d5ee64%26rowNo%3D0%26type%3Dtra
                ns%26archivecode%3D20150220123456780002&ul=en-us&de=windows-1252&
                dt=Tilit%C2%A0%7C%C2%A0Verkkopankki%20%7C%20S-Pankki&sd=24-bit&sr
                =1440x900&vp=1440x150&je=1&fl=16.0%20r0&_u=QACAAQQBI~&jid=&cid=18
                39557247.1424801770&uid=&tid=UA-37407484-1&cd1=&cd2=demo_accounts
                &cd3=%2Ffi%2F&z=2098846672
Request Method: GET
Status Code:    200 OK

It's 40 hex characters long, which is 160 bits. This happens to be the length of an SHA-1 hash.

Could it really be a simple hash of the user's bank account number? Surely they would at least salt it.

Let's try.

The demo account's IBAN code is FI96 3939 0001 0006 03, but this doesn't give us the above hash. However, if we remove the country code, IBAN checksum, and all whitespaces, it turns out we have a match!

~ (zsh) ×
[~]$ echo -n "FI96 3939 0001 0006 03" | shasum dcf04c4fd3b6e29b4b43a8bf43c2713ac9be1de2 - [~]$ echo -n "FI9639390001000603" | shasum 3e3658e4c2802dd5c21b1c6c1ed55fc1f39c8830 - [~]$ echo -n "39390001000603" | shasum 69af881eca98b7042f18e975e00f9d49d5d5ee64 - [~]$ █

This is a BBAN format bank account number. BBAN numbers are easy to brute-force, especially if the bank is already known. I wrote the following C program, 22 lines of code, that reversed the above hash to the correct account number in 0.5 seconds.

#include <openssl/sha.h>
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
 
int main() {
  const char target_hash[SHA_DIGEST_LENGTH] = {
    0x69, 0xaf, 0x88, 0x1e, 0xca, 0x98, 0xb7, 0x04, 0x2f, 0x18,
    0xe9, 0x75, 0xe0, 0x0f, 0x9d, 0x49, 0xd5, 0xd5, 0xee, 0x64
  };
  unsigned char test_number[15];
  unsigned char test_hash[SHA_DIGEST_LENGTH];
 
  for (int bban_office=0; bban_office < 9999; bban_office++) {
    for (int bban_id=0; bban_id <= 999999; bban_id++) {
      snprintf((char*)test_number, 15, "3939%04d%06d", bban_office,
               bban_id);
      SHA1(test_number, 14, test_hash);
      if (memcmp(test_hash, target_hash, 20) == 0) {
        printf("found %s\n", test_number);
        exit(0);
      }
    }
  }
}
~/koodi (zsh) ×
[~/koodi]$ gcc -lcrypto -o pankki pankki.c [~/koodi]$ time ./pankki found 39390001000603 ./pankki 0.42s user 0.00s system 99% cpu 0.420 total [~/koodi]$ █

In conclusion, the third party is provided with the user's IP address, bank account number, addresses of subpages they visit, and account numbers associated with all transactions they make. The analytics company should also have no difficulty matching the user with its own database collected from other sites, including their full name and search history.

Incidentally, this is in breach of the Guidelines on bank secrecy (PDF) by the Federation of Finnish Financial Services; "In accordance with the secrecy obligation, third parties may not even be disclosed whether a certain person is a customer of the bank" (pg 4) (sama suomeksi).

The script was eventually removed from the site, leaving the bank regretful that such a useful tool was lost. However, alternatives do exist (like Piwik) that can be run locally, not involving a third party.

Feb 8, 2015

Receiving RDS with the RTL-SDR

redsea is a command-line RDS decoder. I originally wrote it as a script to decode RDS from demultiplexed FM stereo sound. Later I've experimented with other ways to read the bits, and the latest addition is to support the RTL-SDR television receiver via the rtl_fm tool.

Redsea is on GitHub. It has minimal dependencies (perl core modules, C standard library, rtl-sdr command-line tools) and has been tested to work on OSX and Linux with good enough FM reception. All test results, ideas, and pull requests are welcome.

What it says

The program prints out decoded RDS groups, one group per line. Each group will contain a PI code identifying the station plus varying other data, depending on the group type. The below picture explains the types of data you'll probably most often encounter.

[Image: Screenshot of textual output from redsea, with some parts explained.]

A more verbose output can be enabled with the -l option (it contains the same information though). The -t option prefixes all groups with an ISO timestamp.

How it works

The DSP side of my program, named rtl_redsea, is written in C99. It's a synchronous DBPSK receiver that first bandpass filters ① the multiplex signal. A PLL locks onto the 19 kHz stereo pilot tone; its third harmonic (57 kHz) is used to regenerate the RDS subcarrier. Dividing it by 16 also gives us the 1187.5 Hz clock frequency. Phase offsets of these derived signals are adjusted separately.

[Image: Oscillograms illustrating how the RDS subcarrier is gradually processed in redsea and finally reduced to a series of 1's and 0's.]

The local 57 kHz carrier is synchronized so that the constellation lines up on the real axis, so we can work on the real part only ②. Biphase symbols are multiplied by the square-wave clock and integrated ③ over a clock period, and then dumped into a delta decoder ④, which outputs the binary data as bit strings into stdout ⑤.

Signal quality is estimated a couple of times per second by counting the number of "suspicious" integrated biphase symbols, i.e. symbols with halves of opposite signs. The symbols are being sampled with a 180° phase shift as well, and we can switch to that stream if it seems to produce better results.

This low-throughput binary string data is then handled by redsea.pl via a pipe. Synchronization and error detection/correction happens there, as well as decoding. Group data is then displayed on the terminal, in semi-human-readable form.

Future

My ultimate goal is to have a tool useful for FM DX, i.e. pretty good noise resistance.

Jan 16, 2015

My chip collection

Old IC (integrated circuit) packages are fun and I collect them. This involves going to flea markets to look for cheap vintage electronics like telephones, answering machines, radios or toys, and then desoldering and salvaging all the ICs and other interesting parts. Selected packages from my disorganized pile of chips follow. Most are POTS-related.

Sony CXA1619BS

[Image: Photo of package]

A "one-chip-wonder", this is an FM/AM radio in a small package. It takes an RF signal (from the antenna) and an IF oscillator frequency as inputs and outputs demodulated monaural audio.

Sanyo LA2805

[Image: Photo of package]

This chip does general answering machine related tasks. It has a tape preamp for recording and playback; voice detector logic; beep detection using zero-crossing comparation; power amplifier; line amplifier; and pins for interfacing with a microcontroller.

Unicorn Microelectronics UM91215C

[Image: Photo of package]

The UM91215C is a tone/pulse dialer. A telephone keyboard matrix is connected to the input pins, and the chip outputs DTMF-encoded audio or pulsed digits, depending on the selected dialing mode. An external oscillator needs to be connected as well. It can do a one-key redial of the last dialed number, and it can also flash the phone line.

Holtek HT9170

[Image: Photo of package]

A DTMF receiver, reversing the operation of UM91215C above. The chip, employing filters and zero-crossing detectors, is fed an external oscillator frequency and telephone line audio, and it outputs a four-bit code corresponding to the DTMF digit present in the signal. The use of external components is minimal, but a crystal oscillator is needed in this case as well.

SGS-Thomson TDA1154

[Image: Photo of package]

A speed regulator for DC motors, this chip can keep a motor running at a very stable speed under varying load conditions. In an answering machine, it is needed to keep distortions in tape audio in the minimum.

Toshiba TC8835AN

[Image: Photo of package]

This chip can store and play back a total of 16 audio recordings of 512 kilobits in size. It also contains a lot of command logic, explained in a 40-page datasheet. Type of audio encoding is not specified, but the bitrate can be chosen between 22kbps and 16kbps. The analog output must be filtered prior to playback.

Intel 8049

[Image: Photo of package]

This monster of a chip is a 6 MHz, 8-bit microcontroller with 17 registers, 2 kilobytes ROM, 128 bytes RAM, and an instruction set of 90 codes. It's used in many older devices, from telephones to digital multimeters.

Oct 30, 2014

Visualizing hex dumps with Unicode emoji

Memorizing SSH public key fingerprints can be difficult; they're just long random numbers displayed in base 16. There are some terminal-friendly solutions, like OpenSSH's randomart. But because I use a Unicode terminal, I like to map the individual bytes into characters in the Miscellaneous Symbols and Pictographs block.

This Perl script does just that:

@emoji = qw( 🌀  🌂  🌅  🌈  🌙  🌞  🌟  🌠  🌰  🌱  🌲  🌳  🌴  🌵  🌷  🌸
             🌹  🌺  🌻  🌼  🌽  🌾  🌿  🍀  🍁  🍂  🍃  🍄  🍅  🍆  🍇  🍈
             🍉  🍊  🍋  🍌  🍍  🍎  🍏  🍐  🍑  🍒  🍓  🍔  🍕  🍖  🍗  🍘
             🍜  🍝  🍞  🍟  🍠  🍡  🍢  🍣  🍤  🍥  🍦  🍧  🍨  🍩  🍪  🍫
             🍬  🍭  🍮  🍯  🍰  🍱  🍲  🍳  🍴  🍵  🍶  🍷  🍸  🍹  🍺  🍻
             🍼  🎀  🎁  🎂  🎃  🎄  🎅  🎈  🎉  🎊  🎋  🎌  🎍  🎎  🎏  🎒
             🎓  🎠  🎡  🎢  🎣  🎤  🎥  🎦  🎧  🎨  🎩  🎪  🎫  🎬  🎭  🎮
             🎯  🎰  🎱  🎲  🎳  🎴  🎵  🎷  🎸  🎹  🎺  🎻  🎽  🎾  🎿  🏀
             🏁  🏂  🏃  🏄  🏆  🏇  🏈  🏉  🏊  🐀  🐁  🐂  🐃  🐄  🐅  🐆
             🐇  🐈  🐉  🐊  🐋  🐌  🐍  🐎  🐏  🐐  🐑  🐒  🐓  🐔  🐕  🐖
             🐗  🐘  🐙  🐚  🐛  🐜  🐝  🐞  🐟  🐠  🐡  🐢  🐣  🐤  🐥  🐦
             🐧  🐨  🐩  🐪  🐫  🐬  🐭  🐮  🐯  🐰  🐱  🐲  🐳  🐴  🐵  🐶
             🐷  🐸  🐹  🐺  🐻  🐼  🐽  🐾  👀  👂  👃  👄  👅  👆  👇  👈
             👉  👊  👋  👌  👍  👎  👏  👐  👑  👒  👓  👔  👕  👖  👗  👘
             👙  👚  👛  👜  👝  👞  👟  👠  👡  👢  👣  👤  👥  👦  👧  👨
             👩  👪  👮  👯  👺  👻  👼  👽  👾  👿  💀  💁  💂  💃  💄  💅 );

while (<>) {
  if (/[a-f0-9:]+:[a-f0-9:]+/) {
    ($b, $m, $a) = ($`, $&aml;, $');
    print $b.join("  ", map { $emoji[$_] } map hex, split /:/, $m)." ".$a;
  }
}

What's happening here? First we create a 256-element array containing a hand-picked collection of emoji. Naturally, they're all assigned an index from 0x00 to 0xff. Then we'll loop through standard input and look for lines containing colon-separated hex bytes. Each hex value is replaced with an emoji from the array.

Here's the output:

[Image: Terminal screenshot showing a PGP key fingerprint and the same with all hex numbers replaced with emoji.]

The script could easily be extended to support output from other hex-formatted sources as well, such as xxd:

 alt=

Jul 14, 2014

Mapping microwave relay links from video

Radio networks are often at least partially based on microwave relay links. They're those little mushroom-like appendices growing out of cell towers and building-mounted base stations. Technically, they're carefully directed dish antennas linking such towers together over a line-of-sight connection. I'm collecting a little map of nearby link stations, trying to find out how they're interconnected and which network they belong to.

Circling around

We can find a rough direction for any link antenna by approximating a tangent for the dish shroud surface from position-stamped video footage taken while circling the tower. Optimally we would have a drone make a full circle around the tower at a constant distance and elevation to map all antennas at once; but if our DJI Phantom has run out of battery, a GPS positioned still camera at ground level will also do.

[Image: Five photos of the same directional microwave antenna, taken from different angles, and edge-detection and elliptical Hough transform results from each one, with a large and small circle for all ellipses.]

The rest can be done manually, or using Hough transform and centroid calculation from OpenCV. In these pictures, the ratio of the diameters of the concentric circles is a sinusoid function of the angle between the antenna direction and the camera direction. At its maximum, we're looking straight at the beam. (The ratio won't max out at unity in this case, because we're looking at the antenna slightly from below.) We can select the frame with the maximum ratio from high-speed footage, or we can interpolate a smooth sinusoid to get an even better value.

[Image: Diagram showing how the ratio of the diameters of the large and small circle is proportional to the angle of the antenna in relation to the camera.]

This particular antenna is pointing west-northwest with an azimuth of 290°.

What about distance?

Because of the line-of-sight requirement, we also know the maximum possible distance to the linked tower, using the formula 7140 × √(4 / 3 × h) where h is the height of the antenna from ground. If the beam happens to hit a previously mapped tower closer than this distance, we can assume they're connected!

This antenna is communicating to a tower not further away than 48 km. Judging from the building it's standing on, it belongs to a government trunked radio network.

Jun 16, 2014

Headerless train announcements

[Image: Information display onboard a Helsinki train, showing a transcript of an announcement along with the time of the day, current speed and other info.]

The Finnish state railway company just changed their automatic announcement voice, discarding old recordings from trains. It's a good time for some data dumpster diving for the old ones, don't you think?

A 67-megabyte ISO 9660 image is produced that once belonged to an older-type onboard announcement device. It contains a file system of 58 directories with five-digit names, and one called "yleis" (Finnish for "general").

Each directory contains files with three-digit file names. For each number, there's 001.inf, 001.txt and 001.snd. The .inf and .txt files seem to contain parts of announcements as ISO 8859 encoded strings, such as "InterCity train" and "to Helsinki". The .snd files obviously contain the corresponding audio announcements. There's a total of 1950 sound files.

Directory structure

The file system seems to be structurally pointless; there's nothing apparent that differentiates all files in /00104 from files in /00105. Announcements in different languages are numerically separated, though (/001xx = Finnish, /002xx = Swedish, /003xx = English). Track numbers and time readouts are stored sequentially, but there are out-of-place announcements and test files in between. The logic connecting numbers to their meanings is probably programmed into the device for every train route.

Everything can be spliced together from almost single words. But many common announcements are also recorded as whole sentences, probably to make them sound more natural.

Audio format

The audio files are headerless; there is no explicit information about the format, sample rate or sample size anywhere.

The byte histogram and Poincaré plot of the raw data suggest a 4-bit sample size; this, along with the fact that all files start with 0x80, is indicative of an adaptive differential PCM encoding scheme.

[Image: Byte histogram and Poincare plot of a raw audio file, characteristic of Gaussian-distributed data encoded as four-bit samples.]

Unfortunately there are as many variations to ADPCM as there are manufacturers of encoder chips. None of the decoders known by SoX produce clean results. But with the right settings for the OKI-ADPCM decoder we can already hear some garbled speech under heavy Brownian noise.

For unknown reasons, the output signal from SoX is spectrum-inverted. Luckily it's trivial to fix (see my previous post on frequency inversion). The pitch sounds roughly natural when a 19,000 Hz sampling rate is assumed. A test tone found in one file comes out as a 1000 Hz sine when the sampling rate is further refined to 18,930 Hz.

This is what we get after frequency inversion, spectral equalization, and low-pass filtering:

There's still a high noise floor due to the mismatch between OKI-ADPCM and the unknown algorithm used by the announcement device, but it's starting to sound alright!

Peculiarities

There seems to be an announcement for every thinkable situation, such as:

  • "Ladies and Gentlemen, as due to heavy snowfall, we are running slightly late. Please accept our apologies."
  • "Ladies and Gentlemen, an animal has been run over by the train. We have to wait a while before continuing the journey."
  • "Ladies and Gentlemen, the arrival track of the train having been changed, the platform is on your left hand side."
  • "Ladies and Gentlemen, we regret to inform you that today the restaurant-car is exceptionally closed."

Also, there is an English recording of most announcements, even though only Finnish and Swedish are usually heard on commuter trains.

One file contains a long instrumental country song.

In an eerily out-of-place sound file, a small child reads out a list of numbers.

Final words

This is something I've wanted to do with this almost melodically intonated announcement about ticket selling compartments.