Barcode recovery using a priori constraints

Barcodes can be quite resilient to redaction. Not only is the pattern a strong visual signal, but also the encoded string that often has a rigidly defined structure. Here I present a method for recovering the data from a blurred, pixelated, or even partially covered barcode using prior knowledge of this higher-layer structure. This goes beyond so-called "deblurring" or blind deconvolution in that it can be applied to distortions other than blur as well.

It has also been a fun exercise in OpenCV matrix operations.

As example data, specimen pictures of Finnish driver's licenses shall be used. The card contains a Code 39 barcode encoding the cardholder's national identification number. This is a fixed-length string with well-defined structure and rudimentary error-detection, so it fits our purpose well. High-resolution samples with fictional data are available at government websites. Redacted and low-quality pictures of real cards are also widely available online, from social media sites to illustrations for news stories.

Nothing on the card indicates that the barcode contains sensitive information (knowledge of a name and this code often suffices as identification on the phone). Consequently, it's not hard to find pictures of cards with the barcode completely untouched either, even if all the other information has been carefully removed.

All cards and codes used in this post are simulated.

Image rectification

We'll start by aligning the barcode with the pixel grid and moving it into a known position. Its vertical position on the driver's license is pretty standard, so finding the card's corners and doing a reverse perspective projection should do the job.

Finding the blue EU flag seemed like a good starting point for automating the transform. However, JPEG is quite harsh on high-contrast edges and extrapolating the card boundary from the flag corners wasn't too reliable. A simpler solution is to use manual adjustments: an image viewer is opened and clicking on the image moves the corners of a quadrilateral on top of the image. cv::find­Homography() and cv::warp­Perspective() are then used to map this quadrilateral to a 857×400 rectangular image, giving us a rectified image of the card.

Reduction & filtering

The bottom 60 pixel rows, now containing our barcode of interest, are then reduced to a single 1D column sum signal using cv::reduce(). In this waveform, wide bars (black) will appear as valleys and wide spaces (white) as peaks.

In Code 39, all characters are of equal width and consist of 3 wide and 9 narrow elements (hence the name). Only the positions of the wide elements need to be determined to be able to decode the characters. A 15-pixel convolution kernel – cv::GaussianBlur() – is applied to smooth out any narrow lines.

A rectangular kernel matched to the bar width would possibly be a better choice, but the exact bar width is unknown at this point.


The format of the driver's license barcode will always be *DDMMYY-NNNC*, where

  • The asterisks * are start and stop characters in Code 39
  • DDMMYY is the cardholder's date of birth
  • NNN is a number from 001 to 899, its least significant bit denoting gender
  • C is a modulo-31 checksum character; Code 39 doesn't provide its own checksum

These constraints will be used to limit the search space at each string position. For example, at positions 0 and 12, the asterisk is the only allowed character, whereas in position 1 we can have either the number 0, 1, 2, or 3 as part of a day of month.

If text on the card is readable then the corresponding barcode characters can be marked as already solved by narrowing the search space to a single character.

Decoding characters

It's a learning adventure so the decoder is implemented as a type of matched filter bank using matrix operations. Perhaps it could be GPU-friendly, too.

Each row of the filter matrix represents an expected 1D convolution output of one character. A row is generated by creating an all-zeroes vector with just the peak elements set to plus/minus unity. These rows are then convolved with a horizontal Lanczos kernel.

The exact positions of the peaks depend on the barcode's wide-to-narrow ratio, as Code 39 allows anything from 2:1 to 3:1. Experiments have shown it to be 2.75:1 in most of these cards.

The above 1D wave is divided into character-length pieces which are then multiplied per-element by this newly generated matrix using cv::Mat::mul(). The result is reduced to a row sum vector.

This vector now contains a "score", a kind of matched filter output, for each character in the search space. The best matching character is the one with the highest score; this maximum is found using cv::minMaxLoc(). Constraints are passed to the command as a binary mask matrix.

Barcode alignment and length

To determine the left and right boundaries of the barcode, an exhaustive search is run through the whole 1D signal (around 800 milliseconds). On each iteration the total score is calculated as a sum of character scores, and the alignment with the best total score is returned. This also readily gives us the best decoded string.

We can also enable checksum calculation and look for the best string with a valid checksum. This allows for errors elsewhere in the code.


The barcodes in these images were fully recovered using the method presented above:

It might be possible to further develop the method to recover even more blurred images. Possible improvements could include fine-tuning the Lanczos kernel used to generate the filter bank, or coming up with a better way to score the matches.


The best way to redact a barcode seems to be to draw a solid rectangle over it, preferably even slightly bigger than the barcode itself, and make sure it really gets rendered into the bitmap.

Printing an unlabeled barcode with sensitive data seems like a bad idea to begin with, but of course there could be a logical reason behind it.

Pea whistle steganography

[Image: Acme Thunderer 60.5 whistle]

Would anyone notice if a referee's whistle transmitted a secret data burst?

I do really follow the game. But every time the pea whistle sounds to start the jam I can't help but think of the possibility of embedding data in the frequency fluctuation. I'm sure it's alternating between two distinct frequencies. Is it really that binary? How random is the fluctuation? Could it be synthesized to contain data, and could that be read back?

I found a staggeringly detailed Wikipedia article about the physics of whistles – but not a single word there about the effects of adding a pea inside, which is obviously the cause of the frequency modulation.

To investigate this I bought a metallic pea whistle, the Acme Thunderer 60.5, pictured here. Recording its sound wasn't straightforward as the laptop microphone couldn't record the sound without clipping. The sound is incredibly loud indeed – I borrowed a sound pressure meter and it showed a peak level of 106.3 dB(A) at a distance of 70 cm, which translates to 103 dB at the standard 1 m distance. (For some reason I suddenly didn't want to make another measurement to get the distance right.)

[Image: Display of a sound pressure meter showing 106.3 dB max.]

Later I found a microphone that was happy about the decibels and got this spectrogram of a 500-millisecond whistle.

[Image: Spectrogram showing a tone with frequency shifts.]

The whistle seems to contain a sliding beginning phase, a long steady phase with frequency shifts, and a short sliding end phase. The "tail" after the end slide is just a room reverb and I'm not going to need it just yet. A slight amplitude modulation can be seen in the oscillogram. There's also noise on somewhat narrow bands around the harmonics.

The FM content is most clearly visible in the second and third harmonics. And seems like it could very well fit FSK data!

Making it sound right

I'm no expert on synthesizers, so I decided to write everything from scratch ( But I know the start phase of a sound, called the attack, is pretty important in identification. It's simple to write the rest of the fundamental tone as a simple FSK modulator; at every sample point, a data-dependent increment is added to a phase accumulator, and the signal is the cosine of the accumulator. I used a low-pass IIR filter before frequency modulation to make the transitions smoother and more "natural".

Adding the harmonics is just a matter of measuring their relative powers from the spectrogram, multiplying the fundamental phase angle by the index of the harmonic, and then multiplying the cosine of that phase angle by the relative power of that harmonic. SoX takes care of the WAV headers.

Getting the noise to sound right was trickier. I ended up generating white noise (a simple rand()), lowpass filtering it, and then mixing a copy of it around every harmonic frequency. I gave the noise harmonics a different set of relative powers than for the cosine harmonics. It still sounds a bit too much like digital quantization noise.

Embedding data

There's a limit to the amount of bits that can be sent before the result starts to sound unnatural; nobody has lungs that big. A data rate of 100 bps sounded similar to the Acme Thunderer, which is pretty much nevertheless. I preceded the burst with two bytes for bit and byte sync (0xAA 0xA7), and one byte for the packet size.

Here's "OHAI!":

Sounds legit to me! Here's a slightly longer one, encoding "Help me, I'm stuck inside a pea whistle":


  1. Write a receiver for the data. It should be as simple as receiving FSK. The frequency can be determined using atan2, a zero-crossing detector, or FFT, for instance. The synchronization bytes are meant to help decode such a short burst; the alternating 0s and 1s of 0xAA probably give us enough transitions to get a bit lock, and the 0xA7 serves as a recognizable pattern to lock the byte boundaries on.
  2. Build a physical whistle that does this!

The microphone bioamplifier

As the in-ear microphone in the previous post couldn't detect a signal that would suggest objective tinnitus, the next step would be to examine EMG signals from facial muscles. This is usually done using a special-purpose device called a bioamplifier, special-purpose electrodes, and contact gel, none of which I have at hand. A perfect opportunity for home-baking, that is!

There's an Instructable called How to make ECG pads & conductive gel. Great! Aloe vera gel and table salt for the conductive gel are no problem, neither are the snap buttons for the electrodes. I don't have bottle caps, though, so instead I cut circular pieces out of some random plastic packaging.

[Image: An electrode made out of transparent plastic.]

As for the bioamplifier, why can't we just use the microphone preamplifier that was used for amplifying audio in the previous post? Both are weak low-frequency signals. There's no apparent reason for why it couldn't amplify EMG, if only a digital filter was used to suppress the mains hum.

It's a signal, but it's noise

First, a little disclaimer. It's unwise to just plug yourself into a random electric circuit, even if Oona survived. Mic preamps, for example, are not mere passive listeners; instead they will in some cases try to apply phantom power to the load. This can be up to 48 volts DC at 10 mA. There's anecdotal evidence of people getting palpitations from experiments like this. Or maybe not. But you wouldn't want to take the risk.

So I attached myself into some leads, soldered into a stereo miniplug, using the home-made pads that I taped on opposite sides of my face. I plugged the whole assembly into the USB sound card's mic preamp and recorded the signal at a pretty low sampling rate.

[Image: Spectrogram.]

The signal, shown here from 0 to 700 Hz, is dominated by a mains hum (colored red-brown), as I suspected. There is indeed a strong signal present during contraction of jaw muscles (large green area). Moving the jaw left and right produces a very low-frequency signal instead (bright green splatter at the bottom).

It's fun to watch but still a bit of a disappointment; I was really hoping for a clear narrow-band signal near the 65 Hz frequency of interest.

Einthoven's triangle

At this point I was almost ready to ditch the EMG thing as uninteresting, but decided to move the electrodes around and see what kind of signals I could get. When one of them was moved far enough, a pulsating low-frequency signal would appear:

[Image: Spectrogram with a regularly pulsating signal.]

Could this be what I think it is? To be sure about it I changed the positions of the electrodes to match Lead II in Einthoven's triangle, as used in electrocardiography. The signal from Lead II represents potential difference between my left leg and right arm, caused by the heart.

After I plugged the leads in the amp already did this:

[Image: Animation of the signal indicator LEDs of an amplifier blinking in a rhythmic manner.]

Looks promising! The mains hum was really irritating at this point, but I could get completely rid of it by rejecting all frequencies above 45 Hz, since the signal of interest was below that.

The result is a beautiful view of the iconic QRS complex, caused by ventricular depolarization in the heart:

[Image: Oscillogram with strong triple-pointed spikes at regular intervals.]

Case study: tinnitus with distortion

[Image: A pure tone audiogram of both ears indicating no hearing loss.]

A periodically appearing low-frequency tinnitus is one of my least favorite signals. A doctor's visit only resulted in a WONTFIX and the audiogram shown here, which didn't really answer any questions. Also, the sound comes with some pecularities that warrant a deeper analysis. So it shall become one of my absorptions.

The possible subtype (Vielsmeier et al. 2012) of tinnitus I have, related to a joint problem, is apparently even more poorly understood than the classical case (Vielsmeier et al. 2011), which of course means I'm free to make wild speculations! And maybe throw a supporting citation here and there.

Here's a simulation of what it sounds like. The occasional frequency shifts are caused by head movements. (There's only low-frequency content, so headphones will be needed; otherwise it will sound like silence.)

It's nothing new, save for the somewhat uncommon frequency. Now to the weird stuff.

Real-life audio artifacts!

This analysis was originally sparked by a seemingly unrelated observation. I listen to podcasts and documentaries a lot, and sometimes I've noticed the voice sounding like it had shifted up in frequency, for just a small amount. It would resemble an across-the-spectrum linear shift that breaks the harmonic relationships, much like when listening to a SSB transmission. (Simulated sound sample from a podcast below.)

I always assumed this was a compression artifact of some kind. Or maybe broken headphones. But one day I also noticed it in real life, when a friend was talking to me! I had to ask her repeat, even though I had heard her well. Surely not a compression artifact. Of course I immediately associated it with the tinnitus that had been quite strong that day. But how could a pure tone alter the whole spectrum so drastically?

Amplitude modulation?

It's known that a signal gets frequency-shifted when amplitude-modulated, i.e. multiplied in the time domain, by a steady sine wave signal. This is a useful effect in the realm of radio, where it's known as heterodyning. My tinnitus happens to be a near-sinusoidal tone at 65 Hz; if this got somehow multiplied with part of the actual sound somewhere in the auditory pathway, it could explain the distortion.

[Image: Oscillograms of a wideband signal and a sinusoid tone, and a multiplication of the two.]

Where could such a multiplication take place physically? I'm guessing it should be someplace where the signal is still represented as a single waveform. The basilar membrane in the cochlea already mechanically filters the incoming sound into frequency bands one sixth of an octave wide for neural transmission (Schnupp et al. 2012). Modulating one of these narrow bands would likely not affect so many harmonics at the same time, so it should either happen before the filtering or at a later phase, where the signal is still being handled in a time-domain manner.

I've had several possibilities in mind:

  1. The low frequency tone could have its origins in actual physical vibration around the inner ear that would cause displacement of the basilar membrane. This is supported by a subjective physical sensation of pressure in the ear accompanying the sound. How it could cause amplitude modulation is discussed later on.
  2. A somatosensory neural signal can cause inhibitory modulation of the auditory nerves in the dorsal cochlear nucleus (Young et al. 1995). If this could happen fast enough, it could lead to amplitude modulation of the sound by modulating the amount of impulses transmitted; assuming the auditory nerves still carry direct information about the waveform at this point (they sort of do). Some believe the dorsal cochlear nucleus is exactly where the perceived sound in this type of tinnitus also originates (Sanchez & Rocha 2011).

Guinea pigs know the feeling

Already in the 1970s, it was demonstrated that human auditory thresholds are modulated by low frequency tones (Zwicker 1977). In a 1984 paper the mechanism was investigated further in Guinea pigs (Patuzzi et al. 1984). A low-frequency tone (anywhere from 33 up to 400 Hz) presented to the ear modulated the sensitivity of the cochlear hair cell voltage to higher frequency sounds. This modulation tracked the waveform of the low tone, such that the greatest amplitude suppression was during the peaks of the low tone amplitude, and there was no suppression at its zero crossings. In other words, a low tone was capable of amplitude-modulating the ear's response to higher tones.

This modulation was observed already in the mechanical velocity of the basilar membrane, even before conversion into neural voltages. Some kind of an electro-mechanical feedback process was thought to be involved.

Hints towards a muscular origin

So, probably a 65 Hz signal exists somewhere, whether physical vibration or neural impulses. Where does it come from? Tinnitus with vascular etiology is usually pulsatile in nature (Hofmann et al. 2013), so it can be ruled out. But what about muscle cramps? After all, I know there's a displaced joint disc and nearby muscles might not be happy about that. We could get some hints by studying the frequencies related to a contracting muscle.

A 1974 study of EEG contamination caused by various muscles showed that the surface EMG signal from the masseter muscle during contraction has its peak between 50 and 70 Hz (O'Donnell et al. 1974); just what we're looking for. (The masseter is located very close to the temporomandibular joint and the ear.) Later, there has been initial evidence that central neural motor commands to small muscles may be rhythmic in nature and that this rhythm is also reflected in EMG and the synchronous vibration of the contracting muscle (McAuley et al. 1997).

Sure enough, in my case, applying firm pressure to the deep masseter or the posterior digastric muscle temporarily silences the sound.

Recording it

Tinnitus associated with a physical sound detectable by an outside observer, a rare occurrence, is described as objective (Hofmann et al. 2013). My next plan was to use a small in-ear microphone setup to try and find out if there was an objective sound present. This would shed light on the way the sound is transmitted from the muscles to the auditory system, as if it made any difference.

But before I could do that, I went to this loud open air trance party (with DJ Tristan) that, for some reason, eradicated the whole tinnitus that had been going on for a week or two. I had to wait for a week before it reappeared. (And I noted it being the result of a stressful situation, as people on Twitter and HN have also pointed out.)

[Image: Sennheiser earplugs connected to the microphone preamp input of a Xenyx 302 USB audio interface.]

Now I could do a measurement. I used my earplugs as a microphone by plugging them into a mic preamplifier using a plug adapter. It's a mono preamp, so I disconnected the left channel of the adapter using cellotape to just record from the right ear.

I set baudline for 2-minute spectral integration time and a 600 Hz decimated sample rate, and the preamp to its maximum gain. Even though the setup is quite sensitive and the earplug has very good isolation, I wasn't able to detect even the slightest peak at 65 Hz. So either recording outside the tympanic membrane was an absurd idea to begin with, or maybe the neural explanation is the more likely cause of the sound.

[Image: Screenshot of baudline with the result of spectral integration from 0 to 150 Hz, with nothing to note but a slight downward slope towards the higher frequencies.]


Trackers leaking bank account data

A Finnish online bank used to include a US-based third-party analytics and tracking script in all of its pages. Ospi first wrote about it (in Finnish) in February 2015, and this caused a bit of a fuss.

The bank responded to users' worries by claiming that all information is collected anonymously:

[Image: A tweet by the bank, in Finnish. Translation: Our customers' personal data will not be made available to Google under any circumstances. Thanks to everyone who participated in the discussion! (2/2)]

But is it true?

As Ospi notes, a plethora of information is sent along the HTTP request for the tracker script. This includes, of course, the IP address of the user; but also the full URL the user is browsing. The bank's URLs reveal quite a bit about what the user is doing; for instance, a user planning to start a continuous savings contract will send the url

I logged in to the bank (using well-known demo credentials) to record one such tracking request. The URL sent to the third party tracker contains a cleartext transaction archive code that could easily be used to match a transaction between two bank accounts, since it's identical for both users. But there's also a hex string called accountId (highlighted in red).

Remote Address: 80.***.***.***:443
Request URL:
Request Method: GET
Status Code:    200 OK

It's 40 hex characters long, which is 160 bits. This happens to be the length of an SHA-1 hash.

Could it really be a simple hash of the user's bank account number? Surely they would at least salt it.

Let's try!

The demo account's IBAN code is FI96 3939 0001 0006 03, but this doesn't give us the above hash. However, if we remove the country code, IBAN checksum, and all whitespaces, it turns out we have a match!

$ echo -n "FI96 3939 0001 0006 03" | shasum
dcf04c4fd3b6e29b4b43a8bf43c2713ac9be1de2  -

$ echo -n "FI9639390001000603" | shasum
3e3658e4c2802dd5c21b1c6c1ed55fc1f39c8830  -

$ echo -n "39390001000603" | shasum
69af881eca98b7042f18e975e00f9d49d5d5ee64  -

$ █

This is a BBAN format bank account number. BBAN numbers are easy to brute-force, especially if the bank is already known. I wrote the following C program, ~25 lines of code, that reversed the above hash to the correct account number in 0.5 seconds.

#include <openssl/sha.h>
#include <stdlib.h>
#include <stdio.h>
#include <string.h>

#define BBAN_LENGTH 14

int main() {
  const char target_hash[SHA_DIGEST_LENGTH] = {
  unsigned char try_accnum[BBAN_LENGTH+1];
  unsigned char try_hash[SHA_DIGEST_LENGTH];
  for (int bban_office=0; bban_office < 1e4; bban_office++) {
    for (int bban_id=0; bban_id < 1e6; bban_id++) {
      snprintf((char*)try_accnum, sizeof(try_accnum),
               "3939%04d%06d", bban_office, bban_id);
      SHA1(try_accnum, BBAN_LENGTH, try_hash);
      if (memcmp(try_hash, target_hash, SHA_DIGEST_LENGTH) == 0) {
        printf("found %s\n", try_accnum);
        return EXIT_SUCCESS;
  return EXIT_FAILURE;
$ gcc -lcrypto -o bban_hash bban_hash.c

$ time ./bban_hash
found 39390001000603
./bban_hash  0.42s user 0.00s system 99% cpu 0.420 total

$ █

In conclusion, the third party is provided with the user's IP address, bank account number, addresses of subpages they visit, and account numbers associated with all transactions they make. The analytics company should also have no difficulty matching the user with its own database collected from other sites, including their full name and search history.

Incidentally, this is in breach of the Guidelines on bank secrecy (PDF) by the Federation of Finnish Financial Services; "In accordance with the secrecy obligation, third parties may not even be disclosed whether a certain person is a customer of the bank" (pg 4) (sama suomeksi: "Salassapitovelvollisuus sisältää myös sen, että sivullisille ei ilmoiteta edes sitä, onko tietty henkilö pankin asiakas vai ei").


The script was eventually removed from the site, leaving the bank regretful that such a useful tool was lost.

However, alternatives do exist (like Piwik) that can be run locally, not involving a third party. Edit: The Intercept, a news website, is using non-privacy-invading metrics.

Receiving RDS with the RTL-SDR

redsea is a command-line RDS decoder. I originally wrote it as a script to decode RDS from demultiplexed FM stereo sound. Later I've experimented with other ways to read the bits, and the latest addition is to support the RTL-SDR television receiver via the rtl_fm tool.

Redsea is on GitHub. It has minimal dependencies (perl core modules, C standard library, rtl-sdr command-line tools) and has been tested to work on OSX and Linux with good enough FM reception. All test results, ideas, and pull requests are welcome.

What it says

The program prints out decoded RDS groups, one group per line. Each group will contain a PI code identifying the station plus varying other data, depending on the group type. The below picture explains the types of data you'll probably most often encounter.

[Image: Screenshot of textual output from redsea, with some parts explained.]

A more verbose output can be enabled with the -l option (it contains the same information though). The -t option prefixes all groups with an ISO timestamp.

How it works

The DSP side of my program, named rtl_redsea, is written in C99. It's a synchronous DBPSK receiver that first bandpass filters ① the multiplex signal. A PLL locks onto the 19 kHz stereo pilot tone; its third harmonic (57 kHz) is used to regenerate the RDS subcarrier. Dividing it by 16 also gives us the 1187.5 Hz clock frequency. Phase offsets of these derived signals are adjusted separately.

[Image: Oscillograms illustrating how the RDS subcarrier is gradually processed in redsea and finally reduced to a series of 1's and 0's.]

The local 57 kHz carrier is synchronized so that the constellation lines up on the real axis, so we can work on the real part only ②. Biphase symbols are multiplied by the square-wave clock and integrated ③ over a clock period, and then dumped into a delta decoder ④, which outputs the binary data as bit strings into stdout ⑤.

Signal quality is estimated a couple of times per second by counting the number of "suspicious" integrated biphase symbols, i.e. symbols with halves of opposite signs. The symbols are being sampled with a 180° phase shift as well, and we can switch to that stream if it seems to produce better results.

This low-throughput binary string data is then handled by via a pipe. Synchronization and error detection/correction happens there, as well as decoding. Group data is then displayed on the terminal, in semi-human-readable form.


My ultimate goal is to have a tool useful for FM DX, i.e. pretty good noise resistance.

My chip collection

Old IC (integrated circuit) packages are fun and I collect them. This involves going to flea markets to look for cheap vintage electronics like telephones, answering machines, radios or toys, and then desoldering and salvaging all the ICs and other interesting parts. Selected packages from my disorganized pile of chips follow. Most are POTS-related.

Sony CXA1619BS

[Image: Photo of package]

A "one-chip-wonder", this is an FM/AM radio in a small package. It takes an RF signal (from the antenna) and an IF oscillator frequency as inputs and outputs demodulated monaural audio.

Sanyo LA2805

[Image: Photo of package]

This chip does general answering machine related tasks. It has a tape preamp for recording and playback; voice detector logic; beep detection using zero-crossing comparation; power amplifier; line amplifier; and pins for interfacing with a microcontroller.

Unicorn Microelectronics UM91215C

[Image: Photo of package]

The UM91215C is a tone/pulse dialer. A telephone keyboard matrix is connected to the input pins, and the chip outputs DTMF-encoded audio or pulsed digits, depending on the selected dialing mode. An external oscillator needs to be connected as well. It can do a one-key redial of the last dialed number, and it can also flash the phone line.

Holtek HT9170

[Image: Photo of package]

A DTMF receiver, reversing the operation of UM91215C above. The chip, employing filters and zero-crossing detectors, is fed an external oscillator frequency and telephone line audio, and it outputs a four-bit code corresponding to the DTMF digit present in the signal. The use of external components is minimal, but a crystal oscillator is needed in this case as well.

SGS-Thomson TDA1154

[Image: Photo of package]

A speed regulator for DC motors, this chip can keep a motor running at a very stable speed under varying load conditions. In an answering machine, it is needed to keep distortions in tape audio in the minimum.

Toshiba TC8835AN

[Image: Photo of package]

This chip can store and play back a total of 16 audio recordings of 512 kilobits in size. It also contains a lot of command logic, explained in a 40-page datasheet. Type of audio encoding is not specified, but the bitrate can be chosen between 22kbps and 16kbps. The analog output must be filtered prior to playback.

Intel 8049

[Image: Photo of package]

This monster of a chip is a 6 MHz, 8-bit microcontroller with 17 registers, 2 kilobytes ROM, 128 bytes RAM, and an instruction set of 90 codes. It's used in many older devices, from telephones to digital multimeters.