Beeps and melodies in two-way radio

Lately my listening activities have focused on two-way FM radio. I'm interested in automatic monitoring and visualization of multiple channels simultaneously, and classifying transmitters. There's a lot of in-band signaling to be decoded! This post shall demonstrate this diversity and also explain how my listening station works.

Background: walkie-talkies are fun

The frequency band I've recently been listening to the most is called PMR446. It's a European band of radio frequencies for short-distance UHF walkie-talkies. Unlike ham radio, it doesn't require licenses or technical competence – anyone with 50€ to spare can get a pair of walkie-talkies at the department store. It's very similar to FRS in the US. It's quite popular where I live.

[Image: Photo of three different walkie-talkies.]

The short-distance nature of PMR446 is what I find perhaps most fascinating: in normal conditions, everything you hear has been transmitted from a 2-kilometer (1.3-mile) radius. Transmitter power is limited to 500 mW and directional antennas are not allowed on the transmitter side. But I have a receive-only system and a my only directional antenna is for 450 MHz, which is how I originally found these channels.

Roger beep

The roger beep is a short melody sent by many hand-held radios to indicate the end of transmission.

The end of transmission must be indicated, because two-way radio is 'half-duplex', which means only one person can transmit at a time. Some voice protocols solve the same problem by mandating the use of a specific word like 'over'; others rely on the short burst of static (squelch tail) that can be heard right after the carrier is lost. Roger beeps are especially common in consumer radios, but I've heard them in ham QSOs as well, especially if repeaters are involved.

Other signaling on PMR

PMR also differs from ham radio in that many of its users don't want to hear random people talking on the same frequency; indeed, many devices employ tones or digital codes designed to silence unwanted conversations, called CTCSS, DCS, or coded squelch. They are very low-frequency tones that can't usually be heard at all because of filtering. These won't prevent others from listening to you though; anyone can just disable coded squelch on their device and hear everyone else on the channel.

Many devices also use a tone-based system for preventing the short burst of static, that classic walkie-talkie sound, from sounding whenever a transmission ends. Baofeng calls these squelch tail elimination tones, or STE for short. The practice is not standardized and I've seen several different sub-audible frequencies being used in the wild, namely 55, 62, and 260 Hz. (Edit: As correctly pointed out by several people, another way to do this is to reverse the phase of the CTCSS tone in the end, called a 'reverse burst'. Not all radios use it though; many opt to send a 55 Hz tone instead, even when they are using CTCSS.)

Some radios have a button called 'alarm' that sends a long, repeating melody resembling a 90s mobile phone ring tone. These melodies also vary from one radio to the other.

My receiver

I have a system in place to alert me whenever there's a strong enough signal matching an interesting set of parameters on any of the eight PMR channels. It's based on a Raspberry Pi 3B+ and an Airspy R2 SDR receiver. The program can play the live audio of all channels simultaneously, or one could be selected for listening. It also has an annotated waterfall view that shows traffic on the band during the last couple of hours:

[Image: A user interface with text-mode graphics, showing eight vertical lanes of timestamped information. The lanes are mostly empty, but there's an occasional colored bar with annotations like 'a1' or '62'.]

The computer is a headless Raspberry Pi with only SSH connectivity; that's why it's in text mode. Also, text-mode waterfall plots are cool!

The coloured bars indicate signal strength (colour) and the duty factor (pattern). The numbers around the bars are decoded squelch codes, STEs and roger beeps. Uncertain detections are greyed out. In this view we've detected roger beeps of type 'a1' and 'a2'; a somewhat rare 62 Hz STE tone; and a ring tone, or alarm (RNG).

Because squelch codes are designed to be read by electronic circuits and their frequencies and codewords are specified exactly, writing a digital decoder for them was somewhat straightforward. Roger beeps and ring tones, on the other hand, are only meant for the human listener and detecting them amongst the noise took a bit more trial-and-error.

Melody detection algorithm

The melody detection algorithm in my receiver is based on a fast Fourier transform (FFT). When loss of carrier is detected, the last moments of the audio are searched for tones thusly:

[Image: A diagram illustrating how an FFT is used to search for a melody. The FFT in the image is noisy and some parts of the melody can not be measured.]
  1. The audio buffer is divided up into overlapping 60-millisecond Hann-windowed slices.
  2. Every slice is Fourier transformed and all peak frequencies (local maxima) are found. Their center frequencies are refined using Gaussian peak interpolation (Gasior & Gonzalez 2004). We need this, because we're only going to allow ±15 Hz of frequency error.
  3. The time series formed by the strongest maxima is compared to a list of pre-defined 'tone signatures'. Each candidate tone signature gets a score based on how many FFT slices match (+) corresponding slices of the tone signature. Slices with too much frequency error subtract from the score ().
  4. Most tone signatures have one or more 'quiet zones', the quietness of which further contributes to the score. This is usually placed after the tone, but some tones may also have a pause in the middle.
  5. The algorithm allows second and third harmonics (with half the score), because some transmitters may distort the tones enough for these to momentarily overpower the fundamental frequency.
  6. Every possible time shift (starting position) inside the 1.5-second audio buffer is searched.
  7. The tone signature with the best score is returned, if this score exceeds a set threshold.

This algorithm works quite well. It's not always able to detect the tones, especially if part of the melody is completely lost in noise, but it's good enough to be used for waterfall annotation. False positives are rare; most of them are detections of very short tone signatures that only consist of one or two beeps. My test dataset of 92 recorded transmissions yields only 5 false negatives and no false positives.

For example, this noisy recording:

was succesfully recognized as having a ringtone (RNG), a roger beep of type a1, and CTCSS code XA:

Naming and classification

Because I love classifying stuff I've had to come up with a system for naming these roger tones as well. My current system uses a lower-case letter for classifying the tone into a category, followed by a number that differentiates similar but slightly different tones. This is a work in progress, because every now and then a new kind of tone appears.

My goal would be to map the melodies to specific manufacturers. I've only managed to map a few. Can you recognise any of these devices?

ClassIdentified modelRecording
aCobra AM845 (a1)
cMotorola TLKR T40 (c1)
hBaofeng UV-5RC

I didn't list them all here, but there are even more samples. I've added some alarm tones there as well, and a list of all the tone signatures that I currently know of. (Why no full source code? FAQ)

In my rx log I also have an emoji classification system for CTCSS codes. This way I can recognize a familiar transmission faster. A few examples below (there are 38 different CTCSS codes in total):

[Image: Two-character codes grouped into categories and paired with emoji. Four categories, namely fruit, sound, mammals, and scary. The fruit category has codes beginning with an M, and emoji for different fruit, etc.]

Future directions

There are mainly just minor bugs in my project trello at the moment, like adding the aforementioned emoji. But as the RasPi is not very powerful the DSP chain could be made more efficient. Sometimes a block of samples gets dropped. Currently it uses a bandpass-sampled filterbank to separate the channels, exploiting aliasing to avoid CPU-intensive frequency shifting altogether:

This is quite fast. But the 1:20 decimation from the Airspy IQ data is done with SoX's 1024-point FIR filter and could possibly be done with fewer coefficients. Also, the RasPi has four cores, so half of the channels could be demodulated in a second thread. Currently all concurrency is thanks to SoX and pmrsquash being different processes.

Related posts



  1. Whoever designed the class d sound must have been a Half-Life fan: (first few seconds)!

  2. Maybe you find this interesting:

    1. Thanks :) Yes, reverse burst seems to always be there when PL codes are used.

  3. You would enjoy hackgreen. The displacement fields end up generating a sequence over nuclear reticules so you get a signal into the wire and that pletes into your own over some ranged wifi or something and you can literally find yourself in a wide band screener for free. Ecosia search for hackgreen. It's bad ass.

  4. Great work Oona, you are the best :-)

  5. Hi, for the tone E I can confirm it is from a Binatone style PMR radio, I have a few of those and they sound exactly the same. Not sure if there are more brands that sound same but I did hear this roger beep there.

    Hope it helps a bit :)

    Keep up the good work!

  6. Very interesting and well documented.
    I am wondering if there is an interesting use-case whereby we can create a set of command/data codes by using a combination of PL/CTCSS/DCS & these call melodies (or DTMF perhaps). These codes could be used in a low-cost long-range IOT application.
    Integrate a radio and en/decoder with MQTT or serial bus and a user of one of these radios could remotely control equipment at low cost, license-free.
    Im in South Africa, we have PMR446 available and many farmer want to be able to control things, often in areas with no cellular/wifi etc coverage and they have these radios.
    I really fancy the idea of using my walkie-talkie to control stuff with a simple channel selection and short ptt.
    Confirmation could just be the melody or a mp3 voice.

  7. Hi, I enjoy reading your blog again and again.
    I had already asked myself this question a while ago, but I couldn't find an answer.
    In the block diagram in the "Future directions" paragraph, "Channel 1" section: You take the bandpass filtered analytical signal (the I/Q samples at 12.5ksps in cyan), apply a delay on one side (Z-1), do the conjugate on the other side and multiply the two before determining the argument.
    I don't understand this technique, what is it called? Is it the inverse of a Hilbert transform?
    I search and read books and publications in hopes of educating myself, but I always feel like I'm missing the basics. Also, you have to be resourceful and make good choices. It's fascinating and daunting at the same time.

    Thank you.

  8. I've got it! You should ignore my previous message. I missed an important point in the article: These are FM signals!
    The block I had a problem with is an FM demodulator. In other posts, you use a PLL for that.
    I didn't take the time to think about it, seeing "Audio" at the output of the block I thought it was a smart way to switch back from an analytical signal to a real signal.
    Thanks for your blog.

    1. Yes, I should label it for clarity! Don't know why I left a mystery block like that there.

      Sadly, this project has encountered some bad luck because the RasPi SD card keeps getting corrupted, and it's a pain to reinstall everything... :)


Please browse through the FAQ first, it might be that your question is already answered.

Spammers have even found comments sections, so this comments section is pre-moderated; it will take some time for the comment to show up.