CTCSS fingerprinting: a method for transmitter identification

Identifying unknown radio transmitters by their signals is called radio fingerprinting. It is usually based on rise-time signatures, i.e. characteristic differences in how the transmitter frequency fluctuates at carrier power-up. Here, instead, I investigate the fingerprintability of another feature in hand-held FM transceivers, known as CTCSS or Continuous Tone-Coded Squelch System.

Motivation & data

I came across a long, losslessly compressed recording of some walkie-talkie chatter and wanted to know more about it, things like the number of participants and who's talking with who. I started writing a transcript – a fun pastime – but some voices sounded so similar I wondered if there was a way to tell them apart automatically.

[Image: Screenshot of Audacity showing an audio file over eleven hours long.]

The file comprises several thousand short transmissions as FM demodulated audio lowpass filtered at 4500 Hz. Signal quality is variable; most transmissions are crisp and clear but some are buried under noise. Passages with no signal are squelched to zero.

I considered several potentially fingerprintable features, many of them unrealistic:

  • Carrier power-up; many transmissions were missing the very beginning because of squelch
  • Voice identification; would probably require pretty sophisticated algorithms (too difficult!) and longer samples
  • Mean audio power; not consistent enough, as it depends on text, tone of voice, etc.
  • Maximum audio power; too sensitive to peaks in FM noise

I then noticed all transmissions had a very low tone at 88.5 Hz. It turned out to be CTCSS, an inaudible signal that enables handsets to silence unwanted transmissions on the same channel. This gave me an idea inspired by mains frequency analysis: Could this tone be measured to reveal minute differences in crystal frequencies and modulation depths? Also, knowing that these were recorded using a cheap DVB-T USB stick – would it have a stable enough oscillator to produce consistent measurements?

Measurements

I used the liquid-dsp library for signal processing. It has several methods for measuring frequencies. I decided to use a phase-locked loop, or PLL; I could have also used FFT with peak interpolation.

In my fingerprinting tool, the recording is first split into single transmissions. The CTCSS tone is bandpass filtered and a PLL starts tracking it. When the PLL frequency stops fluctuating, i.e. the standard deviation is small enough, it's considered locked and its frequency is averaged over this time. The average RMS power is measured similarly.

Here's one such transmission:

[Image: A graph showing frequency and power, first fluctuating but then both stabilize for a moment, where text says 'PLL locked'. Caption says 'No, I did not copy'.]

Results

When all transmissions are plotted according to their CTCSS power and frequency relative to 88.5 Hz, we get this:

[Image: A plot of RMS power versus frequency, with dots scattered all over, but mostly concentrated in a few clusters.]

At least three clusters are clearly distinguishable by eye. Zooming in to one of the clusters reveals it's made up of several smaller clusters. Perhaps the larger clusters correspond to three different models of radios in use, and these smaller ones are the individual transmitters?

A heat map reveals even more structure:

[Image: The same clusters presented in a gradual color scheme and numbered from 1 to 12.]

It seems at least 12 clusters, i.e. potential individual transmitters, can be distinguished.

Even though most transmissions are part of some cluster, there are many outliers as well. These appear to correspond to a very noisy or very short transmission. (Could the FFT have produced better results with these?)

Use as transcription aid

My goal was to make these fingerprints useful as labels aiding transcription. This way, a human operator could easily distinguish parties of a conversation and add names or call signs accordingly.

I experimented with automated k-means clustering, but that didn't immediately produce appealing results. Then I manually assigned 12 anchor points at apparent cluster centers and had a script calculate the nearest anchor point for all transmissions. Prior to distance calculations the axes were scaled so that the data seemed uniformly distributed around these points.

This automatic labeling proved quite sensitive to errors. It could be useful when listing possible transmitters for an unknown transmission with no context; distances to previous transmissions positively mentioning call signs could be used. Instead I ended up printing the raw coordinates and colouring them with a continuous RGB scale:

[Image: A few lines from a conversation between Boa 1 and Cobra 1. Numbers in different colors are printed in front of each line.]

Here the colours make it obvious which party is talking. Call signs written in a darker shade are deduced from the context. One sentence, most probably by "Cobra 1", gets lost in noise and the RMS power measurement becomes inaccurate (463e-6). The PLL frequency is still consistent with the conversation flow, though.

Countermeasures

If CTCSS is not absolutely required in your network, i.e. there are no unwanted conversations on the frequency, then it can be disabled to prevent this type of fingerprinting. In Motorola radios this is done by setting the CTCSS code to 0. (In the menus it may also be called a PT code or Interference Eliminator code.) In many other consumer radios it's doesn't seem to be that easy.

Conclusions

CTCSS is a suitable signal for fingerprinting transmitters, reflecting minute differences in crystal frequencies and, possibly, FM modulation indices. Even a cheap receiver can recover these differences. It can be used when the signal is already FM demodulated or otherwise not suitable for more traditional rise-time fingerprinting.

14 comments:

  1. Funny coincidence :) I too have been experimenting with fingerprinting Wi-Fi packets from their Carrier Frequency Offset:

    https://rftap.github.io/blog/2016/09/01/rftap-wifi.html

    The -0.24 Hz offset is surprising - it corresponds to ~200ppm error (0.24/88.5), which would be way too much for a RF carrier frequency error. This probably means the transmitter in question is using two different clock sources - a high accuracy (10-20 PPM) clock for generating the RF FM carrier, and a low accuracy clock (RC?) for generating the CTCSS tone.

    ReplyDelete
    Replies
    1. Yes, I saw your blog post a while before this was completed! I guess you inspired me to hit Publish :) Also I notice you've written about FM subcarriers, my pet subject.

      It would have been interesting to have access to the raw signal and measure RF differences as well.

      Delete
    2. Oona, have you posted a step-by-step tutorial on how you've accomplished all of this? I'd very much like to do this myself.

      Delete
  2. m.youtube.com/watch?v=SyMUTqRQZPA my last cyber talk was fingerprinting devices from the ir signature of the screens proximity detector. I am gonna use some of your articles techniques to do more thanks.

    ReplyDelete
  3. Hi, in this and previous post there are some nice looking graphs and plots. I'm wondering, what tool were used to generate them? Is it something widely available, or you wrote it yourself?

    And by the way, your blog is very cool, I've read it all in one night some time ago, when i was supposed to study for my finals :D

    ReplyDelete
    Replies
    1. I'm happy you've liked my blog - sorry about your finals :)

      The first picture in the previous post is a baudline screenshot. The next ones were made in gnuplot and modified (just recolored with "invert" and "hue/saturation") in Gimp.

      The frequency/power plot in this post was also made in gnuplot and modified in Gimp (colors, gray box, text).

      The next one (green) is also a gnuplot plot.

      I think I made the heatmap by printing out a 8-bit Perl array to ImageMagick's convert tool (convert -t .gray). It's perhaps my favourite way to quickly visualize 2D arrays. This produced a grayscale image that I then colored using Gimp's Gradient map, and added the numbers.

      Delete
  4. Oona, another radio finger printing scheme was invented by Phil Ferrel, K7PF. It was based on the repeatable frequency change of the RF carrier when first keyed. The patent was later assigned to Boeing. The technique is used on some US amateur repeater stations.

    ReplyDelete
  5. This is a fantastic post. Transmitter fingerprinting is such an overlooked technology, as well as a potential threat in some situations.

    Historically, US and the former Soviet submarines have been using extremely low frequency comms for decades (trailing out kilometers long antennas behind them), because ELF is poorly attenuated by seawater. The US used 40-80Hz.

    It's been repeatedly alleged that because the data rates were so incredibly low, the used one-time pads for crypto, which if deployed properly are provably unbreakable. This isn't verified by supporting evidence, but seems possible.

    So the US, and presumably the Soviets too, used transmitter fingerprinting for traffic analysis. They couldn't tell what was being sent, and triangulating ELF is apparently hideously difficult, but the fact that transmitter X transmitted Y bits at X time was useful data. They determined X by fingerprinting, obviously.

    In the late 90's I heard that the CIA was trying to commercialize this technology, and I was told by friend in Silicon Valley of unclassified demos he'd seen (circa 1998-1999) where they were fingerprinting cell phones and were able to detect even slightly movements in the phones and calculate a rough displacement shift ("the cell phone on the left has been moved about a foot to the right.") He wasn't shown the receiving infrastructure, which was remote to the demo, and the company was selling this for the usual incoherent mix of drug/child abuse/terrorist/money-laundering/insert-boggieman-here crime detection. This is fairly typical for ex-spook tech.

    Thanks, Oona. As I said on Twitter, awesome as always!

    Ian.

    ReplyDelete
  6. Another parameter possibly worth exploring is the fact that, during a conversation, any single unit is unlikely to transmit twice in a row. This might provide further disambiguation among individual transmitters.

    ReplyDelete
    Replies
    1. Good point - though a low signal often triggers the squelch several times during a single transmission.

      Delete
  7. There was a company decades ago that marketed an ISA plug in board (crude A/D convertor) and software to fingerprint the characteristics of transmitters that interfered with ham repeaters. This early work is described in this article:


    http://kb9mwr.blogspot.com/2008/04/transmitter-fingerprinting.html

    ReplyDelete
  8. While we are on the subject of RF fingerprinting. There was discussion in the field of transportation about the use of electromagnetic fingerprinting to identify specific automobiles driving over a magnetic coil buried in pavement. The theory was that each vehicle would have a distinct electromagnetic profile. Most certainly specific models might have identifiable fingerprint resulting for example the speed of the alternator and emissions from CPU's and fuel injectors.

    ReplyDelete
  9. Couldn't you also make the assumption that one continous transmission is likely to be within a pretty narrow power band since there are limits to how much a transmitter moves when its operating. Which means you could also factor in time to possible give more contrast to point very close to each other.

    IE, in a small scope of time you contrast clusters more than over a longer time period (you could for example use a sliding window algorithm).

    ReplyDelete
    Replies
    1. Pretty interesting, this could definitely be exploited. However, in this case I only had access to FM demodulated audio, where information on signal strength is lost.

      Delete

The comments section is pre-moderated; it will take some time for the comment to show up.

You might want to check out the FAQ first.