Feb 1, 2014

Mystery signal from a helicopter

Last night, YouTube suggested a video for me. It was a raw clip from a news helicopter filming a police chase in Kansas City, Missouri. I quickly noticed a weird interference in the audio, especially the left channel, and thought it must be caused by the chopper's engine. I turned up the volume and realized it's not interference at all, but a mysterious digital signal! And off we went again.

The signal sits alone on the left audio channel, so I can completely isolate it. Judging from the spectrogram, the modulation scheme seems to be BFSK, switching the carrier between 1200 and 2200 Hz. I demodulated it by filtering it with a lowpass and highpass sinc in SoX and comparing outputs. Now I had a bitstream at 1200 bps.

The bitstream consists of packets of 47 bytes each, synchronized by start and stop bits and separated by repetitions of the byte 0x80. Most bits stay constant during the video, but three distinct groups of bytes contain varying data, marked blue below:

What could it be? Location telemetry from the helicopter? Information about the camera direction? Video timestamps?

The first guess seems to be correct. It is supported by the relationship of two of the three byte groups. If the 4 first bits of each byte are ignored, the data forms a smooth gradient of three-digit numbers in base-10. When plotted parametrically, they form an intriguing winding curve. It is very similar to this plot of the car's position (blue, yellow) along with viewing angles from the helicopter (green), derived from the video by magical image analysis (only the first few minutes shown):

When the received curve is overlaid with the car's location trace, we see that 100 steps on the curve scale corresponds to exactly 1 minute of arc on the map!

Using this relative information, and the fact that the helicopter circled around the police station in the end, we can plot all the received data points in Google Earth to see the location trace of the helicopter:

Update: Apparently the video downlink to ground was transmitted using a transmitter similar to Nucomm Skymaster TX that is able to send live GPS coordinates. And this is how they seem to do it.

Update 2: Yes, it's 7-bit Bell 202 ASCII. I tried decoding it as 7-bit data earlier, ignoring parity, but must have gotten the bit order wrong! So I just chose a roundabout way and kept looking at the hex. When fully decoded, the stream says:

#L N390386 W09434208YJ
#L N390386 W09434208YJ
#L N390384 W09434208YJ
#L N390384 W09434208YJ
#L N390381 W09434198YJ
#L N390381 W09434198YJ
#L N390379 W09434188YJ

These are the full lat/lon pairs of coordinates (39° 3.86′ N, 94° 34.20′ W). Nucomm says the system enables viewing the helicopter "on a moving map system". Also, it could enable the receiving antenna to be locked onto the helicopter's position, to allow uninterrupted video downlink.

Thanks to all the readers for additional hints!

Jan 13, 2014

Misleading representations of discrete-time signals

Many audio editor tools draw digitally sampled signals in ways that can be misleading or sometimes downright incorrect. I hope this post will be useful in explaining this recurring and confusing issue.

In Audacity, let's set the project sample rate to 48 kHz and generate a sinusoid tone at, say, 22.5 kHz. Using a tone near the Nyquist frequency and a high zoom level will make what I'm talking about quite apparent:

It appears as though our newly generated sinusoid is "pumping" or going up and down in amplitude! Of course this can't be true; it's just a pure sine wave and should be perfectly reconstructible as such, according to the sampling theorem. But what's happening here?

The problem lies in drawing straight lines between the samples (known as linear interpolation). This overlooks the fact that the original signal – in this case, coming from Audacity's signal generator – only contained frequencies below the Nyquist frequency. Such sharp corners would have been impossible.

A cleaner, more customary way of drawing discrete-time signals is shown below:

Here, only the samples are shown, and their order is discernible from the lines that connect them to the zero level. But we're not trying to claim anything about what's in between those samples (the value is actually undefined).

The "pumping" pattern we're still seeing emerges from the slowly changing phase difference between the sampling frequency and the signal; in other words, the sinusoid is being sampled at different points.

A more faithful representation of the signal can be obtained by low-pass filtering the waveform shown by Audacity – after resampling at a higher rate – using a 'brick-wall' cut-off at the Nyquist frequency. This is equivalent to sinc interpolation, and it's pretty close to what actually happens in the sound card when the signal is played back. We can immediately see how the weird dot pattern was formed:

Sinc interpolation is computationally more complex (slower) than linear interpolation, which is probably why it isn't used by Audacity and others. Some audio editors however, like Cool Edit, do use a better interpolation at high zoom levels.

See also my related post, Rendering PCM with simulated phosphor persistence.

(And since it's been linked so many times – a great episode of Digital Show & Tell explains this and a lot more about sampled signals.)

Nov 24, 2013

Decoding radio-controlled bus stop displays

In the previous post I told about the 16 kbps data stream on FM broadcast frequencies, and my suspicion that it's being used by the bus and tram stop display system here in Helsinki. Now it's time to find out the truth.

I had the opportunity to observe a display stuck in the middle of its bootup sequence, displaying a version string. This revealed that the system is called IBus and it's made by the Swedish company Axentia. Sure enough, their website talks about DARC and how it requires no return channel, making it possible to use battery-powered displays in remote areas.

Not much else is said about the system, though; there are no specs for the proprietary protocol. So I implemented the five-layer DARC protocol stack in Perl and was left with a stream of fully error-corrected packets on top of Layer 5, separated into hundreds of subchannels. Some of these contained human-readable strings with names of terminal stations. They seemed like an easy starting point for reverse engineering.

Here's one such packet after dissection:

Each display seems to store in its memory a list of buses that can be expected to pass the stop, along with their destinations in both official languages. The above "bus & destination packets" are used to update the memory. This is done once a day for each display on a narrow-band subchannel, so that updating all the displays takes the whole day. The mapping of the bus stop ID to actual bus stops is not straightforward, and had to be guessed from the lists of buses, on a case-by-case basis.

A different kind of packet updates the remaining waiting time for each bus in minutes. This "minutes packet" is sent three times per minute for every display in the system.

These packets may contain waiting times for several displays using the same subchannel; the "bus & destination packet" contains an address to the correct minutes slot. A special flag is used to signify an unused slot; another flag indicates that the bus has no working GPS and that the time is an approximation based on timetables. This causes a tilde "~" to appear before the minutes field.

There's also an announcement packet that's used to sent text messages to the displays. Often these will be about traffic disruptions and diverted routes. It's up to the displays to decide how to display the message; in the low-end battery-powered ones, the actual minutes function is (annoyingly) disabled for the duration of the message.

I have yet to figure out the meaning of some packet types with non-changing data.

A special subchannel contains test packets with messages such as "Tämä on Mono-Axentia-testiä... Toimiiko tää ees..." and "Määritykset ja tiedotteet tehty vain Monolla - ei mitään IBus:lla" suggesting that they're planning to migrate from IBus to something called Mono. Interestingly, there's also a repeating test message in German – "Bus 61 nach Flughafen aus Haltestelle 1".

What good is it, you ask? Well, who wouldn't want a personal display repeater at home, telling when it's time to go?

Nov 5, 2013

Broadcast messages on the DARC side

By now I've rambled quite a lot about RDS, the data subcarrier on the third harmonic of the 19 kHz FM stereo pilot tone. And we know the second harmonic carries the stereo information. But is there something beyond that?

Using a wideband FM receiver, like an RTL-SDR, I can plot the whole demodulated spectrum of any station, say, from baseband to 90 kHz. Most often there is nothing but silence above the third pilot harmonic (denoted 3f here), which is the RDS subcarrier. But two local stations have this kind of a peculiar spectrogram:

There seems to be lots of more data on a pretty wide band centered on the fourth harmonic (4f)!

As so often with mysterious persistent signals, I got a lead by googling for the center frequency (76 kHz). So, meet the imaginatively named Data Radio Channel, or DARC for short. This 16,000 bps data stream uses level-controlled minimum-shift keying (L-MSK), which can be thought of as a kind of offset-quadrature phase-shift keying (O-QPSK) where consecutive bits are sent alternating between the in-phase and quadrature channels.

To find out more, I'll need to detect the signal first. Detecting L-MSK coherently is non-trivial ("tricky") for a hobbyist like me. So I'm going to cheat a little and treat the signal as continuous-phase but non-coherent binary FSK instead. This should give us good data, even though the bit error probability will be suboptimally high. I'll use a band-pass filter first; then a splitter filter to split the signal band into mark and space; detect both envelopes and calculate difference; lowpass filter with a cutoff at the bitrate; recover bit timing and synchronize using a PLL; and now we can just take the bits out by hard decision at bit intervals.

Oh yeah, we have a bitstream!

Decoder software for DARC does not exist, according to search engines. To read the actual data frames, I'll have to acquire block synchronization, regenerate the pseudorandom scrambler used and reverse its action, check the data against CRCs for errors, and implement the stack of layers that make up the protocol. The DARC specification itself is luckily public, albeit not very helpfully written; in contrast to the RDS standard, it contains almost no example circuits, data, or calculations. So, a little recap of Galois field mathematics and linear feedback shift registers for me.

But what does it all mean? And who's listening to it?

Now, it's no secret (Signaali 3/2011, p. 16) that the HSL bus stop timetable displays in Helsinki get their information about the next arriving GPS positioned buses through this very FM station, YLE 1. That's the only thing they're telling though. I haven't found anything bus-related in the RDS data, so it's quite likely that they're listening to DARC.

DARC is known to be used in various other applications as well, such as DGPS correction signals. So decoding it could prove interesting.

Sniffing the packet types being sent, it seems that quite a big part of the time is being used to send a transmit cycle schedule (COT and SCOT). And indeed, the aforementioned magazine article hints that the battery-powered displays use a precise radio schedule to save power, receiving only during a short period every minute each. (This grep only lists metadata type packets, not the actual data.)

oonar@northernspy: ~/darc/darcdec (zsh) ×
darcdec git:(master) perl darcdec-filters.pl|grep Type:|sort|uniq -c 88 Type: 0 Channel Organization Table (COT) 8 Type: 1 Alternative Frequency Table (AFT) 8 Type: 2 Service Alternative Frequency Table (SAFT) 1 Type: 4 Service Name Table (SNT) 8 Type: 5 Time and Date Table (TDT) 112 Type: 6 Synchronous Channel Organization Table (SCOT) darcdec git:(master)

At the moment I'm only getting sensible data out at the very beginning of each packet (or "block"). I do get a solid, error-free block sync but, for example, the date is consistently showing the year 1922 and all CRCs fail. Other fields are similarly weird but consistent. This could mean that I've still got the descrambler PN polynomial wrong, only succeeding when it's using bits from the initialization seed. And this can only mean many more sleepless coding nights ahead.

(Continued in the next post)

(Pseudotags for Finns: HSL:n pysäkkinäytöt eli aikataulunäytöt.)

Sep 16, 2013

The burger pager

A multinational burger chain has a restaurant nearby. One day I (exceptionally) went there and ordered a take-away burger that was not readily available. The clerk gave me a device that looked like a thick coaster, and told me I could fetch the burger from the counter when the coaster starts blinking its lights and make noises.

Of course, this device deserved a second look! (We can forget about the burger now) The device looked like a futuristic coaster with red LEDs all around it. I've blurred the text on top of it for dramatic effect.

Several people in the restaurant were waiting for orders with their similar devices, which suggested to me this could be a pager system of some sort. Turning the receiver over, we see stickers with interesting information, including a UHF carrier frequency.

For this kind of situations I often carry my RTL2832U-based television receiver dongle with me (the so-called rtl-sdr). Luckily this was one of those days! I couldn't help but tune in to 450.2500 MHz and see what's going on there.

And indeed, just before a pager went off somewhere, this burst could be heard on the frequency (FM demodulated audio):

Googling the device's model number, I found out it's using POCSAG, a common asynchronous pager protocol at 2400 bits per second. The modulation is binary FSK, which means we should be able to directly read the data from the above demodulated waveform, even by visual inspection if necessary. And here's the data.

There's no actual message being paged, just an 18-bit device address. It's preceded by a preamble of alternating 1's and 0's that the receiver can grab onto, and a fixed synchronization codeword that tells where the data begins. It's followed by numerous repetitions of the idle codeword.

The next question is inevitable. How much havoc would ensue if someone were to loop through all 262,144 possible addresses and send a message like this? I'll leave it as hypothetical.

Sep 8, 2013

LAN file transfer with netcat

Need to quickly transfer a file from one computer to another? They don't have AirDrop and you can't find a memory stick? No worries; netcat comes to the rescue. This tip works on Linux as well as OSX.

I'm going to suppose you're on the same LAN, the subnet address space begins with 192.168, and you want to transfer a file called file.tar. First, on the receiving computer, type ifconfig|grep 192.168 to find out its IP address. Then make netcat listen to a port by typing nc -l 12345 > file.tar. On the sending side, type nc 12345 < file.tar (or whatever the IP address and file name are). And magic happens!

Jul 13, 2013

Squelch it out

Many air traffic and maritime radio channels, like 2182 kHz and VHF 16, are being monitored and continuously recorded by numerous vessels and ground stations. Even though gigabytes are now cheap, long recordings would have to be compressed to make them easier to move around.

There are some reasons not to use lossy compression schemes like MP3 or Vorbis. Firstly, radio recordings are used in accident investigations, and additional artifacts at critical moments could hinder investigation. Secondly, lossy compression performs poorly on noisy signals, because of the amount of entropy present. Let's examine some properties of such recordings that may aid in their lossless compression instead.

A common property of radio conversations is that the transmissions are short, and they're followed by sometimes very long pauses. Receivers often silence themselves in the absence of a carrier (known as 'squelching'), so there's nothing interesting to record during that time.

This can be exploited by replacing such quiet passages with absolute digital silence, that is a stream of zero-valued samples. Now, using lossless FLAC, passages of digital silence compress into a very small space because of run-length encoding; the encoder simply has to save the number of zero samples encountered.

Real-life test

Let's see how the method performs on an actual recording. A sample WAV file will be recorded from the radio, from a conversation on VHF FM. The SDR does the actual squelching, and then the PCM waveform will be piped through squelch, a silencer tool I wrote.

rtl_fm      -f 156.3M -p 96 -N -s 16k -g 49.2 -l 200 |\
  sox       -r 16k -t raw -e signed -b 16 -c 1 -V1 -\
               -r 11025 -t raw - sinc 200-4500 |\
  squelch   -l 5 -d 4096 -t 64 |\
  sox       -t raw -e signed -c 1 -b 16 -r 11025 - recording.wav

The waveform is sinc filtered so as to remove the DC offset resulting from FM demodulation without a PLL. This makes silence detection simpler. It also removes any noise outside the actual bandwidth of VHF voice transceivers. (The output of the first SoX at quiet passages is not digital silence; the resample to the common rate of 11,025 Hz introduces some LSB noise. This is what we're squelching in this example.)

30 minutes of audio was recorded. A carrier was on less than half of the time. FLAC compresses the file at a ratio of 0.174; this results in an average bitrate of 30 kbps, which is pretty good for lossless 16-bit audio. At this rate, a CD can fit almost 50 hours of recording.