Decoding radio-controlled bus stop displays

In the previous post I told about the 16 kbps data stream on FM broadcast frequencies, and my suspicion that it's being used by the bus and tram stop display system here in Helsinki. Now it's time to find out the truth.

I had the opportunity to observe a display stuck in the middle of its bootup sequence, displaying a version string. This revealed that the system is called IBus and it's made by the Swedish company Axentia. Sure enough, their website talks about DARC and how it requires no return channel, making it possible to use battery-powered displays in remote areas.

[Image: Photo of a liquid crystal display showing the text: 8589 IBUS 6.5.70d]

Not much else is said about the system, though; there are no specs for the proprietary protocol. So I implemented the five-layer DARC protocol stack in Perl and was left with a stream of fully error-corrected packets on top of Layer 5, separated into hundreds of subchannels. Some of these contained human-readable ISO-8859-1 strings with names of terminal stations. They seemed like an easy starting point for reverse engineering.

Here's one such packet after dissection:

[Image: An infographic titled 'Bus & Destination Packet', showing the hex bytes of a 64-byte packet, divided into fields that are labeled according to their apparent purpose. There are counters, size fields, the bus stop identifier, bus line identifiers, and apparent references to other types of packets. Several fields containing Latin-1 text are also transcribed. The text in them reads '23N RUSKEASUO BRUNAKÄRR'.]

Each display seems to store in its memory a list of buses that can be expected to pass the stop, along with their destinations in both official languages. The above "bus & destination packets" are used to update the memory. This is done once a day for each display on a narrow-band subchannel, so that updating all the displays takes the whole day. The mapping of the bus stop ID to actual bus stops is not straightforward, and had to be guessed from the lists of buses, on a case-by-case basis.

A different kind of packet updates the remaining waiting time for each bus in minutes. This "minutes packet" is sent three times per minute for every display in the system.

[Image: An infographic titled 'Minutes Packet', showing another type of labeled packet. Most fields are one byte long and they contain the number of minutes until the next bus arrives, in 7 bits, and one bit telling whether this estimate is based on positioning or time tables. Information about which number belongs to which bus is contained in the references from other types of packets.]

These packets may contain waiting times for several displays using the same subchannel; the "bus & destination packet" contains an address to the correct minutes slot. (The subchannel address is signaled on a lower protocol layer.) A special flag is used to signify an unused slot; another flag indicates that the bus has no working GPS and that the time is an approximation based on timetables. This causes a tilde "~" to appear before the minutes field. This means all calculation is done centrally and the displays only show numbers they hear on the radio.

There's also an announcement packet that's used to sent text messages to the displays. Often these will be about traffic disruptions and diverted routes. It's up to the displays to decide how to display the message; in the low-end battery-powered ones, the actual minutes function is (annoyingly) blanked for the duration of the slowly scrolling message.

I have yet to figure out the meaning of some packet types with non-changing data.

A special subchannel contains test packets with messages such as "Tämä on Mono-Axentia-testiä... Toimiiko tää ees..." and "Määritykset ja tiedotteet tehty vain Monolla - ei mitään IBus:lla" suggesting that they're planning to migrate from IBus to something called Mono. Interestingly, there's also a repeating test message in German – "Bus 61 nach Flughafen aus Haltestelle 1".

What good is it, you ask? Well, who wouldn't want a personal display repeater at home, telling when it's time to go?

[Image: Photo of a small liquid crystal display kit with its PCB showing, obviously home-soldered to a bunch of wires, and displaying the text: '72 TAPANILA ~12'.]

Broadcast messages on the DARC side

By now I've rambled quite a lot about RDS, the data subcarrier on the third harmonic of the 19 kHz FM stereo pilot tone. And we know the second harmonic carries the stereo information. But is there something beyond that?

Using a wideband FM receiver, like an RTL-SDR, I can plot the whole demodulated spectrum of any station, say, from baseband to 90 kHz. Most often there is nothing but silence above the third pilot harmonic (denoted 3f here), which is the RDS subcarrier. But two local stations have this kind of a peculiar spectrogram:

[Image: A spectrogram showing a signal at the audible frequency range, labeled 'mono', and four carriers centered at 19, 38, 57, and 76 kHz, labeled pilot, 2f, 3f, and 4f, respectively. Pilot is a pure sinusoid; 2f and 3f are several kHz wide signals with mirrored sidebands; and 4f is 20 kHz wide and resembles wideband noise.]

There seems to be lots of more data on a pretty wide band centered on the fourth harmonic (4f)!

As so often with mysterious persistent signals, I got a lead by googling for the center frequency (76 kHz). So, meet the imaginatively named Data Radio Channel, or DARC for short. This 16,000 bps data stream uses level-controlled minimum-shift keying (L-MSK), which can be thought of as a kind of offset-quadrature phase-shift keying (O-QPSK) where consecutive bits are sent alternating between the in-phase and quadrature channels.

To find out more, I'll need to detect the signal first. Detecting L-MSK coherently is non-trivial ("tricky") for a hobbyist like me. So I'm going to cheat a little and treat the signal as continuous-phase but non-coherent binary FSK instead. This should give us good data, even though the bit error probability will be suboptimally high. I'll use a band-pass filter first; then a splitter filter to split the signal band into mark and space; detect both envelopes and calculate difference; lowpass filter with a cutoff at the bitrate; recover bit timing and synchronize using a PLL; and now we can just take the bits out by hard decision at bit middle points.

Oh yeah, we have a bitstream!

[Image: Three oscillograms followed by a stream of 1s and 0s. The first oscillogram is quite nondescript; the second one actually shows two waveforms, red and blue, in the same graph, with the red dominating in envelope power where blue is suppressed and vice versa. The third oscillograms shows a graph apparently following their envelope power difference, with sample points at regular intervals. The sign of this plot at sample points dictates whether a 0 or 1 is shown below that sample.]

Decoder software for DARC does not exist, according to search engines. To read the actual data frames, I'll have to acquire block synchronization, regenerate the pseudorandom scrambler used and reverse its action, check the data against CRCs for errors, and implement the stack of layers that make up the protocol. The DARC specification itself is luckily public, albeit not very helpfully written; in contrast to the RDS standard, it contains almost no example circuits, data, or calculations. So, a little recap of Galois field mathematics and linear feedback shift registers for me.

But what does it all mean? And who's listening to it?

Now, it's no secret (Signaali 3/2011, p. 16) that the HSL bus stop timetable displays in Helsinki get their information about the next arriving GPS positioned buses through this very FM station, YLE 1. That's the only thing they're telling though. I haven't found anything bus-related in the RDS data, so it's quite likely that they're listening to DARC.

[Image: Photo of a rugged LCD display mounted on a metal pole, displaying the text '55K FORSBY ~6' and stamped with the logo 'HSL HRT'.]

DARC is known to be used in various other applications as well, such as DGPS correction signals. So decoding it could prove interesting.

Sniffing the packet types being sent, it seems that quite a big part of the time is being used to send a transmit cycle schedule (COT and SCOT). And indeed, the aforementioned magazine article hints that the battery-powered displays use a precise radio schedule to save power, receiving only during a short period every minute each. (This grep only lists metadata type packets, not the actual data.)

$ perl darcdec-filters.pl | grep Type: | sort | uniq -c
  88 Type: 0 Channel Organization Table (COT)
   8 Type: 1 Alternative Frequency Table (AFT)
   8 Type: 2 Service Alternative Frequency Table (SAFT)
   1 Type: 4 Service Name Table (SNT)
   8 Type: 5 Time and Date Table (TDT)
 112 Type: 6 Synchronous Channel Organization Table (SCOT)
$ █

At the moment I'm only getting sensible data out at the very beginning of each packet (or "block"). I do get a solid, error-free block sync but, for example, the date is consistently showing the year 1922 and all CRCs fail. Other fields are similarly weird but consistent. This could mean that I've still got the descrambler PN polynomial wrong, only succeeding when it's using bits from the initialization seed. And this can only mean many more sleepless coding nights ahead.

(Continued in the next post)

(Pseudotags for Finns: HSL:n pysäkkinäytöt eli aikataulunäytöt.)

Update 1/2019: There's now a basic DARC decoder on GitHub, it's called darc2json.