Using HDMI radio interference for high-speed data transfer

This story, too, begins with noise. I was browsing the radio waves with a software radio, looking for mysteries to accompany my ginger tea. I had started to notice a wide-band spiky signal on a number of frequencies that only seemed to appear indoors. Some sort of interference from electronic devices, probably. Spoiler alert, it eventually led me to broadcast a webcam picture over the radio waves... but how?

It sounds like video

The mystery deepened when I listened to how this interference sounded like as an AM signal. It reminded me of a time I mistakenly plugged our home stereo system to the Nintendo console's video output and heard a very similar buzz.

Am I possibly listening to video? Why would there be analog video transmitting on any frequency, let alone inside my home?

[Image: Oscillogram of a noisy waveform that seems to have a pulse every 10 microseconds or so.]

If we plot the signal's amplitude against time we can see that there is a strong pulse exactly 60 times per second. This could be the vertical synchronisation signal of 60 Hz video. A shorter pulse (pictured above) can be seen repeating more frequently; it could be the horizontal one. Between these pulses there is what appears to be noise. Maybe, if we use the strong pulses for synchronisation and plot the amplitude of that noise as a two-dimensional picture, we could see something?

And sure enough, when main screen turn on, we get signal:

[Image: A grainy greyscale image of what appears to be a computer desktop.]

(I've hidden the bright synchronisation signal from this picture.)

It seems to be my Raspberry Pi's desktop with weirdly distorted greyscale colours! Somehow, some part of the monitor setup is radiating it quite loudly into the aether. The frequency I'm listening to is a multiple of the monitor's pixel clock frequency.

As it turns out, this vulnerability of some monitors has been known for a long time. In 1985, van Eck demonstrated how CRT monitors can be spied on from a distance[1]; and in 2004, Markus Kuhn showed that the same still works on flat-screen monitors[2]. The image is heavily distorted, but some shapes and even bigger text can be recognisable.

The next thought was, could we get any more information out of these images? Is there any information about colour?

Mapping all the colours

HDMI is fully digital; there is no linear dependency between pixel values and greyscale brightness in this amplitude image. I believe the brightness is related to the number of bit transitions over my radio's sampling time (which is around 8 bit-lengths); and in HDMI, this is dependent on many things, not just the actual RGB value of the pixel. HDMI also uses multiple differential wires that all are transmitting their own picture channels side by side.

This is why I don't think it's possible easy to reconstruct a clear picture of what's being shown on the screen, let alone decode any colours.

But could the reverse be possible? Could we control this phenomenon to draw the greyscale pictures of our choice on the receiver's screen? How about sending binary data by displaying alternating pixel values on the monitor?

[Image: On the left, gradients of red, green, and blue; on the right, greyscale lines of seemingly unrelated brightness.]

My monitor uses 16-bit colours. There are "only" 65,536 different colours, so it's possible to go through all of them and see how each appears in the receiver. But it's not that simple; the bit-pattern of a HDMI pixel can actually get modified based on what came before it. And my radio isn't fast enough to even tell the bits apart anyway. What we could do is fill entire lines with one colour and average the received signal strength. We would then get a mapping for single-colour horizontal streaks (above). Assuming a long run of the same colour always produces the same bitstream, this could be good enough.

[Image: An XY plot where x goes from 0 to 65536 and Y from 0 to 1.2. A pattern seems to repeat itself every 256 values of x. Values from 16128 to 16384 are markedly higher.]

Here's the map of all the colours and their intensity in the radio receiver. (Whatever happens between 16,128 and 16,384? I don't know.)

Now, we can resample a greyscale image so that its pixels become short horizontal lines. Then, for every greyscale value find the closest matching RGB565 color in the above map. When we display this psychedelic hodge-podge of colour on the screen (on the right), enough of the above mapping seems to be preserved to produce a recognizable picture of a movie[3] on the receiver side (on the left):

[Image: On the right, a monitor shows a noisy green and blue image. On the left, another monitor shows a grainy picture of a man and the text 'Hackerman'.]

These colours are not constant in any way. If I move the antenna around, even if I turn it from vertical to horizontal, the greyscales will shift or even get inverted. If I tune the radio to another harmonic of the pixel clock frequency, the image seems to break down completely. (Are there more secrets to be unfolded in these variations?)

The binary exfiltration protocol

Now we should have enough information to be able to transmit bits. Maybe even big files and streaming data, depending on the bitrate we can achieve.

First of all, how should one bit be encoded? The absolute brightness will fluctuate depending on radio conditions. So I decided to encode bits as the brightness difference between two short horizontal lines. Positive difference means 1 and negative 0. This should stay fairly constant, unless the colours completely flip around that is.

[Image: When a bit is 0, the leftmost line is darker than the rightmost line, and vice versa. These lines are used to form 768-bit packets.]

The monitor has 768 pixels vertically. This is a nice number so I designed a packet that runs vertically across the display. (This proved to be a bad decision, as we will later see.) We can stack as many packets side-by-side as the monitor width allows. A new batch of packets can be displayed in each frame, or we can repeat them over multiple frames to improve reliability.

These packets should have some metadata, at least a sequence number. Our medium is also quite noisy, so we need some kind of forward error correction. I'm using a Hamming(12,8) code which adds 4 error correction bits for every 8 bits of data. Finally, we need to add a CRC to each packet so we can make sure it arrived intact; I chose CRC16 with the polynomial 0x8005 (just because liquid-dsp provided it by default).

First results!

It was quite unbelievable, I was able to transmit a looping 64 kbps audio stream almost without any glitches, with the monitor and the receiver in the same room approximately 2 meters from each other.

Quick tip. Raw 8-bit PCM audio is a nice test format for these kinds of streaming experiments. It's straightforward to set an arbitrary bitrate by resampling the sound (with SoX for instance); there's no structure, headers, or byte order to deal with; and any packet loss, misorder, or buffer underrun is instantly audible. You can use a headerless companding algorithm like A-law to fit more dynamic range in 8 bits. Even stereo works; if you start from the wrong byte the channels will just get swapped. SoX can also play back the stream.

But can we get more? Slowly I added more samples per second, and a second audio channel. Suddenly we were at 256 kbps and still running smoothly. 200 kbps was even possible from the adjacent room, with a directional antenna 5 meters away, and with the door closed! In the same room, it worked up to around 512 kilobits per second but then hit a wall.

[Image: Info window that says HasPreamble: 1. Total: 928.5 kbps, Fresh: 853.6 kbps, Fresh (payload): 515.7 kbps.]

A tearful performance

The heavy error correction and framing adds around 60% of overhead, and we're left wit 480 bits of 'payload' per packet. If we have 39 packets per frame at 60 frames per second we should get more than a megabit per second, right? But for some reason it always caps at half a megabit.

The reason revealed itself when I noticed every other frame was often completely discarded at the CRC check. Of course; I should have thought of properly synchronising the screen update to the graphics adapter's frame update cycle (or VSYNC). This would prevent the picture information changing mid-frame, also known as tearing. But whatever options I tried with the SDL library I couldn't get the Raspberry Pi 4 to not introduce tearing.

Screen tearing appears to be an unsolved problem plaguing the Raspberry Pi 4 specifically (see this Google search). I tried another mini computer, the Asus Tinker Board R2.0, but I couldn't get the graphics drivers to work properly. I then realised it was a mistake to have the packets run from top to bottom; any horizontal tearing will cut every single packet in half! With a horizontal design only one packet per frame would suffer this fate.

A new design enables video-over-video

Packets that run horizontally across the screen indeed fix most of the packet loss. It may also help with CPU load as it improves memory access locality. I'm now able to get 1000 kbps from the monitor! What could this be used for? A live video stream, perhaps?

But the clock was ticking. I had a presentation coming up and I really wanted to amaze everyone with a video transfer demo. I quite literally got it working on the morning of the event. For simplicity, I decided to go with MJPEG, even though fancier schemes could compress way more efficiently. The packet loss issues are mostly kept at bay by repeating frames.

The data stream is "hidden" in a Windows desktop screenshot; I'm changing the colours in a way that both creates a readable bit and also looks inconspicuous when you look from far away.

Mitigations

This was a fun project but this kind of a vulnerability could, in the tinfoiliest of situations, be used for exfiltrating information out of a supposedly airgapped computer.

The issue has been alleviated in some modern display protocols. DisplayPort[4] makes use of scrambling: a pseudorandom sequence of bits is mixed with the bitstream to remove the strong clock oscillations that are so easily radiated out. This also randomizes the bitstream-to-amplitude correlation. I haven't personally tested whether it still has some kind of video in their radio interference, though. (Edit: Scrambling seems to be optionally supported by later versions of HDMI, too – but it might depend on which features exactly the two devices negotiate. How could you know if it's turned on?)

[Image: A monitor completely wrapped in tinfoil, with the text IMPRACTICAL written over it.]

I've also tried wrapping the monitor in tinfoil (very impractical) and inside a cage made out of chicken wire (it had no effect - perhaps I should have grounded it?). I can't recommend either of these.

Software considerations

This project was made possible by at least C++, Perl, SoX, ImageMagick, liquid-dsp, Dear Imgui, GLFW, turbojpeg, and v4l2! If you're a library that feels left out, please leave a comment.

If you wish to play around with video emanations, I heard there is a project called TempestSDR. For generic analog video decoding via a software radio, there is TVSharp.

References

  1. Van Eck, Wim (1985): Electromagnetic radiation from video display units: An eavesdropping risk?
  2. Kuhn, Markus (2004): Electromagnetic Eavesdropping Risks of Flat-Panel Displays
  3. KUNG FURY Official Movie [HD] (2015)
  4. Video Electronics Standards Association (2006): DisplayPort Standard, version 1.

Spiral spectrograms and intonation illustrations

I've been experimenting with methods for visualising harmony, intonation (tuning), and overtones in music. Ordinary spectrograms aren't very well suited for that as the harmonic relations are not intuitively visible. Let's see what could be done about this. I'll try to sprinkle the text with Wikipedia links in order to immerse (nerd snipe?) the reader in the subject.

Equal temperament cents against time

We can examine how tuning evolves during a recording by choosing a reference pitch and plotting all frequencies relative to it modulo 100 cents. This is similar to what an electronic tuner does, but instead of just showing the fundamental frequency, we'll plot the whole spectrum. Information about the absolute frequencies is lost. This "zoomed-in" plot visualises how the distribution of frequencies fits the 12-tone equal temperament system (12-TET) common in Western music.

Here's the first 20 seconds of Debussy's Clair De Lune as found on YouTube, played with a well-tuned (video) and an out-of-tune piano (video). The second piano sounds out of tune because there are relative differences in tuning between the strings. The first piano looks to be a few cents sharp as a whole, but consistently so, so it's not perceptible.

[Image: Two spectrograms labeled 'Piano in tune' and 'Piano out of tune'. The first one shows blobs of light along the center axis. In the second one, the blobs are jumping up and down the graph.]

The vertical axis is the same that electronic tuners use. All the notes of a chord will appear in the middle, as long as they are well tuned against the reference pitch of, say, A = 440 Hz. The top edge of the graph is half a semitone sharp (quarter tone = 50c), and the bottom is half a semitone flat.

Overtones barely appear in the picture because the first three conveniently hit other notes in tune. But from f5 onwards the harmonic series starts deviating from 12-TET and the harmonics start to look out-of-tune (f5 = −14c, f7 = −31c, ...). These can be cut out by filtering, or hoping that they're lower in volume and setting the color range accordingly.

Six months later...

You know how you sometimes accidentally delete a project and have to rewrite it from scratch much later, and it's never exactly the same? That's what happened at this point. It's a little scuffed, but here's some piano music that utilises quarter tones (by Wyschnegradsky). I used the same scale on purpose, so the quarter tones wrap around from the top and bottom.

[Image: Similar as above, but blobs are also seen at the very top and bottom of the graph.]

More examples of these "intonation plots" in a tweet.

This suits well for piano music. However, not even all western classical music is tuned to equal temperament; for instance, solo strings may be played with Pythagorean intonation[1], whereas vocal ensembles[2] and string quartets may tune some intervals closer to just intonation. Unlike the piano, these wouldn't look too good in the equal-note plot.

Octave spiral

If we instead plot the spectrum modulo 1200 cents (1 octave) we get an interesting interval view. We could even ditch the time scale and wind the spectrogram into a spiral to make it prettier and preserve the overtones and absolute frequency information. Now each note is represented by an angle in this spiral; in 12-TET, they're separated by 30 degrees. At any point in the spiral, going 1 turn outwards doubles the frequency.

Here's a C major chord on a piano. Note how the harmonic content adds several high-frequency notes on top of the C, E, and G of the triad chord, and how multiple notes can contribute to the same overtone:

[Image: 12 note names labeled around a spiral. The C, E, and G notes light up sequentially, and their frequency and harmonics are displayed on the spiral.]

I had actually hoped I would get an Ultimate Chord Viewer where any note, regardless of frequency, would have all its harmonics neatly stacked on top of it. But it's not what has happened here: the harmonic series is not a stack of octaves (2^n), but instead of integer multiples (n). Some harmonics appear at seemingly unrelated angles. But it's still a pretty interesting visualisation, and perhaps makes more sense musically.

This plot is also better at illustrating different tuning systems. Let's look at a major third interval F-A in equal temperament and just intonation, with a few more harmonics.

[Image: The interval alternates between 400 and 386 cents. When it's 386, a few harmonics of the F note merge with those of the A note.]

The intuition from this plot is that equal temperament aims for equal distances (angles) between notes and just intonation tries to make more of the harmonics match instead.

Even though the Ultimate Chord Viewer was not achieved I now have ideas for Christmas lights...

Live visualization

Here's what a cappella music with some reverb looks like on the spiral spectrogram.

A shader toy

This little real-time GLSL demo on shadertoy draws a spiral FFT from microphone audio. But don't get your hopes up: The spectrograms in this blog post were made with a 65536-point FFT. Shadertoy's 512px microphone texture offers a lot less in terms of frequency range and bins. This greatly blurs the frequency resolution, especially towars the low end. Could it be improved with the right colormap? Or a custom FFT with the waveform texture as its input?

References

Speech to birdsong conversion

I had a dream one night where a blackbird was talking in human language. When I woke up there was actually a blackbird singing outside the window. Its inflections were curiously speech-like. The dreaming mind only needed to imagine a bunch of additional harmonics to form phonemes and words. One was left wondering if speech could be transformed into a blackbird song by isolating one of the harmonics...

One way to do this would be to:

  • Find the instantaneous fundamental frequency and amplitude of the speech. For example, filter the harmonics out and use an FM demodulator to find the frequency. Then find the signal envelope amplitude by AM demodulation.
  • Generate a new wave with similar amplitude variations but greatly multiplied in frequency.
[Image: Signal path diagram.]

A proof-of-concept script using the Perl-SoX-csdr command-line toolchain is available (source code here). The result sounds surprisingly blackbird-like. Even the little trills are there, probably as a result of FM noise or maybe vocal fry at the end of sentences. I got the best results by speaking slowly and using exaggerated inflection.

Someone hinted that the type of intonation used in certain automatic announcements is perfect for this kind of conversion. And it seems to be true! Here, a noise gate and reverb has been added to the result to improve it a little:

And finally, a piece of sound art where this synthetic blackbird song is mixed with a subtle chord and a forest ambience:

Think of the possibilities: A simultaneous interpreter for talking to birds. A tool for dubbing talking birds in animation or live theatre. Entertainment for cats.

What other birds could be done with a voice changer like this? What about croaky birds like a duck or a crow?

(I talked about this blog post a little on NPR: Here's What 'All Things Considered' Sounds Like — In Blackbird Song)

Plotting patterns in music with a fantasy record player

[Image: Close-up of a vinyl record showing a wavy pattern.]

Back in April I bought a vinyl record that had a weird wavy pattern near the outer edge. I though I may have broken it somehow but couldn't even test this because I don't own a record player. *) But when I took a closer look at the pattern it seemed to somehow follow changes in the music. That doesn't look like damage at all.

When I played the CD version it became clear: this was an artifact of the tempo of the electronic track (100 bpm) being a multiple of the rotational speed (33 1/3 rpm), and these were probably drum hits! My tweet sparked some interesting discussion and I've been pondering this ever since. Could we plot any song as a loop or grid based on its own tempo and see interesting patterns?

(*) I know, it's a little odd. But I have a few unplayed vinyl records waiting for the day that I finally have the proper equipment. By the way, the song was Black Pink by RinneRadio from their wonderful album staRRk.

I wrote a little script to do just this: to plot the amplitude of the FLAC into a grid with an adjustable width. The result looks very similar to the pattern on the vinyl surface! Note that this image is a "straightened out" version of the disc surface and it's showing three of those wavy patterns. The top edge corresponds to the outer edge of the vinyl.

[Image: A plot showing similar patterns that were on the disc surface.]

Later I wrote a little more ambitious plotter that shall be explained soon.

Computer-conducted music gives best patterns

After plotting several different songs against their own tempo like this it seemed that in addition to electronic music a lot of pop and rock has this type of a pattern, too. The most striking and clear patterns can be seen in music that makes use of drum samples in a quantized time base (aka. a drum machine): the same kick drum sample, for example, repeats four times in each bar, perfectly timed by a computer so that they align in phase.

Somewhat similar patterns can be seen in live music that is played to a "click track": each band member hears a common computer-generated time signal in their earplug so that they won't sway from an average tempo. But of course the live beats won't be perfectly phase-aligned in this case, because the musicians are humans and there's also physics involved.

3D rendered video experiment

To demonstrate how the patterns on vinyl records are born I made a video showing a fantasy record player that can play an "e-ink powered optical record" and morph it into any RPM. I say fantasy because it's just that: imagination, science fiction, rendered 3D art - it would be quite unfeasible in real life. You can't actually make e-ink displays that fast and accurate. But of course it would be possible to have a live display of a digitally sampled audio file as a spiral and use some kind of physical controllers to change the RPM value in real time, and just play the sound from a digital file.

Making the video was really fun and I think the end result is equal parts weird and illustrating.

Programming the disk surface and audio

The disc surface is based on a video texture: a different image was rendered for each changed frame using a purpose-written C++ program. The program uses an oversampled (8x) version of the original music that it then resamples at a variable rate based on the recorded RPM value (let's call it a morphing value). Oversampling and low-pass filtering beforehand makes variable-rate resampling simple: just take a sample at the appropriate time instant and don't worry about interpolation. It won't sound perfect but actually the artifact adds an interesting distortion, perhaps simulating pixel boundaries in the 'e-ink display'.

The amplitude sample at each time instant was projected into polar coordinates and plotted as an image. The image is fairly large - at least 2048 by 2048 pixels. I use this as a sort of image-space oversampling to get the polar projection to look a little better. I even tried 8192 x 8192 video but it was getting too heavy on my computer. But a new image must only be generated when the morphing value changes; the other frames can be copied.

[Image: A square image of the disc video texture.]

The sound track was made by continuously sampling the position of the "play head" 44100 times per second, whether the disk was moving or not. Which sample ends up in the audio depends on the current rotational angle and the morphing value of the disk surface. When either of those values change it moves the audio past the play head. A DC cancel filter was then applied because the play head would often stop on a non-zero sample, and it didn't look nice in the waveform. There's also a quiet hum in the background.

[Image: Screenshot of C++ code with a list of events.]

I made an event-based system where I could input events simulating the button presses and other controls. The system responds to speed change events with a smoothstep function so that the disc seems to have realistic inertia. Also, the slow startup and slowdown sounds kind of cool this way. Here's an extra-slow version of the effect -- you can hear the slight aliasing artifacts in the very beginning and end:

3D modeling, texturing, shading

The models were all made in Blender, a tool that I've slowly learned to use during the pandemic situation. Its modeling tools are pretty fun to use and once you learn it you can build pretty 3D models to look at that won't take up any room in your apartment.

[Image: A screenshot of Blender with the device without textures.]

I love the retro aesthetic of old reel-to-reel players and other studio equipment. So I looked for inspiration by searching "reel-to-reel" on Google Images. Try it out, it's worth it! Originally I wanted for the disc to be transparent with some type of movable particles inside, and the laser to be shone through it, but this was computationally very expensive to render. So I made it and 'e-ink' display instead. (I regret this choice a little bit since some people, at first glance, apparently thought the video depicted actual existing technology. But I tried to make it clear it's a photorealistic render :)

I made use of the boolean workflow and bevel modifiers to cut holes and other small details in the hard surfaces. The cables are Bezier curves with the round bevel setting enabled.

The little red LCD is also a video texture on an emission shader – each frame was an SVG that was changed a little in time to add flicker and then exported using Inkscape from a batch script.

The wood textures, fingerprints, and the room environment photo are from HDRi Haven, Texture Haven and CC0 Textures. I'm especially proud of all the details on the disc surface -- here's the shader setup I built for the surface:

[Image: A Blender texture node map.]

The video was rendered in Blender Eevee and it took maybe 10 hours at 720p60. It's sad that it's not in 1080p but I was impatient. I spent quite some time to make the little red LCD look realistic but it was completely spoiled by compression!

Here's a bigger still rendered in Cycles:

[Image: A render of the record player.]

Mapping rotation to the disc

Rotation angles from the C++ program were sampled 60 times per second and output as CSV. These were then imported to Blender as keyframes for the rotating disc, using the Python API:

[Image: A screenshot of a Python script.]

Here you only need to print a new keyframe when the speed of rotation changes, or is about to change; Blender will interpolate the rest.

A Driver was set up to make the Y rotation slightly follow the Z rotation with a very small factor to make the disc 'wobble' a bit.

What's next?

It's endless fun to build and program functional fantasy electronics and I may need to do more of that. I'm currently also obsessed with 3D modeled dollhouses and who knows how those things will combine?

By the way, there is an actual device somewhat resembling this 3D model. It's called the Panoptigon, it's sort of an optical mellotrone (video).