Capturing PAL video with an SDR (and a few dead-ends)

I play 1980s games, mostly Super Mario Bros., on the Nintendo NES console. It would be great to be able to capture live video from the console for recording or maybe even speedrun streaming. Now, how to make the 1985 NES and the 2013 MacBook play together, preferably using hardware that I already have? This project diary documents my search for the answer.

Here's a spoiler – it did work:

[Image: A powered-on NES console and a MacBook on top of it, showing a Tetris title screen.]

Things that I tried first

A capture device

Video capture devices, or capture cards, are devices specially made for this purpose. There was only one cheap (~30€) capture device for composite video available locally, and I bought it, hopingly. But it wasn't readily recognized as a video device on the Mac, and there seemed to be no Mac drivers available. Having already almost capped my budget for this project I then ordered a 5€ EasyCap device from eBay, as there was some evidence of Mac drivers online. The EasyCap is still making its way to Finland as of this writing, so I continued to pursure other routes.

CRT TV + DSLR camera

The cathode-ray tube television that I use for gaming could be filmed with a digital camera. This posed interesting problems: The camera must be timed appropriately so that a full scan is captured in every frame, to prevent temporal aliasing (stripes). This is why I used a DSLR camera with a full manual mode (Canon EOS 550D in this case).

For the 50 Hz PAL television screen I used a camera frame rate of 25 fps and an exposure time of 1/50 seconds (set by camera limitations). The camera will miss every other frame of the original 50 fps video, but on the other hand, will get an evenly lit screen every time.

A Moiré pattern will also appear if the camera is focused on the CRT shadow mask. This is due to intererence between two regular 2D arrays, the shadow mask in the TV and the CCD array in the camera. I got rid of this by setting the camera on manual focus and defocusing the lense just a bit.

[Image: A screen showing Super Mario Bros., and a smaller picture with Oona in it.]

This produced surprisingly good quality video, save for the slight jerkiness caused by the low frame rate (video). This setup was good for one-off videos; However, I could not use this setup for live streaming, because the camera could only record on the SD card and not connect to the computer directly.

LCD TV + webcam

An old LCD TV that I have has significantly less flicker than the CRT, and I could have live video via the webcam. But the Microsoft LifeCam HD-3000 that I have had only a binary option for manual exposure (pretty much "none" and "lots"). Using the higher setting the video was quite washed out, with lots of motion blur. The lower setting was so fast that it looked like the LCD had visible vertical scanning. Brightness was also heavily dependent on viewing angle, which caused gradients over the image. I had to film at a slightly elevated angle so that the upper part of the image wouldn't go too dark, and this made the video look like a bootleg movie copy.

[Image: A somewhat blurry photo of an LCD TV showing Super Mario Bros.]

Composite video

Now to capturing the actual video signal. The NES has two analog video outputs: one is composite video and the other an RF modulator, which has the same composite video signal modulated onto an AM carrier in the VHF television band plus a separate FM audio carrier. This is meant for televisions with no composite video input: the TV sees the NES as an analog TV station and can tune to it.

In composite video, information about brightness, colour, and synchronisation is encoded in the signal's instantaneous voltage. The bandwidth of this signal is at least 5 MHz, or 10 MHz when RF modulated, which would require a 10 MHz IQ sampling rate.

[Image: Oscillogram of one PAL scanline, showing hsync, colour burst, and YUV parts.]

I happen to have an Airspy R2 SDR receiver that can listen to VHF and take 10 million samples per second - could it be possible? I made a cable that can take the signal from the NES RCA connector to the Airspy SMA connector. And sure enough, when the NES RF channel selector is at position "3", a strong signal indeed appears on VHF television channel 3, at around 55 MHz.

Software choices

There's already an analog TV demodulator for SDRs - it's a plugin for SDR# called TVSharp. But SDR# is a Windows program and TVSharp doesn't seem to support colour. And it seemed like an interesting challenge to write a real-time PAL demodulator myself anyway.

I had been playing with analog video demodulation recently because of my HDMI Tempest project (video). So I had already written a C++ program that interprets a 10 Msps digitised signal as greyscale values and sync pulses and show it live on the screen. Perhaps this could be used as a basis to build on. (It was not published, but apparently there is a similar project written in Java, called TempestSDR)

Data transfer from the SDR is done using airspy_rx from airspy-tools. This is piped to my program that reads the data into a buffer, 256 ksamples at a time.

Automatic gain control is an important part of demodulating an AM signal. I used liquid-dsp's AGC by feeding it the maximum amplitude over every scanline period; this roughly corresponds to sync level. This is suboptimal, but it works in our high-SNR case. AM demodulation was done using std::abs() on the complex-valued samples. The resulting real value had to be flipped from 1, because TV is transmitted "inverse AM" to save on the power bill. I then scaled the signal so that black level was close to 0, white level close to 1, and sync level below 0.

I use SDL2 to display the video and OpenCV for pixel addressing, scaling, cropping, and YUV-RGB conversions. OpenCV is an overkill dependency inherited from the Tempest project and SDL2 could probably do all of those things by itself. This remains TODO.

Removing the audio

The captured AM carrier seems otherwise clean, but there's an interfering peak on the lower sideband side at about –4.5 MHz. I originally saw it in the demodulated signal and thought it would be related to colour, as it's very close to the PAL chroma subcarrier frequency of 4.43361875 MHz. But when it started changing frequency in triangle-wave shapes, I realized it's the audio FM carrier. Indeed, when it is FM demodulated, beautiful NES music can be heard.

[Image: A spectrogram showing the AM carrier centered in zero, with the sidebands, chroma subcarriers and audio alias annotated.]

The audio carrier is actually outside this 10 MHz sampled bandwidth. But it's so close to the edge (and so powerful) that the Airspy's anti-alias filter cannot sufficiently attenuate it, and it becomes folded, i.e. aliased, onto our signal. This caused visible banding in the greyscale image, and some synchronization problems.

I removed the audio using a narrow FIR notch filter from the liquid-dsp library. Now, the picture quality is very much acceptable. Minor artifacts are visible in narrow vertical lines because of a pixel rounding choice I made, but they can be ignored.

[Image: Black-and-white screen capture of NES Tetris being played.]

Decoding colour

PAL colour is a bit complicated. It was designed in the 1960s to be backwards compatible with black-and-white TV receivers. It uses the YUV colourspace, the Y or "luminance" channel being a black-and-white sum signal that already looks good by itself. Even if the whole composite signal is interpreted as Y, the artifacts caused by colour information are bearable. Y also has a lot more bandwidth, and hence resolution, than the U and V (chrominance) channels.

U and V are encoded in a chrominance subcarrier in a way that I still haven't quite grasped. The carrier is suppressed, but a burst of carrier is transmitted just before every scanline for reference (so-called colour burst).

Turns out that much of the chroma information can be recovered by band-pass filtering the chrominance signal, mixing it down to baseband using a PLL locked to the colour burst, rotating it by a magic number (chroma *= std::polar(1.f, deg2rad(170.f))), and plotting the real and imaginary parts of this complex number as the U and V colour channels. This is similar to how NTSC colour is demodulated.

In PAL, every other scanline has its chrominance phase shifted (hence the name, Phase Alternating [by] Line). I couldn't get consistent results demodulating this, so I skipped the chrominance part of every other line and copied it from the line above. This doesn't even look too bad for my purposes. However, there seems to be a pre-echo in UV that's especially visible on a blue background (most of SMB1 sadly), and a faint stripe pattern on the Y channel, most probably crosstalk from the chroma subcarrier that I left intact for now.

[Image: The three chroma channels Y, U, and V shown separately as greyscale images, together with a coloured composite of Mario and two Goombas.]

I used liquid_firfilt to band-pass the chroma signal, and liquid_nco to lock onto the colour burst and shift the chroma to baseband.

Let's play Tetris!

Latency

It's not my goal to use this system as a gaming display; I'm still planning to use the CRT. However, total buffer delays are quite small due to the 10 Msps sampling rate, so the latency from controller to screen is pretty good. The laptop can also easily decode and render at 50 fps. Tetris is playable up to level 12!

Using a slow-mo phone camera, I measured the time it takes for a button press to make Mario jump. The latency is similar to that of a NES emulator:

MethodFrames @240fpsLatency
RetroArch emulator28117 ms
PAL NES + Airspy SDR26108 ms
PAL NES + LCD TV2083 ms

Performance considerations

A 2013 MacBook Pro is perhaps not the best choice for dealing with live video to begin with. But I want to be able to run the PAL decoder and a screencap / compositing / streaming client on the same laptop, so performance is even more crucial.

When colour is enabled, CPU usage on this quad-core laptop is 110% for palview and 32% for airspy_rx. The CPU temperature is somewhere around 85 °C. Black-and-white decoding lowers palview usage to 84% and CPU temps to 80 °C. I don't think there's enough cycles left for a streaming client just yet. Some CPU headroom would be nice as well; a resync after dropped samples looks quite nasty, and I wouldn't want that to happen very often.

[Image: htop screenshot show palview and airspy_rx on top, followed by some system processes.]

Profiling reveals that the most CPU-intensive tasks are those related to FIR filtering. FIR filters are based on convolution, which is of high computational complexity, unless done in hardware. FFT convolution can also be faster, but only when the kernel is relatively long.

[Image: Diagram shows the Audio notch FIR takes up 27 % and Chroma Bandpass FIR 12 % of CPU. Several smaller contributors mentioned.

I've thought of having another computer do the Airspy transfer, audio notch filtering, and AM demodulation, and then transmit this preprocessed signal to the laptop via Ethernet. But my other computers (Raspberry Pi 3B+ and a Core 2 Duo T7500 laptop) are not nearly as powerful as the MacBook.

Instead of a FIR bandpass filter, a so-called chrominance comb filter is often used to separate chrominance from luminance. This could be realized very efficiently as a linear-complexity delay line. This is a promising possibility, but so far my experiments have had mixed results.

There's no source code release for now (Why? FAQ), but if you want some real-time coverage of this project, I did a multi-threaded tweetstorm: one, two, three.

18 comments:

  1. Nice hack!
    According [1] you are connecting the video output of the NES un-attenuated to the input of your Airspy.
    Isn't the power level that you feed into the SDR greatly exceeding the spec?


    [1] https://pbs.twimg.com/media/ECujW4YWwAEaZgP.jpg:orig

    ReplyDelete
    Replies
    1. Hi! It's difficult find information about the NES RF modulator output power, and I don't have any calibrated equipment. So I can't absolutely rule out the possibility of it exceeding the maximum RF input level. For Airspy R2, the rating is +10 dBm (10 milliwatts).

      But dBFS readings can be measured. If I set all gains in the Airspy to unity, the video carrier has -76 dBFS power on a 80 kHz band around the carrier. A local FM station gives a similar reading via antenna if I set LNA gain to +6 dB. This would indicate to me that the power is quite low.

      Delete
    2. Very interesting, my intuition was different.

      AFAIK so far SDRs have been used almost exclusively for RX/TX of signals via some form of antenna. Connecting them with cable-bound communication systems is definitely very interesting and worth exploring further. E.g. speaking DSL with an SDR would be awesome.
      Oona, are you aware of other projects who have used SDRs like you did? Seems to me like you are a trailblazer in yet another area.

      Delete
    3. I haven't heard of other people decoding colour PAL with an SDR, though there are no doubt many people who have done similar things and not written about them.

      Delete
    4. Steinar H. Gunderson25 August, 2019 23:30

      Not exactly the same, but a certainly related project, was BBC's 2001 realtime PAL software decoder (eventually put into hardware). See http://www.jim-easterbrook.me.uk/pal/ .

      Delete
    5. As to the signal level causing issues for you, there are some distributors that stock adjustable attenuators with F and IEC connectors that you could possibly use (and have in the RF adapters bin for future projects/jobs).
      Another route would be an attenuation and impedance conversion setup, basically resistors in an inverted L (Γ) configuration which presents the RF modulator with a clean 75 Ω termination (improper termination sometimes can cause artifacts in some makes of RF modulators).
      The horizontal resistor (Rh) (in series with the center conductor of the coax) and the notional resistor of the input impedance (Zin) of the Airspy (50 Ω) form a resistive divider that attenuates the signal, while the vertical resistor (Rv) is in parallel with the Rh and Zin of the Airspy (i.e. between center conductor of modulator and ground) to present the modulator something close to 75 Ω.
      As to resistors, low impedance metal film types (single-turn cermets if you're going for an potentiometer approach), standard grounded metal enclosure.

      Delete
    6. Thanks! The signal level is already perfect for me; I have no issues with it. What I demonstrated in my comment was that it is probably nowhere near maximum ratings for Airspy.

      Delete
    7. Indeed, but impedance mismatch can cause all sorts of mayhem at RF, reflecting signal back into the source, overloads amplifiers, causing them to operate non-linearly, and AM depends on the signal path being linear for lowest distortion (and the mayhem is proportional to the frequency used).
      Simplest thing to try would be a 25 Ω resistance in series in the center conductor, that way, the modulator sees 75 Ω (25 + 50 (Zin)) and operates as near linear as is possible (Ignore if you have a 75 Ω input impedance option in the Airspy and enabled it).

      Delete
  2. Could you share your SDR pipeline setup? That is, is it GNURadio, or piping data into entirely your own code, or what?

    ReplyDelete
    Replies
    1. I read IQ samples from airspy_rx via stdin pipe into my C++ array buffer. The signal flowchart above is an approximation of what happens in the code, though I left synchronization out for now.

      Delete
  3. The "aliasing" link back to the same page made me smile

    ReplyDelete
    Replies
    1. Lol, that's funny. I meant to link to Wikipedia :D

      Delete
  4. I encoded PAL color in my fpga "gpu" by calculating Y with basic rgb->greyscale conversion, then calculating G-Y and B-Y signals. Then I modulated sin() and cos() oscillators at color carrier frequency with those U and V signals and summed them up with the greyscale signal and syncs. PAL was implemented by just inverting one of those oscillators per scanline. (can't remember which)

    I guess that was a bit complex way though, but I wanted to try it out because I had just heard about quadrature modulation and complex signals. AFAIK Gideon did this on Ultimate64 by just having single oscillator and table of precalculated phase shift values for each color in C64 palette.

    ReplyDelete
    Replies
    1. The way you described isn't overly complex -- it's actually the straightforward way to do things if you want to support arbitrary RGB colors. The "precomputed phase offsets" method only works if you have a small palette of supported colors to begin with.

      Delete
  5. If you're going down the cheap USB video capture dongle rabbit-hole via the EasyCAP passage, taking a look at the following article first might save you a fair amount of head-banging.

    https://linuxtv.org/wiki/index.php/Easycap

    The information here will help you determine the specific clone that's in the device you have on order and maybe assist in chasing down a working macOS driver for it.

    The dongle you already have may have a V4L driver in Linux - might be worth checking. Also, if you don't already have the 'lsusb' command on your Mac it can be had via MacPorts. It's a well-used tool in my bag of tricks.

    Could you farm out the USB video capture to one of your other machines using Linux at the expense of additional latency?

    ReplyDelete
  6. You definitely could capture the DSLR's liveview via HDMI and stream it — see scanlime's famous "SERVO AF" on stream :) An HDMI capture device is required though.

    ReplyDelete

The comments section is pre-moderated; it will take some time for the comment to show up.

You might want to check out the FAQ first.