Visualizing hex dumps with Unicode emoji

Memorizing SSH public key fingerprints can be difficult; they're just long random numbers displayed in base 16. There are some terminal-friendly solutions, like OpenSSH's randomart. But because I use a Unicode terminal, I like to map the individual bytes into characters in the Miscellaneous Symbols and Pictographs block.

This Perl script does just that:


@emoji = qw( 🌀  🌂  🌅  🌈  🌙  🌞  🌟  🌠  🌰  🌱  🌲  🌳  🌴  🌵  🌷  🌸
             🌹  🌺  🌻  🌼  🌽  🌾  🌿  🍀  🍁  🍂  🍃  🍄  🍅  🍆  🍇  🍈
             🍉  🍊  🍋  🍌  🍍  🍎  🍏  🍐  🍑  🍒  🍓  🍔  🍕  🍖  🍗  🍘
             🍜  🍝  🍞  🍟  🍠  🍡  🍢  🍣  🍤  🍥  🍦  🍧  🍨  🍩  🍪  🍫
             🍬  🍭  🍮  🍯  🍰  🍱  🍲  🍳  🍴  🍵  🍶  🍷  🍸  🍹  🍺  🍻
             🍼  🎀  🎁  🎂  🎃  🎄  🎅  🎈  🎉  🎊  🎋  🎌  🎍  🎎  🎏  🎒
             🎓  🎠  🎡  🎢  🎣  🎤  🎥  🎦  🎧  🎨  🎩  🎪  🎫  🎬  🎭  🎮
             🎯  🎰  🎱  🎲  🎳  🎴  🎵  🎷  🎸  🎹  🎺  🎻  🎽  🎾  🎿  🏀
             🏁  🏂  🏃  🏄  🏆  🏇  🏈  🏉  🏊  🐀  🐁  🐂  🐃  🐄  🐅  🐆
             🐇  🐈  🐉  🐊  🐋  🐌  🐍  🐎  🐏  🐐  🐑  🐒  🐓  🐔  🐕  🐖
             🐗  🐘  🐙  🐚  🐛  🐜  🐝  🐞  🐟  🐠  🐡  🐢  🐣  🐤  🐥  🐦
             🐧  🐨  🐩  🐪  🐫  🐬  🐭  🐮  🐯  🐰  🐱  🐲  🐳  🐴  🐵  🐶
             🐷  🐸  🐹  🐺  🐻  🐼  🐽  🐾  👀  👂  👃  👄  👅  👆  👇  👈
             👉  👊  👋  👌  👍  👎  👏  👐  👑  👒  👓  👔  👕  👖  👗  👘
             👙  👚  👛  👜  👝  👞  👟  👠  👡  👢  👣  👤  👥  👦  👧  👨
             👩  👪  👮  👯  👺  👻  👼  👽  👾  👿  💀  💁  💂  💃  💄  💅 );

while (<>) {
  if (/[a-f0-9:]+:[a-f0-9:]+/) {
    ($b, $m, $a) = ($`, $&, $');
    print $b.join("  ", map { $emoji[$_] } map hex, split /:/, $m)." ".$a;
  }
}

What's happening here? First we create a 256-element array containing a hand-picked collection of emoji. Naturally, they're all assigned an index from 0x00 to 0xff. Then we'll loop through standard input and look for lines containing colon-separated hex bytes. Each hex value is replaced with an emoji from the array.

Here's the output:

[Image: Terminal screenshot showing a PGP key fingerprint and the same with all hex numbers replaced with emoji.]

The script could easily be extended to support output from other hex-formatted sources as well, such as xxd:

[Image: Terminal screenshot showing a hex dump of a poem and the same with all hex numbers replaced with emoji. kissofoni; tassun kynsi neulana / musa korvista kajahtaa]

Some additional methods for visualizing hex dumps and key fingerprints, from the comments section:

15 comments:

  1. Love it! I have also been playing with different ways to visualize hex digests. http://user.xmission.com/~atoponce/art/ for some visual eye candy goodness on PGP keys.

    ReplyDelete
    Replies
    1. Interesting !
      I have been playing with the concept of visual hashes too: http://sebsauvage.net/wiki/doku.php?id=php:vizhash_gd

      That's just a small php lib. Someone even ported it to Javascript.

      I use it as avatars in a pseudonymous discussion board (ZeroBin, http://sebsauvage.net/wiki/doku.php?id=php:zerobin). Works well for this purpose.

      There are also other interesting uses for these kind of visual hashes.

      Delete
  2. Which terminal are you using?

    ReplyDelete
  3. Heh, resembles an old Lotus password visualizing hint :)

    ReplyDelete
    Replies
    1. Did you ever work out how to read the password in the clear from that? As recently as 2011, this was possible.

      Delete
  4. Memorizing SSH fingerprints is your first mistake, representing it in a form which is even more open to "social engineering" is the second.
    Memorizing 16 icons, words, or shapes isn't easy either and if you start training people to verify the identity of SSH server that way will just lead to attacks that use public keys with only a single byte difference. Most people will not notice a single icon out of place, especially if you supplement it for a different one, and even a fewer people will notice when you flip the order of a pair of icons.
    Server verification should be done by software you trust against a pre-existing list of keys which are stored in a secure location that you can control and monitor. The rest is just pure nonsense.

    ReplyDelete
    Replies
    1. I fully agree. How dare she do anything fun not directly useful to your existence. Bad hacker!

      Delete
    2. As far as I know, the attacks you describe would not be a problem. Generating a key with "a few different bytes" is almost as hard as just cracking the RSA key in the first place.

      Delete
  5. Good thing that the emoji are double-wide, so they take up the same width as two hex digits. Pure luck?

    ReplyDelete
  6. Or use radare2 => "r2 -qfncpxe file" :)

    ReplyDelete
  7. Love the idea, thanks! Made my own Bash version: https://github.com/onnozweers/scripts/blob/master/emoji

    ReplyDelete
  8. What do you think could be done to employ all emoji code points (or another arbitrary set of code points)? There are currently 1093 characters with "emoji presentation" attribute.

    My idea was to consider the file a positional representation of a long 256-ary number and convert it into a 1093-ary number, but that process is O(N^2) and requires me to store the whole representation in memory (also, as a kludge, I had to imagine that the file starts with a non-zero byte, then discard it on output - otherwise the leading zero bytes would be lost). Maybe I can keep it fast by employing another kludge and encoding small blocks to avoid quadratic rise in complexity.

    There has got to be a better way.

    ReplyDelete

Please browse through the FAQ first, it might be that your question is already answered.

Spammers have even found comments sections, so this comments section is pre-moderated; it will take some time for the comment to show up.