Making MIDI Art from Text

When it’s someone’s happy birthday, you usually sing them happy birthday. When you’re in quarantine, this is not feasible, and sending a recording is too cheesy. The text “Happy birthday! 🎉”, doesn’t really do the music much justice. Thinking about this, I was reminded of MIDI art, so I decided to make a program that could take in text, and make MIDI art from it. Here’s what the script produced:

Happy Birthday

A normal person might just do this using the marker tool in the piano roll section of a DAW, but where’s the fun in that?

Before we go there though, what about existing solutions? I’ve heard of Algoart and similar software, but they don’t really do what I want them to. So here we are.

Interfacing with MIDI

MIDI used serialized data to represent music. MIDI has channels and tracks that go on these channels. For what I’m doing here, we don’t really have to go into the depths of how MIDI works. All we really have to know is that MIDI files can have multiple tracks, and you write out notes to a single track.

mido was the quickest way to get this done in Python. Here’s how you’d write out a single note to a MIDI file with Python

mid = mido.MidiFile()
track = mido.MidiTrack()

track.append(mido.Message('note_on', note=60, velocity=100, time=32))
track.append(mido.Message('note_off', note=60, velocity=100, time=0))

# Save the file'test_single_note.mid')

The note=60 specifies which note you want to play, the velocity=100 specfies how hard you want to hit the key, and time=32 specifies how long you want the note to last. From this, it’s easy to see how the code above creates a note (by turning it on and off).

What’s the sound of a character?

Although we see the isomorphism between a visual representation of a string and how it appears, a computer doesn’t. s = 'Hello World' represents an array of bytes, it does not tell the computer that the H is being rendered with this font, at this size, etc. So we actually have to render this text into an image to be able to write it out to MIDI.

Since a typeface typically is represented by parametric curves we can’t directly just create an image of a string. We have to rasterize our string using a particular font to get our image. Visually, this is what that process looks like,

A Closer Look At Font Rendering — Smashing Magazine

Image taken from here

We can convert this into a binary image, and lower the resolution enough so that each pixel can be converted into a note. This note can then be written out to a MIDI file.

Writing to MIDI

Now we just have to carefully iterate over this array of number and write it out to MIDI to get our MIDI art back. Here’s the for-loop that does exactly that,

new_col = True
for i, col in enumerate(note_vels.T):
    for j, note_vel in enumerate(col):
        if note_vel != 0:
            time = 20 if new_col else 0
            track.append(mido.Message('note_on', note=60 -
						 2 * j, velocity=note_vel, time=time))
            track.append(mido.Message('note_off', note=60 -
						 2 * j, velocity=127, time=0))
            new_col = False
    new_col = True


Technically this is not that exciting, but I can still do something that I couldn’t quite get done with software out there. I find it quite awesome that all it took was a really small chunk of code and some boredom.

Here’s the original MIDI art I made by hand,


If you want to try this out for yourself, you can do it as follows,

pip install text2midi

Once you’ve installed the script, the following is a generic template for using text2midi,

python -m text2midi "<message string>" path/to/output/file.mid

For example, you could try running something like this,

python -m text2midi "Hello, World\!" hello.mid

To view the generated MIDI file you can use the DAW of your choice. Something like Logic Pro X, Reaper, Garageband (?) will work just fine.

Next Steps