blog · git · desktop · images · contact & privacy · gopher


Algorithmic music for nixers.net

There’s a contest going on at nixers.net: Who can create the best intro for the podcast?

These are my entries for the contest:

Well, I don’t consider these epic masterpieces. What’s interesting about them is that they’re created using very concise C code. Here you go:

#include <stdio.h>

int
main()
{
    int t;

#if 0
    /* Intro (8kHz) */
    for (t = 0; t < 32768; t++)
        putchar((unsigned char)(t | (-t >> 8)) - 128);
    for (t = 0; t < 65536; t++)
        putchar((unsigned char)((t * 4 | t * 2 | t) | (t >> 4) | (t >> 8)) / 2);
#endif

#if 0
    /* Intro (44kHz) */
    for (t = 0; t < 4 * 32768; t++)
        putchar((unsigned char)((t | (-t >> 8)) >> 2) - 128);
    for (t = 0; t < 4 * 65536; t++)
        putchar((unsigned char)(((t * 4 | t * 2 | t) | (t >> 4) | (t >> 8)) >> 2) / 2);
#endif

#if 0
    /* Outro (8kHz) */
    for (t = 0; t < 65536; t++)
        putchar((unsigned char)(t | (-t >> 10)) - 128);

    for (t = 32768; t < 65536; t++)
        putchar((unsigned char)((t * 8) | (t >> 4) | (t >> 8)) / 2);
#endif

#if 0
    /* Outro (44kHz) */
    for (t = 0; t < 4 * 65536; t++)
        putchar((unsigned char)((t | (-t >> 10)) >> 2) - 128);

    for (t = 4 * 32768; t < 4 * 65536; t++)
        putchar((unsigned char)(((t * 8) | (t >> 4) | (t >> 8)) >> 2) / 2);
#endif

    return 0;
}

Some time ago, I read about this kind of algorithmic music. I’ve never tried it before, so this was the perfect opportunity.

Basics

The code prints data to stdout. It has no inherent meaning. You can interpret it as raw 8 bit unsigned PCM audio, though. Do this and you get the song. On GNU/Linux:

$ ./sound | aplay -q -r 8000 -f U8

Of course, you could also use another sampling rate, which would result in a different playing speed and higher pitch. I’ll be doing everything in 8kHz for now. You’re also not forced to use U8, but you’ll soon understand why this choice makes sense.

You can also write it to a file and import it into Audacity:

$ ./sound >audio.raw

Breakdown of the intro

Basic idea and first part

The basic idea of the whole thing is this piece of code:

#include <stdio.h>

int
main()
{
    int t;

    for (t = 0; ; t++)
        putchar(t);
}

Here is what it sounds like: sawtooth.ogg.

It’s an integer that counts upwards. A little magic happens in putchar(): Before printing anything to stdout, it casts the value to an unsigned char (hence the U8). This means that our loop produces the following output:

0x00 0x01 0x02 0x03 ... 0xFD 0xFE 0xFF 0x00 0x01 0x02 ...
                                  ^^^^^^^^^

I’ve marked the important part. After 0xFF comes 0x00 in the output, but the real value of t will be 0x100, not 0x00. This means that we have more bits to work with than what you see on stdout. We can and will make use of those upper bits later.

It also means that an operation like 123 * 32 will result in an “audible” value of 96 – everything we do is modulo 256. This is one of the reasons for many unexpected sound effects.

Let’s move on. Consider this:

#include <stdio.h>

int
main()
{
    int t;

    for (t = 0; ; t++)
        putchar(t >> 8);
}

Remember that a plain t results in ascending numbers, a sawtooth pattern. What happens if we shift the bits to the right? Firstly, we discard the lower eight bits. Secondly, we reduce the frequency of the sawtooth pattern: Up until the point where we hit the first “overflow” (0xFF0x00), we get only zeroes. After that, we get a 0x01 until the next overflow happens. And so on.

When doing a >> 8, we reduce the frequency by a factor of 256, which is below audible range. I can only show you a screenshot:

scrot-t-8.png

As you can see, this is not a simple straight line but a stairway pattern. This is due to the fact that we only do integer arithmetic. Yet another cause for neat effects. Expressions like t >> 8 remain at the same value for some milliseconds. This is the basis for the illusion of distinct tones, instead of smoothly “gliding” frequencies. Let me illustrate.

#include <stdio.h>

int
main()
{
    int t;

    for (t = 0; ; t++)
        putchar(t | (-t >> 8));
}

This is the first part of the actual Nixers intro. What happens here? -t >> 8 is the same as t >> 8 but mirrored along the x axis, so it starts at 0xFF (almost – let’s ignore the first zero) and slowly decreases to 0x00. We then do a bitwise OR with t. And this is where it gets really interesting.

Let’s listen to the code above: intro-part-1.ogg.

Highly unexpected, isn’t it? Let’s try to analyze what causes this pattern.

-t >> 8 is:

0xFF 0xFF 0xFF 0xFF 0xFF 0xFF 0xFF 0xFF 0xFF 0xFF 0xFF ... 0xFF
0xFE 0xFE 0xFE 0xFE 0xFE 0xFE 0xFE 0xFE 0xFE 0xFE 0xFE ... 0xFE
0xFD 0xFD 0xFD 0xFD 0xFD 0xFD 0xFD 0xFD 0xFD 0xFD 0xFD ... 0xFD
...

t is:

0x00 0x01 0x02 0x03 0x04 0x05 0x06 0x07 0x08 0x09 0x10 ... 0xFF
0x00 0x01 0x02 0x03 0x04 0x05 0x06 0x07 0x08 0x09 0x10 ... 0xFF
0x00 0x01 0x02 0x03 0x04 0x05 0x06 0x07 0x08 0x09 0x10 ... 0xFF
...

When you OR these two together, nothing happens as long as -t >> 8 stays at 0xFF. But once it drops below that, it makes room for bits to be affected by the OR. Let’s say, -t >> 8 is at 0xFE, after the OR we get:

0xFE 0xFF 0xFE 0xFF 0xFE 0xFF 0xFE 0xFF 0xFE 0xFF ...

An alternating pattern! When listening to this, it’s a very high frequency but also extremely quiet. Let’s look at another example, -t >> 8 is at 0xF8:

0xF8 0xF9 0xFA 0xFB 0xFC 0xFD 0xFE 0xFF 0xF8 0xF9 ...

You can see the original sawtooth pattern of t emerging, but it’s still a much higher frequency than t itself: One cycle goes from 0xF8 to 0xFF, that’s only eight bytes as opposed to the full 255 bytes of t. And the amplitude is larger than the 0xFF ←→ 0xFE flip above.

In other words, as -t >> 8 decreases, it makes room for more and more tones, with lower frequencies becoming more dominant.

Second part

Again, this is all about “overlaying”.

Let’s begin with the following code.

#include <stdio.h>

int
main()
{
    int t;

    for (t = 0; ; t++)
        putchar(t * 16 | t);
}

What do you think this sounds like? I first expected it to be two sawtooth patterns, the base frequency and 16 times that frequency. Indeed, quite close. Here’s a screenshot:

t-and-16.png

The “big” sawtooth pattern is t. “On top of it”, you get something similar to t * 16. But you also get a hell lot of “artifacts”, resulting in many, many overtones.

And here’s an audio file of it:

t-and-16.ogg

So, the expression t * 4 | t * 2 | t that I’m using in the actual intro is nothing but a very “rich” sawtooth pattern.

In a second step, I’m ORing that sawtooth with (t >> 4) | (t >> 8). Let’s look at that second expression first. Here’s a screenshot of t >> 4, t >> 8, and both of them ORed together:

t-4-and-t-8.png

Once more, the OR function acts as an “overlay” that also creates many overtones. You can almost see the individual bits being flipped around. Surprisingly, when you listen to this, it’s not just simple noise:

t-4-and-t-8.ogg

You can already hear a tiny little melody. More importantly, though, t >> 4 creates a “beat”. Let’s quickly listen to the following code:

#include <stdio.h>

int
main()
{
    int t;

    for (t = 0; ; t++)
        putchar((t * 4 | t * 2 | t) | (t >> 4));
}

Here it goes: drum-loop.ogg. A bass drum and a keyboard. Isn’t that nice?

t >> 8, on the other hand, works like -t >> 8 in the first part – but the other way around: It keeps low frequencies (the “beat”) at the beginning and blocks them later on.

Both bit shifts ORed together add slight variations in the melody and some overtones.

The complete second part of the intro:

#include <stdio.h>

int
main()
{
    int t;

    for (t = 0; ; t++)
        putchar((t * 4 | t * 2 | t) | (t >> 4) | (t >> 8));
}

intro-part-2.ogg

Alignment

First of all, the code so far had simple endless loops. This is not what you want for a “short podcast intro”. :-) So, we need to add bounds. Quite frankly, just use trial and error here, or add debug code to print the current value of t. Go with multiples of two. It’s easy to find the right values:

#include <stdio.h>

int
main()
{
    int t;

    for (t = 0; t < 32768; t++)
        putchar(t | (-t >> 8));
    for (t = 0; t < 65536; t++)
        putchar((t * 4 | t * 2 | t) | (t >> 4) | (t >> 8));

    return 0;
}

There’s another important detail. Let’s listen to that code: intro-unaligned.ogg. Do you hear that “blop” at the start and at the end? We want to get rid of it.

It’s simple to explain those artifacts by looking at a hexdump of our data:

00000000: 00ff ffff ffff ffff ffff ffff ffff ffff  ................
00000010: ffff ffff ffff ffff ffff ffff ffff ffff  ................
00000020: ffff ffff ffff ffff ffff ffff ffff ffff  ................
00000030: ffff ffff ffff ffff ffff ffff ffff ffff  ................
...

When interpreted as a waveform, this is a big jump: Right from full negative peak (0x00) to full positive peak (0xFF). Something similar happens at the end of the stream:

...
00017fc0: ffff ffff ffff ffff ffff ffff ffff ffff  ................
00017fd0: ffff ffff ffff ffff ffff ffff ffff ffff  ................
00017fe0: ffff ffff ffff ffff ffff ffff ffff ffff  ................
00017ff0: ffff ffff ffff ffff ffff ffff ffff ffff  ................

There’s no final 0xFF 0x00, but the file still ends with 0xFF, that’s a full positive peak. Once playback stops, your speakers jump from 0xFF back to the center, 0x80.

A screenshot of the waveform sheds additional light on the problem: intro-unaligned.png. Here is what I want it to look like: intro-aligned.png. So, the first part must be moved halfway down. The second part must be moved down as well, but we can’t use a simple subtraction here because we use the full amplitude at the beginning. Thus, we divide by two.

And that’s just it.

#include <stdio.h>

int
main()
{
    int t;

    for (t = 0; t < 32768; t++)
        putchar((unsigned char)(t | (-t >> 8)) - 128);
    for (t = 0; t < 65536; t++)
        putchar((unsigned char)((t * 4 | t * 2 | t) | (t >> 4) | (t >> 8)) / 2);

    return 0;
}

Note that I have to do a type cast before doing the alignment. Of course. Otherwise, we’d be operating in the full integer range – but we have to get down to 0x00 to 0xFF.

The outro

… is left as an excercise for the reader. :-)

The 44kHz versions

What’s going on here? Not much, really. A higher sampling rate means that you need to provide more data to cover the same amount of time. Hence, you need to slow everything down. In our case, this is very easy by simply doing a bit shift to the right (remember the first part of the intro where we did just that?). We also have to multiply our loop bounds by a factor of four.

Why four? Because 44kHz is 4 * 11kHz and, of course, 11kHz is roughly 8kHz. :-)

Conclusion

This is not an exhaustive analysis. Many details are still uncovered. I think, though, that you can get a good feeling on what’s going on here.

And, to be honest, let’s not spoil it. Just go ahead, play with the code, and be surprised. That’s the actual fun!