Random thoughts on `for`, `goto`, readability

2019-12-29

nullprogram.com is one of the best blogs that I know of. Virtually every article is interesting and I learn something new.

Today, he published an article about purgeable memory. Short summary of the concept: It’s memory that might be freed (not swapped!) by the kernel if memory pressure rises. Of course, your code must be prepared to handle this situation.

The idea is: You have some computation to do that takes some time and a considerable amount of memory. You may want to use the result multiple times, but there are long pauses in between. And, it’s faster to just re-do the computation than to wait for the kernel to swap things from disk. So, you can tell the kernel about this.

As an example, there is this snippet of code:

/* Version 0: Original code */

uint32_t *texture = 0;
struct png *png = png_load("texture.png");
if (!png) die();

/* ... */

for (;;) {
    if (!texture) {
        texture = purgeable_alloc(png->width * png->height * 4);
        if (!texture) die();
        png_decode_rgba(png, texture);
    } else if (!purgeable_lock(texture)) {
        purgeable_free(texture);
        texture = 0;
        continue;
    }
    glTexImage2D(
        GL_TEXTURE_2D, 0,
        GL_RGBA, png->width, png->height, 0,
        GL_RGBA, GL_UNSIGNED_BYTE, texture
    );
    purgeable_unlock(texture);
    break;
}

A texture is loaded and used in an OpenGL context. texture is a long-living variable. Just think of a game menu: The menu is only opened every now and then, and always shows the same background image – that’s our texture. You want the menu to open very quickly, so you would like to keep the decoded texture in memory. But then again … If the menu is rarely shown, do we really have to keep the texture there? It’s nice to have, but not absolutely necessary. Thus, use purgeable memory for this purpose.

I want to talk about the for loop.

This for loop baffled me. I didn’t understand what was going on at first. Why is there a loop? Why is it infinite? Do we have to retry many times, because something might fail? After a moment, I noticed the continue and break, and then it dawned on me.

What’s actually going on, is this (pseudo code):

if (texture not loaded)
    load texture;
else /* texture was once loaded */
    if (kernel has purged the memory in the meantime)
        mark texture as "not loaded";
        TRY AGAIN FROM THE TOP;

use texture in OpenGL;
mark texture memory as purgeable;

In other words, the for loop is used to avoid a goto.

One advantage is that the brackets of the loop make it clear where this code section begins and ends. There can’t be any unexpected jumps from somewhere else in the code, so it eliminates one of the major pitfalls of goto.

On the other hand, I think it fails to actually create “better” code. You have to read it twice and it’s not a loop in the first place. It’s kind of a “try again” mechanism.

Once you understand it, it’s trivial. But it’s one of those things that create a tiny additional barrier. If tricks like this are used over and over, the code will be less readable and it can make it quite hard for new people to join your team. I’m no exception here, I do this, too. See, when you write code yourself, you always understand it and everything is trivial. It’s easy to forget, though, that you will at some point be “new to the team” again, too, simple because you have forgotten all those tiny little tricks in a couple of years not touching the code. This is not a new idea at all, but it’s so easy to forget.

Using goto, it looks like this:

/* Version 1: goto */

load_texture:
    if (!texture) {
        texture = purgeable_alloc(png->width * png->height * 4);
        if (!texture) die();
        png_decode_rgba(png, texture);
    } else if (!purgeable_lock(texture)) {
        purgeable_free(texture);
        texture = 0;
        goto load_texture;
    }
    glTexImage2D(
        GL_TEXTURE_2D, 0,
        GL_RGBA, png->width, png->height, 0,
        GL_RGBA, GL_UNSIGNED_BYTE, texture
    );
    purgeable_unlock(texture);

Is this better? I think so. There is no confusion about a loop anymore. We got rid of the misnomers continue and break. The fact that you read the label first makes you aware of an upcoming goto. Label and goto are in close proximity. I honestly think that this version is easier to understand.

I use goto here and there, but I’m always cautious. There has to be a good reason for it. It must assist in creating code that’s easier to understand. This, right here, would be legitimate use in my opinion.

But the obvious question is: Do we need a jump in the first place? What about this:

/* Version 2: Restructured code */

if (texture && !purgeable_lock(texture)) {
    purgeable_free(texture);
    texture = 0;
}

if (!texture) {
    texture = purgeable_alloc(png->width * png->height * 4);
    if (!texture) die();
    png_decode_rgba(png, texture);
}
glTexImage2D(
    GL_TEXTURE_2D, 0,
    GL_RGBA, png->width, png->height, 0,
    GL_RGBA, GL_UNSIGNED_BYTE, texture
);
purgeable_unlock(texture);

I restructured the code and now there is no “try again“ anymore. The pseudo code now looks like this:

if (kernel has purged our memory)
    reset pointer;

if (texture not loaded)
    load texture;
use texture in OpenGL;
mark texture memory as purgeable;

It’s a bit ugly because the check for texture == NULL has been duplicated. It’s also crucial not to write it as else if, which is why I included the empty line.

Can we boil it down even more? In essence, what we actually want to do, should look like this:

if (texture not available)
    load texture;
use texture in OpenGL;
mark texture memory as purgeable;

There is no loop, no jump, nothing. We clearly stated our intention. Can we actually do this? At the moment, no. But let’s suppose we change the library slightly:

Support handling NULL pointers: purgeable_lock() would return False or 0.
Automatically free the memory on unsuccesful locking attempts (as was suggested in the original article).

We could then do this:

/* Version 3: Altered API */

if (!purgeable_lock(texture)) {
    texture = purgeable_alloc(png->width * png->height * 4);
    if (!texture) die();
    png_decode_rgba(png, texture);
}
glTexImage2D(
    GL_TEXTURE_2D, 0,
    GL_RGBA, png->width, png->height, 0,
    GL_RGBA, GL_UNSIGNED_BYTE, texture
);
purgeable_unlock(texture);

This is the most concise version I could come up with.

On the other hand, this is a grey area. In version 3, you have to do more thinking:

If texture == NULL, the library does nothing and simply returns False. We have to decode the PNG.
If texture != NULL:
- If locking succeeds, we’re good, because we have already decoded the PNG in the past.
- If locking fails, we have to decode the PNG again.

The older I get, the easier I get annoyed by this kind of thing. It’s just one function call, but I have to simulate a variety of cases in my head. If you’re not careful, you might forget that the NULL case exists altogether. So, even if the library’s API was extended as mentioned above, I would write the code as follows:

/* Version 4: Altered API and explicit check */

if (!texture || !purgeable_lock(texture)) {
    texture = purgeable_alloc(png->width * png->height * 4);
    if (!texture) die();
    png_decode_rgba(png, texture);
}
glTexImage2D(
    GL_TEXTURE_2D, 0,
    GL_RGBA, png->width, png->height, 0,
    GL_RGBA, GL_UNSIGNED_BYTE, texture
);
purgeable_unlock(texture);

The check for texture == NULL is not strictly required as the library would do this internally. Still, it’s worth it, because now the reader immediately knows that this if is about two different cases: We decode the PNG if it’s the very first time or if the memory has been purged.

On a final note, I like writing texture == NULL instead of !texture, all for the same reasons. It’s a pointer and we’re checking if it’s NULL. More bytes to type, but clear as day.

Comments?

Random thoughts on for, goto, readability

Random thoughts on `for`, `goto`, readability