blog · git · desktop · images · contact & privacy · gopher


fontconfig caches vs. /tmp

My setup isn’t really that complex. The system boots to a text console, so I can log in on tty1. When I do that, my ~/.bashrc decides to automatically run startx. There is no display manager. My ~/.xinitrc consists of a bunch of programs to auto-start, mostly some terminals, the bars at the edges of the screen, and the window manager. And it sets a wallpaper. That’s pretty much it.

So even though little of the “big” “modern” “bloated” software is involved, there was a short freeze right after the start of X and before the bars appeared. It used to be short enough for me not to bother, but it got a lot worse yesterday.

The debugging session

There was no obvious cause for this. I trimmed down my ~/.xinitrc to just contain

openbox

and nothing else. I chose Openbox for my test runs, because it’s a mature program and won’t do something silly, right? Maybe there is a bug in my own window manager. Well, didn’t help. When I did a right-click to open Openbox’s menu, it froze for about 3-4 seconds.

But only the first time. Opening the menu a second time was fast. I also noticed that this only happened the first time after booting.

Okay, what now? Log in to a second text console on tty2 and run this, then start X on the other console:

while true; do ps aux >"$(date +%T.%N)"; done

Interesting, openbox had 50 to 80% CPU time for some seconds. What is it doing?

Out of curiosity, I switched back to my original ~/.xinitrc and had a look at the process list here. It showed that pretty much all X clients needed a large amount of CPU time, too. Each terminal, each bar, the window manager itself – but not tools like sxhkd.

So, this appeared to be unrelated to the window manager and it’s probably not a general X problem. But what are these programs doing? Time for strace.

I changed one of the calls in my ~/.xinitrc from

terminal -e foo

to:

strace -tt -ff -o /tmp/term.log terminal -e foo

And there you have it:

...
06:48:17.516879 stat("/usr/share/fonts/100dpi/UTI___10-ISO8859-1.pcf.gz", {st_mode=S_IFREG|0644, st_size=5080, ...}) = 0
06:48:17.516914 openat(AT_FDCWD, "/usr/share/fonts/100dpi/UTI___10-ISO8859-1.pcf.gz", O_RDONLY) = 5
06:48:17.516939 fcntl(5, F_SETFD, FD_CLOEXEC) = 0
06:48:17.516959 fstat(5, {st_mode=S_IFREG|0644, st_size=5080, ...}) = 0
06:48:17.516981 mmap(NULL, 5080, PROT_READ, MAP_PRIVATE, 5, 0) = 0x7efcdda1b000
06:48:17.517006 close(5)                = 0
06:48:17.517211 munmap(0x7efcdda1b000, 5080) = 0
06:48:17.517235 openat(AT_FDCWD, "/usr/share/fonts/100dpi/UTI___10-ISO8859-1.pcf.gz", O_RDONLY) = 5
06:48:17.517260 fcntl(5, F_SETFD, FD_CLOEXEC) = 0
06:48:17.517280 fstat(5, {st_mode=S_IFREG|0644, st_size=5080, ...}) = 0
06:48:17.517302 mmap(NULL, 5080, PROT_READ, MAP_PRIVATE, 5, 0) = 0x7efcdda1b000
06:48:17.517324 close(5)                = 0
06:48:17.517457 munmap(0x7efcdda1b000, 5080) = 0
...

This was done for every font file on my system. And each and every program, who uses fonts in some way, did it. sxhkd is just a hotkey daemon, it doesn’t display text, so it didn’t show up in the ps aux log files above. But the bars, the terminals, the window manager, … they all display text and they all crawled through my fonts.

How many fonts are there? The calls above finish in under one millisecond, so how can this take so long? Here are some numbers from ncdu:

--- /usr/share/fonts ---------------------
   24.3 MiB [##########]     39 /TTF
   17.0 MiB [######    ]  1,897 /100dpi
   15.7 MiB [######    ]  1,897 /75dpi
   10.8 MiB [####      ]    584 /misc
    7.0 MiB [##        ]      2 /noto
    5.9 MiB [##        ]     26 /OTF
    3.2 MiB [#         ]     38 /gsfonts
    2.8 MiB [#         ]     60 /Type1
  776.0 KiB [          ]     54 /encodings
  704.0 KiB [          ]     13 /local
  600.0 KiB [          ]      8 /cantarell
  208.0 KiB [          ]     18 /util
   16.0 KiB [          ]      3 /cde
   12.0 KiB [          ]      2 /cyrillic
    4.0 KiB [          ]         .uuid

The megabytes aren’t that important, but there are close to 2'000 files in /usr/share/fonts/100dpi and …/75dpi.

But what exactly is going on? Why are these fonts being processed? Why only the first time? At the end of the strace log, there was something along these lines:

06:54:55.253048 access("/var/cache/fontconfig", W_OK) = -1 EACCES (Permission denied)
06:54:55.253078 access("/tmp/cache-void/fontconfig", W_OK) = 0

A-ha.

Having $XDG_CACHE_HOME in /tmp

My ~/.bashrc does this:

export XDG_CACHE_HOME=/tmp/cache-$USER
install -dm0700 "$XDG_CACHE_HOME"

A lot of programs adhere to the freedesktop.org proposals these days, so they’ll use $XDG_CACHE_HOME among others. A lot of junk ends up there:

I want to remove all these files on a regular basis. I have often found caches to be the cause of “evil” things. Some program decides something, puts it in its cache, and it’ll stay there for a long time. This can be really annoying and it can lead to misbehavior.

Some people go one step further and have their entire $HOME in RAM. I think that’s a good idea, I’ve just been way too lazy to set it up. See, when I boot the computer, I want a clean, well-defined state. Leftovers from previous sessions can be useful, but most of the time they feel like litter and garbage from the last party.

fontconfig’s cache

Well, okay then, but the thing is that fontconfig will put its cache in $XDG_CACHE_HOME, too. And, as we already saw, creating that cache is rather expensive. What can I do about that?

The first thing to note: The strace logs show that the programs process files in /usr/share/fonts and then they try to write to /var/cache/fontconfig. This means that

  1. there is a system-wide fontconfig cache
  2. and it’s out of date.

Me having $XDG_CACHE_HOME point to /tmp only revealed this issue, it’s not the cause of it. The issue revolves around the system-wide cache, not about the one of my user.

On Arch Linux, there is a pacman hook to automatically update the system’s fontconfig cache:

$ cat /usr/share/libalpm/hooks/fontconfig.hook
[Trigger]
Type = File
Operation = Install
Operation = Upgrade
Operation = Remove
Target = usr/share/fonts/*

[Action]
Description = Updating fontconfig cache...
When = PostTransaction
Exec = /usr/bin/fc-cache -s

Sadly, I didn’t dig deep enough to find out why it got out of date. I was in a bit of a hurry and just ran fc-cache -s as root.

What’s in that cache anyway? Do you really need it? Well, yes, you do – the mere fact that it takes several seconds to build the cache is a strong indicator. From what I can tell, fontconfig opens each font file, possibly decompresses it, and reads attributes like

Removing the font cache and running FC_DEBUG=32 fc-list gives you an idea of what’s probably in there.

A workaround for future breakages

I don’t want to give up having caches in /tmp. Plus, it really isn’t the cause of this issue: Having a persistent per-user fontconfig cache in $HOME would have covered up a broken system-wide cache, which is exactly the kind of problem I’m trying to avoid. Instead, my ~/.xinitrc now contains this, before starting any other programs:

echo 'Running fc-list to make sure fontconfig cache is up to date ...'
fc-list >/dev/null 2>&1
echo 'fc-list done.'

This triggers an update of my user’s cache in /tmp. That cache is only meant for the tiny set of fonts I put in ~/.fonts – none of the /usr/share/fonts stuff will end up here. Running this implicitly means that all programs starting afterwards operate on a fresh cache and won’t trigger a cache update by themselves – which is what happened before, because they all started at the same time and found an outdated cache.

The echo statements are important, too. Right at the top of my ~/.xinitrc, there is this:

#!/bin/bash

[[ "$1" != logging ]] && exec systemd-cat "$0" logging

This collects all output of those programs in my journal. (Due to file descriptor inheritance, this also covers programs run by sxhkd later on, and thus also by dmenu.) While systemd’s journal is still controversial, it does two nice things:

  1. It automatically “rotates” log files, so I don’t have to implement that.
  2. It automatically adds time stamps.

(Yes, these aren’t unique features. As I said, it’s controversial.)

The next time I’m having issues like this one right here, I can look at journalctl -ra. The time stamps will quickly reveal if it was a fontconfig issue.