Where are my files? Part 2

2018-03-24

The topic of how to find your files came up again.

Back then, I only thought about “normal” binaries. Compiled C programs. Things work a bit differently, though, when it comes to interpreted scripts which are launched using the shebang mechanism.

Quick recap of shebang lines

Suppose you have a simple shell script:

#!/bin/sh

echo hi

When you run it, …

$ ./my_script foo bar baz

… the kernel actually runs the binary “/bin/sh” behind the scenes. The first argument will be the path to your script, followed by the arguments you specified on the command line. So, the actual call looks like this:

$ /bin/sh ./my_script foo bar baz

And there you have it. When you run something using a shebang line, the interpreter knows very well where your script is located. This is a necessity, because it must open and read that file.

It’s now just a matter of passing this information from the interpreter to your actual code.

GNU Bash

Bash has a special variable which holds the path: “$BASH_SOURCE”. Suppose you have this directory structure:

opt/
└── mytool/
    ├── mytool*
    ├── some-resource.gif
    └── something-else.ogg

With “/opt/mytool/mytool” being:

#!/bin/bash

my_dir=$(dirname "$(readlink -e "${BASH_SOURCE[0]}")")
ls -al "$my_dir"

And you will see your files.

This approach isn’t new and has been discussed over here, for example:

Making your own shebang “interpreter”

Interpreters mentioned in shebang lines are not special programs. They don’t need to be interpreters at all and you can, of course, easily write your own.

Let’s create a simple file with only one line in it and make it executable:

#!/opt/mytool/mytool-shebang

Let’s call this file “mytool” and put it in “/opt/mytool” again.

When you try to execute this file, the operating system will evaluate the shebang line and it will try to execute “/opt/mytool/mytool-shebang”. Let’s create that file as well. It can be another script or something else like a C program – let’s go for the latter.

#define _XOPEN_SOURCE 500

#include <limits.h>
#include <stdio.h>
#include <stdlib.h>

int
main(int argc, char **argv)
{
    int i;
    char *rp;

    if (argc < 2)
    {
        fprintf(stderr, "Need argv[1]\n");
        return 1;
    }

    rp = realpath(argv[1], NULL);
    printf("arg 1: '%s', resolved: '%s'\n", argv[1], rp);
    free(rp);

    for (i = 2; i < argc; i++)
        printf("arg %d: '%s'\n", i, argv[i]);

    return 0;
}

Compilation:

$ cc -std=c99 -Wall -Wextra -o mytool-shebang mytool-shebang.c

And that’s pretty much it. You can now call “mytool” in various ways and it will find your files:

$ cd /opt/mytool
$ ./mytool
arg 1: './mytool', resolved: '/opt/mytool/mytool'

$ ./mytool foo bar
arg 1: './mytool', resolved: '/opt/mytool/mytool'
arg 2: 'foo'
arg 3: 'bar'

$ cd /tmp
$ /opt/mytool/mytool
arg 1: '/opt/mytool/mytool', resolved: '/opt/mytool/mytool'

$ cd /tmp
$ PATH=$PATH:/opt/mytool mytool
arg 1: '/opt/mytool/mytool', resolved: '/opt/mytool/mytool'

That “realpath()” business makes sure that you can use symlinks as well:

$ cd /tmp
$ ln -s /opt/mytool/mytool
$ ./mytool 
arg 1: './mytool', resolved: '/opt/mytool/mytool'

$ cd /tmp
$ rm mytool 
$ ln -s /opt/mytool 
$ cd mytool
$ ./mytool
arg 1: './mytool', resolved: '/opt/mytool/mytool'

Tweak: Putting everything in “`$PATH`”

“/opt/mytool/mytool” currently reads like this:

#!/opt/mytool/mytool-shebang

You can do what most Python scripts do and abuse “/usr/bin/env” to traverse the path for you:

#!/usr/bin/env mytool-shebang

And then you have to do this:

$ PATH=$PATH:/opt/mytool mytool
arg 1: '/opt/mytool/mytool', resolved: '/opt/mytool/mytool'

Conclusion

All of this should be fairly portable, but I only tested it on Linux and OpenBSD.

I’m not sure if I’d use a hack like this in real life, because it does feel a little convoluted. I’ve never seen it, either. It’s nice to know your options, though. :-)