How loop optimization in GCC uses undefined behaviour to make inferences

There are many ways to try and understand how an optimizing C compiler might transform your code.

Lets start by thinking about the goal of an optimizing compiler. The goal is to manipulate the code its is presented with making modifications to cause it to execute more quickly and without altering the observable behaviour of a correctly formed program.

It is very important when describing the effects of an optimizer to remember that it only cares about a correctly formed program. If you program is not correctly formed (normally expressed by compiler writers as relying upon undefined behaviour) then the optimizer is allowed to make modifications. That’s it! They don’t have to preserve observable behaviour. They don’t have to execute more quickly. They don’t have to try to guess the programmer’s intent. The compiler is allowed to make modifications of more or less any form.

Over the years optimizing compilers have come to make inferences based on undefined behaviour. One of the simplest ones occurs when a pointer is dereferenced. In this case the compiler knows that it cannot be NULL (because it is undefined to do so). It can the treat any later checks for pointer validity as unreachable code and can remove it.

Recently GCC gained a much more powerful mechanism to track undefined behaviour that occur as specific loop iteration counts. Having read about this and looked at a few spurious bug reports I settled down and wrote the following plausible but buggy code.

Note: If you like puzzles then don’t scroll down past the return 0; statement and closing brace. That way you can try and spot the bug yourself. As far as I know there is only one in there!

int is_whitespace(char c)
    const char lookup_table[] = { ' ', '\t', '\n' };

    for (int i=0; i<=sizeof(lookup_table); i++)
        if (c == lookup_table[i])
            return 1;

    return 0;

So spotting the error is hopefully the easy part.

If you need a clue then be reassured that the lookup table contains three elements (it is initialized from character literals rather than a string literal so it does not get nil-terminated).

If you just want it solved so you can keep reading then observe that the loop exit condition uses <= rather than < meaning the loop exist condition is reached after four cycles round the loop. This results in reading after the end of the array leading to undefined behaviour.

Now for the hard bit. What do you expect to happen when you run this code?

One answer, and one that I might have given had I been asked this question instead of writing it is: “strictly speaking this is undefined so it needs fixing, however the padding put in by the linker probably means that lookup_table[3] will evaluate to ‘\0′ and so the function will work more of less as intended assuming the caller never cares whether the ‘\0′ character is whitespace or not”.

That answer might even have been adequate two years ago. However with modern compilers it is better to rely on the simpler answer regarding what can happen?: “anything”.

Nevertheless, even knowing all of the above the transformation the compiler actually makes may still yet surprise you. Unconditionally the code comes out as:

int is_whitespace(char c)
    return 1;

If you don’t believe me look at the assembler (generated using gcc-4.8.2 with -O2 -std=c99 -S on x86-64).

        .file   "g.c"
        .p2align 4,,15
        .globl  is_whitespace
        .type   is_whitespace, @function
        movl    $1, %eax
        .size   is_whitespace, .-is_whitespace
        .ident  "GCC: (GNU) 4.8.2 20131212 (Red Hat 4.8.2-7)"
        .section        .note.GNU-stack,"",@progbits

What has happens is the compiler has realized that when i == 3 the code makes an undefined memory access. From this the compiler infers that i must therefore be strictly less than three during execution of the function and that the end condition is unreachable. It can be optimized away and, once the loop cannot exit then we also know that the function can only return 1 so we can get rid of the loop itself as well.

As a further exercise imagine what happens if the loop contains a function call that is opaque to the optimizer (that is, it is not inlined). Now we still know that the end condition is unreachable but it might hatter how many times we make the function call. Of course the compiler still believes the end condition is unreachable and can optimize it away leaving an infinite loop!

If you’re currently busy moaning about arrogant out-of-touch compiler writers who don’t understand the real world then please stop. Compiler writers tend to dog food (by compiling their compilers using their own compilers) and are just as in touch with the real world as every other working programmer. Perhaps instead you could thank them for transforming a rare, hard to tickle bug that could easily slip into production, into a deterministic one that should be caught almost immediately! That’s way easier to debug.

Of course the other thing you get out of the dedication of your compiler writer is faster code. In a system where there is heavy inlining of defensively written code (for example the C++ standard library) then, in principle, throwing away unreachable loop end conditions could pay significant dividends.


Fifteen minutes using a Boss Micro BR

The fifteen minute figure is a slight poetic license. I did record this track during my first fifteen minutes playing about with the recorder. However after that it took me nearly twenty minutes to figure out how to master it into a format I can share with the world!

Recording is an Epiphone Dot plugged directly into a Boss Micro BR using the P01 “SuperCln” preset. The preset has been changed from the factory setting only by turning off the chorus.

To be honest the chorus effect sounds pretty terrific but, with such a naked guitar part it makes it sound like I have something to hide.



Faking try/catch/finally in bourne shell (and jenkins)

When Bourne shell was first release in 1977 it turned out that, for several very good reasons, Steven Bourne had designed a nice simple language with no need for exception handling. That is, it did not need exception handling until Jenkins, also for very good reasons started using it with the -ex that causes the shell to bail out on the first error in encounters.

Normally Jenkins’ behaviour is is exactly what you need. Scripts stop as soon as something goes wrong. However a typical glue script to run a test suite overnight on a shared development board might look like the following pseudo-code:


This problem with the above code is run using jenkins, or any other tool that runs the shell with the -ex argument, then the unlock command is not run if the test suite fails and make returns the error to us and the board is never unlocked. A simple fix might be:


If you favour compactness (and only having to type out the unlock command once) then perhaps:

make test TARGET=$MY_DEVELOPMENT_BOARD && res=$? || res=$?
[ 0 -ne "$res" ] && false

However what about the following. Note that the “unlock” command is similar to a  a “finally” operation in some languages but will be executed before the catch statement:

catch echo "System tests failed! Please see logs"

The above can readily be implemented aided by a couple of simple shell functions:

try () {
        if [ -z $exception_has_been_thrown ]
                "$@" || exception_has_been_thrown=1

catch () {
        if [ ! -z $exception_has_been_thrown ]
                false   # If "sh -ex" then exit at this point
                unset exception_has_been_thrown

These scripts don’t make a big difference for the simple script above. However what if you are running multiple test suites sequentially under lock?

try make smoke_test TARGET=$MY_DEVELOPMENT_BOARD
try make heavy_regression_test TARGET=$MY_DEVELOPMENT_BOARD
catch echo "System tests failed! Please see logs"

Now at last the benefits of these wrapper functions really make sense. Because all the test suites are run using the try function then they will be skipped if previous try blocks have reported an error. This gives us behaviour similar to -e but delays the reporting of the error until the unlock has been performed.

To close, and for the historians amoung you, despite my starting this post with a reference to 1977 the code presented wouldn’t have run in the original Bourne shell because it uses features that were added later. However I think that by 1986 (SVR3) then all the features used here would have been available. Certainly if you know different then please let me know… I’d be interested.


The Encore project – Introduction

Recently I allowed myself to be given a very old beat up Encore strat copy. The idea is to have something that is provably worthless to mess about with as a confidence building re-finish project.


Somebody has attempted to refinish this guitar before and failed in quite an extreme manner. The body, originally black, is still just as black but has a weird tar like coating on top of the slightly cracked original poly finish. It must have looked better before they started work, simple because I haven’t found much underneath that I wouldn’t have been happy to address with nail varnish and T-cut. In the picture below that big rectangular patch on the side is some of that charming tar. From the shape I can only assume the new paint reacted badly with the glue of some stickers.


The scratch plate is almost worse. It looks like it is coated in tipex…


This last pictures is a close up of the controls that shows the edge of the scratch plate nicely. Its actually a three ply scratch plate but you can hardly see the black middle ply in places.


Oddly the neck is actually in a pretty good state. Not only that but I was actually pretty stunned by its build quality (let’s just say I was not expected great things from this guitar). The wood of the fretboard is a bit cheap (and not stained enough) but the main wood of the neck is sufficiently pretty and the frets are really quite well finished. Its a fairly nice profile too so when I can get over my prejudices it doesn’t feel that bad to play.

The body? Not so much, they’ve put the money where is shows and put the MDF were it doesn’t! I had at least hoped for plywood (like my old Epiphone SG) so I could play about with transparent finishes. Not to be, so it goes…

Anyhow, ideas for how to refinish would be cool. I’ve more or less given up on the idea of a trans finish. I did briefly toy with doing an “industrial” finish of incompletely sanded back black poly next to very dark stained MDF under a tinted transparent finish. I might learn something doing that but I suspect what I would learn is not to put half sanded MDF under a transparent finish…

The Egmond cantilever neck joint

The Egmond Project – Update #2

The next part of the project turned out to be getting everything lined up. After rebuilding the guitar I left it at pitch for a week or two to see if anything moved… it didn’t. Since the last post I have also moved the bridge over a bit to improved the line up of the strings. Unfortunately with the bridge in the right place the bass string started popping out if anyone breathed too loudly so I took the bridge out to cut a much deeper saddle slot for the bass strings.

Note: This post is part of a series: [Update #1] -> [Original post]

Even with these changes it still lacked quite a bit in the playability department. After several hours trying to shave the bridge I concluded that the neck joint really did need the neck to lay back a bit more. The Egmond’s neck joint it rather different from the Fender-style neck joint you might see on electric guitars. Instead of the four big screws it relies on a single bolt to counter the tension of the strings.

The Egmond cantilever neck joint

The Egmond cantilever neck joint

Basically the neck pivots on the plywood at the top of the neck joint, whilst the single bolt goes through the heel of the neck, through the spring and into the body. Adding some shims to the pivot point allows the neck angle to be increases slightly, dropping the strings down a bit and making the guitar more playable.

Neck joint with shims inserted

Neck joint with shims inserted

The shims were made by setting a plane to a fairly brutal setting and running it up and down a piece of beech. Sadly they were not quite thick enough, even when doubled up. The folks over at The Guitar Grounds were very helpful. After they described how the 70s era Fender’s were generally shimmed using whatever was at hand (often business cards). I figured if business cards were good enough to US made Fenders then I could get away with four layers of wooden shims. Actually I have three shims on one side and four on the other so correct the angle slightly.

It’s done the job. The neck is now laid back enough that the little metal wheels on the bridge can adjust the action properly. I’ve been forced to set the action a little higher than I would like because the frets could really do with the attention of a crowning file (which I don’t have). Nevertheless it makes the world of different to its playability.

Electrification will come soon!


Crafting indentation

The code in this blog post wasn’t written by me. The code actually comes from an automatic code generator used by the wayland project (and is credited to Kristian Høgsberg). To me, the code displays simple craftsmanship and should be appreciated for something beyond it’s mere utility.

The code itself deserved to be read.

So here it is:

static const char *indent(int n)
	const char *whitespace[] = {
		"\t\t\t\t\t\t\t\t\t\t\t\t ",
		"\t\t\t\t\t\t\t\t\t\t\t\t  ",
		"\t\t\t\t\t\t\t\t\t\t\t\t   ",
		"\t\t\t\t\t\t\t\t\t\t\t\t    ",
		"\t\t\t\t\t\t\t\t\t\t\t\t     ",
		"\t\t\t\t\t\t\t\t\t\t\t\t      ",
		"\t\t\t\t\t\t\t\t\t\t\t\t       "

	return whitespace[n % 8] + 12 - n / 8;

[From ]

I can even forgive if for neglecting to assert() its preconditions!


Bootable Fedora USB stick with encrypted home partition – part 1

In this tutorial we will repartition a USB stick and install Fedora on it allowing it to be used:

  • As encrypted storage with any modern Linux system
  • As a bootable USB stick running Fedora and using an encrypted home partition
  • To copy files to/from other computers, including those running non-Linux operating systems (this bit uses an unencrypted partition).

The basic idea is to split the disc into two partitions, Boot and Vault.

Boot is a FAT partition that interoperates well with non-Linux operating systems. The FAT partition will also contain, as files, the bootloader, read only compressed file system image and “overlay” image that allows us to amend the main filesystem. It is the compression that makes this scheme attractive. A very rich development workstation (including eclipse and lots of header packages) weighs in at less than 2GB. The other big advantage of basing things on the live images is that all the logic to stop temporary (and log) files writing out to the USB media is ready and working out of the box. This keeps down the wear on the media.

Note: The read-only compressed file system comes from the Fedora “Live” media. Thus the images easily available are the live CD and the live DVD published by the Fedora project. However it is possible to use the Fedora tools to custom roll your own live media.

The Vault is an encrypted home partition where the user files (including audio/video streams) can be stored. It is also automounted, subject to password, on any modern Linux system allowing it to be used for encrypted file exchange.

Recommended partition sizes

This is just a rough guide since its up to you to decide what you’ll be using the bootable stick for.

For a 4GB USB stick a 3GB FAT partition leaving a 1GB encrypted partition would be fairly flexible and allow big files to be transferred to a non-Linux operating system. Consider using a CD sized live image and a relatively small overlay partition (300MB or so).

For a 8GB USB stick, either a 4GB/4GB or a 5GB/3GB division would make sense. With a 5GB/3GB split then the DVD sized live image is possible together with a generous home area and the capacity to transfer large files.

For 16GB media I like to have a very big encrypted area so I can keep lots of audio/video material on the encrypted partition. For me a 6GB/10GB split gives me exactly what I want. A 2GB live image together with a generous overlay partition (1GB) so I can easilt install extra software whilst travelling if I need to.

I seldom use non-Linux operating systems these days so these recommendations assume I can use the encrypted partition for file transfer. If the primary thing you use the USB stick for is file transfer to non-Linux operating systems then perhaps you want to just pick a relatively small size for the encrypted partition (say 1GB) and give all the rest to the boot partition.

Putting it into practice

After inserting the USB media it is likely to be auto-mounted by the OS. Therefore the first thing we need to do it identify the media and unmount it. I recommend using the command line for this. Many GUI “eject” commands do more than just unmount the file system, they also do a USB shutdown that makes it impossible to use the media until you unplug and replug it (at which point it auto mounts again). Here we use mount to list the mounted devices and hunt for the device mounted on either /media or /run/media/<username>/ and then use the device name on the left to do the unmount. Remember the device name (below it is /dev/sdb1) since we’ll need that later.

[root@lobster ~]# mount
 proc on /proc type proc (rw,relatime)
 sysfs on /sys type sysfs (rw,relatime)
 /dev/sda1 on /boot type ext3 (rw,relatime,data=ordered)
 /dev/sdb1 on /run/media/drt/9A63-9772 type vfat
 [root@lobster ~]# umount /dev/sdb1

Now we need to repartition the USB media to create seperate Boot and Vault partitions. THIS WILL ERASE EVERYTHING ON THE DISC. Here we use parted and the argument is the device name from above (/dev/sdb1) with the numeric part and the end shaved off (/dev/sdb).

Note: The following examples are taken from my own system where I’m setting up a 16GB USB stick with a 6GB/10GB split.

[root@lobster ~]# parted /dev/sdb
 GNU Parted 3.0
 Using /dev/sdb
 Welcome to GNU Parted! Type 'help' to view a list of commands.
 (parted) p
 Model: SanDisk Cruzer Fit (scsi)
 Disk /dev/sdb: 16.0GB
 Sector size (logical/physical): 512B/512B
 Partition Table: msdos
 Disk Flags:
Number Start End Size Type File system Flags
 1 16.4kB 16.0GB 16.0GB primary fat32 lba

Remove the original partition:

(parted) rm 1

Make a 6GB FAT partition to act as the boot partition, a 10GB encrypted partition and double check things by printing the partition table:

 (parted) mkpart primary fat32 16.4kB 6.0GB
 Warning: The resulting partition is not properly aligned for best performance.
 Ignore/Cancel? i
 (parted) mkpart primary ext2 6.0GB 16GB
 (parted) print
 Model: SanDisk Cruzer Fit (scsi)
 Disk /dev/sdb: 16.0GB
 Sector size (logical/physical): 512B/512B
 Partition Table: msdos
 Disk Flags:
Number Start End Size Type File system Flags
 1 16.4kB 6000MB 6000MB primary fat32 lba
 2 6001MB 16.0GB 10.0GB primary
(parted) quit
 Information: You may need to update /etc/fstab.

Now is a good time to unplug the media, just to make sure that the kernel adopts the new partition table. This is paranoid but, hey, unplugging a USB stick isn’t so hard now is it?

Having done that, the automounter might end up decided to mount the old filesystem (not caring that half of it is now missing). However because the file system has changed size we must make a new one in order to be save.

Firstly we format the boot partition:

[root@lobster ~]# umount /dev/sdb1
 [root@lobster ~]# mkfs.vfat -F 32 -n LIVE /dev/sdb1
 mkfs.vfat 3.0.12 (29 Oct 2011)
 [root@lobster ~]#

Having done that we now need to create an encrypted ext4 partition ready to use as the home area (and for Linux to Linux file transfers):

[root@lobster ~]# cryptsetup --verify-passphrase luksFormat /dev/sdb2
 This will overwrite data on /dev/sdb2 irrevocably.
Are you sure? (Type uppercase yes): YES
 Enter LUKS passphrase:
 Verify passphrase:
 [root@lobster ~]# cryptsetup luksOpen /dev/sdb2 tmp
 Enter passphrase for /dev/sdb2:
 [root@lobster ~]# mkfs.ext4 -L Vault -m 0 /dev/mapper/tmp
 mke2fs 1.42.3 (14-May-2012)
 Filesystem label=Vault
 OS type: Linux
 Block size=4096 (log=2)
 Fragment size=4096 (log=2)
 Stride=0 blocks, Stripe width=0 blocks
 610800 inodes, 2442752 blocks
 0 blocks (0.00%) reserved for the super user
 First data block=0
 Maximum filesystem blocks=2503999488
 75 block groups
 32768 blocks per group, 32768 fragments per group
 8144 inodes per group
 Superblock backups stored on blocks:
 32768, 98304, 163840, 229376, 294912, 819200, 884736, 1605632
Allocating group tables: done
 Writing inode tables: done
 Creating journal (32768 blocks): done
 Writing superblocks and filesystem accounting information: done
[root@lobster ~]# cryptsetup luksClose tmp
[root@lobster ~]#

Again this is paranoia but just to make sure everything writes out before we unplug I like to run a:

[root@lobster ~]# sync

That’s it. The USB stick is ready. You can confirm this by hot-plugging one last time and you should be prompted to enter your password by the auto mounter.

We’re now half way there. The disk is all ready to run liveusb-creator to install the bootable operating system. After that there’s one last trick to get the live operating system to mount the encrypted home partition automatically and we’re all set.

I’ll tell you about all that in another post!


I like GNOME 3 and gnome-shell…

That’s it actually.

GNOME is brilliant. Each new release gets very slightly better than last time and the developers are doing a really good job of not listening to the regressive forces that want to halt all progress in UI design and happily assume Mac OS 8 and Windows 95 were the pinnacle of human achievement.


tintdrum – tintamp’s junior stablemate

For some time now I’ve been playing with constructing my own digital modeller.

By and large I’ve deliberately kept the scope of the project to be a make a practice tool (rather than a studio effect) in order to try an keep things achievable. The ultimate aim is to make a device that you use like an amPlug (or iRig for iPhone) to practice with. More exactly it is a battery operated “thing” that allows you practice without any wires except those joining the guitar to your ears.

I’m working in phases of very limited scope so that I can lose interest in the project whilst still having achieved something and there’s still a long way to go before I can call it a modeller. Nevertheless since Christmas I’ve been able to move from playing with software in the PC to playing with something real.

STM32F4-Discovery running an early version of tintdrum
The above is my recently acquired STM32F4-Discovery board running tintdrum, a fixed function groove machine designed to use like a metronome but with a stronger groove. The idea is that this type of drum machine is a vital component of a digital practice tool so tintamp will definitely have to have one when its finished. However it is actually useful enough to be a separate thing in its own right. Something I can put in a box and call “done”

The board above is running my own drum machine software. You can tell can’t you? Until last week all it was able to do was plug in and it started playing drums via the headphone socket at the bottom… it played a really basic 4/4, kick, snare, kick, snare beat (plus hi-hat)… at exactly 100 beats per minutes… and that’s it.

This week however I’ve been able to extend it to flash an LED on the beat (a vital feature in a practice tool) and also been able to rig up a tap tempo button. This means its starting to feel real. That said I still need to implement controls to change the groove and volume. I’d also like to extend the drum machine code to include humanization to stop is sounding quite so start.

Nevertheless the journey from PC to real hardware has begun. Bon voyage.


Daniel Thompson: Meeting requirements since 1978