Low Discrepancy Color Sequences

I have been working on a project where there are a bunch of objects next to each other and I want different colors for each, so that I can tell where one ends and another starts. In the past I’ve simply hacked this sort of palette:

for (my $i=0; $i<8; $i++){
    my $r = $i % 2;
    my $g = (int($i/2)) % 2;
    my $b = (int($i/4)) % 2;
    print "Color $i is ($r, $g, $b)\n";
}

varying the red, green, and blue channels between their min and max values. (Yes, I’m using Perl; I imprinted on it before Python existed. It’s easy enough to understand.)

The 8 colors produced:

Color 0 is (0, 0, 0)
Color 1 is (1, 0, 0)
Color 2 is (0, 1, 0)
Color 3 is (1, 1, 0)
Color 4 is (0, 0, 1)
Color 5 is (1, 0, 1)
Color 6 is (0, 1, 1)
Color 7 is (1, 1, 1)

which gives:

Color cube colors, in “ascending” order

Good enough, when all I needed was up to 8 colors. But, I was finding I needed 30 or more different colors to help differentiate the set of objects. The four-color map theorem says we just need four distinct colors, but figuring out that coloring is often not easy, and doesn’t animate. Say you’re debugging particles displayed as squares. Giving each a unique color helps solve the problem of two of them blending together and looking like one.

To make more colors, I first tried something like this, cycling each channel between 0, 0.5, and 1:

my $n=3;    # number of subdivisions along each color axis
for (my $i=0; $i<$n*$n*$n; $i++){
    my $r[$i] = ($i % $n)/($n-1);
    my $g[$i] = ((int($i/$n)) % $n)/($n-1);
    my $b[$i] = ((int($i/($n*$n))) % $n)/($n-1);
    print "Color $i is ($r, $g, $b)\n";
}

Which looks like:

3x3x3 colors, in “ascending” order

These are OK, I guess, but you can see the blues are left out until the later colors. The colors also start out pretty dark, building up and becoming mostly light at the end of the set.

And it gets worse the more you subdivide. Say I use $n = 5. We’re then just walking through variants where the red channel walks up by +0.25. Here are the first 10, to show what I mean:

Color 0 is (0, 0, 0)
Color 1 is (0.25, 0, 0)
Color 2 is (0.5, 0, 0)
Color 3 is (0.75, 0, 0)
Color 4 is (1, 0, 0)
Color 5 is (0, 0.25, 0)
Color 6 is (0.25, 0.25, 0)
Color 7 is (0.5, 0.25, 0)
Color 8 is (0.75, 0.25, 0)
Color 9 is (1, 0.25, 0)

The result for 125 colors:

5x5x5 colors, in “ascending” order

These might be OK if I was picking out a random color, and that would actually be the easiest way: just shuffle the order. After calculating a set of colors and putting them in arrays, go through each color and swap it with some other random location in the array (here the colors are now in arrays $r[], $g[], $b[]):

for (my $i=($n*$n*$n)-1; $i>=1; $i--){
    my $idx = int(rand($i+1));    # pick random index from remaining colors, [0,$i]
    my @tc = ($r[$i],$g[$i],$b[$i]);    # save color so we can swap to its location
    $r[$i] = $r[$idx]; $g[$i] = $g[$idx]; $b[$i] = $b[$idx];    # swap
    $r[$idx] = $tc[0]; $g[$idx] = $tc[1]; $b[$idx] = $tc[2]; 
}
Shuffled 5x5x5 colors

Some colors don’t look all that different, and the palette tends to be dark. This can be improved with simple gamma correction:

my $gamma = 1/2.2;
$r[$i] = (($i % $n)/($n-1))**$gamma;
$g[$i] = (((int($i/$n)) % $n)/($n-1))**$gamma;
$b[$i] = (((int($i/($n*$n))) % $n)/($n-1))**$gamma;

That’s a little better, I think:

Gamma corrected shuffled 5x5x5 colors

The bigger problem is that these are just random colors over a fixed range, 125 colors in this case. Sometimes I’m displaying 4 objects, sometimes 15, sometimes 33. With this sequence, the first four colors have two oranges that are not considerably different – much worse than my original 8-color palette. This was just (bad) luck, but doing another random roll of the dice isn’t the solution. Any random swizzle will almost always give colors that are close to each other in some sets of the first N colors, missing out on colors that would have been more distinctive.

I’d like them to all look as different as possible, as the number grows, and I’d like to have one table. This goal reminded me of low-discrepancy sequences, commonly used for progressively sampling a pixel for ray tracing, for example. Nice run-through of that topic here by Alan Wolfe.

The idea is simple: start with a color. To add a second color, look at every possible pixel RGB triplet and see how far it is from that first color. Whichever is the farthest is the color you use. For your third color, look at every possible pixel triplet and find which color has the largest “closest distance” to one of the first two. Lather, rinse, repeat, choosing the next color as that which maximizes the distance to its nearest neighbor.

Long and short, it works pretty well! Here are 100 colors, in order:

Maximum closest distance to previous colors, starting at white

I started with white. No surprise, the farthest color from white is black. For the next color, the program happened to pick a blue, (0,128,255), which is (0,186,255) after gamma correction. At first, I thought this third color was a bug. But thinking about it, it makes sense: any midpoint of the six edges of the color cube that don’t touch the white or black corner (these form a hexagon) are equally far from both corner colors (the other RGB cube corners are not).

The other colors distribute themselves nicely enough after that. At a certainly point, some colors start to look a bit the same, but I at least know they’re all different, as best as can be, given the constraints.

In Perl it took an overnight run on a CPU to get this sequence, as I test all 16.7 million (256^3) triplets against all the previous colors found for the largest of the closest approach distances computed. But, who cares. Computers are fast. Once I have the sequence, I’m done. Here’s the sequence in a text file, if of interest.

This is a sequence, meant to optimize all groups of the first N colors for any given N. If you know you’ll always need, say, 27 colors, the colors on a 3x3x3 subdivided color cube (in sRGB) are going to be better, because you’re globally optimizing for exactly 27 colors. Here I did not want to find some optimal set of colors for every number N from 1 to 100, but just wanted a single table I could store and reasonably use for a group of any size.

What’s surprising is that none of the other color cube corner colors – red (255,0,0), cyan (0,255,255), etc. – appear in this sequence. If you start with another color than white, you get a different sequence. Starting with a different RGB cube corner results in some rotation or flip of the color sequence above, e.g., start with black and your next color is white, then the rest are (or can be; depends on tie breaks) the same. Start with red, cyan is next, and then some swapping and flipping of the RGB values in the original sequence. But, start with “middle gray” and the next eight colors are the corners of the color cube, followed by a different sequence. Here are the first twenty:

Start at middle gray

I tried some other ideas, such as limiting the colors searched to those that aren’t too dark. If I want to, for example, display black wireframes around my objects, black and other dark colors are worth avoiding:

Avoid dark colors

This uses a rule of “if red + green + blue < 0.2, don’t use the color.” Gets rid of black, though that dark blue is still pretty low contrast, so maybe I should kick that number up. But dark greens and reds are not so bad, so maybe balance by the perceptual brightness of each channel… Endless fiddling is possible.

I also tried “avoid grays, they’re boring” by having a similar rule that if a test color’s three differences among the three RGB channel values were all less than 0.15, don’t use that color. I started with the green corner of the color cube, to avoid white. Here’s that rule:

Avoid grays (well, some grays)

Still some pretty gray-looking swatches in there – maybe increase the difference value? One downside is that these types of rules remove colors from the selection set, forcing the remaining colors to be closer to one another.

I could have made this process much faster by simply choosing from fewer colors, e.g., looking at only the color levels (0.0, 0.25, 0.5, 0.75, 1.0), which would give me 125 colors to choose from, instead of 16.7 million. But it’s fun to run the program overnight and have that warm and fuzzy feeling that I’m finding the very best colors, each most distant from the previous colors in the sequence.

I should probably consider better perceptual measures of “different colors,” there’s a lot of work in this area. And 100 colors is arbitrary – above this number, I just repeat. I could probably get away with a smaller array (useful if I was including this in a shader program), as the 100-color list has some entries that look pretty similar. Alternately, a longer table is fine for most applications, it does not take a lot of space. Computing the full 16.7 million entry table might take quite a while.

There’s lots of other things to tinker with… But, good enough – done puttering! Here’s my perl program. If you make or know of a better, perceptually based, low-discrepancy color sequence, great, I’d be happy to use it.

Addendum: Can’t get enough? See what other people say and more things I try here, in a follow-up blog post.