Time-based (version 1) UUIDs ordering in PostgreSQL

The problem

Not so long ago I wrote about a little strange problem with time-based UUIDs I faced (Retrospective time-based UUID generation (with Spark)). This time I needed to do something more usual. As you surely know, UUID standard has several versions. Version 4 is purely random numbers, version 1 relies on the identity of a machine and the timestamp with the counter, etc.

Let’s consider version 1 UUIDs. They include timestamps with counters, so they naturally can be ordered by it. In other words, having two time-based UUIDs, I wanted to say if the first is lower, greater or equal to the second. No big deal, say, in Java: u1.timestamp().compare(u2.timestamp()). But I wanted to do this in PostgreSQL, inside SQL query. Postgresql does have uuid data type, but it provides no functions for comparing them by version 1 timestamps. That is why I decided to write such a function myself.

Version 1 UUID structure

You can find the full UUID specification in RFC 4122. Let’s consider here only the part of version 1 which we are interested in. UUIDs have length of 128 bits, i.e. 16 bytes, which have different meaning in different versions of the standard. Version 1 layout is shown on the picture:

UUID version 1 structure

UUID timestamp is a count of 100-nanoseconds intervals since 00:00:00.000 15 Oct 1582. This looks strange, but it is for the sake of rational usage of bits and to give room for the counter (which is used to prevent collisions). The timestamp is 60-bit long, which requires 8 full bytes to store – the first half of the UUID. Actually, we are only interested in this first half. The timestamp is cut into three pieces: hi - 12 bits | mid - 16 bits | low - 32 bits. First 4 bytes of an UUID is low, next two – mid, and the next two is hi. But the higher 4 bits of hi‘s byte are used to store the version number. The counter is a part of low.

UUID timestamps ordering

As we know the structure of timestamps, it is easy to order them. The algorithm is the following:

Now, let’s write this in a form of PL/pgSQL function:

I know no standard way in PostgreSQL for transforming uuid type into a byte array or getting a single byte from it. That is why I used the conversion into a string with the following decoding as a hex number.

The function is quite transparent, but I tested it by comparing its result to Java’s timestamp comparison. I tested it on equal UUIDs, on UUIDs with the same time but different counters, on UUIDs with different times, sequentially and randomly, etc. You can find the tests (in Scala) in the repository. As for the performance, I have not measured it.

So, if you need to order time-based (version 1) UUIDs in PostgreSQL in this way, that is one of possible solutions. Maybe one day we will have something out-of-the-box for doing this.

All relevant code can be found in the repository.