Type-safe query builders in Scala

Update 11.01.2016: the next post about type-safe query builders using shapeless.

Recently, I was hacking on a Scala library for queries to Cassandra database, phantom. At the moment, it was not able to build prepared statements, which I needed, so I added this functionality. However, phantom developers also implemented prepared statements quickly :) Nevertheless, I decided to write this post about the core idea of my implementation.

Caution: Don’t do this at home, use shapeless :)

My plan was to be able to prepare queries like this:

and after that to execute this query: query.execute("string", 1, false). Moreover, I wanted this execution to be type-safe, so it was not possible to execute something like query.execute(1, "string", 0).

I will abstract from phantom queries and will focus on the general idea. The code is in this repository.

The approach

Let us start from QueryParameters trait and its descendants.

The main idea here is a type chain: each QueryParameters‘ descendant has a typed reference to its successor in terms of parameter addition: QueryParameters1[String]#Next[Int] is basically QueryParameters2[String, Int].

This Next with type parameter is called a type of a higher kind (article). To enable them, we have to import Scala language feature scala.language.higherKinds (the compiler use such imports as flags to turn some language features on).

The last QueryParametersFinal is a special case – it is basically an indicator of the fact that no more parameters could be added. Final flag FF = Final is used for this (see later).

I have implemented an empty tuple type Tuple0 to represent an empty parameter list that could be iterated (via Product trait).

Let us look at QueryBuilder:

The build is started from QueryBuilder.begin() – it returns an empty QueryBuilder (without parameters). Then we add a new parameter with addParameter method. It has one explicit parameter name (for parameter name; the pun is unintended) and two implicits.

The first is ev of type =:=[P#Next[T]#FF, NotFinal] (expressed in an infix form). This cumbersome construct says ‘compiler, give me an evidence that the type in the left part is exactly the same as in the right’ and ensures that the next QueryParameters will not be QueryParametersFinal.

For example: let P be QueryParameters0, so P#Next[T] is QueryParameters1[T], and P#Next[T]#FF is QueryParameters1[T]#FF, which is NotFinal – we have the evidence we need, and the code compiles.

Another example: let P be QueryParameters22[T1, ..., T22], so P#Next[T] is QueryParametersFinal, and P#Next[T]#FF is QueryParametersFinal#FF, which is Final – we do not have the evidence, and the code will not be compiled.

This prevents us from creating of a malformed type set.

The second implicit parameter does not have the name and represented in a form of context bound: [T : ClassTag]. We get its instance as val classTag = implicitly[ClassTag[T]] and store it along with the name. It has been added for the purpose of illustration and, perhaps, will not be needed in a real situation.

When we finish with the parameters, we call build() that returns Query.

Query has little magic inside:

The main idea is that Scala automatically converts "str", 1, false into Tuple3("str", 1, false) when the only parameter of a function is Tuple3[String, Int, Boolean]. This allows us to write execute calls in a natural way, without an explicit tuple.

There are two exceptions: no parameters and one parameter. We create implicit conversions for these cases:

That is all: now we can build queries and executed them in a type-safe way:

Drawbacks

There are a couple of nuances. Firstly, the implicit conversion from no parameters to (): Unit still exists but are going to be deprecated. Thus, in the future we will have to write a little ugly execute(()):

Secondly, because one implicit conversion (to a tuple) is already done, Scala will not perform the following. Thus, all implicit conversions for parameters will not work, except for Int → Long and similar widenings (numeric widenings and some other conversions are applied by special rules):

You can deal with this, though, by using explicit conversions: execute(A2B(...)). Another way is to add an explicit tuple in such problem places: execute(Tuple3(..., ..., ...)). If you do so, the compiler will have space for implicit conversions that are needed.

Alternatives

If you do not like the approach described, I could offer you some alternatives:

  1. As was said earlier, shapeless library is an option. It provides facilities for convenient building of heterogeneous lists of parameters. I am sure, this should be considered as the default way.
  2. Use macros. With them you can create a final query code that will be indistinguishable from hand-written, in which no implicit conversions to tuples are needed; empty execute is natural. However, macros have their own problems (difficult to debug, still poor IDE support, etc.)
  3. Avoid all this and write type-safe wrapper by hand. Seriously, sometimes it is better.

Links

Check this relevant The Magnet Pattern article.