Type-safe query builders in Scala

October 18, 2015

Page content

Update 11.01.2016: the next post about type-safe query builders using shapeless.

Recently, I was hacking on a Scala library for queries to Cassandra database, phantom. At the moment, it was not able to build prepared statements, which I needed, so I added this functionality. However, phantom developers also implemented prepared statements quickly :) Nevertheless, I decided to write this post about the core idea of my implementation.

Caution: Don’t do this at home, use shapeless :)

My plan was to be able to prepare queries like this:

val query = select
  .where(typedParamPlaceholder1)
  .and(typedParamPlaceholder2)
  .and(typedParamPlaceholder3)
  .build()

and after that to execute this query: query.execute("string", 1, false). Moreover, I wanted this execution to be type-safe, so it was not possible to execute something like query.execute(1, "string", 0).

I will abstract from phantom queries and will focus on the general idea. The code is in this repository.

The approach

Let us start from QueryParameters trait and its descendants.

package me.ivanyu.typed_queries

import scala.language.higherKinds

// Type flags to mark query parameter set that is the last possible.
private[typed_queries] sealed trait FinalFlag
private[typed_queries] sealed trait NotFinal extends FinalFlag
private[typed_queries] sealed trait Final extends FinalFlag

// Empty tuple (to make it iterable as Product).
class Tuple0 extends Product {
  override def productElement(n: Int): Any =
    throw new NoSuchElementException()
  override def productArity: Int = 0
  override def canEqual(that: Any): Boolean = false

  override def toString: String = "()"
}

sealed trait QueryParameters {
  // Final flag.
  type FF <: FinalFlag

  // Parameter type that will go to the final execute method.
  type MethodParameters <: Product

  // The next QueryParameters type, used during parameter addition.
  type Next[T] <: QueryParameters
}

// 22 query parameter set

class QueryParameters0 extends QueryParameters {
  override type FF = NotFinal
  override type MethodParameters = Tuple0
  override type Next[T] = QueryParameters1[T]
}

class QueryParameters1[T1] extends QueryParameters {
  override type FF = NotFinal
  override type MethodParameters = Tuple1[T1]
  override type Next[T] = QueryParameters2[T1, T]
}

class QueryParameters2[T1, T2] extends QueryParameters {
  override type FF = NotFinal
  override type MethodParameters = (T1, T2)
  override type Next[T] = QueryParameters3[T1, T2, T]
}

// ... 3 - 21

class QueryParameters22[T1, T2, T3, T4, T5, T6, T7, T8, T9, T10, T11, T12, T13, T14, T15, T16, T17, T18, T19, T20, T21, T22] extends QueryParameters {
  override type FF = NotFinal
  override type MethodParameters = (T1, T2, T3, T4, T5, T6, T7, T8, T9, T10, T11, T12, T13, T14, T15, T16, T17, T18, T19, T20, T21, T22)
  override type Next[T] = QueryParametersFinal
}

// The marker of "cannot add more parameters"
class QueryParametersFinal extends QueryParameters {
  override type FF = Final
  override type MethodParameters = Nothing
  override type Next[T] = Nothing
}

The main idea here is a type chain: each QueryParameters’ descendant has a typed reference to its successor in terms of parameter addition: QueryParameters1[String]#Next[Int] is basically QueryParameters2[String, Int].

This Next with type parameter is called a type of a higher kind (article). To enable them, we have to import Scala language feature scala.language.higherKinds (the compiler use such imports as flags to turn some language features on).

The last QueryParametersFinal is a special case - it is basically an indicator of the fact that no more parameters could be added. Final flag FF = Final is used for this (see later).

I have implemented an empty tuple type Tuple0 to represent an empty parameter list that could be iterated (via Product trait).

Let us look at QueryBuilder:

package me.ivanyu.typed_queries

import scala.reflect.ClassTag

class QueryBuilder[P <: QueryParameters] private(parameters: Seq[TypedParameter]) {
  final def addParameter[T : ClassTag](name: String)
                                      (implicit ev: P#Next[T]#FF =:= NotFinal): QueryBuilder[P#Next[T]] = {
    val classTag = implicitly[ClassTag[T]]
    val param = TypedParameter(name, classTag)
    new QueryBuilder[P#Next[T]](parameters :+ param)
  }

  final def build(): Query[P] = {
    new Query[P](parameters)
  }
}

object QueryBuilder {
  final def begin(): QueryBuilder[QueryParameters0] = {
    new QueryBuilder[QueryParameters0](Nil)
  }
}

The build is started from QueryBuilder.begin() - it returns an empty QueryBuilder (without parameters). Then we add a new parameter with addParameter method. It has one explicit parameter name (for parameter name; the pun is unintended) and two implicits.

The first is ev of type =:=[P#Next[T]#FF, NotFinal] (expressed in an infix form). This cumbersome construct says ‘compiler, give me an evidence that the type in the left part is exactly the same as in the right’ and ensures that the next QueryParameters will not be QueryParametersFinal.

For example: let P be QueryParameters0, so P#Next[T] is QueryParameters1[T], and P#Next[T]#FF is QueryParameters1[T]#FF, which is NotFinal - we have the evidence we need, and the code compiles.

Another example: let P be QueryParameters22[T1, ..., T22], so P#Next[T] is QueryParametersFinal, and P#Next[T]#FF is QueryParametersFinal#FF, which is Final - we do not have the evidence, and the code will not be compiled.

This prevents us from creating of a malformed type set.

The second implicit parameter does not have the name and represented in a form of context bound: [T : ClassTag]. We get its instance as val classTag = implicitly[ClassTag[T]] and store it along with the name. It has been added for the purpose of illustration and, perhaps, will not be needed in a real situation.

When we finish with the parameters, we call build() that returns Query.

Query has little magic inside:

package me.ivanyu.typed_queries

class Query[P <: QueryParameters] private[typed_queries](parameters: Seq[TypedParameter]) {
  def execute(parameterValues: P#MethodParameters): Unit = {
    val zipped = parameters zip parameterValues.productIterator.toList
    println(s"Much magic: $zipped")
    // Your code
  }
}

The main idea is that Scala automatically converts "str", 1, false into Tuple3("str", 1, false) when the only parameter of a function is Tuple3[String, Int, Boolean]. This allows us to write execute calls in a natural way, without an explicit tuple.

There are two exceptions: no parameters and one parameter. We create implicit conversions for these cases:

beginQuery()
  .addParameter[String]("param1")
  .build()
  .execute("'some string'")

beginQuery()
  .addParameter[String]("param1")
  .addParameter[Int]("param2")
  .build()
  .execute("'some string'", 1)

beginQuery()
  .addParameter[String]("param1")
  .addParameter[Int]("param2")
  .addParameter[Boolean]("param3")
  .build()
  .execute("'some string'", 1, false)

Drawbacks

There are a couple of nuances. Firstly, the implicit conversion from no parameters to (): Unit still exists but are going to be deprecated. Thus, in the future we will have to write a little ugly execute(()):

beginQuery().build()
  .execute() //.execute(())

Secondly, because one implicit conversion (to a tuple) is already done, Scala will not perform the following. Thus, all implicit conversions for parameters will not work, except for Int → Long and similar widenings (numeric widenings and some other conversions are applied by special rules):

// Won't work due to one implicit conversion (to tuple) is already done :(
case class A()
case class B()

import scala.language.implicitConversions
implicit def A2B(a: A): B = B()

beginQuery()
  .addParameter[B]("long param")
  .build()
//  .execute(A())

You can deal with this, though, by using explicit conversions: execute(A2B(...)). Another way is to add an explicit tuple in such problem places: execute(Tuple3(..., ..., ...)). If you do so, the compiler will have space for implicit conversions that are needed.

Alternatives

If you do not like the approach described, I could offer you some alternatives:

As was said earlier, shapeless library is an option. It provides facilities for convenient building of heterogeneous lists of parameters. I am sure, this should be considered as the default way.
Use macros. With them you can create a final query code that will be indistinguishable from hand-written, in which no implicit conversions to tuples are needed; empty execute is natural. However, macros have their own problems (difficult to debug, still poor IDE support, etc.)
Avoid all this and write type-safe wrapper by hand. Seriously, sometimes it is better.

Links

Check this relevant The Magnet Pattern article.