Type-safe query builders in Scala revisited: shapeless

Not so long ago, I wrote a post about creating type-safe query builders in Scala from scratch. In it, I also suggested using shapeless library to do what was described. Now, I decided to write how it could be done. The code is in this repository.

Problem reminder

Without going into much details, the problem was to provide a type-safe way to build queries (to an abstract database, for instance) with parameters of different types. Something like

val query = beginQuery()

// Compile:
query.execute("some string", 1, false)
// Won't compile:
query.execute(42, true, "another string")

Retrospective time-based UUID generation (with Spark)

Update 26.03.2023: 8 years ago 50 Gb sounded more serious that it is now, but even then we could and should have done this easily with one beefy machine without Spark or any other then-fancy tool.

I have faced a problem: having about 50 Gb of data in one database, export records to another database with slight modifications, which include UUID generation based on timestamps of records (collisions were intolerable). We have Spark cluster, and with it the problem did not seem even a little tricky: create RDD, map it and send to the target DB.

(In this post I am telling about Spark. I have not told about it in this blog so far, but I will. Generally, it is a framework for large-scale data processing, like Hadoop. It operates on abstract distributed data collections called Resilient Distributed Datasets (RDDs). Their interface is quite similar to functional collections (map, flatMap, etc.), and also has some operations specific for Spark's distributed and large-scale nature.)

However, I was too optimistic - there were difficulties.

Type-safe query builders in Scala

Update 11.01.2016: the next post about type-safe query builders using shapeless.

Recently, I was hacking on a Scala library for queries to Cassandra database, phantom. At the moment, it was not able to build prepared statements, which I needed, so I added this functionality. However, phantom developers also implemented prepared statements quickly :) Nevertheless, I decided to write this post about the core idea of my implementation.

Caution: Don’t do this at home, use shapeless :)

My plan was to be able to prepare queries like this:

val query = select

and after that to execute this query: query.execute("string", 1, false). Moreover, I wanted this execution to be type-safe, so it was not possible to execute something like query.execute(1, "string", 0).

I will abstract from phantom queries and will focus on the general idea. The code is in this repository.

Introduction to Akka

Update 26.03.2023: In September 2022, Lightbend changed the license of Akka from Apache 2.0 to the source-available Business Source License (BSL) 1.1. Akka was forked as Pekko.

There are several models of concurrent computing, the actor model is one of them. I am going to give a glimpse of this model and one of its implementation - Akka toolkit.

Akka logo

The actor model

In the actor model, actors are objects that have state and behavior and communicate to each other by message passing. This sounds like good old objects from OOP, but the crucial difference is that message passing is one-way and asynchronous: an actor sends a message to another actor and continues its work. In fact, actors are totally reactive, all theirs activity is happening as reaction to incoming messages, which are processed one by one. However, it is not a limitation because messages can be of any sort including scheduled messages (by timer) and network messages.

Value Classes in Scala

Type systems and compile-time type checking are great things that can save you a couple of hours of debugging and also have documenting potential, could make the code more understandable. In my opinion, it’s wise to use them, and unfortunately, sometimes we don’t do this enough. Consider Integer/Int/int. A counter could be Integer, an entity identifier could be Integer, an integer number in arithmetic expression could be Integer. In most cases all this Integers have nothing to do with each other: in your domain it is a bad idea to compare them, do arithmetic operations on them, pass one instead of another as a function parameter etc.

In one of my projects (in C#) there are a dozen of domain entities that have integer identifiers that are passed all over the code. After a couple of bugs connected with mixed up identifiers of different entities I’ve solved this problem by replacing plain integer numbers with structs (in C#, structs are value types used for representing lightweight objects such as Point or Color) like Id<EntityName>T (T is to distinct type from property names). The key idea was to introduce a new level of types to let the type checker intently look at the code instead of me. It’s worked: I’ve gotten rid of some old bugs in rarely used parts of code and hope new bugs of such a type won’t bother me in the future. (Aside: I hope, this post will persuade you not only to consider using value classes but also to think about the role of types in code quality).

Why I like Scala

I am familiar (more or less) with a number of programming languages and have both emotional and rational thoughts of them. Scala is for certain in the group of languages I like. I have decided to summarize my judgments of Scala attractive parts in a blog post and here it is. Also, I have got some ideas of posts about Scala and its technology stack and an introduction is possibly needed.

Scala logo

Scala is a general purpose programming language created by Martin Odersky more than ten years ago. It compiles into JVM byte code and interoperable (both direction) with Java (including mixed compilation), which gives Scala an ability to use all this enormous amount of code created for JVM. The interesting property and also one of the strongest selling points of the language is fusion of object-oriented and functional programming paradigms.