Posts

Type systems and compile-time type checking are great things that can save you a couple of hours of debugging and also have documenting potential, could make the code more understandable. In my opinion, it’s wise to use them, and unfortunately, sometimes we don’t do this enough. Consider Integer/Int/int. A counter could be Integer, an entity identifier could be Integer, an integer number in arithmetic expression could be Integer. In most cases all this Integers have nothing to do with each other: in your domain it is a bad idea to compare them, do arithmetic operations on them, pass one instead of another as a function parameter etc.

In one of my projects (in C#) there are a dozen of domain entities that have integer identifiers that are passed all over the code. After a couple of bugs connected with mixed up identifiers of different entities I’ve solved this problem by replacing plain integer numbers with structs (in C#, structs are value types used for representing lightweight objects such as Point or Color) like Id<EntityName>T (T is to distinct type from property names). The key idea was to introduce a new level of types to let the type checker intently look at the code instead of me. It’s worked: I’ve gotten rid of some old bugs in rarely used parts of code and hope new bugs of such a type won’t bother me in the future. (Aside: I hope, this post will persuade you not only to consider using value classes but also to think about the role of types in code quality).

I am familiar (more or less) with a number of programming languages and have both emotional and rational thoughts of them. Scala is for certain in the group of languages I like. I have decided to summarize my judgments of Scala attractive parts in a blog post and here it is. Also, I have got some ideas of posts about Scala and its technology stack and an introduction is possibly needed.

Scala logo

Scala is a general purpose programming language created by Martin Odersky more than ten years ago. It compiles into JVM byte code and interoperable (both direction) with Java (including mixed compilation), which gives Scala an ability to use all this enormous amount of code created for JVM. The interesting property and also one of the strongest selling points of the language is fusion of object-oriented and functional programming paradigms.

Recently, I’ve faced a task of developing a tool which allows the application to have base of (not very complex) logical rules. There were three demands:

The rules were to be written by non-programmers, so using of the languages which the program is written in (Java/Scala), wasn’t very good.
The rule base should be changeable without redeployment of the application, ideally, should be stored in a database.
We should have control on compilation and error emission.

The first and the second demand could be met by developing some kind of Scala- or Groovy-based DSL, extremely simple. But I’ve come with several arguments against:

The third requirement might be hard to meet.
The rules are quite simple, so embedding an interpreter of a general-purpose language might be overkill.
The language which rules consumer is written in might be changed (from Java/Scala to Python e.g.)

So, I’ve decided to write a very simple rule parser/compiler. After I’d created a prototype I decided to write this post, hope it’ll be useful. I say in advance that you can see the code adapted for this post in this repository.

How much CPU cores does your computer have? 2-8, I think. It’s very time to use them all, isn’t it? But there are plenty of Unix utils such as grep, find, wc etc., which have no idea about parallel data processing. They can’t split their input into 8 pieces and spawn the corresponding number of threads or processes to process it using all the power of your modern CPU.

Definitely, this problem is quite interesting and practical to rest unsolved. According to the Unix philosophy, 1) it is good for programs to do one thing well; 2) it is a good idea to combine simple programs to do complex things. grep do pattern matching well. How about parallelization?

There is an utility know as GNU Parallel which main purpose is to execute arbitrary jobs in parallel on one or even multiple machines. The program is quite complex and multifunctional, look at man and tutorial. Here, I want just to give a little flavor of it.

GNU Parallel

There is a popular task at software developer job interviews: having a singly linked list, write a piece of code which tells if the list has a cycle.

In a linked list, each element is a structure which contains the value of an element and the link to the next element. The next-link of the last element has a special value which marks the end (usually, null). If a list has a cycle, the last element points to some element inside the list.

Linked list types

Introduction

There are some algorithms of exact substring searching (e.g. Knuth-Morris-Pratt, Boyer-Moore etc.) I want to explain one of them which is called Z algorithm in some sources.

Z-boxes and Z-values

Let’s consider the concept of Z-box. Take the string S = “abcxxxabyyy”. We have an internal part “ab” in the string which repeats its prefix. The internal “ab” is a Z-box. It has the beginning at the position with index 6 and the end in 7 (0-based). Z-boxes are substrings which match string prefixes with the same length. For the Z-box, let’s call the corresponding prefix the prototype (it’s my term but I think it’s OK to use it.)

Z algorithm

Value Classes in Scala

Why I like Scala

Creating a simple parser with ANTLR

GNU Parallel

Finding a cycle in a linked list

Z algorithm

Introduction

Z-boxes and Z-values