Kafka

Proptest: property testing in Rust

You probably heard about property testing (AKA property-based testing), a testing technique where test inputs are randomly generated in great quantity and particular desired properties of the code under test are checked. It’s very practical and I’m a big proponent of using it where it makes sense.

In this post, I will tell you how I used property testing with the Proptest library in Rust to ensure the correctness of a bunch of generated serialization/deserialization code for the Apache Kafka protocol.

In this post:

  1. A real-life example of using property testing in Rust with Proptest.
  2. A step-by-step explanation.

The post assumes you’re familiar with Rust. However, to follow it you don’t need any knowledge about Kafka or its protocol. I will explain the problem I was working on briefly, but will not go into much detail.

Introduction

Apache Kafka has a sizeable custom binary protocol that is versioned, with various data types, optional fields, etc. Unfortunately, it doesn’t use a well-known serialization format like Protobuf. The protocol message schema is described in JSON. The actual Java code that does serialization and deserialization is generated from this description1.

I created a library kafka_wire_protocol. It’s a generated de-/serialization code for the Kafka protocol, but for Rust (and in the future, for Go). The correctness of the generated code is paramount. I applied many testing techniques to it: quick unit tests, integration tests against a real Kafka instance, fuzzying. And also kafka_wire_protocol is a good example of software that can benefit from property testing. About this is going to be this post.

You don’t need to familiarize yourself with kafka_wire_protocol to understand what will be discussed. I will give examples from the code, but they will be self-contained. However, if you’re interested, feel free to check out this repo and follow the instruction in README to run the code generator, compile, and run the tests.

Kafka protocol practical guide

I worked with the Apache Kafka protocol on the low level quite a bit. It wasn’t easy to start doing this following the official guide only and I read the code a lot. With this post, I want to give you a head start by guiding you step by step from primitive values to meaningful requests.

In this post:

  1. Explore the Kafka protocol code and the protocol in action with Wireshark.
  2. Learn how to read and write primitive values.
  3. Combine primitives to perform meaningful requests.

We will use Python as the programming language. However, the code will be zero-dependency and easily portable to the language of your choice.