Proptest: property testing in Rust

September 22, 2024

Page content

You probably heard about property testing (AKA property-based testing), a testing technique where test inputs are randomly generated in great quantity and particular desired properties of the code under test are checked. It’s very practical and I’m a big proponent of using it where it makes sense.

In this post, I will tell you how I used property testing with the Proptest library in Rust to ensure the correctness of a bunch of generated serialization/deserialization code for the Apache Kafka protocol.

In this post:

A real-life example of using property testing in Rust with Proptest.
A step-by-step explanation.

The post assumes you’re familiar with Rust. However, to follow it you don’t need any knowledge about Kafka or its protocol. I will explain the problem I was working on briefly, but will not go into much detail.

Introduction

Apache Kafka has a sizeable custom binary protocol that is versioned, with various data types, optional fields, etc. Unfortunately, it doesn’t use a well-known serialization format like Protobuf. The protocol message schema is described in JSON. The actual Java code that does serialization and deserialization is generated from this description¹.

I created a library kafka_wire_protocol. It’s a generated de-/serialization code for the Kafka protocol, but for Rust (and in the future, for Go). The correctness of the generated code is paramount. I applied many testing techniques to it: quick unit tests, integration tests against a real Kafka instance, fuzzying. And also kafka_wire_protocol is a good example of software that can benefit from property testing. About this is going to be this post.

You don’t need to familiarize yourself with kafka_wire_protocol to understand what will be discussed. I will give examples from the code, but they will be self-contained. However, if you’re interested, feel free to check out this repo and follow the instruction in README to run the code generator, compile, and run the tests.

Property testing

If we quote Wikipedia:

Property testing is a testing technique where, instead of asserting that specific inputs produce specific expected outputs, the practitioner randomly generates many inputs, runs the program on all of them, and asserts the truth of some “property” that should be true for every pair of input and output.

The programmer writes the test code similar to unit tests, i.e. an input, some operation(s) on this input, and finally some checks on the output. The set of checks represents the properties we’re testing in code. Normally, some property testing library is used to support this. The library does two main things:

It generates many inputs randomly with the guidance from programmer. It should be convenient for the human to set up the strategy for generation, it should be mostly automatic and easily composable from building blocks.
Once a violation of a property under test is detected, the library shrinks the input, that is, minifies it to make it more comprehensible to people.

For example, every output from a serialization function should be accepted by the corresponding deserialization function

This sounds quite like what we want to achieve here! Property testing is particularly good for “paired” functions, like serialization/deserialization, encryption/decryption, compression/decompression, and so on, because they are so easy to check against each other.

Property testing was invented (to the best of my knowledge) by Koen Claessen and John Hughes in the library QuickCheck back in 1999. If you’re interested, you can read the story of QuickCheck here.

Originally made for Haskell, QuickCheck inspired many similar libraries for numerous programming platforms.

Proptest

Proptest, inspired by Hypothesis (a Python library), is a property testing library for Rust. I used Proptest for kafka_wire_protocol.

Step by step towards property tests

Let’s manually re-create the generated code with property tests step by step. We will use MetadataRequest version 12. If you’re interested in the final generated variant of the code in the following sections, check out this file. Meanwhile, I skip all the cfg(test) attributes here to not clutter the code.

Step 1. Simplified generated code

The root struct is called MetadataRequest, which contains a vector of another struct MetadataRequestTopic. The implementations of Readable and Writable are generated, so feel free to ignore what’s inside them.

use std::io::{Read, Result, Write};

use uuid::Uuid;

use crate::arrays::{read_nullable_array, write_nullable_array};
use crate::readable_writable::{Readable, Writable};
use crate::tagged_fields::{read_tagged_fields, write_tagged_fields, RawTaggedField};

#[derive(Clone, PartialEq)]
pub struct MetadataRequest {
    pub topics: Option<Vec<MetadataRequestTopic>>,
    pub allow_auto_topic_creation: bool,
    pub include_topic_authorized_operations: bool,
    pub _unknown_tagged_fields: Vec<RawTaggedField>,
}

/* Generated code */
impl Readable for MetadataRequest {
    fn read(#[allow(unused)] input: &mut impl Read) -> Result<Self> {
        let topics = read_nullable_array::<MetadataRequestTopic>(input, "topics", true)?;
        let allow_auto_topic_creation = bool::read(input)?;
        let include_topic_authorized_operations = bool::read(input)?;
        let tagged_fields_callback = |tag: i32, _: &[u8]| {
            match tag {
                _ => Ok(false)
            }
        };
        let _unknown_tagged_fields = read_tagged_fields(input, tagged_fields_callback)?;
        Ok(MetadataRequest {
            topics, allow_auto_topic_creation, include_topic_authorized_operations, _unknown_tagged_fields
        })
    }
}

/* Generated code */
impl Writable for MetadataRequest {
    fn write(&self, #[allow(unused)] output: &mut impl Write) -> Result<()> {
        write_nullable_array(output, "topics", self.topics.as_deref(), true)?;
        self.allow_auto_topic_creation.write(output)?;
        self.include_topic_authorized_operations.write(output)?;
        write_tagged_fields(output, &[], &self._unknown_tagged_fields)?;
        Ok(())
    }
}

#[derive(Clone, PartialEq)]
pub struct MetadataRequestTopic {
    pub topic_id: Uuid,
    pub name: Option<String>,
    pub _unknown_tagged_fields: Vec<RawTaggedField>,
}

/* Generated code */
impl Readable for MetadataRequestTopic {
    fn read(#[allow(unused)] input: &mut impl Read) -> Result<Self> {
        let topic_id = Uuid::read(input)?;
        let name = Option::<String>::read_ext(input, "name", true)?;
        let tagged_fields_callback = |tag: i32, _: &[u8]| {
            match tag {
                _ => Ok(false)
            }
        };
        let _unknown_tagged_fields = read_tagged_fields(input, tagged_fields_callback)?;
        Ok(MetadataRequestTopic {
            topic_id, name, _unknown_tagged_fields
        })
    }
}

/* Generated code */
impl Writable for MetadataRequestTopic {
    fn write(&self, #[allow(unused)] output: &mut impl Write) -> Result<()> {
        self.topic_id.write(output)?;
        self.name.write_ext(output, "name", true)?;
        write_tagged_fields(output, &[], &self._unknown_tagged_fields)?;
        Ok(())
    }
}

If we ignore the meaning of these fields, this is quite trivial a situation of a simple struct containing a bunch of instances of another struct + some extra data. You can find this in every other program.

Step 2. Property test for serialization/deserialization

First, let’s add the proptest crate to the project. Then, write code like this:

#[cfg(test)]
mod tests {
    use std::io::{Cursor, Seek, SeekFrom};

    use proptest::prelude::*;

    use super::*;

    proptest! {
        #[test]
        fn test_serde(data: MetadataRequest) {
            // Serialize.
            let mut cur = Cursor::new(Vec::<u8>::new());
            data.write(&mut cur).unwrap();

            // Deserialize.
            cur.seek(SeekFrom::Start(0)).unwrap();
            let data_read = MetadataRequest::read(&mut cur).unwrap();

            // Compare.
            prop_assert_eq!(data_read, data.clone());
        }
    }
}

As you see, we don’t explicitly create instances of MetadataRequest. Due to the proptest! macro, the Proptest does this automatically for us.

However, if we try to compile this, it won’t:

error[E0277]: the trait bound `v12::MetadataRequest: Arbitrary` is not satisfied
   --> src/schema/metadata_request/v12.rs:90:29
    |
90  |         fn test_serde(data: MetadataRequest) {
    |                             ^^^^^^^^^^^^^^^ the trait `Arbitrary` is not implemented for `v12::MetadataRequest`

This means that the macro is expecting an implementation of Arbitrary to be available for the structure MetadataRequest for the magic to work.

Let’s understand the central concept of the Proptest library (and many other property testing libraries), which is Strategy. Quoting the documentation:

A strategy defines two things:

How to generate random values of a particular type from a random number generator.

How to “shrink” such values into “simpler” forms.

Arbitrary determines a canonical Strategy for the implementing type. To generate test instances, Proptest requires the type to implement the Arbitrary trait (there are other ways to tell Proptest what Strategy to use, too).

There are many strategies for fundamental types provided by the library. Let’s implement (or rather, derive) the strategy for our structs.

Step 3. Deriving `Arbitrary`

Implementing Arbitrary manually for each struct we may be interested in would take a lot of boring and error-prone work. Luckily, the automatic generation of simple strategies for structs, enums is straightforward and can be done automatically with the proptest-derive crate. It provides the #[derive(Arbitrary)] macro, which we’re going to use here. We will also need Debug (because Arbitrary needs it), and Clone and PartialEq (for the test code), which also could be derived.

#[derive(Arbitrary, Debug, Clone, PartialEq)]
pub struct MetadataRequest {
...

Since derivation is recursive, the same is needed for MetadataRequestTopic used by MetadataRequest:

#[derive(Arbitrary, Debug, Clone, PartialEq)]
pub struct MetadataRequestTopic {
...

However, this still doesn’t compile:

error[E0277]: the trait bound `uuid::Uuid: types::_::_proptest::arbitrary::Arbitrary` is not satisfied
 --> src/schema/metadata_request/v12.rs:9:10
  |
9 | #[derive(Arbitrary, Debug)]
  |          ^^^^^^^^^ the trait `types::_::_proptest::arbitrary::Arbitrary` is not implemented for `uuid::Uuid`

MetadataRequestTopic uses Uuid inside, which doesn’t implement Arbitrary. Moreover, Arbitrary and Uuid are defined in different crates. Rust prohibits implementing the former for the latter following the “orphan rule”. A workaround may be to create a new wrapper type. However, this is a public-facing API of kafka_wire_protocol and it didn’t sound great to me to expose a newtype for UUIDs there.

Proptest offers another workaround.

Step 4. Custom strategy for `Uuid`

Apart from the default strategy provided by Arbitrary, one can define as many custom strategies as needed and instruct proptest-derive to use it with annotations. Here you can find some info on how to create custom strategies.

What’s a random UUID? It’s random 128 bits. Proptest has two things:

The out-of-the-box strategy for generating 128-bit numbers.
A way to transform strategies to produce new strategies.

Let’s combine these two:

pub(crate) fn uuid() -> impl Strategy<Value=Uuid> {
    any::<u128>().prop_map(Uuid::from_u128)
}

Now let’s ask Proptest to use this strategy when deriving Arbitrary for MetadataRequestTopic. First, import the strategy if it’s defined in another module:

use crate::test_utils::proptest_strategies;

Then, annotate the field in the MetadataRequestTopic struct:

#[derive(Arbitrary, Debug, Clone, PartialEq)]
pub struct MetadataRequestTopic {
    #[proptest(strategy = "proptest_strategies::uuid()")]
    pub topic_id: Uuid,
    pub name: Option<String>,
    pub _unknown_tagged_fields: Vec<RawTaggedField>,
}

Step 5. First run and first failure

The code finally compiles, let’s run the tests:

Test failed: called `Result::unwrap()` on an `Err` value: Custom { kind: Other, error: "Invalid raw tag field list: tag -1 comes after tag -1, but is not higher than it." }.
minimal failing input: data = MetadataRequest {
    topics: None,
    allow_auto_topic_creation: false,
    include_topic_authorized_operations: false,
    _unknown_tagged_fields: [
        RawTaggedField {
            tag: -1,
            data: [],
        },
        RawTaggedField {
            tag: 0,
            data: [],
        },
    ],
}
    successes: 0
    local rejects: 0
    global rejects: 0
thread 'schema::metadata_request::v12::tests::test_serde' panicked at src/schema/metadata_request/v12.rs:97:34:
called `Result::unwrap()` on an `Err` value: Custom { kind: Other, error: "Invalid raw tag field list: tag -358390220 comes after tag -1, but is not higher than it." }

(Your run may have different tag values or number of tagged fields, but the idea will be the same.)

The run failed. The error is Invalid raw tag field list: tag -1 comes after tag -1, but is not higher than it. It’s the details of the implementation (you can read the code here if you’re interested), but the high-level idea is that it’s not allowed to have two tagged fields with non-increasing tags. So, the serde code may be correct, but the derived Proptest strategy isn’t. Let’s fix this issue.

Step 6. Custom strategy for `Vec<RawTaggedField>`

There are multiple types of strategies to employ for generating this _unknown_tagged_fields field. However, a simple one will suffice. Defined in human language, it would read:

Either no unknown tagged fields, or a single field tagged with 999 and random bytes.

Let’s define it:

pub(crate) fn unknown_tagged_fields() -> impl Strategy<Value=Vec<RawTaggedField>> {
    prop_oneof![
        Just(Vec::<RawTaggedField>::new()),
        bytes().prop_map(|data| vec![RawTaggedField{ tag: 999, data }]),
    ]
}

The prop_oneof! macro tells Proptest to select one of its elements, which are Strategy themselves. It’s like an if of switch statement in the randomized property testing world. For example, check out how it can be used for generating enums. prop_oneof! supports assigning weights to variants in case you need some of them to appear more frequently than others.

Just is a very simple strategy that always produces the provided value. In this case, the strategy gives out an empty vector of RawTaggedField.

In the second arm of prop_oneof! there’s the strategy for creating a single-element vector of RawTaggedField. It’s created by transformation from our yet-to-be-written random byte vector strategy bytes() into a single element vector strategy where the bytes represent the tagged field data.

Let’s now define bytes(). Proptest offers an out-of-the-box strategy for Vec<u8>. However, it works with an arbitrary length. Huge values would not bring any new information to the tests, only slow them down and increase the resource consumption. So let’s limit the length of the output.

pub(crate) fn bytes() -> impl Strategy<Value=Vec<u8>> {
    collection::vec(prop::num::u8::ANY, collection::size_range(0..10))
}

collection::vec is Proptest’s way to define a vector strategy based on an element strategy. In the second parameter, we define the target range of the vector.

Having created the unknown_tagged_fields strategy, let’s apply it to our structs:

#[derive(Arbitrary, Debug, Clone, PartialEq)]
pub struct MetadataRequest {
    ...
    #[proptest(strategy = "proptest_strategies::unknown_tagged_fields()")]
    pub _unknown_tagged_fields: Vec<RawTaggedField>,
}

#[derive(Arbitrary, Debug, Clone, PartialEq)]
pub struct MetadataRequestTopic {
    ...
    #[proptest(strategy = "proptest_strategies::unknown_tagged_fields()")]
    pub _unknown_tagged_fields: Vec<RawTaggedField>,
}

Now when we run the test, it will finally pass successfully.

Conclusion

There’s more to Proptest that I wrote in this post (but not much more), please check the documentation. But I hope this practical introduction will help you get started with the library in your projects. Also, check out similar libraries for different languages like the mentioned Hypothesis for Python or jqwik for Java.

Foot note: testing against real Java Kafka code

With property testing done, we’re more or less sure that our serialization and deserialization work correctly together. But what if they contain a dual error that makes them correct with respect to each other, but incorrect for the real Kafka serialization/deserialization, which we consider the gold standard and source of truth? There’s only one way to be sure about that, and that is to test against the Java code.

In my day job, we developed an open source library Kio. It essentially does what kafka_wire_protocol does but in Python. Kio has a component named (very straightforwardly) Java Tester, implemneted by yours truly some time ago. Its point was simple yet powerful: We generate random objects like in our serde tests. But instead of just checking the serialization-deserialization pair with our generated code, why don’t we also ask the real Java Kafka serde code to do the same and compare the results? Java Tester did exactly that: it serialized the object in binary form and also in JSON and sends these two representations to the special Java tester process. This process, in turn, reconstructed the object based on its JSON representation and then serialized it in the binary form itself. After that, it compared what the external code produced with its own result. The expectation is byte-to-byte equality, otherwise an error is reported.

The Java Tester approach is what was needed for kafka_wire_protocol. So I copied it from Kio with a few changes. It’s pretty sizeable (due to the reconstruction logic), you can check it out in the repo. I’m not going to explain its internal details, because it’d be mostly off-topic.

See my recent post on the Kafka protocol. ↩︎