Understanding Serde: A Deep Dive into Rust Serialization and Deserialization

Serde is one of the most widely used crates in the Rust ecosystem, and for good reason. As its tagline says, Serde is "a framework for serializing and deserializing Rust data structures efficiently and generically." What’s remarkable is how Serde’s data model enables support for both human-readable formats like JSON and YAML, as well as compact binary formats like Bincode—all while remaining highly performant.

In this article, we’ll explore how Serde (and its ecosystem of data formats) achieves this. To keep things focused, we’ll look specifically at serialization to JSON, but the concepts apply to other formats and to deserialization as well.

The Serde Data Model: A Mental Model

When learning a new library, I like to imagine how I might implement it myself. Sometimes my guess is close; other times, not so much. With Serde, my initial mental model was:

flowchart TD
    A[Rust struct] -- Serialize --> B[Serde Data Model]
    B -- Format (JSON/Bincode/etc) --> C[Output Format]

I thought that #[derive(Serialize)] would generate code like:

impl Serialize for Point {
    fn serialize(&self) -> SerdeDataModel {
        // ...
    }
}

And that serde_json::to_string would look like:

fn to_string<T: Serialize>(input: T) -> String {
    let serde_data_model = input.serialize();
    // Traverse the data model and build JSON
    // ...
    output
}

But as I dug into the source, I realized this isn’t how Serde works at all.

A Real Example

Let’s ground this with a real example:

use serde::{Serialize, Deserialize};

#[derive(Serialize, Deserialize, Debug)]
struct Point {
    x: i32,
    y: i32,
}

fn main() {
    let point = Point { x: 1, y: 2 };
    let serialized = serde_json::to_string(&point).unwrap();
    println!("serialized = {}", serialized); // {"x":1,"y":2}
}

Tracing the Code: How Serde Actually Works

Let’s follow the path of serde_json::to_string(&point):

pub fn to_string<T: ?Sized>(value: &T) -> Result<String>
where
    T: Serialize,
{
    let vec = to_vec(value)?;
    let string = unsafe { String::from_utf8_unchecked(vec) };
    Ok(string)
}

This calls to_vec, which calls to_writer, which creates a Serializer and calls value.serialize(&mut ser).

Here’s the key: the Serialize trait is implemented for your type (via the derive macro), and it drives the serialization process by calling methods on the serializer.

Expanded, the generated code for Point looks like:

impl serde::Serialize for Point {
    fn serialize<S>(&self, serializer: S) -> Result<S::Ok, S::Error>
    where
        S: serde::Serializer,
    {
        let mut state = serializer.serialize_struct("Point", 2)?;
        state.serialize_field("x", &self.x)?;
        state.serialize_field("y", &self.y)?;
        state.end()
    }
}

Notice: there’s no intermediate "Serde data model" struct or enum. Instead, the derive macro generates code that directly calls methods on the serializer.

The Serializer Trait: The Real Data Model

The real "data model" in Serde is a set of methods defined by the Serializer trait. Each data format (like JSON, Bincode, etc.) implements this trait. The generated Serialize impl for your type calls these methods in the right order, passing in the data.

For example, in serde_json, serialize_struct just delegates to serialize_map (since JSON objects are maps):

fn serialize_struct(self, name: &'static str, len: usize) -> Result<Self::SerializeStruct> {
    self.serialize_map(Some(len))
}

And serialize_map writes the opening { to the output, then returns a state object that handles writing keys and values.

Walking Through Serialization

Here’s a simplified flow of what happens when you serialize a struct:

sequenceDiagram
    participant User as User Code
    participant Derive as #[derive(Serialize)]
    participant SerdeJson as serde_json::Serializer
    User->>Derive: #[derive(Serialize)] on Point
    User->>SerdeJson: serde_json::to_string(&point)
    SerdeJson->>Derive: Point::serialize(&mut ser)
    Derive->>SerdeJson: serialize_struct("Point", 2)
    Derive->>SerdeJson: serialize_field("x", &self.x)
    Derive->>SerdeJson: serialize_field("y", &self.y)
    Derive->>SerdeJson: end()

Each call to serialize_field writes a key and value to the output, using the serializer’s methods. There’s no intermediate representation—just a series of method calls.

Why This Design?

My original guess involved building an intermediate data structure, but Serde’s approach avoids this for performance. The derive macro generates code that walks your data structure and calls serializer methods directly, minimizing allocations and maximizing speed.

The "Serde data model" isn’t a struct or enum—it’s a set of trait methods. The derive macro generates the driver (the Serialize impl), and each data format provides the engine (the Serializer trait implementation).

Conclusion

Serde’s design is elegant and efficient. By generating code that directly drives the serializer, it avoids unnecessary allocations and enables support for a wide range of data formats. If you want to understand how Serde works for other formats or for deserialization, you can follow a similar process—trace the trait implementations and see how the pieces fit together.

References: