June 6, 2025
Serde is one of the most widely used crates in the Rust ecosystem, and for good reason. As its tagline says, Serde is "a framework for serializing and deserializing Rust data structures efficiently and generically." What’s remarkable is how Serde’s data model enables support for both human-readable formats like JSON and YAML, as well as compact binary formats like Bincode—all while remaining highly performant.
In this article, we’ll explore how Serde (and its ecosystem of data formats) achieves this. To keep things focused, we’ll look specifically at serialization to JSON, but the concepts apply to other formats and to deserialization as well.
When learning a new library, I like to imagine how I might implement it myself. Sometimes my guess is close; other times, not so much. With Serde, my initial mental model was:
flowchart TD
A[Rust struct] -- Serialize --> B[Serde Data Model]
B -- Format (JSON/Bincode/etc) --> C[Output Format]
I thought that #[derive(Serialize)]
would generate code like:
impl Serialize for Point {
fn serialize(&self) -> SerdeDataModel {
// ...
}
}
And that serde_json::to_string
would look like:
fn to_string<T: Serialize>(input: T) -> String {
let serde_data_model = input.serialize();
// Traverse the data model and build JSON
// ...
output
}
But as I dug into the source, I realized this isn’t how Serde works at all.
Let’s ground this with a real example:
use serde::{Serialize, Deserialize};
#[derive(Serialize, Deserialize, Debug)]
struct Point {
x: i32,
y: i32,
}
fn main() {
let point = Point { x: 1, y: 2 };
let serialized = serde_json::to_string(&point).unwrap();
println!("serialized = {}", serialized); // {"x":1,"y":2}
}
Let’s follow the path of serde_json::to_string(&point)
:
pub fn to_string<T: ?Sized>(value: &T) -> Result<String>
where
T: Serialize,
{
let vec = to_vec(value)?;
let string = unsafe { String::from_utf8_unchecked(vec) };
Ok(string)
}
This calls to_vec
, which calls to_writer
, which creates a Serializer
and calls value.serialize(&mut ser)
.
Here’s the key: the Serialize
trait is implemented for your type (via the derive macro), and it drives the serialization process by calling methods on the serializer.
Expanded, the generated code for Point
looks like:
impl serde::Serialize for Point {
fn serialize<S>(&self, serializer: S) -> Result<S::Ok, S::Error>
where
S: serde::Serializer,
{
let mut state = serializer.serialize_struct("Point", 2)?;
state.serialize_field("x", &self.x)?;
state.serialize_field("y", &self.y)?;
state.end()
}
}
Notice: there’s no intermediate "Serde data model" struct or enum. Instead, the derive macro generates code that directly calls methods on the serializer.
The real "data model" in Serde is a set of methods defined by the Serializer
trait. Each data format (like JSON, Bincode, etc.) implements this trait. The generated Serialize
impl for your type calls these methods in the right order, passing in the data.
For example, in serde_json
, serialize_struct
just delegates to serialize_map
(since JSON objects are maps):
fn serialize_struct(self, name: &'static str, len: usize) -> Result<Self::SerializeStruct> {
self.serialize_map(Some(len))
}
And serialize_map
writes the opening {
to the output, then returns a state object that handles writing keys and values.
Here’s a simplified flow of what happens when you serialize a struct:
sequenceDiagram
participant User as User Code
participant Derive as #[derive(Serialize)]
participant SerdeJson as serde_json::Serializer
User->>Derive: #[derive(Serialize)] on Point
User->>SerdeJson: serde_json::to_string(&point)
SerdeJson->>Derive: Point::serialize(&mut ser)
Derive->>SerdeJson: serialize_struct("Point", 2)
Derive->>SerdeJson: serialize_field("x", &self.x)
Derive->>SerdeJson: serialize_field("y", &self.y)
Derive->>SerdeJson: end()
Each call to serialize_field
writes a key and value to the output, using the serializer’s methods. There’s no intermediate representation—just a series of method calls.
My original guess involved building an intermediate data structure, but Serde’s approach avoids this for performance. The derive macro generates code that walks your data structure and calls serializer methods directly, minimizing allocations and maximizing speed.
The "Serde data model" isn’t a struct or enum—it’s a set of trait methods. The derive macro generates the driver (the Serialize
impl), and each data format provides the engine (the Serializer
trait implementation).
Serde’s design is elegant and efficient. By generating code that directly drives the serializer, it avoids unnecessary allocations and enables support for a wide range of data formats. If you want to understand how Serde works for other formats or for deserialization, you can follow a similar process—trace the trait implementations and see how the pieces fit together.
References: