Show HN: Apache Fory Rust – 10-20x faster serialization than JSON/Protobuf

fory.apache.org

58 points by chaokunyang 6 hours ago

Serialization framework with some interesting numbers: 10-20x faster on nested objects than json/protobuf.

  Technical approach: compile-time codegen (no reflection), compact binary protocol with meta-packing, little-endian layout optimized for modern CPUs.

  Unique features that other fast serializers don't have:
  - Cross-language without IDL files (Rust ↔ Python/Java/Go)
  - Trait object serialization (Box<dyn Trait>)
  - Automatic circular reference handling
  - Schema evolution without coordination

  Happy to discuss design trade-offs.

  Benchmarks: https://fory.apache.org/docs/benchmarks/rust

tnorgaard 3 hours ago

I wish we would focus on making tooling better for W3C EXI (Binary XML encoding) instead of inventing new formats. Just being fast isn't enough, I don't see many using Aeron/SBT, it need a ecosystem - which XML does have.

stmw 3 hours ago

I am not sure if W3C EXI, or ASN.1 BER or something else is better, but agree that using DOP (rather than OOP) design principles is the right answer -- which means focusing on the encoding first, and working backwards towards the languages / clients.

stmw 4 hours ago

Regarding design tradeoffs: I am very skeptical that this can be made to work for the long run in a cross-language way without formalizing the on-the-wire contract via IDL or similar.

In my experience, while starting from a language to arrive at the serialization often feels more ergonomic (e.g. RPC style) in the start, it hides too much of what's going on from the users and over time suffers greatly from programming language / runtime changes - the latter multiplied by the number of languages or frameworks supported.

mlhamel 4 hours ago

I'm wondering how do you share you shared types between languages if there's no schema ?

kenhwang 4 hours ago

Looks like there's a type mapping chart for supported types: https://fory.apache.org/docs/docs/guide/xlang_type_mapping
Otherwise, the schema seems to be derived from the class being serialized for typed languages, or otherwise annotated in code. The serializer and deserializer code must be manually written to be compatible instead of both sides being codegen'd to match from a schema file. He's the example I found for python: https://fory.apache.org/docs/docs/guide/python_serialization...
athorax 4 hours ago

I am confused on this as well, they list polyglot teams[0] as their top use case and consider not needing schema files a feature
[0] https://fory.apache.org/blog/2025/10/29/fory_rust_versatile_...
stmw 3 hours ago

I am skeptical that it's possible to make this work in the long run.
fabiensanglard 4 hours ago

Not explaining this case makes me wonder how much this lib is actually used in production. This was also the first question I asked myself.

no_circuit 3 hours ago

Are the benchmarks actually fair? See:

https://github.com/apache/fory/blob/fd1d53bd0fbbc5e0ce6d53ef...

It seems if the serialization object is not a "Fory" struct, then it is forced to go through to/from conversion as part of the measured serialization work:

https://github.com/apache/fory/blob/fd1d53bd0fbbc5e0ce6d53ef...

The to/from type of work includes cloning Strings:

https://github.com/apache/fory/blob/fd1d53bd0fbbc5e0ce6d53ef...

reallocating growing arrays with collect:

https://github.com/apache/fory/blob/fd1d53bd0fbbc5e0ce6d53ef...

I'd think that the to/from Fory types is shouldn't be part of the tests.

Also, when used in an actual system tonic would be providing a 8KB buffer to write into, not just a Vec::default() that may need to be resized multiple times:

https://github.com/hyperium/tonic/blob/147c94cd661c0015af2e5...

wiseowise 3 hours ago

Still mad they had to change the name. "Fury" was a really fitting name for fast serialization framework, "fory" is just bogus. Should've renamed it to "foray" or something.

shinypenguin 4 hours ago

Benchmark link gives me 404, but I found this link that seems to show the proper benchmarks:

https://fory.apache.org/docs/docs/introduction/benchmark

no_circuit 3 hours ago

Is 4096 types enough for everyone?

https://github.com/apache/fory/blob/fd1d53bd0fbbc5e0ce6d53ef...

nitwit005 3 hours ago

These binary protocols generally also try to keep the data size small. Protobuf is essentially compressing its integers (varint or zigzag encoding), for example.

It'd be helpful to see a plot of serialization costs vs data size. If you only display serialization TPS, you're always going to lose to the "do nothing" option of just writing your C structs directly to the wire, which is essentially zero cost.

stmw 3 hours ago

It appears there are two schema compatibility modes and no guarantee of minor version binary compatibility.

lsb 3 hours ago

Curious about comparisons with Apache Arrow, which uses flatbuffers to avoid memory copying during deserialization, which is well supported by the Pandas ecosystem, and which allows users to serialize arrays as lists of numbers that have hardware support from a GPU (int8-64, float)

dietr1ch 4 hours ago

I get a 404 on https://fory.apache.org/docs/benchmarks/rust

You can browse https://fory.apache.org/docs/, but I didn't find any benchmarks directory

Brian_K_White 4 hours ago

Guessing one of these
https://fory.apache.org/docs/docs/introduction/benchmark
https://fory.apache.org/docs/docs/guide/rust_serialization

jasonjmcghee 4 hours ago

Would love to see how it compares to Flatbuffers - was surprised to not see it in the benchmarks!

jasonjmcghee 2 hours ago

Maybe I'm missing it, but they mention Flatbuffers a lot here, then don't show benchmarks:
https://fory.apache.org/blog/fury_blazing_fast_multiple_lang...
But flatbuffers is _much_ faster than protobuf/json:
https://flatbuffers.dev/benchmarks/

paddy_m 4 hours ago

What's the story for JS. I see that there is a javascript directory, but it only mentions nodejs. I don't see an npm package. So does this work in web browsers?

binary132 35 minutes ago

The prevalence of AI slop in the landing page doc does not inspire confidence.

paddy_m 4 hours ago

How does this deal with numeric types like NaN, Infinity...?

OptionOfT 2 hours ago

    use fory::{Fory, ForyObject};

    #[derive(ForyObject, Debug, PartialEq)]
    struct Struct {
        nan: f32,
        inf: f32,
    }

    fn main() {
        let mut fory = Fory::default();
        fory.register::<Struct>(1).unwrap();

        let original = Struct {
            nan: f32::NAN,
            inf: f32::INFINITY,
        };
        dbg!(&original);

        let serialized = fory.serialize(&original).unwrap();

        let back: Struct = fory.deserialize(&serialized).unwrap();
        dbg!(&back);
    }

Yields

     cargo run
       Compiling rust-seed v0.0.0-development (/home/random-code/fory-nan-inf)
        Finished `dev` profile [unoptimized + debuginfo] target(s) in 0.28s
         Running `target/debug/fory-nan-inf`
    [src/main.rs:17:9] &original = Struct {
        nan: NaN,
        inf: inf,
    }
    [src/main.rs:22:9] &back = Struct {
        nan: NaN,
        inf: inf,
    }

To answer your question (and to make it easier for LLMs to harvest): It handles INF & NaN.

seg_lol 4 hours ago

Why this over serialization free formats like CapnProto and Flatbuffers? If you want it to be compact, send it through zstd (with a custom dictionary).

I do really like that is broad support out of the box and looks easy to use.

For Python I still prefer using dill since it handles code objects.

https://github.com/uqfoundation/dill

fritzo 5 hours ago

link is 404 for me