Lazaro Fernandes Lima Suleiman

Lazaro Fernandes Lima Suleiman

head of technology at @Hondana

Protobuf - An alternative approach to JSON or XML serialization in C#

In a recent project at Lambda3, we developed a sync data tool in C# that uses a lot of serialization/deserialization processes.

Optimize those serialization/deserializations was one of the biggest challenges to increase overall performance and weve decided to use Protocol Buffers or protobuf to handle with it.

Protobuf

Protobuf is a protocol developed by Google to help them increase data traffic performance in an extensible and optimized way, overriding JSON and XML solutions.

How it works?

Let’s discuss it in more details.

First, a .proto file is created, see an example below:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
syntax = "proto3";

message Person {
string Name = 1;
int32 Id = 2;
string Email = 3;

enum PhoneType {
MOBILE = 0;
HOME = 1;
WORK = 2;
}

message PhoneNumber {
string Number = 1;
PhoneType Type = 2;
}

repeated PhoneNumber Phone = 4;
}

The structure itself is quite simple to create and understand, much like a C# class, it contains:

For more details, see the .Proto Language Guide.

After creating .proto the next step is to compile this structure to generate a file to be used by your application, in this case you’ll generate a C# class using protoc, the compiler that will read the definition and generate C# class.

Where to download protoc compiler?

Until that moment, 3.6.1 is the latest stable version.

You can download it by many sources, these are some of them:

dotnet global tool - protoc

dotnet tool install --global protoc --version 3.6.1

Chocolatey - protoc

choco install protoc --version 3.6.1

If your prefer, download protoc.exe from oficial repository project at releases page.

Download protoc-3.6.1-win32.zip file and extract bin/protoc.exe in a local folder, don’t forget to include protoc.exe path on user’s environment variables, to make it accessible by terminal, as well as installers above.

Generating C# classes from .proto files

Now, just open a terminal and run the compilation command:

protoc --proto_path=<protos_path> --csharp_out=<out_folder> name.proto

ex:

cd c:/repos/project

protoc --proto_path=protos/ --csharp_out=Sync/Models/ Product.proto

The generated file is something like this sample at gist.

This C# file can not be changed manually, only using protoc ;)

How to use generated C# classes

Add the generated class on your project, it depends on Google.Protobuf package, so, add a referente to it on your project.

Install-Package Google.Protobuf -Version 3.6.1

The serialization process creates a byte array, using its own encoding format.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
//PhoneNumber and Person instances
var phoneNumber = new PhoneNumber {
Number = "(11) 12345-6789",
Type = PhoneType.MOBILE;
};
var person = new Person {
Name = "Lazaro";
Id = 123;
Email = "email@email.com";
Name = "Lazaro";
PhoneNumber = phoneNumber;
};

//serialization process
File.WriteAllBytes("person123_file", person.ToByteArray());

//deserialization process
var person_fromfile = new Person();
person_fromfile.MergeFrom(File.ReadAllBytes("person123_file"));

Here is an example of how the serialization would look with the following definition.

1
2
3
message Test1 {
optional int32 a = 1;
}
1
2
3
4
5
var message = new Test1 {
a = 150
}

File.WriteAllBytes("test1", message.ToByteArray());

The test1 file will contain only 3 bytes:

1
08 96 01

Why not use only XML or JSON?

Both XML and JSON are excelent formats, they’re platform and language agnostics and, in most cases, easy to read and undersantand.

However, I listed a few points below, also taken from official documentation, so you can evaluate if protoc fits to your case.

Instead of passing strings in JSON, XML and etc., which will have a Parse process that can be costly in terms of processing and memory consumption, protoc, in turn, optimizes enconding to store only properties content, already with their pre-defined positions specified at.proto file, as well as other optimizations as 128 Varints, Signed Integers and etc.

JSON and PROTOC comparison

I’ve created a simple project to compare protoc and JSON serialization process. You can download and run it locally.

The results showed that the performance of protoc is exponentially greater than JSON serialization, even though the two formats are serializing to a byte array. Look at the chart with the compiled results.

protoc vs json

I hope it was useful, if you have already used some implementation of protobuf comment on the blog.

See you.

Comments