Protocol Buffers - Google's data interchange format πŸŽƒ

What is protobuf? πŸ‘€

Protocol Buffers are language-neutral, platform-neutral extensible mechanisms for serializing structured data.

  • Basically, Protocol Buffers (often referred to as Protobuf) is a data format developed by Google used for serializing data, similar to XML or JSON.

  • You define how you want your data to be structured in a .proto file, and then the protocol buffer compiler generates code to interface with the data.

Replacement For JSON? πŸͺ

While Protobuf can be used as an alternative to JSON in certain scenarios, it's not a complete replacement, as JSON is often preferred for its human-readable format and ease of use in web development, while Protobuf excels in performance and data compactness for high-throughput internal services where data size and processing speed are critical factors.

  • Protobuf is better suited for situations where performance is prioritized over human readability, making it a complementary option to JSON rather than a direct replacement.

  • JSON is ideal for web APIs and client-side data exchange because of its easy parsing by web browsers, while Protobuf is better for high-volume data exchange between backend services.

Okay, but how fast is Protobuf? πŸ€”

Credit: Arpit Bhayani's X profile

Special thanks to Arpit Bhayani, whose insightful performance testing results illuminated this article. The above test image was sourced from his X profile.

LinkedIn transitions from JSON to Protocol Buffers! πŸ”—

LinkedIn's decision to move from JSON to Protocol Buffer proved how impactful the right data format can be for efficiency, reducing their service latency by 60%. This real-world example highlights why organizations should carefully consider their data handling methods to achieve better performance as they scale.

Following LinkedIn's success, many firms are increasingly considering the switch from JSON to Protobuf. The advantages of Protobuf, such as reduced payload sizes and improved serialization efficiency, are compelling organizations to explore this transition.

If you're curious about this whole JSON to Protocol Buffer switch, definitely check out how LinkedIn did it - it really breaks everything down nicely, Here!

You can checkout this post too for the same.., Here!

Working with Protobuf ✨,

  1. Download and install the protocol buffer compiler.

  2. Protobuf supports multiple programming languages, allowing for efficient implementation in various language environments.

  3. Select the programming language of your choice, language guide

πŸ’₯
Detailed steps of using protobuf -
  1. Define Schema: Protocol Buffers (Protobuf) work by defining a structured schema for data using a interface description language.

    • Create a .proto file to define the structure of their data using the Protobuf schema definition language.

    •           syntax = "proto3";
                message Person {
                     string name = 1;
                     int32 id = 2;
                     bool has_ponycopter = 3;
                 }
      
  2. Compile Schema: The .proto file is compiled using a Protobuf compiler (`protoc`).

    • This compiler generates language-specific code (such as Java, Python, etc.) based on the defined schema. Example command for generating Python code,

    •           protoc - python_out=. your_proto_file.proto
      
  3. Generate Code : The generated code includes classes for each message type, as well as methods for serialization and deserialization. These methods are optimized for efficiency and are used to convert data between its in-memory representation and the serialized binary format.

class Person:
 def __init__(self):
 self.name = ""
 self.id = 0
 self.has_ponycopter = False

def serialize_to_bytes(self):
 # Serialization logic

def parse_from_bytes(self, data):
 # Deserialization logic
  1. Serialize Data: Now create instances of the generated classes, set values for the fields, and use the provided serialization methods to convert the data into a binary format.

     person = Person()
     person.name = "John Doe"
     person.id = 123
     person.has_ponycopter = True
     serialized_data = person.serialize_to_bytes()
    
  2. Deserialize data on receving side: The binary data can be transmitted over a network, stored in a file, or sent between systems.

    • On the receiving end, the binary data is deserialized using the generated code, reconstructing the original in-memory data structure.

        new_person = Person()
        new_person.parse_from_bytes(serialized_data)
        print(new_person.name) # Output: John Doe
        print(new_person.id) # Output: 123
        print(new_person.has_ponycopter) # Output: True
      

At a glance …

Protobuf is a binary format.
- It is not human-readable and optimized for serialization and efficient data interchange.
- Its used for high-performance , low latency API calls

Thanks for reading ✨ I hope you found these insights valuable. If you learned something new today, don't forget to share this knowledge with others who might benefit from it.

Let’s connect on, Twitter , Linkedin Find all the other important links at, Linktree πŸ€

Until next time! Goodbye πŸ‘‹πŸ»

11
Subscribe to my newsletter

Read articles from Anirudh Singh Bhadauria directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

Anirudh Singh Bhadauria
Anirudh Singh Bhadauria

Hello world!, I am a software engineer, I integrate Data engineering and software engineering together to make beautiful software solutions that actually solves problem.