Protocol buffers: the early reviews are in [dive into mark]
Google (my current employer) has finally open sourced protocol buffers, the data interchange format we use for internal server-to-server communication. The blogosphere’s response? “No wireless. Less space than a Nomad. Lame.”
Aaaaanyway…
Protocol buffers are “just” cross-platform data structures. All you have to write is the schema (a .proto
file), then generate bindings in C++, Java, or Python. (Or Haskell. Or Perl.) The .proto
file is just a schema; it doesn’t contain any data except default values. All getting and setting is done in code. The serialized over-the-wire format is designed to minimize network traffic, and deserialization (especially in C++) is designed to maximize performance. I can’t begin to describe how much effort Google spends maximizing performance at every level. We would tear down our data centers and rewire them with $500 ethernet cables if you could prove that it would reduce latency by 1%.
Besides being blindingly fast, protocol buffers have lots of neat features. A zero-size PB returns default values. You can nest PBs inside each other. And most importantly, PBs are both backward and forward compatible, which means you can upgrade servers gradually and they can still talk to each other in the interim. (When you have as many machines as Google has, it’s always the interim somewhere.)
Comparisons to other data formats was, I suppose, inevitable. Old-timers may remember ASN.1 or IIOP. Kids these days seem to compare everything to XML or JSON. They’re actually closer to Facebook’s Thrift (written by ex-Googlers) or SQL Server’s TDS. Protocol buffers won’t kill XML (no matter how much you wish they would), nor will they replace JSON, ASN.1, or carrier pigeon. But they’re simple and they’re fast and they scale like crazy, and that’s the way Google likes it.
No comments:
Post a Comment