Bootstrap

【杂谈】protobuf详解

概述

我们在日常开发过程中进行网络通信和数据交换等应用场景中经常使用的技术是json或xml,最近接触了Google的Protobuf。

在查阅相关资料学习 ProtoBuf 以及研读其源码之后,发现其在效率、兼容性等方面非常出色。在以后的项目技术选型中,尤其是网络通信、通用数据交换等场景应该会优先选择 ProtoBuf。

下面详细的看下protobuf相关的内容

Protobuf简介

protobuf (protocol buffer) 是谷歌内部的混合语言数据标准。通过将结构化的数据进行序列化,用于通讯协议、数据存储等领域和语言无关、平台无关、可扩展的序列化结构数据格式。

官方网站:https://developers.google.cn/protocol-buffers
推荐:https://zhuanlan.zhihu.com/p/141415216

protobuf的几个特点

多语言,多平台。自定义源文件,存储类
高效。二进制数据交互格式,可以安装编译器protoc.proto编译成各种文件 变成可以使用的类
扩展性好,兼容性好。可以在源文件更新数据结构

Protobuf使用

工作流程
在介绍如何使用protobuf之前,我们先看看它的工作流程,
Linux技术宅 出处bilibili
对于序列化协议来说,使用方只需要关注业务对象本身,即 idl 定义,序列化和反序列化的代码只需要通过工具生成即可

创建 .proto文件

定义数据结构,参考官方的实例

syntax = "proto2";

package tutorial;

message Person {
  optional string name = 1;
  optional int32 id = 2;
  optional string email = 3;

  enum PhoneType {
    MOBILE = 0;
    HOME = 1;
    WORK = 2;
  }

  message PhoneNumber {
    optional string number = 1;
    optional PhoneType type = 2 [default = HOME];
  }

  repeated PhoneNumber phones = 4;
}

message AddressBook {
  repeated Person people = 1;
}

这里边涉及到一些protobuf的一些语法规则,其中message是定义一个消息类型的关键字,相当于在一些语言中定义类class。每个message都会生成一个名字与之对应的类,该类公开继承自google::protobuf::Message。具体数据定义的语法规则,我们后面详细展开介绍(proto中2和3有些语法的区别)。

编译proto文件

这里涉及到protoc编译器的下载和安装

下载地址:https://github.com/protocolbuffers/protobuf/releases

# window
https://blog.csdn.net/weixin_43440680/article/details/122178381

# linux
tar -xzf protobuf-2.1.0.tar.gz
cd protobuf-2.1.0
./configure --prefix=$INSTALL_DIR
make
make check
make install

安装完成之后,可以使用protoc命令对.proto文件进行编译,生成对应的文件

protoc -I=$src_dir --python_out=$dst_dir xxx.proto
  • $src_dir 所在的源目录
  • –python_out 生成python代码
  • $dst_dir 生成代码的目录
  • xxx.proto 针对那个proto文件生成接口
class Person(message.Message):
  __metaclass__ = reflection.GeneratedProtocolMessageType

  class PhoneNumber(message.Message):
    __metaclass__ = reflection.GeneratedProtocolMessageType
    DESCRIPTOR = _PERSON_PHONENUMBER
  DESCRIPTOR = _PERSON

class AddressBook(message.Message):
  __metaclass__ = reflection.GeneratedProtocolMessageType
  DESCRIPTOR = _ADDRESSBOOK

编写writer和reader

Writing A Message

#! /usr/bin/python

import addressbook_pb2
import sys

# This function fills in a Person message based on user input.
def PromptForAddress(person):
  person.id = int(raw_input("Enter person ID number: "))
  person.name = raw_input("Enter name: ")

  email = raw_input("Enter email address (blank for none): ")
  if email != "":
    person.email = email

  while True:
    number = raw_input("Enter a phone number (or leave blank to finish): ")
    if number == "":
      break

    phone_number = person.phones.add()
    phone_number.number = number

    type = raw_input("Is this a mobile, home, or work phone? ")
    if type == "mobile":
      phone_number.type = addressbook_pb2.Person.PhoneType.MOBILE
    elif type == "home":
      phone_number.type = addressbook_pb2.Person.PhoneType.HOME
    elif type == "work":
      phone_number.type = addressbook_pb2.Person.PhoneType.WORK
    else:
      print "Unknown phone type; leaving as default value."

# Main procedure:  Reads the entire address book from a file,
#   adds one person based on user input, then writes it back out to the same
#   file.
if len(sys.argv) != 2:
  print "Usage:", sys.argv[0], "ADDRESS_BOOK_FILE"
  sys.exit(-1)

address_book = addressbook_pb2.AddressBook()

# Read the existing address book.
try:
  f = open(sys.argv[1], "rb")
  address_book.ParseFromString(f.read())
  f.close()
except IOError:
  print sys.argv[1] + ": Could not open file.  Creating a new one."

# Add an address.
PromptForAddress(address_book.people.add())

# Write the new address book back to disk.
f = open(sys.argv[1], "wb")
f.write(address_book.SerializeToString())
f.close()

Reading A Message

#! /usr/bin/python

import addressbook_pb2
import sys

# Iterates though all people in the AddressBook and prints info about them.
def ListPeople(address_book):
  for person in address_book.people:
    print "Person ID:", person.id
    print "  Name:", person.name
    if person.HasField('email'):
      print "  E-mail address:", person.email

    for phone_number in person.phones:
      if phone_number.type == addressbook_pb2.Person.PhoneType.MOBILE:
        print "  Mobile phone #: ",
      elif phone_number.type == addressbook_pb2.Person.PhoneType.HOME:
        print "  Home phone #: ",
      elif phone_number.type == addressbook_pb2.Person.PhoneType.WORK:
        print "  Work phone #: ",
      print phone_number.number

# Main procedure:  Reads the entire address book from a file and prints all
#   the information inside.
if len(sys.argv) != 2:
  print "Usage:", sys.argv[0], "ADDRESS_BOOK_FILE"
  sys.exit(-1)

address_book = addressbook_pb2.AddressBook()

# Read the existing address book.
f = open(sys.argv[1], "rb")
address_book.ParseFromString(f.read())
f.close()

ListPeople(address_book)

Protobuf总结

性能方面

  • 数据压缩小
  • 序列化速度快
  • 传输速度快

使用方面

  • 使用简单:proto编译器自动进行序列化和反序列化
  • 维护成本低:多平台只需要维护一套对象协议文件,即.proto文件
  • 可扩展性好:不必破坏旧的数据格式,就能对数据结构进行更新
  • 加密性好:http传输内容抓包只能抓到字节数据

使用范围

  • 跨平台、跨语言、可扩展性强
;