Communication Protocol

This document describes the transport and serialization protocols used across the Terragraph E2E stack.

Transport Layer

About ZeroMQ

Terragraph uses ZeroMQ (or ZMQ) for all inter- and intra-process message passing at the application layer. ZMQ offers a protocol-agnostic abstraction layer for socket communication, as well as a framework for building lock-free concurrent applications. E2E specifically uses fbzmq, a C++ wrapper over libzmq, which provides some helpful abstractions: an async framework with event loops (fbzmq::ZmqEventLoop) and timeouts (fbzmq::ZmqTimeout), along with methods to easily send and receive Thrift objects over sockets (fbzmq::Socket).

Sockets are the core abstraction of ZMQ, and act as an asynchronous message queue rather than a synchronous interface. From a programming perspective, this means that "sending" a message only enqueues it, with no indication as to whether or when the message was actually delivered or dropped. Additionally, ZMQ defines various socket types which enable different messaging patterns (e.g. request-reply, publish-subscribe). Unlike conventional sockets, ZMQ sockets allow many-to-many connections depending on the socket type.

Specific behaviors of ZMQ sockets can be configured via a lengthy list of socket options. An important option is the high watermark, which is a hard limit on the size of the message queue; any further messages will either be dropped or result in blocking, depending on the socket type. This value is configured separately for outbound (ZMQ_SNDHWM) and inbound (ZMQ_RCVHWM) messages, and the actual limit may be "60-70% lower" than the given value.

Transport Architecture

E2E heavily uses the ZMQ request-reply pattern, exposing router sockets (ZMQ_ROUTER) externally and using dealer sockets (ZMQ_DEALER) internally for intra-process communication. Both socket types are bidirectional. Router sockets accept multiple client connections from dealer sockets, each with a unique socket identity (the ZMQ_IDENTITY socket option). For outbound messages in router sockets, the first message part must contain the destination's identity; the receiving socket uses this identity to route the message to the appropriate client, and will replace the first message part with the sender's identity.

The controller exposes two external router sockets: an "app socket" and "minion socket". These sockets are contained within the controller's Broker, which runs as a separate app (i.e. fbzmq::ZmqEventLoop thread). Every other controller app has its own dealer socket, contained within the CtrlApp base class, that connects to the broker's app socket. The app socket also accepts connections from external clients, such as TG CLI and API service.

Each minion connects to the controller on a dealer socket in the minion's Broker. In addition, the broker has a local router socket, which is otherwise analogous to the controller's app socket: all other minion apps connect to the local router socket via dealer sockets in the MinionApp base class. The router socket accepts connections from other local clients, which must connect using a ZMQ ID with prefix :FWD: (kAppSockForwardPrefix) to receive replies (or else they will get routed to the controller by default). The minion also exposes a publish socket (ZMQ_PUB) for other clients on which it broadcasts certain periodic or asynchronous messages (e.g. heartbeats, link status).

All socket identities of apps are fixed strings (defined in E2EConsts). The socket identity of each minion is its MAC address. External clients connect using arbitrary, non-conflicting identity strings; these are typically randomized for sending one-off requests.

Implementation Details

ZMQ socket details are generally abstracted away from E2E apps through the controller/minion base classes and brokers. When an app's dealer socket receives a message, it simply passes the message up to a processMessage() virtual function for the app to handle. This function will never be called concurrently.

The controller's app and minion sockets expect multi-part messages, with all non-final message parts flagged with ZMQ_SNDMORE. The first part is the destination's identity, as required by router sockets. The second part is the sender's app, and the third is the actual message contents. All message contents are Thrift structures serialized using the Thrift compact protocol.

The minion finds the controller's minion socket address (i.e. IP and port) through reading the e2e-ctrl-url key in the Open/R KvStore, a process described in other documents. The minion's broker automatically disconnects and reconnects from the controller upon a URL change or a timeout.

The controller uses the ZeroMQ Authentication Protocol (ZAP), but currently only for debug purposes to log and associate peer IP addresses with their socket connections. If enabled, the controller will spawn a thread to receive and respond to authentication requests via a ZMQ_REP socket on inproc://zeromq.zap.01. The app and/or minion sockets will be marked with an arbitrary, non-empty ZMQ_ZAP_DOMAIN socket option, causing ZMQ to forward connection details to the ZAP handler. The handler simply echoes received peer IPs into the metadata in its response as the Ip-Address property. This metadata becomes associated with the ZMQ socket, and can be queried whenever messages arrive.

Each app in both the controller and minion will bump a unique "socketMonitor" counter once per minute (by default) to indicate that its dealer socket is healthy and the thread itself is alive. These stats are published via the local ZmqMonitor instance (refer to Stats, Events, Logs for further details).

Global Objects

Apart from the intra-process communication architecture, E2E uses a small number of globally-shared objects across apps. These objects, defined in SharedObjects, are only accessible through acquiring read-write locks using the folly::Synchronized abstraction.

Serialization Layer

About Thrift

Terragraph E2E serializes all messages using Thrift, specifically the fbthrift branch. Thrift includes an interface definition language (IDL) with a cross-language code generator, as well as a serialization framework for the generated structures. Terragraph does not use Thrift's RPC framework, in favor of ZMQ.

Thrift Interfaces

Thrift structures are defined within *.thrift files, located inside various if/ directories. All Thrift files used in E2E are listed below.

Controller.thriftCore structures used by the controller
Aggregator.thriftCore structures used by the aggregator
Topology.thriftTopology structures
NodeConfig.thriftNode configuration structures
FwOptParams.thriftFirmware-specific node configuration structures
Event.thriftEvent structures
PassThru.thriftFirmware pass-through message structures
DriverMessage.thriftDriver message structures
BWAllocation.thriftBandwidth and airtime allocation structures

Implementation Details

Terragraph exclusively uses the Thrift compact protocol for messages transported over ZMQ. When writing Thrift structures to disk, the JSON serializer is used instead.

For consistency at the ZMQ layer, the outermost Thrift structure for transport is always thrift::Message (shown below). This structure must include a message type and compact-serialized binary value, which the receiver can then deserialize into another Thrift structure.

struct Message {
1: MessageType mType;
2: binary value;
3: optional bool compressed;
4: optional CompressionFormat compressionFormat;

Optionally, messages can be compressed using any supported format. The receiver should first decompress the binary value before deserializing it. Currently, only a handful of message types are compressed, and only the Snappy format is supported.
