Deployment and Installation
This document describes the installation procedure for the E2E controller and configuring nodes for communication with the controller.
This section describes the installation and initial configuration of the full Terragraph cloud suite, including E2E services and the NMS backend. The Terragraph cloud suite is deployed as a Docker Swarm.
Docker Swarm Installation
Docker Swarm recommends at least 3 (Docker) hosts for redundancy. If redundancy is not required, the cloud suite can be run on a single host. To support a network composed of roughly 512 sectors, each Docker host must meet the following specifications.
- Ubuntu 18.04
- 4 vCPU
- 16GB of RAM
- 200GB of disk space
- Globally addressable IPv6 and private (or global) IPv4
- A unique and static hostname for each Docker node
Below is a suggested filesystem partitioning scheme for the Docker hosts. By
default, all of the Terragraph-specific data is stored in
|130GB||Storage for all Terragraph data|
|50GB||Storage for all Docker data|
Terragraph comes with an installer that deploys and configures the Terragraph cloud suite. The installer is a PEX file which packages together Ansible, a Python CLI, and all of their dependencies into a single executable. An installation host that has SSH access to all the Docker hosts is necessary to run the installer. The installation host can be one of the Docker hosts.
For up-to-date installation instructions, see the Terragraph NMS repository.
- Download the installer (
nms) on the installation host.
$ wget https://github.com/terragraph/tgnms/releases/latest/download/nms
- Make the
$ chmod +x nms
- Install Python 3.8 (
python3.8) on the installation host, and optionally install
ssh-passif password-based SSH is used to access the Docker hosts.
$ sudo apt-get install python3-distutils-extra
$ sudo apt-get install python3.8
- Create the config file.
$ ./nms show-defaults > config.yml
Open and edit
config.yml. The config options are documented within the config file. Replace the "NMS Web Login" credentials and change
ansible_userto be the username on the installation host. If identity and access management is desired (recommended), follow the steps in the section Identity and Access Management Configuration.
Run the installer.
tg-docker-host-XXare placeholders for the IP addresses or hostnames of the Docker hosts. All Docker hosts must be provided to the
installcommand. Include the private key and SSL certificate file for the Nginx server as arguments and set
ext_nms_hostnameto configure the location of the certs in the Docker path.
$ ./nms install -f config.yml -k <ssl-key-file> -C <ssl-cert-file> \
-h tg-docker-host-01 [-h tg-docker-host-02] [-h tg-docker-host-03]
- Open a web browser and navigate to the IP address of any of the Docker hosts
to load the NMS UI. Select the "Network Config" menu, then select the
"Controller" tab and change the configuration fields listed in the table
below. Be sure to change
<network>to the name of the network (under
nmsinstaller config file).
Deploying Additional Networks
To add additional networks to a Docker Swarm deployment (i.e. networks managed
by the same NMS instance), edit the
config.yml file and add new entries to
controllers_list, then re-run the
install command above. Note that this will
cause some downtime to running services, including those for existing networks.
Legacy Terragraph deployments make use of systemd for running the E2E stack. The NMS stack can no longer be installed with this method.
The installation steps for services contained within the E2E software image are described below.
The following specifications support a network with roughly 512 sectors.
- CentOS 7 or Ubuntu 17+
- 2 vCPU
- 12GB of RAM
- 40GB of disk space
The cloud services use the following ports by default to communicate with each other and the Terragraph nodes:
|Yes||Used by libtorrent to seed upgrade images.|
|Yes||Opentracker serves the torrent tracker on this port.|
|Yes||Management communication port between E2E controller and nodes.|
|Yes||NMS aggregator listens on this port for stats from nodes.|
|Yes||API service runs on this port.|
|Yes||Used by primary and backup controller to send and receive HA messages.|
|No||Used for communication between various E2E controller apps.|
|No||Used for streaming from the controller to API service, currently disabled.|
|No||Used for communication between various NMS aggregator apps.|
|No||Used by controller to publish stats to cloud-based stats agent.|
|No||Used by controller to publish stats to cloud-based stats agent.|
The external column indicates whether or not that port must be externally
accessible. Note that
18100 may have to be externally accessible
if API service is not in the same network.
Have the following information available before installing the E2E controller:
- An IPv6 network prefix that will be used to address the Terragraph nodes. This prefix will need to be reachable from the E2E controller when it is installed.
- A route (possibly via a default) from a PoP node to the E2E controller machine(s). This can be static or learned from BGP.
- A copy of the Terragraph E2E image (x86).
Note: The following installation steps have been tested on Ubuntu 17.10. Some commands and paths may differ for CentOS.
- Copy the Terragraph E2E image to the server and extract it to
/opt/rootfs(or another directory).
$ scp e2e-image-tgx86.tar.gz my-e2e-controller01:/tmp/
$ mkdir /opt/rootfs
$ tar xvzf e2e-image-tgx86.tar.gz -C /opt/rootfs
- Create the environment file in
/etc/tg_systemd_config/README.mdfor additional details.
$ cat /etc/default/tg_services
# Relative to DATA_DIR
# Additional CLI flags
- Create the external "data" directory (
DATA_DIRin the environment file).
$ mkdir /root/data
- Install and enable the systemd service scripts.
$ cp /opt/rootfs/etc/tg_systemd_config/*.service /lib/systemd/system/
$ systemctl daemon-reload
# E2E controller
$ systemctl enable e2e_controller
$ systemctl enable opentracker
# NMS aggregator
$ systemctl enable nms_aggregator
# API Service
$ systemctl enable api_service
# Stats agent
$ systemctl enable stats_agent
- Create an empty topology file (
E2E_TOPOLOGY_FILEin the environment file) within the "data" directory.
$ touch /root/data/e2e-topology.conf
- Copy the default E2E controller configuration file into the actual file
E2E_CONFIG_FILEin the environment file). If needed, change
statsAgentParams.endpointParams.kafkaParams.config.brokerEndpointListto the list of Kafka broker URLs. See
/etc/e2e_config/controller_config_metadata.jsonfor additional details.
$ cp /opt/rootfs/etc/e2e_config/controller_config_default.json \
- Copy the default NMS aggregator configuration file into the actual file
NMS_CONFIG_FILEin the environment file). If needed, change
dataEndpoints.nms.hostto the URL prefix of the desired data endpoint. See
/etc/stats_config/aggregator_config_metadata.jsonfor additional details.
$ cp /opt/rootfs/etc/stats_config/aggregator_config_default.json \
The E2E controller can be installed in High Availability (HA) mode, where a backup E2E controller runs passively on a separate machine and takes over if the primary controller fails. HA mode is strongly recommended, as Terragraph heavily relies on the E2E controller for normal network operation.
To enable HA mode, follow the installation steps above on both machines, with the following changes:
- In the E2E controller configuration file (
E2E_CONFIG_FILE), additional flags must be set.
bstar_primaryshould be "true" on the primary controller and "false" on the backup controller.
"bstar_peer_host": "<hostname or IPv6 address of the other controller>",
- In the environment file (
/etc/default/tg_services), set a different value for
AGENT_MACon the backup controller, as shown below.
- Do not enable the nms_aggregator systemd service on the backup machine.
Post-installation configuration steps for the Terragraph NMS are described below.
NMS parameters can be configured by editing the file
/opt/terragraph/gfs/nms/env/nms_custom.env. All available parameters are
described in the table below.
|integer||When NMS makes an external API request, the request will be canceled if the response takes longer than |
|string||Commit date for the NMS build. Use this to show the user the release date of the NMS. This is normally set automatically. This can be in any format as it's merely displayed in the UI.|
|string||Commit hash for the NMS build. Use this to show the user the exact source control commit that this build was generated from. This is normally set automatically. This can be in any format as it's merely displayed in the UI.|
|string||The URL sent to the E2E controller to download node images uploaded via the UI. The format is |
|string||Send external HTTP requests to the following URL. Note that this environment variable is lowercase due to the hosting environment.|
|string||URL to a bug tracker for logging issues with the NMS.|
|enum (debug, info, warning, error)||NMS logs information to the console with varying degrees of verbosity. Higher verbosity log levels also show messages from less verbose levels. The most verbose is "debug", the least verbose is "error". |
|enum (development, production)||Which mode to run the NMS process in. Development mode will reload the server when files change and enable logging to the console. Production mode will generate a static build and will log JSON to standard out. |
|boolean||Enables the unified alarm configuration UI. |
|string||URL of the Alertmanager configuration service. |
|string||URL of the Prometheus Alertmanager. |
|string||URL of the Prometheus configuration service. |
|string||Hostname of the Terragraph event alarms service. |
|string||Base URL of the NMS. This should be the user-visible URL to access the NMS homepage (i.e. what a user would see in their browser address bar).|
|string||Keycloak client ID to use. This must be retrieved from Keycloak if set manually.|
|string||Keycloak client secret to use. This must be retrieved from Keycloak if set manually.|
|string||URL of the Keycloak server to authenticate with. Full URLs are allowed here for cases where Keycloak is not served under the root hostname.|
|string||Send all Keycloak-related HTTP requests through this proxy. There are occasions where most services are hosted within an internal network behind a proxy, but the Keycloak authorization server may be on the public internet. In this case, Keycloak related requests must be sent through a different proxy than the default HTTP proxy. If this variable is not set, this falls back to http_proxy. If neither is set, this is n/a.|
|string||Keycloak realm to authenticate the user with. This must be retrieved from Keycloak if set manually.|
|boolean||Enables authentication for NMS. Authentication is handled by integrating with Keycloak. |
|integer||How long a user's login cookie will last. |
|boolean||Enables Single Sign-On (SSO) functionality through Keycloak. |
|string||Access token for the Mapbox account.|
|comma separated strings||A comma separated list of tile names and URLs to display in the UI in the format |
|MySQL (or MariaDB)|
|string||The MySQL database to connect to. |
|string||The hostname or IP address of the MySQL server to use. |
|string||The MySQL password to connect to the database with.|
|integer||The TCP port of the MySQL server to use. |
|string||The MySQL user to connect to the database as. |
|string||ID of the Terragraph software portal API token. This will be generated when creating the API token. |
|string||API token for the Terragraph software portal.|
|string||URL to the central Terragraph software portal. This is where official Terragraph releases are stored. Setting this variable will enable the software portal feature. |
|string||Absolute or relative URL to the Grafana UI. This is used to link to certain Grafana dashboards. By default Grafana is hosted under CLIENT_ROOT_URL/grafana. |
|string||URL of the Prometheus service. |
|integer||Allowed delay in stats pipeline when showing availability. This is used for stats derived from the Kafka stats stream. |
|comma separated strings||Comma separated list of Kafka hosts to pull node events from (see |
|boolean||Enables the service availability display in the overview panel. Service availability represents the link's ability to carry real traffic at layer 4. |
|boolean||Enables the notification menu for streaming events from Kafka in realtime. |
|boolean||Enables the default routes history feature. |
|string||Hostname of the Terragraph Network Test service.|
Self-Hosting Map Tiles Using OpenStreetMaps
If fetching remote map data in the NMS UI is not possible, the OpenStreetMaps tiles can be generated locally and hosted in a Docker tile server container.
- Follow the OpenMapTiles quick-start instructions.
$ git clone https://github.com/openmaptiles/openmaptiles.git
$ cd openmaptiles
- Edit the
.envfile and change
14. Then run the
quickstart.shscript to generate tiles. Generating tiles for a specific region is possible by adding the region name as the first parameter.
$ ./quickstart.sh <REGION NAME>
$ make start-tileserver
- The tile server will now be running on port 8080 via docker-compose.
- Specify the
TILE_STYLEparameter by following the Setup Options for the UI.
Identity and Access Management Configuration
Access management is handled by Keycloak, an open source Identity and Access Management (IAM) system. Keycloak is capable of integrating with many identity providers such as Facebook and Google, as well as storing users inside its own database.
Installing and Enabling Keycloak
The following variables must be set in config.yml:
nms_username- The username of the default NMS administrator.
nms_password- The password of the default NMS administrator.
ext_nms_hostname- Must be set to the externally visible NMS hostname (whatever a user would type into their browser's address bar). An IP address is also acceptable.
keycloak_root_user- The username of the default Keycloak administrator.
keycloak_root_password- The password of the default Keycloak administrator.
keycloak_db_user- The username of the Keycloak MySQL user.
keycloak_db_password- The password of the Keycloak MySQL user.
It is recommended to change
keycloak_db_password from their defaults.
Logging into the NMS for the first time
After successfully running the Ansible installer, visit the URL set in
ext_nms_hostname in a web browser. The user will be shown a login page.
nms_password variables from
config.yml as the username and password.
User access to Terragraph NMS may be configured by following the instructions
in the section below.
Configuring User Access to Terragraph NMS
User authentication may be handled directly by Keycloak or by an external Identity Provider such as Facebook or Google. This guide will detail how to add a new user to Keycloak directly. Please consult the Keycloak documentation for instructions on configuring an external Identity Provider.
First, the user must sign in to the Keycloak administration dashboard by
/auth in a browser.
Click on the "Administration Console" link. Enter the username and password used
keycloak_root_password variables in
Once logged in to the Keycloak administration dashboard, select the "TGNMS" realm from the dropdown at the upper-left of the screen. Click the "Users" link on the left sidebar. To view the users in the "TGNMS" realm, search for a user using the search bar or click the "View all users" button.
To give a new user access to Terragraph NMS, click the "Add user" button at the upper-right corner of the users table. Input a username for the user. It is also recommended to set a valid email address so the user may reset their own password using Keycloak. Optionally, select some "Required User Actions" for the user to perform when they sign-in. For example, the administrator may require a user to set a new password and verify their email.
Click the "Save" button to create the user. The newly created user must now be given authorization to access the Terragraph NMS. User authorization is configured by assigning Keycloak Roles to a user. Click on the "Role Mappings" tab. To assign roles to a user, select the desired role in the "Available Roles menu" and click the "Add selected" button. The user's new roles will take effect after their next sign-in.
Note that the "tg_all_read" and "tg_all_write" roles are called "Composite Roles". These are roles which contain multiple other roles. For example, if a user needs full read-access to the NMS, assign them the "tg_all_read" role and all corresponding read roles will be assigned to them. This can be demonstrated by assigning either composite role to a user and then looking at the "Effective Roles" menu on the right side of the "Role Mappings" screen.
Authorization and Roles
Roles are used to protect E2E API endpoints, as well as hide UI components in the Terragraph NMS. Descriptions of each role can be viewed by selecting the "Roles" link on the left sidebar. The roles required to access each E2E API endpoint are listed in the E2E API Documentation.
A role's name is broken up into two main parts: category and level. For example,
tg_ignition_read role has a category of "ignition" and a level of "read".
If a user has the
tg_ignition_write role, they are also able to access APIs
in the same category that require the "read" level (write access implies
read access). A user with the composite role
tg_all_write has full
read-write access to all endpoints and UI components.
Some important node deployment steps are highlighted in the sections below.
On Puma hardware, the MAC address and serial number of each node are encoded in a QR code on the back of the device. This is parsed as follows:
SN:<serial number>:M60:<MAC address>
Serial number => TG100P05241700057
MAC address => 74:6F:F7:CA:1A:76
First PoP Configuration
The configuration of Terragraph nodes is handled automatically by the E2E service. However, the first PoP node must be configured manually since connectivity to the E2E controller has not yet been established.
Add the PoP site and node to the network topology (refer to Monitoring and Alerting for details).
Create the PoP node configuration (refer to PoP Node Config for details).
Generate the full PoP node configuration JSON using one of these methods:
Via NMS: Select the Node Config icon from the left-hand toolbar, click on the "Node" tab on the top bar, then select the PoP node in the left-hand column. Click Show Full Configuration on the bottom-left, make sure the correct versions are selected, then click Copy.
Via TG CLI: Run the following command (be sure to substitute the correct version details):
$ tg config get node -n pop-node-name full \
--version RELEASE_M61 \
--hardware NXP_LS1048A_PUMA \
Deploy the configuration to the PoP node using one of these methods:
- Via SSH (requires a signed SSH key): Log into the node and write the
full configuration JSON (obtained above) to the path
- Via web portal (must be enabled through
envParams.WEBUI_ENABLED): On hardware that supports Wi-Fi for administrative access, connect to the node's Wi-Fi network and submit the full configuration JSON (obtained above) using the web portal.
- Via SSH (requires a signed SSH key): Log into the node and write the full configuration JSON (obtained above) to the path
Reboot the PoP node. Connectivity to the E2E controller should be established when the node comes back up.
When deploying a test network with poor GPS visibility (e.g. indoors), nodes
must be configured to disable the GPS synchronization check during link
ignition. If this is applicable, add a network override for the
radioParamsBase.fwParams.forceGpsDisable config to
Maintenance and Configuration).
SSH access to Terragraph nodes is controlled through SSH certificate authorities (CAs). The CAs that Terragraph nodes trust are managed via node configuration (see Maintenance and Configuration). These CAs can be used to sign user SSH keys.
- To generate a CA key, use
ssh-keygen. It is recommended to use the ed25519 encryption scheme when generating CA and user keys.
$ ssh-keygen -f ~/.ssh/tg-CA -t ed25519
- Configure nodes to trust the newly generated CA by adding the contents of
$ cat ~/.ssh/tg-CA.pub
- Optionally, disable the default Terragraph CA on the nodes
/etc/ssh/tg-CA.bak) by setting
Stats, events, and other data can be pushed from nodes to a Kafka broker. To
enable this, add a network override for the
to the list of Kafka broker URLs.
When running in a Docker environment, please note that the node configuration must use port 9096 (by default), which will differ from the controller configuration's port 9092 (by default).
The NTP servers that Terragraph nodes use for system clock synchronization are
managed via node configuration (see
Maintenance and Configuration). By default,
nodes will connect to
time.facebook.com. To use custom NTP servers instead, set
a network override for the
sysParams.ntpServers config. Note that the server
must have IPv6 (unless NAT64 or IPv4 are supported).
The timezone on Terragraph nodes is set using
/etc/localtime. This can point
at any valid tz database entry. By default, Terragraph nodes use the
timezone. To change this for all nodes, set a network override for the
envParams.TIMEZONE config (see
Maintenance and Configuration).
The DNS servers that Terragraph nodes use for DNS resolution are managed via
node configuration (see
Maintenance and Configuration). By default,
nodes will use Google's public DNS servers (
2001:4860:4860::8844). To use additional DNS servers, add their IPv6 addresses
WPA-Enterprise (802.1X) Configuration
Terragraph supports 802.1X for link authentication for enterprise-grade
security. This requires a RADIUS server to be available and reachable from the
nodes, and all Terragraph nodes must contain signed certificates. To enable
802.1X, set a network override for the config keys
eapolParams, as shown below:
By default, 802.1X is disabled, and links are authenticated using WPA-PSK.
The wireless channel can be manually configured for each node or assigned
across the entire network. For automatic assignment, only channel 2 is enabled
by default but the set of enabled channels can be configured via the
topologyParams.enabledChannels controller configuration field.
Disabling Linux Serial Console
The Linux serial console (
/dev/ttyS0) is enabled on nodes by default. To
disable it (e.g. for security reasons), set the
envParams.SERIAL_CONSOLE_DISABLE config to
1 (see Maintenance and
Configuration). Please be aware this may
make it impossible to recover a node.
DHCP, PD, and DS-Lite
Please note these features are only supported on the Rev5 platform.
Dual-Stack Lite (or DS-Lite) allows IPv4 to work natively in an IPv6 access network. Terragraph supports DS-Lite as a solution for dual-stack deployments. DS-Lite involves two network services: AFTR (Address Family Transition Router) and B4 (Basic Bridging BroadBand). The IPv6 address of the AFTR is required for setup. The CPE will run the B4 function as long as it supports DS-Lite.
DS-Lite requires DHCP prefix delegation. Terragraph Rev5 nodes run Kea as a DHCP server with all the required configuration options.
The figure below shows an example Terragraph network with DS-Lite.
The following section describes the node configuration options for setting up DHCP prefix delegation and DS-Lite.
Nodes with CPEs (DHCP)
The table below lists the configuration options in
|Interface for DHCP (usually same as CPE interface)|
|Length of prefix to be delegated to CPEs|
|Network pool from which prefixes will be delegated (must be larger than |
|DHCP lease preferred lifetime; once this expires, the lease should not be used for new communications|
|DHCP renew timer (T1), usually set to 50% of the preferred lease time|
|DHCP rebind timer (T2), usually set to 87.5% of the preferred lease time|
|Length of time that an address remains in the valid state, during which the address can be used for new or existing communications; when the valid-lifetime expires, the address becomes invalid and can no longer be used|
|Start address for regular leases|
|Max address for regular leases|
|Boolean flag to enable or disable Kea (enabling Kea disables isc-dhcpd)|
|Map of DHCP options, with names as the keys and data as the values (refer to the DHCPv6 options list)|
|If enabled, splits prefix down to /64 before pushing to hardware (please only enable on Marvell hardware)|
dhcpPdPool is an optional parameter. If it is not set, Kea will try
to split the address allocated to the node for delegation, if possible.
DS-Lite auto-configuration will not work without prefix delegation. To disable
prefix delegation entirely, leave both
unset. Please note that the DHCP server is only supported in the Rev5 platform.
On the newer VPP-based platforms, CPE provisioning with DHCPv6 is enabled via a
PoP Nodes (BGP)
The table below lists the configuration options in
|Entire prefix used for prefix delegation in a deployment|
|Comma-separated list of specific prefixes desired to be routed from nearby CPEs (defined in |
Terragraph Network Security
It is essential to implement proper access control list (ACL) rules to keep the
Terragraph network infrastructure secure. Running Terragraph without any ACL
rules upstream may expose security vulnerabilities. As such, it is critical that
e2e-network-prefix which the nodes are using be protected from the
Internet by the application of ACL rules upstream.
There is no requirement for connections that must be initiated from the Internet towards a Terragraph node under regular operation.
A Terragraph node exposed to inbound connections from the internet allows for vulnerabilities such as:
- SSH access by unauthorized users (by default, trusts the CA in
- NTP amplification attacks
- Route hijacking via BGP (if node is a PoP)
Terragraph nodes may require the following outbound connections (depending on their configuration):
- DNS -> Google DNS6
- NTP ->
- E2E minion -> E2E controller on TCP port 7007
- Fluent Bit -> configured Fluentd endpoint (TCP)
- Stats agent -> configured Kafka endpoint (TCP)
- Stats agent -> NMS aggregator on TCP port 8002 (off by default)
- HTTPS ->
https://graph.facebook.com(for logging stats, off by default)