Files
akvorado/web/data/docs/02-configuration.md
2022-03-31 20:56:14 +02:00

10 KiB

Configuration

Akvorado can be configured through a YAML file. You can get the default configuration with ./akvorado --dump --check. Durations can be written in seconds or using strings like 10h20m.

It is also possible to override configuration settings using environment variables. You need to remove any - from key names and use _ to handle nesting. Then, put AKVORADO_ as a prefix. For example, let's consider the following configuration file:

kafka:
  topic: test-topic
  topic-configuration:
    num-partitions: 1
  brokers:
    - 192.0.2.1:9092
    - 192.0.2.2:9092

It can be translated to:

AKVORADO_KAFKA_TOPIC=test-topic
AKVORADO_KAFKA_TOPICCONFIGURATION_NUMPARTITIONS=1
AKVORADO_KAFKA_BROKERS=192.0.2.1:9092,192.0.2.2:9092

Flow

The flow component handles incoming flows. It only accepts the inputs key to define the list of inputs to receive incoming flows.

Each input has a type and a decoder. For decoder, only netflow is currently supported. As for the type, both udp and file are supported.

For the UDP input, the supported keys are listen to set the listening endpoint, workers to set the number of workers to listen to the socket, receive-buffer to set the size of the kernel's incoming buffer for each listening socket, and queue-size to define the number of messages to buffer inside each worker. For example:

flow:
  inputs:
    - type: udp
      decoder: netflow
      listen: 0.0.0.0:2055
      workers: 3
  workers: 2

The file input should only be used for testing. It supports a paths key to define the files to read from. These files are injected continuously in the pipeline. For example:

flow:
  inputs:
    - type: file
      decoder: netflow
      paths:
       - /tmp/flow1.raw
       - /tmp/flow2.raw
  workers: 2

Without configuration, Akvorado will listen for incoming Netflow/IPFIX flows on port 2055.

Kafka

Received flows are exported to a Kafka topic using the protocol buffers format. The definition file is flow/flow-*.proto. Each flow is written in the length-delimited format.

The following keys are accepted:

  • topic tells which topic to use to write messages
  • topic-configuration contains the topic configuration
  • brokers specifies the list of brokers to use to bootstrap the connection to the Kafka cluster
  • version tells which minimal version of Kafka to expect
  • usetls tells if we should use TLS to connection (authentication is not supported)
  • flush-interval defines the maximum flush interval to send received flows to Kafka
  • flush-bytes defines the maximum number of bytes to store before flushing flows to Kafka
  • max-message-bytes defines the maximum size of a message (it should be equal or smaller to the same setting in the broker configuration)
  • compression-codec defines the compression codec to use to compress messages (none, gzip, snappy, lz4 and zstd)
  • queue-size defines the size of the internal queues to send messages to Kafka. Increasing this value will improve performance, at the cost of losing messages in case of problems.

The topic name is suffixed by the version of the schema. For example, if the configured topic is flows and the current schema version is 0, the topic used to send received flows will be flows-v0.

If no topic configuration is provided, the topic should already exist in Kafka. If a configuration is provided, the topic is created if it does not exist or updated if it does. Currently, updating the number of partitions or the replication factor is not possible. The following keys are accepted for the topic configuration:

  • num-partitions for the number of partitions
  • replication-factor for the replication factor
  • config-entries is a mapping from configuration names to their values

For example:

kafka:
  topic: test-topic
  topic-configuration:
    num-partitions: 1
    replication-factor: 1
    config-entries:
      segment.bytes: 1073741824
      retention.ms: 86400000
      cleanup.policy: delete

Core

The core component adds some information using the GeoIP databases and the SNMP poller, and push the resulting flow to Kafka. It is also able to classify exporters and interfaces into groups.

The following configuration keys are accepted:

  • workers key define how many workers should be spawned to process incoming flows
  • exporter-classifiers is a list of classifier rules to define a group for exporters
  • interface-classifiers is a list of classifier rules to define connectivity type, network boundary and provider for an interface
  • classifier-cache-size defines the size of the classifier cache. As classifiers are pure, their result is cached in a cache. The metrics should tell if the cache is big enough. It should be set at least to twice the number of the most busy interfaces.

Classifier rules are written using expr.

Exporter classifiers gets the classifier IP address and its hostname. If they can make a decision, they should invoke one of the Classify() functions with the target group as an argument. Calling this function makes the exporter part of the provided group. Evaluation of rules stop on first match. The accessible variables and functions are:

  • Exporter.IP for the exporter IP address
  • Exporter.Name for the exporter name
  • Classify() to classify exporter to a group

Interface classifiers gets the following information and, like exporter classifiers, should invoke one of the Classify() functions to make a decision:

  • Exporter.IP for the exporter IP address
  • Exporter.Name for the exporter name
  • Interface.Name for the interface name
  • Interface.Description for the interface description
  • Interface.Speed for the interface speed
  • ClassifyConnectivity() to classify for a connectivity type (transit, PNI, PPNI, IX, customer, core, ...)
  • ClassifyProvider() to classify for a provider (Cogent, Telia, ...)
  • ClassifyExternal() to classify the interface as external
  • ClassifyInternal() to classify the interface as internal

Once an interface is classified for a given criteria, it cannot be changed by later rule. Once an interface is classified for all criteria, remaining rules are skipped. Connectivity and provider are somewhat normalized (down case)

Each Classify() function, with the exception of ClassifyExternal() and ClassifyInternal() have a variant ending with Regex which takes a string and a regex before the original string and do a regex match. The original string is expanded using the matching parts of the regex. The syntax is the one from Go.

Here is an example:

Interface.Description startsWith "Transit:" &&
ClassifyConnectivity("transit") &&
ClassifyExternal() &&
ClassifyProviderRegex(Interface.Description, "^Transit: ([^ ]+)", "$1")

GeoIP

The GeoIP component adds source and destination country, as well as the AS number of the source and destination IP if they are not present in the received flows. It needs two databases using the MaxMind DB file format, one for AS numbers, one for countries. If no database is provided, the component is inactive. It accepts the following keys:

  • asn-database tells the path to the ASN database
  • country-database tells the path to the country database

If the files are updated while Akvorado is running, they are automatically refreshed.

SNMP

Flows only include interface indexes. To associate them with an interface name and description, SNMP is used to poll the exporter sending each flows. A cache is maintained to avoid polling continuously the exporters. The following keys are accepted:

  • cache-duration tells how much time to keep data in the cache
  • cache-refresh tells how much time to wait before updating an entry by polling it
  • cache-check-interval tells how often to check if cached data is about to expire or need an update
  • cache-persist-file tells where to store cached data on shutdown and read them back on startup
  • default-community tells which community to use when polling exporters
  • communities is a map from a exporter IP address to the community to use for a exporter, overriding the default value set above,
  • poller-retries is the number of retries on unsuccessful SNMP requests.
  • poller-timeout tells how much time should the poller wait for an answer.
  • workers tell how many workers to spawn to handle SNMP polling.

As flows missing interface information are discarded, persisting the cache is useful to quickly be able to handle incoming flows. By default, no persistent cache is configured.

HTTP

The builtin HTTP server serves various pages. Its configuration supports only the listen key to specify the address and port to listen. For example:

http:
  listen: 0.0.0.0:8000

Web

The web interface presents the landing page of Akvorado. It also embeds the documentation. It accepts only the following key:

  • grafanaurl to specify the URL to Grafana and exposes it as /grafana.

ClickHouse

The ClickHouse component exposes some useful HTTP endpoints to configure a ClickHouse database. Optionally, it will also provision and keep up-to-date a ClickHouse database. In this case, the following keys should be provided:

  • servers defines the list of ClickHouse servers to connect to
  • username is the username to use for authentication
  • password is the password to use for authentication
  • database defines the database to use to create tables
  • akvorado-url defines the URL of Akvorado to be used by Clickhouse (autodetection when not specified)

Reporting

Reporting encompasses logging and metrics. Currently, as Akvorado is expected to be run inside Docker, logging is done on the standard output and is not configurable. As for metrics, they are reported by the HTTP component on the /api/v0/metrics endpoint and there is nothing to configure either.