Files
akvorado/console/data/docs/02-configuration.md
rastsislau-matusevich f2dfe55d27
Some checks failed
CI / 🤖 Check dependabot status (push) Has been cancelled
CI / 🐧 Test on Linux (${{ github.ref_type == 'tag' }}, misc) (push) Has been cancelled
CI / 🐧 Test on Linux (coverage) (push) Has been cancelled
CI / 🐧 Test on Linux (regular) (push) Has been cancelled
CI / ❄️ Build on Nix (push) Has been cancelled
CI / 🍏 Build and test on macOS (push) Has been cancelled
CI / 🧪 End-to-end testing (push) Has been cancelled
CI / 🔍 Upload code coverage (push) Has been cancelled
CI / 🔬 Test only Go (push) Has been cancelled
CI / 🔬 Test only JS (${{ needs.dependabot.outputs.package-ecosystem }}, 20) (push) Has been cancelled
CI / 🔬 Test only JS (${{ needs.dependabot.outputs.package-ecosystem }}, 22) (push) Has been cancelled
CI / 🔬 Test only JS (${{ needs.dependabot.outputs.package-ecosystem }}, 24) (push) Has been cancelled
CI / ⚖️ Check licenses (push) Has been cancelled
CI / 🐋 Build Docker images (push) Has been cancelled
CI / 🐋 Tag Docker images (push) Has been cancelled
CI / 🚀 Publish release (push) Has been cancelled
Update Nix dependency hashes / Update dependency hashes (push) Has been cancelled
orchestrator: add manage-topic flag and document skip-migrations (#2109)
2025-11-20 22:58:41 +01:00

46 KiB

Configuration

The orchestrator service is configured through YAML files (provided in the config/ directory) and includes the configuration of the other services.

Tip

Other services query the orchestrator through HTTP on startup to get their configuration. By default, the orchestrator restarts automatically if it detects a configuration change, but this may fail if there is a configuration error. Look at the logs of the orchestrator service or restart it if you think a configuration change is not applied.

You can get the default configuration with docker compose run --rm --no-deps akvorado-orchestrator orchestrator --dump --check /dev/null. Note that some sections are generated from the configuration of other sections. It is better to not use the generated configuration as a base for your configuration. Write durations as strings, like 10h20m or 5s. Valid time units are ms, s, m, and h.

You can also override configuration settings with environment variables. Remove any - from key names and use _ for nesting. Then, add the prefix AKVORADO_CFG_ORCHESTRATOR_. Let's consider this configuration file:

http:
  listen: 127.0.0.1:8081
kafka:
  topic: test-topic
  brokers:
    - 192.0.2.1:9092
    - 192.0.2.2:9092

It can be translated to:

AKVORADO_CFG_ORCHESTRATOR_HTTP_LISTEN=127.0.0.1:8081
AKVORADO_CFG_ORCHESTRATOR_KAFKA_TOPIC=test-topic
AKVORADO_CFG_ORCHESTRATOR_KAFKA_BROKERS=192.0.2.1:9092,192.0.2.2:9092

The orchestrator service has its own configuration and the configuration for the other services. The configuration for each service is under a key with the same name as the service (inlet, outlet, and console). For each service, you can provide a list of configurations. A service can request a specific configuration by adding an index to the configuration URL. If the index does not match a configuration, the first configuration is used.

Each service has several functional components. Each component has a section in the configuration file with the same name.

Inlet service

Configure this service under the inlet key. The inlet service receives NetFlow/IPFIX/sFlow packets and sends them to Kafka. Its main components are flow and kafka.

Flow

The flow component handles incoming flows. Use the inputs key to define a list of inputs for incoming flows. The flows are put into protobuf messages and sent to Kafka without being parsed.

Each input has a type and a decoder. For decoder, netflow and sflow are supported. For type, udp and file are supported.

For the UDP input, you can use the following keys:

  • listen: set the listening endpoint.
  • workers: set the number of workers to listen to the socket.
  • receive-buffer: set the size of the kernel's incoming buffer for each listening socket.

If you set use-src-addr-for-exporter-addr to true, the source IP of the received flow packet is used as the exporter address. You can also choose how to extract the timestamp for each packet with timestamp-source:

  • udp: use the receive time of the UDP packet (the default).
  • netflow-packet: extract the timestamp from the NetFlow/IPFIX header.
  • netflow-first-switched: use the “first switched” field from NetFlow/IPFIX.

For example:

flow:
  inputs:
    - type: udp
      decoder: netflow
      listen: :2055
      workers: 3
      use-src-addr-for-exporter-addr: true
    - type: udp
      decoder: sflow
      listen: :6343
      workers: 3

Use the file input for testing only. It has a paths key to define the files to read. These files are continuously added to the processing pipeline. For example:

flow:
  inputs:
    - type: file
      decoder: netflow
      paths:
       - /tmp/flow1.raw
       - /tmp/flow2.raw
    - type: file
      decoder: sflow
      paths:
       - /tmp/flow1.raw
       - /tmp/flow2.raw

Without configuration, Akvorado listens for incoming NetFlow/IPFIX and sFlow flows on a random port. Check the logs to see which port is used.

Kafka

The inlet service sends received flows to a Kafka topic using the protocol buffers format.

The following keys are accepted:

  • topic, brokers, and tls are described in the configuration for the orchestrator service. Their values are copied from the orchestrator configuration, unless you set brokers explicitly.
  • compression-codec defines the compression codec for messages: none, gzip, snappy, lz4 (default), or zstd.
  • queue-size defines the maximum number of messages to buffer for Kafka.

A version number is automatically added to the topic name. This is to prevent problems if the protobuf schema changes in a way that is not backward-compatible.

Outlet service

Configure this service under the outlet key. The outlet service takes flows from Kafka, parses them, adds metadata and routing information, and sends them to ClickHouse. Its main components are kafka, metadata, routing, and core.

Kafka

The outlet's Kafka component takes flows from the Kafka topic. The following keys are accepted:

  • topic, brokers, and tls are described in the configuration for the orchestrator service. Their values are copied from the orchestrator configuration, unless you set brokers explicitly.
  • consumer-group defines the consumer group ID for Kafka consumption.
  • fetch-min-bytes defines the minimum number of bytes to fetch from Kafka.
  • fetch-max-wait-time defines the maximum time to wait for the minimum number of bytes to become available.
  • min-workers defines the minimum number of Kafka workers to use.
  • max-workers defines the maximum number of Kafka workers to use (it should not be more than the number of partitions for the topic, as defined in kafkanum-partitions)
  • worker-increase-rate-limit defines the duration before increasing the number of workers.
  • worker-decrease-rate-limit defines the duration before decreasing the number of workers.

The number of running workers depends on the load of the ClickHouse component. The number of workers is adjusted to stay below maximum-batch-size. Do not set max-workers too high, as it can increase the load on ClickHouse. The default value of 8 is usually fine.

Routing

The routing component can get the source and destination AS numbers, AS paths, and communities. Not all exporters provide this information. Currently, the default provider is BMP. Akvorado tries to select the best route using the next hop from the flow. If it is not found, it will use any other next hop.

The component has a provider key that defines the provider configuration. Inside the provider configuration, the type key defines the provider type. bmp and bioris are currently supported. The remaining keys are specific to the provider.

BMP provider

For the BMP provider, the following keys are accepted:

  • listen specifies the IP address and port to listen for incoming connections (default port is 10179).
  • rds is a list of route distinguishers to accept. Use 0 to accept routes without a route distinguisher.
  • collect-asns defines if origin AS numbers should be collected.
  • collect-aspaths defines if AS paths should be collected.
  • collect-communities defines if communities should be collected. It supports regular and large communities, but not extended communities.
  • keep defines how long to keep routes from a terminated BMP connection.
  • receive-buffer is the size of the kernel receive buffer in bytes for each established BMP connection.

If you do not need AS paths and communities, you can disable them to save memory and disk space in ClickHouse.

Akvorado supports receiving AdjRIB-in, with or without filtering. It can also work with a LocRIB.

For example:

routing:
  provider:
    type: bmp
    listen: 0.0.0.0:10179
    collect-asns: true
    collect-aspaths: true
    collect-communities: false

Note

With many routes, BMP can have performance issues when a peer disconnects. If you do not need full accuracy, limit the number of BMP peers and export the LocRIB. These issues will be fixed in a future release.

BioRIS provider

As an alternative to the internal BMP, you can connect to an existing bio-rd RIS instance. It accepts the following keys:

  • ris-instances is a list of instances.
  • timeout defines how long to wait for an answer from a RIS instance.
  • refresh defines how long to wait between refreshing the list of routers.

Each instance accepts the following keys:

  • grpc-addr is the address and port of a RIS instance.
  • grpc-secure tells if a connection should be set using TLS.
  • vrf (as a string) or vrf-id (as an ID) defines which VRF to look up.

This is configured as follows:

routing:
  provider:
    type: bioris
    risinstances:
      - grpcaddr: 192.0.2.15:4321
        grpcsecure: true
        vrf: 0:0

BioRIS queries the RIB of the router that sent the flow. If this router's RIB is not available in any of the known RIS instances, another router is used as a fallback. After the router ID is determined, BioRIS queries one of the RIS instances that has the RIB.

BioRIS can set the prefix, AS, AS Path, and communities for the flow.

Metadata

Flows only include interface indexes. To associate them with an interface name and description, metadata is retrieved from the exporting routers. A cache is used. Several providers are available to poll metadata. The following keys are accepted:

  • cache-duration defines how long to keep data in the cache.
  • cache-refresh defines how long to wait before updating an entry by polling it.
  • cache-check-interval defines how often to check if cached data is about to expire or needs an update.
  • cache-persist-file defines where to store cached data on shutdown and read it back on startup.
  • query-timeout defines how long to wait for a provider to answer a query.
  • initial-delay defines how long to wait after starting before applying the standard query timeout.
  • providers defines the provider configurations.

Because flows missing any interface information are discarded, persisting the cache is useful to quickly handle incoming flows.

The providers key contains the provider configurations. For each, the provider type is defined by the type key. When using several providers, they are queried in order and the process stops on the first one that accepts the query. Currently, only the static provider can skip a query. Therefore, you should put it first.

SNMP provider

The snmp provider accepts these configuration keys:

  • credentials is a map from exporter subnets to credentials. Use ::/0 to set the default value. For SNMPv2, it accepts the communities key. It is a single community or a list of communities. In the latter case, each community is tried in order for all requests. For SNMPv3, it accepts the following keys: user-name, authentication-protocol (none, MD5, SHA, SHA224, SHA256, SHA384, and SHA512 are accepted), authentication-passphrase (if the previous value was set), privacy-protocol (none, DES, AES, AES192, AES256, AES192-C, and AES256-C are accepted, the latter being Cisco-variant), privacy-passphrase (if the previous value was set), and context-name.
  • ports is a map from exporter subnets to the SNMP port to use for polling exporters in the provided subnet.
  • agents is a map from exporter IPs to agent IPs. When there is no match, the exporter IP is used. Other options still use the exporter IP as a key, not the agent IP.
  • poller-retries is the number of retries for unsuccessful SNMP requests.
  • poller-timeout defines how long the poller should wait for an answer.

Akvorado uses SNMPv2 if communities is present and SNMPv3 if user-name is present. You need one of them.

For example, with SNMPv2, you can try both private and @private SNMPv2 communities:

metadata:
  workers: 10
  providers:
    - type: snmp
      credentials:
        ::/0:
          communities:
            - private
            - "@private"

And with SNMPv3:

metadata:
  workers: 10
  providers:
    - type: snmp
      credentials:
        ::/0:
          user-name: monitoring
          authentication-protocol: SHA
          authentication-passphrase: "d$rkSec"
          privacy-protocol: AES192
          privacy-passphrase: "Cl0se"

gNMI provider

The gnmi provider polls an exporter using gNMI. It accepts these keys:

  • targets is a map from exporter subnets to target IPs. When there is no match, the exporter IP is used. Other options still use the exporter IP as a key, not the target IP.
  • ports is a map from exporter subnets to the gNMI port to use for polling exporters in the provided subnet.
  • set-target is a map from exporter subnets to a boolean that specifies if the target name should be set in the gNMI path prefix. In this case, it is set to the exporter IP address. This is useful if the selected target is a gNMI gateway.
  • authentication-parameters is a map from exporter subnets to authentication parameters for gNMI targets. Authentication parameters accept these keys: username, password, and tls (which takes the same keys as for Kafka).
  • models is the list of models to use to get information from a target. Each model is tried, and if a target supports all the paths, it is selected. The models are tried in the order they are declared. If you want to keep the built-in models, use the special string defaults.
  • timeout defines how long to wait for an answer from a target.
  • minimal-refresh-interval is the minimum time a collector will wait before polling a target again.

For example:

metadata:
 providers:
  type: gnmi
  authentication-parameters:
   ::/0:
    username: admin
    password: NokiaSrl1!
    skip-verify: true

Unlike SNMP, a single metadata worker is sufficient for gNMI.

The gNMI provider uses "subscribe once" to poll for information from the target. This should be compatible with most targets.

A model accepts these keys:

  • name for the model name (e.g., Nokia SR Linux).
  • system-name-paths is a list of paths to get the system name (e.g., /system/name/host-name).
  • if-index-paths is a list of paths to get interface indexes.
  • if-name-keys is a list of keys where you can find the name of an interface in the paths returned for interface indexes (e.g., name or port-id).
  • if-name-paths is a list of paths to get interface names. These paths take precedence over the previous key if found.
  • if-description-paths is a list of paths to get interface descriptions.
  • if-speed-paths is a list of paths to get interface speeds. For this key, a path is defined by two keys: path for the gNMI path and unit for the unit on how to interpret the value. A unit can be bps (bits per second), mbps (megabits per second), ethernet (OpenConfig ETHERNET_SPEED like SPEED_100GB), or human (human-readable format like 10G or 100M).

The currently supported models are:

  • Nokia SR OS
  • Nokia SR Linux
  • OpenConfig
  • IETF

Static provider

The static provider accepts an exporters key that maps exporter subnets to an exporter configuration. An exporter configuration is a map:

  • name is the name of the exporter.
  • default is the default interface when no match is found.
  • ifindexes is a map from interface indexes to an interface.
  • skip-missing-interfaces defines whether the exporter should process only the interfaces defined in the configuration and leave the rest to the next provider. This conflicts with the default setting.

An interface has a name, a description, and a speed.

For example, to add an exception for 2001:db8:1::1 and then use SNMP for other exporters:

metadata:
  providers:
    - type: static
      exporters:
        2001:db8:1::1:
          name: exporter1
          skip-missing-interfaces: true
          ifindexes:
            10:
              name: Gi0/0/10
              description: PNI Netflix
              speed: 1000
            11:
              name: Gi0/0/15
              description: PNI Google
              speed: 1000
    - type: snmp
      communities:
        ::/0: private

The static provider also accepts an exporter-sources key, which fetches a remote source that maps subnets to attributes. This is similar to exporters, but the definition is fetched through HTTP. It accepts a map from source names to sources. Each source accepts these attributes:

  • url is the URL to fetch.
  • tls defines the TLS configuration to connect to the source (it uses the same configuration as for Kafka, be sure to set enable to true)
  • method is the method to use (GET or POST).
  • headers is a map of header names to values to add to the request.
  • proxy defines if a proxy should be used (defined with environment variables like http_proxy).
  • timeout defines the timeout for fetching and parsing.
  • interval is the interval at which the source should be refreshed.
  • transform is a jq expression that transforms the received JSON into a set of attributes represented as objects. Each object should have these keys: exporter-subnet, default (with the same structure as a static configuration), and interfaces. The latter is a list of interfaces, where each interface has an ifindex, a name, a description, and a speed.

For example:

metadata:
  providers:
    type: static
    exporter-sources:
      gostatic:
        url: http://gostatic:8043/my-exporters.json
        interval: 10m
        transform: .exporters[]

Core

The core component processes flows from Kafka, queries the metadata component to enrich the flows with additional information, and classifies exporters and interfaces into groups with a set of classification rules. It also handles flow rate limiting.

The following configuration keys are accepted:

  • exporter-classifiers is a list of classifier rules to define a group for exporters
  • interface-classifiers is a list of classifier rules to define connectivity type, network boundary and provider for an interface
  • classifier-cache-duration defines how long to keep the result of a previous classification in memory to reduce CPU usage.
  • default-sampling-rate defines the default sampling rate to use when the information is missing. If not defined, flows without a sampling rate will be rejected. Use this option only if your hardware is unable to advertise a sampling rate. This can either be a single value or a map from subnets to sampling rates.
  • override-sampling-rate defines the sampling rate instead of the one received in the flows. This is useful if a device lie about its sampling rate. This is a map from subnets to sampling rates (but it would also accept a single value).
  • asn-providers defines the source list for AS numbers. The available sources are flow, flow-except-private (use information from flow except if the ASN is private), routing, routing-except-private, and geo-ip. The default value is flow, routing, geo-ip. geo-ip should only be used at the end as there is no fallback possible.
  • net-providers defines the sources for prefix lengths and nexthop. flow uses the value provided by the flow message (if any), while routing looks it up using the BMP component. If multiple sources are provided, the value of the first source providing a non-default route is taken. The default value is flow and routing.

Classification

Classifier rules are written in a language called Expr.

Interface classifiers gets exporter and interface-related information as input. If they can make a decision, they should invoke one of the Classify() functions with the target element as an argument. Once classification is done for an element, it cannot be changed by a subsequent rule. All strings are normalized (lower case, special chars removed).

  • Exporter.IP for the exporter IP address
  • Exporter.Name for the exporter name
  • Interface.Index for the interface index
  • Interface.Name for the interface name
  • Interface.Description for the interface description
  • Interface.Speed for the interface speed
  • Interface.VLAN for VLAN number (you need to enable SrcVlan and DstVlan in schema)
  • ClassifyConnectivity() to classify for a connectivity type (transit, PNI, PPNI, IX, customer, core, ...)
  • ClassifyProvider() to classify for a provider (Cogent, Telia, ...)
  • ClassifyExternal() to classify the interface as external
  • ClassifyInternal() to classify the interface as internal
  • SetName() to change the interface name
  • SetDescription() to change the interface description
  • Reject() to reject the flow
  • Format() to format a string: Format("name: %s", Interface.Name)

Once an interface is classified for a given criteria, it cannot be changed by later rule. Once an interface is classified for all criteria, remaining rules are skipped. Connectivity and provider are normalized (lower case, special chars removed).

Each Classify() function, with the exception of ClassifyExternal() and ClassifyInternal() have a variant ending with Regex which takes a string and a regex before the original string and do a regex match. The original string is expanded using the matching parts of the regex. The syntax is the one from Go. If you want to use Perl character classes, such as \d or \w, you need to escape the backslash character: \\d and \\w. To test your regex, you can use a site like regular expressions 101. Be sure to use the "Golang" flavor. You can use the substition function. In this case, append .* to your regex to get the expected result (you can keep it in the final regex if you prefer).

Here is an example, assuming interface descriptions for external facing interfaces look like Transit: Cogent 1-3834938493 or PNI: Netflix (WL6-1190).

interface-classifiers:
  - |
    ClassifyConnectivityRegex(Interface.Description, "^(?i)(transit|pni|ppni|ix):? ", "$1") &&
    ClassifyProviderRegex(Interface.Description, "^[^ ]+? ([^ ]+)", "$1") &&
    ClassifyExternal()
  - ClassifyInternal()

The first rule says “extract the connectivity (transit, pni, ppni or ix) from the interface description, and if successful, use the second part of the description as the provider, and if successful, considers the interface as an external one”. The second rule says “if an interface was not classified as external or internal, consider it as an internal one.”

Exporter classifiers gets the classifier IP address and its hostname. Like the interface classifiers, they should invoke one of the Classify() functions to make a decision:

  • Exporter.IP for the exporter IP address
  • Exporter.Name for the exporter name
  • ClassifyGroup() to classify the exporter to a group
  • ClassifyRole() to classify the exporter for a role (edge, core)
  • ClassifySite() to classify the exporter to a site (paris, berlin, newyork)
  • ClassifyRegion() to classify the exporter to a region (france, italy, caraibes)
  • ClassifyTenant() to classify the exporter to a tenant (team-a, team-b)
  • Reject() to reject the flow
  • Format() to format a string: Format("name: %s", Exporter.Name)

As a compatibility Classify() is an alias for ClassifyGroup(). Here is an example, assuming routers are named th2-ncs55a1-1.example.fr or milan-ncs5k8-2.example.it:

exporter-classifiers:
  - ClassifySiteRegex(Exporter.Name, "^([^-]+)-", "$1")
  - Exporter.Name endsWith ".it" && ClassifyRegion("italy")
  - Exporter.Name matches "^(washington|newyork).*" && ClassifyRegion("usa")
  - Exporter.Name endsWith ".fr" && ClassifyRegion("france")

ClickHouse

The ClickHouse component pushes data to ClickHouse. There are three settings that are configurable:

  • maximum-batch-size defines how many flows to send to ClickHouse in a single batch at most
  • minimum-wait-time defines how long to wait before sending an incomplete batch
  • grace-period defines how long to wait when flushing data to ClickHouse on shutdown

These numbers are per-worker (as defined in the Kafka component). A worker will send a batch of size at most maximum-batch-size at least every maximum-wait-time. ClickHouse is more efficient when the batch size is large. The default value is 100 000 and allows ClickHouse to handle incoming flows efficiently.

Flow

The flow component decodes flows received from Kafka. There is only one setting:

  • state-persist-file defines the location of the file to save the state of the flow decoders and read it back on startup. It is used to store IPFIX/NetFlow templates and options.

Orchestrator service

The three main components of the orchestrator service are schema, clickhouse, and kafka. The automatic-restart directive tells the orchestrator to watch for configuration changes and restart if there are any. It is enable by default.

Schema

It is possible to alter the data schema used by Akvorado by adding and removing columns. For example, to add the SrcVlan and DstVlan columns while removing the SrcCountry and DstCountry, one can use:

schema:
  materialize:
    - SrcNetPrefix
    - DstNetPrefix
  disabled:
    - SrcCountry
    - DstCountry
  enabled:
    - SrcVlan
    - DstVlan

With materialize, you can control if an dimension computed from other dimensions (e.g. SrcNetPrefix and DstNetPrefix) is computed at query time (the default) or materialized at ingest time. This reduces the query time, but increases the storage needs.

You can get the list of columns you can enable or disable with akvorado version -d. Disabling a column won't delete existing data.

It is also possible to make some columns available on the main table only or on all tables with main-table-only and not-main-table-only. For example:

schema:
  enabled:
    - SrcMAC
    - DstMAC
  main-table-only:
    - SrcMAC
    - DstMAC
  not-main-table-only:
    - SrcAddr
    - DstAddr

For ICMP, you get ICMPv4Type, ICMPv4Code, ICMPv6Type, ICMPv6Code, ICMPv4, and ICMPv6. The two latest one are displayed as a string in the console (like echo-reply or frag-needed).

Custom dictionaries

You can add custom dimensions to be looked up via a dictionary. This is useful to enrich your flow with additional information not possible to get in the classifier. This works by providing the database with a CSV file containing the values.

schema:
  custom-dictionaries:
    ips:
      layout: complex_key_hashed
      keys:
        - name: addr
          type: String
      attributes:
        - name: role
          type: String
          default: DefaultRole
          label: IPRole
      source: /etc/akvorado/ips_annotation.csv
      dimensions:
        - SrcAddr
        - DstAddr

This example expects a CSV file named ips_annotation.csv (when using Docker, put it in the config/ directory) with the following format:

addr,role
2001:db8::1,ExampleRole

If SrcAddr has the value 2001:db8::1 (matches the key), the dimension SrcAddrIPRole will be set to ExampleRole.

Independently, if DstAddr has the value 2001:db8::1, the dimension DstAddrIPRole will be set to ExampleRole.

All other IPs will get "DefaultRole" in their SrcAddrIPRole/DstAddrIPRole dimension.

The label and default keys are optional.

It is possible to add the same dictionary to multiple dimensions, usually for the "Input" and "Output"-direction.

By default, the value of the key tries to match a dimension. For multiple keys, it is necessary to explicitly specify the dimension name to match by either specifing match-dimension or match-dimension-suffix:

schema:
  custom-dictionaries:
    interfaces:
      layout: complex_key_hashed
      dimensions:
        - OutIf
        - InIf
      keys:
        - name: agent
          type: String
          # CSV column “agent” matches the ExporterAddress dimension
          match-dimension: ExporterAddress
        - name: interface
          type: String
          # CSV column “interface” matches matches either OUtIfName or InIfName
          match-dimension-suffix: Name
      attributes:
        - name: information # OutIfInformation/InIfInformation
          type: String
          # No default. If no match of both agent and interface, the dimension is empty
      source: /etc/akvorado/interfaces.csv

Kafka

The Kafka component creates or updates the Kafka topic to receive flows. It accepts the following keys:

  • brokers specifies the list of brokers to use to bootstrap the connection to the Kafka cluster
  • tls defines the TLS configuration to connect to the cluster
  • sasl defines the SASL configuration to connect to the cluster
  • topic defines the base topic name
  • manage-topic controls whether the orchestrator should create/update the Kafka topic (default: true). Can be set to false when Kafka is not needed (e.g., UI-only setup) or managed externally.
  • topic-configuration describes how the topic should be configured

The following keys are accepted for the TLS configuration:

  • enable should be set to true to enable TLS.
  • skip-verify can be set to true to skip checking server certificate (not recommended).
  • ca-file gives the location of the file containing the CA certificate in PEM format to check the server certificate. If not provided, the system certificates are used instead.
  • cert-file and key-file defines the location of the client certificate pair in PEM format to authenticate to the broker. If the first one is empty, no client certificate is used. If the second one is empty, the key is expected to be in the certificate file.

The following keys are accepted for SASL configuration:

  • username and password enables SASL authentication with the provided user and password.
  • algorithm tells which SASL mechanism to use for authentication. This can be none, plain, scram-sha256, scram-sha512, or oauth. This should not be set to none when SASL is used.
  • oauth-token-url defines the URL to query to get a valid OAuth token (in this case, username and password are used as client credentials).
  • oauth-scopes defines the list of scopes to request for the OAuth token.

The following keys are accepted for the topic configuration:

  • num-partitions for the number of partitions
  • replication-factor for the replication factor
  • config-entries is a mapping from configuration names to their values
  • config-entries-strict-sync for the configuration in-sync policy

For example:

kafka:
  topic: test-topic
  topic-configuration:
    num-partitions: 1
    replication-factor: 1
    config-entries:
      segment.bytes: 1073741824
      retention.ms: 86400000
      cleanup.policy: delete
    config-entries-strict-sync: true

Another useful setting is retention.bytes to limit the size of a partition in bytes too (divide it by the number of partitions to have a limit for the topic).

Currently, the orchestrator service won't update the replication factor. By default, the configuration entries are kept in sync with the content of the configuration file, except if you disable the config-entries-strict-sync, the existing non-listed overrides won't be removed from topic configuration entries.

ClickHouse database

The ClickHouse database component contains the settings to connect to the ClickHouse database. The following keys should be provided inside clickhousedb:

  • servers defines the list of ClickHouse servers to connect to
  • username is the username to use for authentication
  • password is the password to use for authentication
  • database defines the database to use to create tables
  • cluster defines the cluster for replicated and distributed tables, see the next section for more information
  • tls defines the TLS configuration to connect to the database (it uses the same configuration as for Kafka)

ClickHouse

The ClickHouse component exposes some useful HTTP endpoints to configure a ClickHouse database. It also provisions and keep up-to-date a ClickHouse database. The following keys can be provided inside clickhouse:

  • resolutions defines the various resolutions to keep data
  • max-partitions defines the number of partitions to use when creating consolidated tables
  • networks maps subnets to attributes. Attributes are name, role, site, region, and tenant. They are exposed as SrcNetName, DstNetName, SrcNetRole, DstNetRole, etc. It is also possible to override GeoIP attributes city, state, country, and ASN.
  • network-sources fetch a remote source mapping subnets to attributes. This is similar to networks but the definition is fetched through HTTP. It accepts a map from source names to sources. Each source accepts the following attributes:
    • url is the URL to fetch
    • tls defines the TLS configuration to connect to the source (it uses the same configuration as for Kafka, be sure to set enable to true)
    • method is the method to use (GET or POST)
    • headers is a map from header names to values to add to the request
    • proxy says if we should use a proxy (defined through environment variables like http_proxy)
    • timeout defines the timeout for fetching and parsing
    • interval is the interval at which the source should be refreshed
    • transform is a jq expression to transform the received JSON into a set of network attributes represented as objects. Each object must have a prefix attribute and, optionally, name, role, site, region, tenant, city, state, country, and asn. See the example provided in the shipped akvorado.yaml configuration file.
  • asns maps AS number to names (overriding the builtin ones)
  • orchestrator-url defines the URL of the orchestrator to be used by ClickHouse (autodetection when not specified)
  • orchestrator-basic-auth enables basic authentication to access the orchestrator URL. It takes two attributes: username and password.
  • skip-migrations controls whether to skip ClickHouse schema management (default: false). Can be set to true when the schema is managed externally or by another orchestrator. The outlet requires the schema to match the expected structure; schema mismatches may cause write errors.

The resolutions setting contains a list of resolutions. Each resolution has two keys: interval and ttl. The first one is the consolidation interval. The second is how long to keep the data in the database. If ttl is 0, then the data is kept forever. If interval is 0, it applies to the raw data (the one in the flows table). For each resolution, a materialized view flows_DDDD is created with the specified interval. It should be noted that consolidated tables do not contain information about source/destination IP addresses and ports. That's why you may want to keep the interval-0 table data a bit longer. Akvorado will still use the consolidated tables if the query do not require the raw table, for performance reason.

Here is the default configuration:

resolutions:
  - interval: 0
    ttl: 360h  # 15 days
  - interval: 1m
    ttl: 168h  # 1 week
  - interval: 5m
    ttl: 2160h # 3 months
  - interval: 1h
    ttl: 8760h # 1 year

If you want to tweak the values, start from the default configuration. Most of the disk space is taken by the main table (interval: 0) and you can reduce its TTL if it's too big for your usage. Check the operational documentation for information on how to check disk usage. If you remove an existing interval, it is not removed from the ClickHouse database and will continue to be populated.

It is mandatory to specify a configuration for interval: 0.

When specifying a cluster name with cluster, the orchestrator will manage a set of replicated and distributed tables. No migration is done between the cluster and the non-cluster modes, therefore, you shouldn't change this setting without also changing the database. If you already have an existing setup, this means you need to start from scratch and copy data. There is currently no instruction for that, but it's mostly a matter of copying flows table to flows_local, and flows_DDDD (where DDDD is an interval) tables to flows_DDDD_local.

GeoIP

The geoip directive allows one to configure two databases using the MaxMind DB file format, one for AS numbers, one for countries/cities. It accepts the following keys:

  • asn-database tells the paths to the ASN database
  • geo-database tells the paths to the geo database (country or city)
  • optional makes the presence of the databases optional on start (when not present on start, the component is just disabled)

If the files are updated while Akvorado is running, they are automatically refreshed. For a given database, the latest paths override the earlier ones.

Console service

The main components of the console service are console, authentication and database.

The console itself accepts the following keys:

  • default-visualize-options to define default options for the "visualize" tab. It takes the following keys: graph-type (one of stacked, stacked100, lines, grid, or sankey), start, end, filter, dimensions (a list), limit, limitType, bidirectional (a bool), previous-period (a bool)
  • homepage-top-widgets to define the widgets to display on the home page (among src-as, dst-as, src-country, dst-country, exporter, protocol, etype, src-port, and dst-port)
  • dimensions-limit to set the upper limit of the number of returned dimensions
  • cache-ttl sets the time costly requests are kept in cache
  • homepage-graph-filter sets the filter for the graph on the homepage (default: InIfBoundary = 'external'). This is a SQL expression, passed into the clickhouse query directly. It can also be empty, in which case the sum of all flows captured will be displayed.
  • homepage-graph-timerange sets the time range to use for the graph on the homepage. It defaults to 24 hours.

It also takes a clickhouse key, accepting the same configuration as the orchestrator service. These keys are copied from the orchestrator, unless servers is set explicitely.

Here is an example:

console:
  homepage-top-widgets: [src-as, src-country, etype]
  default-visualize-options:
    start: 1 day ago
    end: now
    filter: InIfBoundary = external
    dimensions:
      - ExporterName

Authentication

The console does not store user identities and is unable to authenticate them. It expects an authenticating proxy will add some headers to the API endpoints:

  • Remote-User is the user login,
  • Remote-Name is the user display name,
  • Remote-Email is the user email address,
  • X-Logout-URL is a link to the logout link,
  • X-Avatar-URL is a link to the avatar image.

Only the first header is mandatory. The name of the headers can be changed by providing a different mapping under the headers key. It is also possible to modify the default user (when no header is present) by tweaking the default-user key. If logout URL or avatar URL is not provided in the headers, it is possible to provide them as logout-url and avatar-url. In this case, they can be templated with .Login, .Name, .Email, .LogoutURL, and .AvatarURL.

auth:
  headers:
    login: Remote-User
    name: Remote-Name
    email: Remote-Email
    logout-url: X-Logout-URL
  default-user:
    login: default
    name: Default User
  avatar-url: "https://avatars.githubusercontent.com/{{ .Login }}?s=80"
  logout-url: "{{ if .LogoutURL }}{{ .LogoutURL }}{{ else }}/logout{{ end }}"

To prevent access when not authenticated, the login field for the default-user key should be empty.

There are several systems providing user management with all the bells and whistles, including OAuth2 support, multi-factor authentication and API tokens. Here is a short selection of solutions able to act as an authenticating reverse-proxy for Akvorado:

There also exist simpler solutions only providing authentication:

Traefik can also be configured to forward authentication requests to another service, include OAuth2 Proxy or Traefik Forward Auth. Some examples are present in docker/docker-compose-local.yml.

Database

The console stores some data, like per-user filters, into a relational database. When the database is not configured, data is only stored in memory and will be lost on restart. Supported drivers are sqlite, mysql, and postgresql.

database:
  driver: sqlite
  dsn: /var/lib/akvorado/console.sqlite

The dsn field for sqlite should be the path to the database. For mysql, the format is user:pass@tcp(hostname:3306)/dbname?charset=utf8mb4. Check the documentation of the SQL driver for more details. For postgresql, the format is host=hostname port=5432 user=user password=pass dbname=dbname sslmode=disable. Check the documentation of libpq for more details.

Important

With the Docker Compose setup, SQLite is configured by default with DSN /run/akvorado/console.sqlite using environment variable. To override this, uncomment the appropriate configuration snippet in docker/docker-compose-local.yml.

The database configuration also accepts a saved-filters key to populate the database with the provided filters. Each filter should have a description and a content:

database:
  saved-filters:
    - description: From Netflix
      content: InIfBoundary = external AND SrcAS = AS2906

Demo exporter service

For testing purpose, it is possible to generate flows using the demo exporter service. It features a NetFlow generator, a simple SNMP agent and a BMP exporter.

snmp:
  name: exporter1.example.com
  interfaces:
    10: "Transit: Telia"
    11: "IX: AMSIX"
    20: "core"
    21: "core"
  listen: :161
bmp:
  target: 127.0.0.1:10179
  routes:
    - prefixes: 192.0.2.0/24,2a01:db8:cafe:1::/64
      aspath: 64501
      communities: 65401:10,65401:12
flows:
  samplingrate: 50000
  target: 127.0.0.1:2055
  flows:
    - per-second: 0.2
      in-if-index: 10
      out-if-index: 20
      peak-hour: 16h
      multiplier: 3
      src-port: 0
      dst-port: 80
      protocol: tcp
      size: 1300
      dst-net: 192.0.2.0/24
      dst-as: 64501
      src-net: 198.38.120.0/23
      src-as: 2906

In the snmp section, all fields are mandatory. The interfaces section maps interface indexes to their descriptions. In the bmp session, for each set of prefixes, the aspath is mandatory, but the communities are optional. In the flows section, all fields are mandatory. Have a look at the provided akvorado.yaml configuration file for a more complete example. As generating many flows is quite verbose, it may be useful to rely on YAML anchors to avoid repeating a lot of stuff.

Common configuration settings

All services also embeds an HTTP and a reporting component.

HTTP

The builtin HTTP server serves various pages. Its configuration supports the following keys:

  • listen defines the address and port to listen to.
  • profiler enables Go profiler HTTP interface. Check the troubleshooting section for details. It is enabled by default.
  • cache defines the cache backend to use for some HTTP requests. It accepts a type key which can be either memory (the default value) or redis. When using the Redis backend, the following additional keys are also accepted: protocol (tcp or unix), server (host and port), username, password, and db (an integer to specify which database to use).
http:
  listen: :8000
  cache:
    type: redis
    username: akvorado
    password: akvorado

Note that the cache backend is currently only useful with the console. You need to define the cache in the http key of the console section for it to be useful.

Reporting

Reporting encompasses logging and metrics. Currently, as Akvorado is expected to be run inside Docker, logging is done on the standard output and is not configurable. As for metrics, they are reported by the HTTP component on the /api/v0/XXX/metrics endpoint (where XXX is the service name) and there is nothing to configure either.