8.3 KiB
Configuration
Akvorado can be configured through a YAML file. Each aspect is configured through a different section:
reporting: Log and metric reportinghttp: Builtin HTTP serverweb: Web interfaceflow: Flow ingestionsnmp: SNMP pollergeoip: GeoIP databasekafka: Kafka brokerclickhouse: Clickhouse helpercore: Core
You can get the default configuration with ./akvorado --dump --check.
Durations can be written in seconds or using strings like 10h20m.
Reporting
Reporting encompasses logging and metrics. Currently, as Akvorado is
expected to be run inside Docker, logging is done on the standard
output and is not configurable. As for metrics, they are reported by
the HTTP component on the /api/v0/metrics endpoint and there is
nothing to configure either.
HTTP
The builtin HTTP server serves various pages. Its configuration
supports only the listen key to specify the address and port to
listen. For example:
http:
listen: 0.0.0.0:8000
Web
The web interface presents the landing page of Akvorado. It also embeds the documentation. It accepts only the following key:
grafanaurlto specify the URL to Grafana and exposes it as/grafana.
Flow
The flow component handles flow ingestion. It supports the following configuration keys:
listento specify the IP and UDP port to listen for new flowsworkersto specify the number of workers to spawn to handle incoming flowsbufferlengthto specify the number of flows to buffer when pushing them to the core component
For example:
flow:
listen: 0.0.0.0:2055
workers: 2
SNMP
Flows only include interface indexes. To associate them with an interface name and description, SNMP is used to poll the sampler sending each flows. A cache is maintained to avoid polling continuously the samplers. The following keys are accepted:
cache-durationtells how much time to keep data in the cache before polling againcache-refreshtells how much time to poll existing data before they expirecache-refresh-intervaltells how often to check if cached data is about to expirecache-persist-filetells where to store cached data on shutdown and read them back on startupdefault-communitytells which community to use when polling samplerscommunitiesis a map from a sampler IP address to the community to use for a sampler, overriding the default value set above,workerstell how many workers to spawn to handle SNMP polling.
As flows missing interface information are discarded, persisting the cache is useful to quickly be able to handle incoming flows. By default, no persistent cache is configured.
GeoIP
The GeoIP component adds source and destination country, as well as the AS number of the source and destination IP if they are not present in the received flows. It needs two databases using the MaxMind DB file format, one for AS numbers, one for countries. If no database is provided, the component is inactive. It accepts the following keys:
asn-databasetells the path to the ASN databasecountry-databasetells the path to the country database
If the files are updated while Akvorado is running, they are automatically refreshed.
Kafka
Received flows are exported to a Kafka topic using the protocol
buffers format. The definition file is flow/flow.proto. It is
also available through the /api/v0/flow.proto
HTTP endpoint. Each flow is written in the length-delimited
format.
The following keys are accepted:
topictells which topic to use to write messagestopic-configurationcontains the topic configurationbrokersspecifies the list of brokers to use to bootstrap the connection to the Kafka clusterversiontells which minimal version of Kafka to expectusetlstells if we should use TLS to connection (authentication is not supported)flush-intervaldefines the maximum flush interval to send received flows to Kafkaflush-bytesdefines the maximum number of bytes to store before flushing flows to Kafkamax-message-bytesdefines the maximum size of a message (it should be equal or smaller to the same setting in the broker configuration)compression-codecdefines the compression codec to use to compress messages (none,gzip,snappy,lz4andzstd)
If no topic configuration is provided, the topic should already exist in Kafka. If a configuration is provided, the topic is created if it does not exist or updated if it does. Currently, updating the number of partitions or the replication factor is not possible. The following keys are accepted for the topic configuration:
num-partitionsfor the number of partitionsreplication-factorfor the replication factorconfig-entriesis a mapping from configuration names to their values
For example:
kafka:
topic: test-topic
topic-configuration:
num-partitions: 1
replication-factor: 1
config-entries:
segment.bytes: 1073741824
retention.ms: 86400000
cleanup.policy: delete
Clickhouse
The Clickhouse component exposes some useful HTTP endpoints to configure a Clickhouse database. It takes no configuration.
Core
The core orchestrates the remaining components. It receives the flows from the flow component, add some information using the GeoIP databases and the SNMP poller, and push the resulting flow to Kafka.
The following keys are accepted:
workerskey define how many workers should be spawned to process incoming flowssampler-classifiersis a list of classifier rules to define a group for samplersinterface-classifiersis a list of classifier rules to define connectivity type, network boundary and provider for an interfaceclassifier-cache-sizedefines the size of the classifier cache. As classifiers are pure, their result is cached in a cache. The metrics should tell if the cache is big enough. It should be set at least to twice the number of the most busy interfaces.
Classifier rules are written using expr.
Sampler classifiers gets the classifier IP address and its hostname.
If they can make a decision, they should invoke one of the
Classify() functions with the target group as an argument. Calling
this function makes the sampler part of the provided group. Evaluation
of rules stop on first match. The accessible variables and functions
are:
Sampler.IPfor the sampler IP addressSampler.Namefor the sampler nameClassify()to classify sampler to a group
Interface classifiers gets the following information and, like sampler
classifiers, should invoke one of the Classify() functions to make a
decision:
Sampler.IPfor the sampler IP addressSampler.Namefor the sampler nameInterface.Namefor the interface nameInterface.Descriptionfor the interface descriptionInterface.Speedfor the interface speedClassifyConnectivity()to classify for a connectivity type (transit, PNI, PPNI, IX, customer, core, ...)ClassifyProvider()to classify for a provider (Cogent, Telia, ...)ClassifyExternal()to classify the interface as externalClassifyInternal()to classify the interface as internal
Once an interface is classified for a given criteria, it cannot be changed by later rule. Once an interface is classified for all criteria, remaining rules are skipped. Connectivity and provider are somewhat normalized (down case)
Each Classify() function, with the exception of ClassifyExternal()
and ClassifyInternal() have a variant ending with Regex which
takes a string and a regex before the original string and do a regex
match. The original string is expanded using the matching parts of the
regex. The syntax is the one from Go.
Here is an example:
Interface.Description startsWith "Transit:" &&
ClassifyConnectivity("transit") &&
ClassifyExternal() &&
ClassifyProviderRegex(Interface.Description, "^Transit: ([^ ]+)", "$1")