mirror of
https://github.com/akvorado/akvorado.git
synced 2025-12-11 22:14:02 +01:00
This is the terminology for IPFix. With Netflow, sampler is ambiguous as it can mean the exporter or the sampler inside the exporter.
290 lines
10 KiB
Markdown
290 lines
10 KiB
Markdown
# Configuration
|
|
|
|
*Akvorado* can be configured through a YAML file. Each aspect is
|
|
configured through a different section:
|
|
|
|
- `reporting`: [Log and metric reporting](#reporting)
|
|
- `http`: [Builtin HTTP server](#http)
|
|
- `web`: [Web interface](#web)
|
|
- `flow`: [Flow ingestion](#flow)
|
|
- `snmp`: [SNMP poller](#snmp)
|
|
- `geoip`: [GeoIP database](#geoip)
|
|
- `kafka`: [Kafka broker](#kafka)
|
|
- `clickhouse`: [ClickHouse helper](#clickhouse)
|
|
- `core`: [Core](#core)
|
|
|
|
You can get the default configuration with `./akvorado --dump
|
|
--check`. Durations can be written in seconds or using strings like
|
|
`10h20m`.
|
|
|
|
It is also possible to override configuration settings using
|
|
environment variables. You need to remove any `-` from key names and
|
|
use `_` to handle nesting. Then, put `AKVORADO_` as a prefix. For
|
|
example, let's consider the following configuration file:
|
|
|
|
```yaml
|
|
kafka:
|
|
topic: test-topic
|
|
topic-configuration:
|
|
num-partitions: 1
|
|
brokers:
|
|
- 192.0.2.1:9092
|
|
- 192.0.2.2:9092
|
|
```
|
|
|
|
It can be translated to:
|
|
|
|
```sh
|
|
AKVORADO_KAFKA_TOPIC=test-topic
|
|
AKVORADO_KAFKA_TOPICCONFIGURATION_NUMPARTITIONS=1
|
|
AKVORADO_KAFKA_BROKERS=192.0.2.1:9092,192.0.2.2:9092
|
|
```
|
|
|
|
## Reporting
|
|
|
|
Reporting encompasses logging and metrics. Currently, as *Akvorado* is
|
|
expected to be run inside Docker, logging is done on the standard
|
|
output and is not configurable. As for metrics, they are reported by
|
|
the HTTP component on the `/api/v0/metrics` endpoint and there is
|
|
nothing to configure either.
|
|
|
|
## HTTP
|
|
|
|
The builtin HTTP server serves various pages. Its configuration
|
|
supports only the `listen` key to specify the address and port to
|
|
listen. For example:
|
|
|
|
```yaml
|
|
http:
|
|
listen: 0.0.0.0:8000
|
|
```
|
|
|
|
## Web
|
|
|
|
The web interface presents the landing page of *Akvorado*. It also
|
|
embeds the documentation. It accepts only the following key:
|
|
|
|
- `grafanaurl` to specify the URL to Grafana and exposes it as
|
|
[`/grafana`](/grafana).
|
|
|
|
## Flow
|
|
|
|
The flow component handles flow ingestion. It supports the following
|
|
configuration keys:
|
|
|
|
- `inputs` to specify the list of inputs
|
|
|
|
Each input should define a `type` and `decoder`. For `decoder`, only
|
|
`netflow` is currently supported. As for the `type`, both `udp` and
|
|
`file` are supported.
|
|
|
|
For the UDP input, the supported keys are `listen` to set the
|
|
listening endpoint, `workers` to set the number of workers to listen
|
|
to the socket, `receive-buffer` to set the size of the kernel's
|
|
incoming buffer for each listening socket, and `queue-size` to define
|
|
the number of messages to buffer inside each worker. For example:
|
|
|
|
```yaml
|
|
flow:
|
|
inputs:
|
|
- type: udp
|
|
decoder: netflow
|
|
listen: 0.0.0.0:2055
|
|
workers: 3
|
|
workers: 2
|
|
```
|
|
|
|
The `file` input should only be used for testing. It supports a
|
|
`paths` key to define the files to read from. These files are injected
|
|
continuously in the pipeline. For example:
|
|
|
|
```yaml
|
|
flow:
|
|
inputs:
|
|
- type: file
|
|
decoder: netflow
|
|
paths:
|
|
- /tmp/flow1.raw
|
|
- /tmp/flow2.raw
|
|
workers: 2
|
|
```
|
|
|
|
## SNMP
|
|
|
|
Flows only include interface indexes. To associate them with an
|
|
interface name and description, SNMP is used to poll the exporter
|
|
sending each flows. A cache is maintained to avoid polling
|
|
continuously the exporters. The following keys are accepted:
|
|
|
|
- `cache-duration` tells how much time to keep data in the cache
|
|
- `cache-refresh` tells how much time to wait before updating an entry
|
|
by polling it
|
|
- `cache-check-interval` tells how often to check if cached data is
|
|
about to expire or need an update
|
|
- `cache-persist-file` tells where to store cached data on shutdown and
|
|
read them back on startup
|
|
- `default-community` tells which community to use when polling exporters
|
|
- `communities` is a map from a exporter IP address to the community to
|
|
use for a exporter, overriding the default value set above,
|
|
- `poller-retries` is the number of retries on unsuccessful SNMP requests.
|
|
- `poller-timeout` tells how much time should the poller wait for an answer.
|
|
- `workers` tell how many workers to spawn to handle SNMP polling.
|
|
|
|
As flows missing interface information are discarded, persisting the
|
|
cache is useful to quickly be able to handle incoming flows. By
|
|
default, no persistent cache is configured.
|
|
|
|
## GeoIP
|
|
|
|
The GeoIP component adds source and destination country, as well as
|
|
the AS number of the source and destination IP if they are not present
|
|
in the received flows. It needs two databases using the [MaxMind DB
|
|
file format][], one for AS numbers, one for countries. If no database
|
|
is provided, the component is inactive. It accepts the following keys:
|
|
|
|
- `asn-database` tells the path to the ASN database
|
|
- `country-database` tells the path to the country database
|
|
|
|
[MaxMind DB file format]: https://maxmind.github.io/MaxMind-DB/
|
|
|
|
If the files are updated while *Akvorado* is running, they are
|
|
automatically refreshed.
|
|
|
|
## Kafka
|
|
|
|
Received flows are exported to a Kafka topic using the [protocol
|
|
buffers format][]. The definition file is `flow/flow-*.proto`. Each
|
|
flow is written in the [length-delimited format][].
|
|
|
|
[protocol buffers format]: https://developers.google.com/protocol-buffers
|
|
[length-delimited format]: https://cwiki.apache.org/confluence/display/GEODE/Delimiting+Protobuf+Messages
|
|
|
|
The following keys are accepted:
|
|
|
|
- `topic` tells which topic to use to write messages
|
|
- `topic-configuration` contains the topic configuration
|
|
- `brokers` specifies the list of brokers to use to bootstrap the
|
|
connection to the Kafka cluster
|
|
- `version` tells which minimal version of Kafka to expect
|
|
- `usetls` tells if we should use TLS to connection (authentication is not supported)
|
|
- `flush-interval` defines the maximum flush interval to send received
|
|
flows to Kafka
|
|
- `flush-bytes` defines the maximum number of bytes to store before
|
|
flushing flows to Kafka
|
|
- `max-message-bytes` defines the maximum size of a message (it should
|
|
be equal or smaller to the same setting in the broker configuration)
|
|
- `compression-codec` defines the compression codec to use to compress
|
|
messages (`none`, `gzip`, `snappy`, `lz4` and `zstd`)
|
|
- `queue-size` defines the size of the internal queues to send
|
|
messages to Kafka. Increasing this value will improve performance,
|
|
at the cost of losing messages in case of problems.
|
|
|
|
The topic name is suffixed by the version of the schema. For example,
|
|
if the configured topic is `flows` and the current schema version is
|
|
0, the topic used to send received flows will be `flows-v0`.
|
|
|
|
If no topic configuration is provided, the topic should already exist
|
|
in Kafka. If a configuration is provided, the topic is created if it
|
|
does not exist or updated if it does. Currently, updating the number
|
|
of partitions or the replication factor is not possible. The following
|
|
keys are accepted for the topic configuration:
|
|
|
|
- `num-partitions` for the number of partitions
|
|
- `replication-factor` for the replication factor
|
|
- `config-entries` is a mapping from configuration names to their values
|
|
|
|
For example:
|
|
|
|
```yaml
|
|
kafka:
|
|
topic: test-topic
|
|
topic-configuration:
|
|
num-partitions: 1
|
|
replication-factor: 1
|
|
config-entries:
|
|
segment.bytes: 1073741824
|
|
retention.ms: 86400000
|
|
cleanup.policy: delete
|
|
```
|
|
|
|
## ClickHouse
|
|
|
|
The ClickHouse component exposes some useful HTTP endpoints to
|
|
configure a ClickHouse database. Optionally, it will also provision
|
|
and keep up-to-date a ClickHouse database. In this case, the following
|
|
keys should be provided:
|
|
|
|
- `servers` defines the list of ClickHouse servers to connect to
|
|
- `username` is the username to use for authentication
|
|
- `password` is the password to use for authentication
|
|
- `database` defines the database to use to create tables
|
|
- `akvorado-url` defines the URL of Akvorado to be used by Clickhouse (autodetection when not specified)
|
|
|
|
## Core
|
|
|
|
The core orchestrates the remaining components. It receives the flows
|
|
from the flow component, add some information using the GeoIP
|
|
databases and the SNMP poller, and push the resulting flow to Kafka.
|
|
|
|
The following keys are accepted:
|
|
|
|
- `workers` key define how many workers should be spawned to process
|
|
incoming flows
|
|
- `exporter-classifiers` is a list of classifier rules to define a group
|
|
for exporters
|
|
- `interface-classifiers` is a list of classifier rules to define
|
|
connectivity type, network boundary and provider for an interface
|
|
- `classifier-cache-size` defines the size of the classifier cache. As
|
|
classifiers are pure, their result is cached in a cache. The metrics
|
|
should tell if the cache is big enough. It should be set at least to
|
|
twice the number of the most busy interfaces.
|
|
|
|
Classifier rules are written using [expr][].
|
|
|
|
Exporter classifiers gets the classifier IP address and its hostname.
|
|
If they can make a decision, they should invoke one of the
|
|
`Classify()` functions with the target group as an argument. Calling
|
|
this function makes the exporter part of the provided group. Evaluation
|
|
of rules stop on first match. The accessible variables and functions
|
|
are:
|
|
|
|
- `Exporter.IP` for the exporter IP address
|
|
- `Exporter.Name` for the exporter name
|
|
- `Classify()` to classify exporter to a group
|
|
|
|
Interface classifiers gets the following information and, like exporter
|
|
classifiers, should invoke one of the `Classify()` functions to make a
|
|
decision:
|
|
|
|
- `Exporter.IP` for the exporter IP address
|
|
- `Exporter.Name` for the exporter name
|
|
- `Interface.Name` for the interface name
|
|
- `Interface.Description` for the interface description
|
|
- `Interface.Speed` for the interface speed
|
|
- `ClassifyConnectivity()` to classify for a connectivity type (transit, PNI, PPNI, IX, customer, core, ...)
|
|
- `ClassifyProvider()` to classify for a provider (Cogent, Telia, ...)
|
|
- `ClassifyExternal()` to classify the interface as external
|
|
- `ClassifyInternal()` to classify the interface as internal
|
|
|
|
Once an interface is classified for a given criteria, it cannot be
|
|
changed by later rule. Once an interface is classified for all
|
|
criteria, remaining rules are skipped. Connectivity and provider are somewhat normalized (down case)
|
|
|
|
Each `Classify()` function, with the exception of `ClassifyExternal()`
|
|
and `ClassifyInternal()` have a variant ending with `Regex` which
|
|
takes a string and a regex before the original string and do a regex
|
|
match. The original string is expanded using the matching parts of the
|
|
regex. The syntax is the one [from Go][].
|
|
|
|
Here is an example:
|
|
|
|
```
|
|
Interface.Description startsWith "Transit:" &&
|
|
ClassifyConnectivity("transit") &&
|
|
ClassifyExternal() &&
|
|
ClassifyProviderRegex(Interface.Description, "^Transit: ([^ ]+)", "$1")
|
|
```
|
|
|
|
[expr]: https://github.com/antonmedv/expr/blob/master/docs/Language-Definition.md
|
|
[from Go]: https://pkg.go.dev/regexp#Regexp.Expand
|