Commit Graph

178 Commits

Author SHA1 Message Date
Vincent Bernat
621f9fd414 inlet/flow: enforce compatibility with Linux 4.19
Some checks failed
CI / 🤖 Check dependabot status (push) Has been cancelled
CI / 🐧 Test on Linux (${{ github.ref_type == 'tag' }}, misc) (push) Has been cancelled
CI / 🐧 Test on Linux (coverage) (push) Has been cancelled
CI / 🐧 Test on Linux (regular) (push) Has been cancelled
CI / ❄️ Build on Nix (push) Has been cancelled
CI / 🍏 Build and test on macOS (push) Has been cancelled
CI / 🧪 End-to-end testing (push) Has been cancelled
CI / 🔍 Upload code coverage (push) Has been cancelled
CI / 🔬 Test only Go (push) Has been cancelled
CI / 🔬 Test only JS (${{ needs.dependabot.outputs.package-ecosystem }}, 20) (push) Has been cancelled
CI / 🔬 Test only JS (${{ needs.dependabot.outputs.package-ecosystem }}, 22) (push) Has been cancelled
CI / 🔬 Test only JS (${{ needs.dependabot.outputs.package-ecosystem }}, 24) (push) Has been cancelled
CI / ⚖️ Check licenses (push) Has been cancelled
CI / 🐋 Build Docker images (push) Has been cancelled
CI / 🐋 Tag Docker images (push) Has been cancelled
CI / 🚀 Publish release (push) Has been cancelled
Update Nix dependency hashes / Update dependency hashes (push) Has been cancelled
The two last fields of `sk_reuseport_md` were added in Linux 5.14. We
don't use them, so it shouldn't matter. I remove them from `vmlinux.h`
to ensure compatibility. Also, adding
`__attribute__((preserve_access_index))` should make the program more
portable (BPF CO-RE).
2025-11-11 15:04:15 +01:00
Vincent Bernat
49b42f6055 inlet/flow: avoid repeating common socket options for Linux and others
Some checks failed
CI / 🤖 Check dependabot status (push) Has been cancelled
CI / 🐧 Test on Linux (${{ github.ref_type == 'tag' }}, misc) (push) Has been cancelled
CI / 🐧 Test on Linux (coverage) (push) Has been cancelled
CI / 🐧 Test on Linux (regular) (push) Has been cancelled
CI / ❄️ Build on Nix (push) Has been cancelled
CI / 🍏 Build and test on macOS (push) Has been cancelled
CI / 🧪 End-to-end testing (push) Has been cancelled
CI / 🔍 Upload code coverage (push) Has been cancelled
CI / 🔬 Test only Go (push) Has been cancelled
CI / 🔬 Test only JS (${{ needs.dependabot.outputs.package-ecosystem }}, 20) (push) Has been cancelled
CI / 🔬 Test only JS (${{ needs.dependabot.outputs.package-ecosystem }}, 22) (push) Has been cancelled
CI / 🔬 Test only JS (${{ needs.dependabot.outputs.package-ecosystem }}, 24) (push) Has been cancelled
CI / ⚖️ Check licenses (push) Has been cancelled
CI / 🐋 Build Docker images (push) Has been cancelled
CI / 🐋 Tag Docker images (push) Has been cancelled
CI / 🚀 Publish release (push) Has been cancelled
Update Nix dependency hashes / Update dependency hashes (push) Has been cancelled
Update Go toolchain / Update Go toolchain (push) Has been cancelled
Update Nix flake.lock / Update Nix lockfile (asn2org) (push) Has been cancelled
Update Nix flake.lock / Update Nix lockfile (iana-assignments) (push) Has been cancelled
Update Nix flake.lock / Update Nix lockfile (nixpkgs) (push) Has been cancelled
Also, fix the description for `SO_REUSEADDR`.
2025-11-02 15:16:18 +01:00
Vincent Bernat
04d7af2e00 inlet/flow: display a better error message for eBPF EPERM
When we have a permission error, display a message about BPF capability
missing instead of misleading MEMLOCK. Mention both.
2025-10-29 04:25:47 +01:00
Vincent Bernat
00264ed226 inlet/flow: remove dead check around eBPF
If err is nil, reuseportEBPFProgram cannot be nil.
2025-10-28 10:39:15 +01:00
Vincent Bernat
43ae8c8f35 build: ship eBPF programs for people without clang
Notably, in GitHub actions, MacOS does not have ebpf support compiled
in.
2025-10-28 09:45:51 +01:00
Vincent Bernat
90a46761af inlet/flow: use counter to load-balance incoming UDP flows with eBPF
The counter is per-CPU and it should be more performant than using a
random number. The test may be flaky if the test process migrate from
one CPU to another. Let's see how it goes.
2025-10-28 09:45:51 +01:00
Vincent Bernat
1fdf0c3f9f inlet/flow: use eBPF for per-packet load-balancing of incoming flows
By default, the 5-tuple is used to load balance flows. Exporters with
many flows are bound to a specific worker. Use eBPF to do a per-packet
load-balancing.

Currently, this is done randomly, but we will use a percpu counter in
the next commit. This will make the test easier too, maybe?

This should also enable graceful restart but not with the current
Docker Compose setup, we would need to use mode host or spawn a new one
in the same network namespace than the old one. This does not look like
very complex:

- spawn a new inlet in the same network namespace, but listening to a
  different HTTP port
- stop the previous inlet
- spawn a new inlet in the same network namespace
- stop the previous inlet

Alternatively, we could use SO_REUSEPORT for the HTTP socket too!
2025-10-28 09:45:51 +01:00
Vincent Bernat
9d0574a64a inlet/flow: remove unused configuration setting queue-size for UDP 2025-10-27 23:52:55 +01:00
Vincent Bernat
10bf6a956b inlet/flow: add a test checking balancing between workers 2025-10-24 08:39:30 +02:00
Vincent Bernat
3603948433 inlet/flow: flow_input_udp_in_dropped_packets_total metric
Some checks failed
Update Nix dependency hashes / Update dependency hashes (push) Has been cancelled
CI / 🤖 Check dependabot status (push) Has been cancelled
CI / 🐧 Test on Linux (${{ github.ref_type == 'tag' }}, misc) (push) Has been cancelled
CI / 🐧 Test on Linux (coverage) (push) Has been cancelled
CI / 🐧 Test on Linux (regular) (push) Has been cancelled
CI / ❄️ Build on Nix (push) Has been cancelled
CI / 🍏 Build and test on macOS (push) Has been cancelled
CI / 🧪 End-to-end testing (push) Has been cancelled
CI / 🔍 Upload code coverage (push) Has been cancelled
CI / 🔬 Test only Go (push) Has been cancelled
CI / 🔬 Test only JS (${{ needs.dependabot.outputs.package-ecosystem }}, 20) (push) Has been cancelled
CI / 🔬 Test only JS (${{ needs.dependabot.outputs.package-ecosystem }}, 22) (push) Has been cancelled
CI / 🔬 Test only JS (${{ needs.dependabot.outputs.package-ecosystem }}, 24) (push) Has been cancelled
CI / ⚖️ Check licenses (push) Has been cancelled
CI / 🐋 Build Docker images (push) Has been cancelled
CI / 🐋 Tag Docker images (push) Has been cancelled
CI / 🚀 Publish release (push) Has been cancelled
`SO_RXQ_OVFL` returns the number of dropped packets by the socket since
its creation. Therefore, we should use a gauge, not a counter.

Fix #2017.
2025-10-10 20:39:16 +02:00
Vincent Bernat
9f594ef66c inlet/flow: make some of the socket options not fatal
If the kernel is too old for timestamping, it should not be fatal. I
prefer to not accept SO_TIMESTAMP_OLD as the size of the timestamp is
arch-dependent.

Fix #1978
2025-09-22 21:33:21 +02:00
Vincent Bernat
ecfa6fb373 inlet/flow: wait a bit longer to get back flows with UDP
There is no harm in waiting more as when the test works, the wait is
minimal.
2025-09-08 22:40:41 +02:00
Vincent Bernat
fc167f052c inlet/flow: use unsafe to cast data from the kernel
We don't need to use NativeEndian, we can just cast. The alignment is
ensured by CMSG_DATA macro, so it's safe even on archs not allowing
unaligned data access.

This way of doing things was one of the main reason Go took so much time
to get binary.NativeEndian.
2025-09-07 11:21:28 +02:00
Vincent Bernat
9aa267f1bc inlet/flow: ensure we request 64-bit timestamps from the kernel
This requires Linux 5.0+. Below, we would just get no timestamp. This is
more correct this way, even if most people would run that on 64-bit
Linux and already get 64-bit timestamp.

We also don't use the nanosecond part as it is "long long" and should be
virtually 64-bit on all archs, this is not totally correct.
2025-09-07 11:14:12 +02:00
Vincent Bernat
93ae69ad9e inlet/flow: switch to binary.NativeEndian to get native endianness
This is available since Go 1.21 and it is better than enumerating
architectures. We were not up-to-date. See for example:

https://cs.opensource.google/go/go/+/refs/tags/go1.25.1:src/encoding/binary/native_endian_big.go
2025-09-07 10:18:26 +02:00
Vincent Bernat
c155d34a43 inlet/outlet: add compile-time interface implementation check
For "plugins" only.
2025-09-02 22:28:28 +02:00
Vincent Bernat
e2f1df9add tests: replace godebug by go-cmp for structure diffs
go-cmp is stricter and allow to catch more problems. Moreover, the
output is a bit nicer.
2025-08-23 16:03:09 +02:00
Vincent Bernat
5f7de0a16c docs: document the metric about buffer size 2025-08-17 16:16:20 +02:00
Vincent Bernat
08f64a9cd3 inlet/flow: test and report UDP buffer sizes 2025-08-17 15:41:44 +02:00
Vincent Bernat
736c4da8a0 outlet/routing: add an option to tune TCP receive buffer for BMP
The default value is quite low. This is a bit of a stop gap. The
alternative would be to maintain a circular buffer of the same size
inside the outlet for each connection and ensure there is no lock in the
path. But doing it in the kernel means almost no code, even if it is a
bit complex for the user.

Fix #1461
2025-08-17 15:13:49 +02:00
Vincent Bernat
98eb1bdba5 chore: make a run of gofumpt 2025-08-05 06:21:34 +02:00
Vincent Bernat
bde9cb3b64 chore: Netflow → NetFlow
But like for ClickHouse/clickhouse, we keep using netflow when not
capitalized.
2025-07-31 09:14:02 +02:00
Vincent Bernat
18beb310ee chore: replace interface{} with any 2025-07-29 07:42:49 +02:00
Vincent Bernat
4c0b15e1cd inlet/outlet: rename a few metrics
For example:

```
 17:35 ❱ curl -s 127.0.0.1:8080/api/v0/outlet/metrics | promtool check metrics
akvorado_outlet_core_classifier_exporter_cache_size_items counter metrics should have "_total" suffix
akvorado_outlet_core_classifier_interface_cache_size_items counter metrics should have "_total" suffix
akvorado_outlet_flow_decoder_netflow_flowset_records_sum counter metrics should have "_total" suffix
akvorado_outlet_flow_decoder_netflow_flowset_records_sum non-histogram and non-summary metrics should not have "_sum" suffix
akvorado_outlet_flow_decoder_netflow_flowset_sum counter metrics should have "_total" suffix
akvorado_outlet_flow_decoder_netflow_flowset_sum non-histogram and non-summary metrics should not have "_sum" suffix
akvorado_outlet_kafka_buffered_fetch_records_total non-counter metrics should not have "_total" suffix
akvorado_outlet_kafka_buffered_produce_records_total non-counter metrics should not have "_total" suffix
akvorado_outlet_metadata_cache_refreshs counter metrics should have "_total" suffix
akvorado_outlet_routing_provider_bmp_peers_total non-counter metrics should not have "_total" suffix
akvorado_outlet_routing_provider_bmp_routes_total non-counter metrics should not have "_total" suffix
```

Also ensure metrics using errors as label don't have a too great
cardinality by using constants for error messages used.
2025-07-27 21:44:28 +02:00
Vincent Bernat
756e4a8fbd */kafka: switch to franz-go
The concurrency of this library is easier to handle than Sarama.
Notably, it is more compatible with the new model of "almost share
nothing" we use for the inlet and the outlet. The lock for workers in
outlet is removed. We can now use sync.Pool to allocate slice of bytes
in inlet.

It may also be more performant.

In the future, we may want to commit only when pushing data to
ClickHouse. However, this does not seem easy when there is a rebalance.
In case of rebalance, we need to do something when a partition is
revoked to avoid duplicating data. For example, we could flush the
current batch to ClickHouse. Have a look at the
`example/mark_offsets/main.go` file in franz-go repository for a
possible approach. In the meantime, we rely on autocommit.

Another contender could be https://github.com/segmentio/kafka-go. Also
see https://github.com/twmb/franz-go/pull/1064.
2025-07-27 21:44:28 +02:00
Vincent Bernat
e49a744a6d build: use vtprotobuf to speedup protobuf marshal/unmarshal
There is still room for improvement. For inlet, it would require to know
when Kafka has sent the message (so enabling successes return). For
outlet, it should be possible to reuse the same flow (with a ResetVT
between each use).
2025-07-27 21:44:28 +02:00
Vincent Bernat
ac68c5970e inlet: split inlet into new inlet and outlet
This change split the inlet component into a simpler inlet and a new
outlet component. The new inlet component receive flows and put them in
Kafka, unparsed. The outlet component takes them from Kafka and resume
the processing from here (flow parsing, enrichment) and puts them in
ClickHouse.

The main goal is to ensure the inlet does a minimal work to not be late
when processing packets (and restart faster). It also brings some
simplification as the number of knobs to tune everything is reduced: for
inlet, we only need to tune the queue size for UDP, the number of
workers and a few Kafka parameters; for outlet, we need to tune a few
Kafka parameters, the number of workers and a few ClickHouse parameters.

The outlet component features a simple Kafka input component. The core
component becomes just a callback function. There is also a new
ClickHouse component to push data to ClickHouse using the low-level
ch-go library with batch inserts.

This processing has an impact on the internal representation of a
FlowMessage. Previously, it was tailored to dynamically build the
protobuf message to be put in Kafka. Now, it builds the batch request to
be sent to ClickHouse. This makes the FlowMessage structure hides the
content of the next batch request and therefore, it should be reused.
This also changes the way we decode flows as they don't output
FlowMessage anymore, they reuse one that is provided to each worker.

The ClickHouse tables are slightly updated. Instead of using Kafka
engine, the Null engine is used instead.

Fix #1122
2025-07-27 21:44:28 +02:00
Vincent Bernat
7be4a5f424 inlet/flow: add a test to decode IPFIX NAT fields
Some checks are pending
CI / 🤖 Check dependabot status (push) Waiting to run
CI / 🐧 Build and test on Linux (push) Blocked by required conditions
CI / 🍏 Build and test on macOS (push) Blocked by required conditions
CI / 🔍 Upload code coverage (push) Blocked by required conditions
CI / 🔭 Build Go backend (1.24) (push) Blocked by required conditions
CI / 🔭 Build JS frontend (18) (push) Blocked by required conditions
CI / 🔭 Build JS frontend (20) (push) Blocked by required conditions
CI / 🔭 Build JS frontend (22) (push) Blocked by required conditions
CI / ⚖️ Check licenses (push) Waiting to run
CI / 🐋 Build Docker images (push) Blocked by required conditions
CI / 🚀 Publish release (push) Blocked by required conditions
Unfortunately, they are discrete events. 0 packets, 0 bytes. We can't
use them much.
2025-06-09 08:02:56 +02:00
Vincent Bernat
3ee5aea894 tests: use b.Loop() instead of range b.N for benchmarks
See https://go.dev/blog/testing-b-loop
2025-05-25 15:16:23 +02:00
Vincent Bernat
dd78410c75 inlet/flow: fix description for input/file
Some checks failed
CI / 🤖 Check dependabot status (push) Has been cancelled
CI / ⚖️ Check licenses (push) Has been cancelled
Build IPinfo geoipupdate image / Build Docker images (push) Has been cancelled
CI / 🐧 Build and test on Linux (push) Has been cancelled
CI / 🍏 Build and test on macOS (push) Has been cancelled
CI / 🔍 Upload code coverage (push) Has been cancelled
CI / 🔭 Build Go backend (1.24) (push) Has been cancelled
CI / 🔭 Build JS frontend (18) (push) Has been cancelled
CI / 🔭 Build JS frontend (20) (push) Has been cancelled
CI / 🔭 Build JS frontend (22) (push) Has been cancelled
CI / 🐋 Build Docker images (push) Has been cancelled
CI / 🚀 Publish release (push) Has been cancelled
Update Nix dependencies / Update Nix lockfile (asn2org) (push) Has been cancelled
Update Nix dependencies / Update Nix lockfile (nixpkgs) (push) Has been cancelled
Update Nix dependencies / Update dependency hashes (push) Has been cancelled
It is not reading pcap files!
2025-05-16 20:31:59 +02:00
Vincent Bernat
233865cfe0 inlet/flow: fix description of Start for file input 2025-05-16 20:27:09 +02:00
Vincent Bernat
c0671ca2fe inlet/flow: keep the inner VLAN as the one to use for SrcVlan
Some checks failed
CI / 🤖 Check dependabot status (push) Has been cancelled
CI / ⚖️ Check licenses (push) Has been cancelled
CI / 🐧 Build and test on Linux (push) Has been cancelled
CI / 🍏 Build and test on macOS (push) Has been cancelled
CI / 🔍 Upload code coverage (push) Has been cancelled
CI / 🔭 Build Go backend (1.24) (push) Has been cancelled
CI / 🔭 Build JS frontend (18) (push) Has been cancelled
CI / 🔭 Build JS frontend (20) (push) Has been cancelled
CI / 🔭 Build JS frontend (22) (push) Has been cancelled
CI / 🐋 Build Docker images (push) Has been cancelled
CI / 🚀 Publish release (push) Has been cancelled
2025-04-29 07:33:31 +02:00
Vincent Bernat
ae1686abd6 inlet/flow: don't override flow-provided VLANs with VLAN from Ethernet 2025-04-29 07:19:16 +02:00
Vincent Bernat
88087809dd inlet/flow: decode destination BGP communities in sFlow packets 2025-01-18 19:29:55 +01:00
Vincent Bernat
84d51e0ca9 inlet/flow: use enumer for TimestampSource 2024-11-23 23:48:02 +01:00
Vincent Bernat
e578942d1e inlet/flow: do not increase decoding error when template is missing
This is confusing.
2024-11-16 14:34:27 +01:00
Vincent Bernat
46af028c0c inlet/flow: fix decoding of QinQ in Ethernet packets
Inner VLAN is the definitive one.
2024-10-31 08:19:37 +01:00
Vincent Bernat
fc80080931 inlet/flow: fix test failing after latest commit
One of the test was not getting an AS path in the extended gateway
section of the sflow packet.
2024-08-30 10:25:37 +02:00
Vincent Bernat
1f7471a926 inlet/flow: don't consider we got an AS path if it's empty
Some equipments may send an empty AS path there. See:
 https://github.com/akvorado/akvorado/discussions/1369
2024-08-29 09:25:42 +02:00
Vincent Bernat
a449736a62 build: use Go 1.22 range over ints
Done with:

```
git grep -l 'for.*:= 0.*++' \
  | xargs sed -i -E 's/for (.*) := 0; \1 < (.*); \1\+\+/for \1 := range \2/'
```

And a few manual fixes due to unused variables. There is something fishy
in BMP rib test. Add a comment about that. This is not equivalent (as
with range, random is evaluated once, while in the original loop, it is
evaluated at each iteration). I believe the intent was to behave like
with range.
2024-08-14 10:11:35 +02:00
Vincent Bernat
0239cd0a9f common: remove MarshalJSON helpers for mapstructure
They are not needed anymore since we don't exchange configuration files
using JSON, since baac495b9c.
2024-07-20 14:51:40 +02:00
Vincent Bernat
83a46e36c7 inlet/flow: remove old schemas 2024-06-09 20:08:49 +02:00
Vincent Bernat
51404d5d11 inlet/flow: do not ask for hardware timestamps for UDP input
This is useless as this also needs to be enabled with the SIOCSHWTSTAMP
ioctl. This requires CAP_NET_ADMIN and we would need to guess the
physical interface. Too much trouble.
2024-06-04 14:08:43 +02:00
Vincent Bernat
638f8ebe21 inlet/flow: add a test for decoding without a template 2024-05-19 08:58:55 +02:00
Vincent Bernat
9d7e0637c1 inlet/flow: support for NetFlow v5 2024-05-18 19:30:56 +02:00
Vincent Bernat
8f829e95f1 inlet/flow: reorganize a bit decoding
This should enable adding support for NetFlow v5.
2024-05-18 16:23:39 +02:00
Vincent Bernat
52d79153c4 inlet/flow: don't log missing template as an error
It's still available as metrics for people reading the documentation. At
least, it will reduce the number of support request around that.
2024-05-18 12:06:21 +02:00
Saku Ytti
459e51396f Fix #1189 2024-04-24 14:07:20 +02:00
Vincent Bernat
7601a28515 inlet/flow: fix parsing of sampling rate with "packet interval"
Fix #1189
2024-04-24 09:44:19 +02:00
Vincent Bernat
00e8989500 inlet/flow: only use IPFIX_FIELD_* instead of NFV9_FIELD_* consts
There is a backward compatibility for anything in the 1-127 range and it
is clearer to only use the IPFIX names.
2024-04-24 09:15:29 +02:00