Commit Graph

162 Commits

Author SHA1 Message Date
Vincent Bernat
e2f1df9add tests: replace godebug by go-cmp for structure diffs
go-cmp is stricter and allow to catch more problems. Moreover, the
output is a bit nicer.
2025-08-23 16:03:09 +02:00
Vincent Bernat
5f7de0a16c docs: document the metric about buffer size 2025-08-17 16:16:20 +02:00
Vincent Bernat
08f64a9cd3 inlet/flow: test and report UDP buffer sizes 2025-08-17 15:41:44 +02:00
Vincent Bernat
736c4da8a0 outlet/routing: add an option to tune TCP receive buffer for BMP
The default value is quite low. This is a bit of a stop gap. The
alternative would be to maintain a circular buffer of the same size
inside the outlet for each connection and ensure there is no lock in the
path. But doing it in the kernel means almost no code, even if it is a
bit complex for the user.

Fix #1461
2025-08-17 15:13:49 +02:00
Vincent Bernat
98eb1bdba5 chore: make a run of gofumpt 2025-08-05 06:21:34 +02:00
Vincent Bernat
bde9cb3b64 chore: Netflow → NetFlow
But like for ClickHouse/clickhouse, we keep using netflow when not
capitalized.
2025-07-31 09:14:02 +02:00
Vincent Bernat
18beb310ee chore: replace interface{} with any 2025-07-29 07:42:49 +02:00
Vincent Bernat
4c0b15e1cd inlet/outlet: rename a few metrics
For example:

```
 17:35 ❱ curl -s 127.0.0.1:8080/api/v0/outlet/metrics | promtool check metrics
akvorado_outlet_core_classifier_exporter_cache_size_items counter metrics should have "_total" suffix
akvorado_outlet_core_classifier_interface_cache_size_items counter metrics should have "_total" suffix
akvorado_outlet_flow_decoder_netflow_flowset_records_sum counter metrics should have "_total" suffix
akvorado_outlet_flow_decoder_netflow_flowset_records_sum non-histogram and non-summary metrics should not have "_sum" suffix
akvorado_outlet_flow_decoder_netflow_flowset_sum counter metrics should have "_total" suffix
akvorado_outlet_flow_decoder_netflow_flowset_sum non-histogram and non-summary metrics should not have "_sum" suffix
akvorado_outlet_kafka_buffered_fetch_records_total non-counter metrics should not have "_total" suffix
akvorado_outlet_kafka_buffered_produce_records_total non-counter metrics should not have "_total" suffix
akvorado_outlet_metadata_cache_refreshs counter metrics should have "_total" suffix
akvorado_outlet_routing_provider_bmp_peers_total non-counter metrics should not have "_total" suffix
akvorado_outlet_routing_provider_bmp_routes_total non-counter metrics should not have "_total" suffix
```

Also ensure metrics using errors as label don't have a too great
cardinality by using constants for error messages used.
2025-07-27 21:44:28 +02:00
Vincent Bernat
756e4a8fbd */kafka: switch to franz-go
The concurrency of this library is easier to handle than Sarama.
Notably, it is more compatible with the new model of "almost share
nothing" we use for the inlet and the outlet. The lock for workers in
outlet is removed. We can now use sync.Pool to allocate slice of bytes
in inlet.

It may also be more performant.

In the future, we may want to commit only when pushing data to
ClickHouse. However, this does not seem easy when there is a rebalance.
In case of rebalance, we need to do something when a partition is
revoked to avoid duplicating data. For example, we could flush the
current batch to ClickHouse. Have a look at the
`example/mark_offsets/main.go` file in franz-go repository for a
possible approach. In the meantime, we rely on autocommit.

Another contender could be https://github.com/segmentio/kafka-go. Also
see https://github.com/twmb/franz-go/pull/1064.
2025-07-27 21:44:28 +02:00
Vincent Bernat
e49a744a6d build: use vtprotobuf to speedup protobuf marshal/unmarshal
There is still room for improvement. For inlet, it would require to know
when Kafka has sent the message (so enabling successes return). For
outlet, it should be possible to reuse the same flow (with a ResetVT
between each use).
2025-07-27 21:44:28 +02:00
Vincent Bernat
ac68c5970e inlet: split inlet into new inlet and outlet
This change split the inlet component into a simpler inlet and a new
outlet component. The new inlet component receive flows and put them in
Kafka, unparsed. The outlet component takes them from Kafka and resume
the processing from here (flow parsing, enrichment) and puts them in
ClickHouse.

The main goal is to ensure the inlet does a minimal work to not be late
when processing packets (and restart faster). It also brings some
simplification as the number of knobs to tune everything is reduced: for
inlet, we only need to tune the queue size for UDP, the number of
workers and a few Kafka parameters; for outlet, we need to tune a few
Kafka parameters, the number of workers and a few ClickHouse parameters.

The outlet component features a simple Kafka input component. The core
component becomes just a callback function. There is also a new
ClickHouse component to push data to ClickHouse using the low-level
ch-go library with batch inserts.

This processing has an impact on the internal representation of a
FlowMessage. Previously, it was tailored to dynamically build the
protobuf message to be put in Kafka. Now, it builds the batch request to
be sent to ClickHouse. This makes the FlowMessage structure hides the
content of the next batch request and therefore, it should be reused.
This also changes the way we decode flows as they don't output
FlowMessage anymore, they reuse one that is provided to each worker.

The ClickHouse tables are slightly updated. Instead of using Kafka
engine, the Null engine is used instead.

Fix #1122
2025-07-27 21:44:28 +02:00
Vincent Bernat
7be4a5f424 inlet/flow: add a test to decode IPFIX NAT fields
Some checks are pending
CI / 🤖 Check dependabot status (push) Waiting to run
CI / 🐧 Build and test on Linux (push) Blocked by required conditions
CI / 🍏 Build and test on macOS (push) Blocked by required conditions
CI / 🔍 Upload code coverage (push) Blocked by required conditions
CI / 🔭 Build Go backend (1.24) (push) Blocked by required conditions
CI / 🔭 Build JS frontend (18) (push) Blocked by required conditions
CI / 🔭 Build JS frontend (20) (push) Blocked by required conditions
CI / 🔭 Build JS frontend (22) (push) Blocked by required conditions
CI / ⚖️ Check licenses (push) Waiting to run
CI / 🐋 Build Docker images (push) Blocked by required conditions
CI / 🚀 Publish release (push) Blocked by required conditions
Unfortunately, they are discrete events. 0 packets, 0 bytes. We can't
use them much.
2025-06-09 08:02:56 +02:00
Vincent Bernat
3ee5aea894 tests: use b.Loop() instead of range b.N for benchmarks
See https://go.dev/blog/testing-b-loop
2025-05-25 15:16:23 +02:00
Vincent Bernat
dd78410c75 inlet/flow: fix description for input/file
Some checks failed
CI / 🤖 Check dependabot status (push) Has been cancelled
CI / ⚖️ Check licenses (push) Has been cancelled
Build IPinfo geoipupdate image / Build Docker images (push) Has been cancelled
CI / 🐧 Build and test on Linux (push) Has been cancelled
CI / 🍏 Build and test on macOS (push) Has been cancelled
CI / 🔍 Upload code coverage (push) Has been cancelled
CI / 🔭 Build Go backend (1.24) (push) Has been cancelled
CI / 🔭 Build JS frontend (18) (push) Has been cancelled
CI / 🔭 Build JS frontend (20) (push) Has been cancelled
CI / 🔭 Build JS frontend (22) (push) Has been cancelled
CI / 🐋 Build Docker images (push) Has been cancelled
CI / 🚀 Publish release (push) Has been cancelled
Update Nix dependencies / Update Nix lockfile (asn2org) (push) Has been cancelled
Update Nix dependencies / Update Nix lockfile (nixpkgs) (push) Has been cancelled
Update Nix dependencies / Update dependency hashes (push) Has been cancelled
It is not reading pcap files!
2025-05-16 20:31:59 +02:00
Vincent Bernat
233865cfe0 inlet/flow: fix description of Start for file input 2025-05-16 20:27:09 +02:00
Vincent Bernat
c0671ca2fe inlet/flow: keep the inner VLAN as the one to use for SrcVlan
Some checks failed
CI / 🤖 Check dependabot status (push) Has been cancelled
CI / ⚖️ Check licenses (push) Has been cancelled
CI / 🐧 Build and test on Linux (push) Has been cancelled
CI / 🍏 Build and test on macOS (push) Has been cancelled
CI / 🔍 Upload code coverage (push) Has been cancelled
CI / 🔭 Build Go backend (1.24) (push) Has been cancelled
CI / 🔭 Build JS frontend (18) (push) Has been cancelled
CI / 🔭 Build JS frontend (20) (push) Has been cancelled
CI / 🔭 Build JS frontend (22) (push) Has been cancelled
CI / 🐋 Build Docker images (push) Has been cancelled
CI / 🚀 Publish release (push) Has been cancelled
2025-04-29 07:33:31 +02:00
Vincent Bernat
ae1686abd6 inlet/flow: don't override flow-provided VLANs with VLAN from Ethernet 2025-04-29 07:19:16 +02:00
Vincent Bernat
88087809dd inlet/flow: decode destination BGP communities in sFlow packets 2025-01-18 19:29:55 +01:00
Vincent Bernat
84d51e0ca9 inlet/flow: use enumer for TimestampSource 2024-11-23 23:48:02 +01:00
Vincent Bernat
e578942d1e inlet/flow: do not increase decoding error when template is missing
This is confusing.
2024-11-16 14:34:27 +01:00
Vincent Bernat
46af028c0c inlet/flow: fix decoding of QinQ in Ethernet packets
Inner VLAN is the definitive one.
2024-10-31 08:19:37 +01:00
Vincent Bernat
fc80080931 inlet/flow: fix test failing after latest commit
One of the test was not getting an AS path in the extended gateway
section of the sflow packet.
2024-08-30 10:25:37 +02:00
Vincent Bernat
1f7471a926 inlet/flow: don't consider we got an AS path if it's empty
Some equipments may send an empty AS path there. See:
 https://github.com/akvorado/akvorado/discussions/1369
2024-08-29 09:25:42 +02:00
Vincent Bernat
a449736a62 build: use Go 1.22 range over ints
Done with:

```
git grep -l 'for.*:= 0.*++' \
  | xargs sed -i -E 's/for (.*) := 0; \1 < (.*); \1\+\+/for \1 := range \2/'
```

And a few manual fixes due to unused variables. There is something fishy
in BMP rib test. Add a comment about that. This is not equivalent (as
with range, random is evaluated once, while in the original loop, it is
evaluated at each iteration). I believe the intent was to behave like
with range.
2024-08-14 10:11:35 +02:00
Vincent Bernat
0239cd0a9f common: remove MarshalJSON helpers for mapstructure
They are not needed anymore since we don't exchange configuration files
using JSON, since baac495b9c.
2024-07-20 14:51:40 +02:00
Vincent Bernat
83a46e36c7 inlet/flow: remove old schemas 2024-06-09 20:08:49 +02:00
Vincent Bernat
51404d5d11 inlet/flow: do not ask for hardware timestamps for UDP input
This is useless as this also needs to be enabled with the SIOCSHWTSTAMP
ioctl. This requires CAP_NET_ADMIN and we would need to guess the
physical interface. Too much trouble.
2024-06-04 14:08:43 +02:00
Vincent Bernat
638f8ebe21 inlet/flow: add a test for decoding without a template 2024-05-19 08:58:55 +02:00
Vincent Bernat
9d7e0637c1 inlet/flow: support for NetFlow v5 2024-05-18 19:30:56 +02:00
Vincent Bernat
8f829e95f1 inlet/flow: reorganize a bit decoding
This should enable adding support for NetFlow v5.
2024-05-18 16:23:39 +02:00
Vincent Bernat
52d79153c4 inlet/flow: don't log missing template as an error
It's still available as metrics for people reading the documentation. At
least, it will reduce the number of support request around that.
2024-05-18 12:06:21 +02:00
Saku Ytti
459e51396f Fix #1189 2024-04-24 14:07:20 +02:00
Vincent Bernat
7601a28515 inlet/flow: fix parsing of sampling rate with "packet interval"
Fix #1189
2024-04-24 09:44:19 +02:00
Vincent Bernat
00e8989500 inlet/flow: only use IPFIX_FIELD_* instead of NFV9_FIELD_* consts
There is a backward compatibility for anything in the 1-127 range and it
is clearer to only use the IPFIX names.
2024-04-24 09:15:29 +02:00
Vincent Bernat
217a3c488c docs: add missing license for some files 2024-04-05 07:43:51 +02:00
Jordan Barnartt
10f062beed Look at IPFIX octets field for ColumnBytes 2024-04-04 23:28:42 +02:00
Vincent Bernat
7977704e3a inlet/flow: run go fmt 2024-03-31 09:10:34 +02:00
Vincent Bernat
0bd259bdd6 docs: document latest change 2024-03-30 22:13:01 +01:00
netixx
c2b3cae237 Allow using fields of the netflow packet to set the flow TimeReceived
Today the timestamp can only be from kernel timetstamp put on the UDP packet
by the kernel.

I propose to add 2 alternative methods of getting the timestamp for netflow/IPFix packets:
- TimestampSourceNetflowPacket: use the timestamp field in the netflow packet itself
- TimestampSourceNetflowFirstSwitched: use the FirstSwitched field from each flow
(the field is actually in uptime, so we need to shift it according to sysUptime)

Using those fields requires the router to have accurate time (probably NTP),
but it allows for architectures where a UDP packet is not immediately
received by the collector, eg. if there is a kafka in-between.
That in turns allows to do maintenance on the collector,
without messing up the statistics
2024-03-30 22:01:40 +01:00
Vincent Bernat
b3a9f6ab2e chore: remove unused parameters
They were not detected by revive in function literals.
2024-02-08 08:30:33 +01:00
Vincent Bernat
a5521ac052 inlet/flow: remove 1 allocation when walking flowsets 2023-12-20 08:38:11 +01:00
Vincent Bernat
2ee77f264a inlet/flow: don't embed error message in label
Instead, log 3 error/30 seconds max.

Fix #999
2023-12-16 20:18:14 +01:00
Vincent Bernat
2fe74ad43a inlet/flow: handle templates containing multiple sampling rates
Fix #997
2023-12-16 20:18:04 +01:00
Vincent Bernat
5e982db8d5 inlet/flow: use per-version observation domain IDs 2023-12-16 20:18:04 +01:00
Vincent Bernat
acc48c0094 inlet/flow: add test for per-flow sampling rates 2023-12-16 20:18:04 +01:00
Vincent Bernat
71b20f3d26 inlet/flow: update to latest version of GoFlow2
There is no change in performance:

```
goos: linux
goarch: amd64
pkg: akvorado/inlet/flow
cpu: AMD Ryzen 5 5600X 6-Core Processor
BenchmarkDecodeEncodeNetflow/with_encoding-12             155505              8059 ns/op            8178 B/op        130 allocs/op
BenchmarkDecodeEncodeNetflow/without_encoding-12          147974              7554 ns/op            8178 B/op        130 allocs/op
BenchmarkDecodeEncodeSflow/with_encoding-12               126746              9463 ns/op            7200 B/op         90 allocs/op
BenchmarkDecodeEncodeSflow/without_encoding-12            140703              8686 ns/op            7200 B/op         90 allocs/op
```
2023-12-12 23:29:21 +01:00
Vincent Bernat
4f043c5822 inlet/flow: do not decode L4 if IP packet is fragmented 2023-12-12 20:45:04 +01:00
Vincent Bernat
8ecc6b9570 inlet: add some test coverage around MPLS parsing 2023-11-25 20:34:45 +01:00
Vincent Bernat
82051b552f inlet: decode MPLS labels
They are stored in an array and there are some aliases to get 1st, 2nd
and third label. Support for sFlow would need a test to ensure it works
as expected.

Fix #960
2023-11-25 20:34:45 +01:00
Marvin Gaube
f55a097968 feat: add decoded flows/queue ingest metric for inlet/udp 2023-11-17 13:11:12 +01:00