Commit Graph

278 Commits

Author SHA1 Message Date
Vincent Bernat
e68b2de72c common/helpers: migrate from verify to skip-verify in TLS config
Some checks failed
CI / 🤖 Check dependabot status (push) Has been cancelled
CI / 🐧 Test on Linux (${{ github.ref_type == 'tag' }}, misc) (push) Has been cancelled
CI / 🐧 Test on Linux (coverage) (push) Has been cancelled
CI / 🐧 Test on Linux (regular) (push) Has been cancelled
CI / ❄️ Build on Nix (push) Has been cancelled
CI / 🍏 Build and test on macOS (push) Has been cancelled
CI / 🧪 End-to-end testing (push) Has been cancelled
CI / 🔍 Upload code coverage (push) Has been cancelled
CI / 🔬 Test only Go (push) Has been cancelled
CI / 🔬 Test only JS (${{ needs.dependabot.outputs.package-ecosystem }}, 20) (push) Has been cancelled
CI / 🔬 Test only JS (${{ needs.dependabot.outputs.package-ecosystem }}, 22) (push) Has been cancelled
CI / 🔬 Test only JS (${{ needs.dependabot.outputs.package-ecosystem }}, 24) (push) Has been cancelled
CI / ⚖️ Check licenses (push) Has been cancelled
CI / 🐋 Build Docker images (push) Has been cancelled
CI / 🐋 Tag Docker images (push) Has been cancelled
CI / 🚀 Publish release (push) Has been cancelled
Otherwise, the default is "false" for verify. This is a breaking change.

Fix #2055.
2025-10-30 08:31:27 +01:00
Vincent Bernat
a2339312ac common/remotedatasource: accept specific TLS configuration
Some checks failed
CI / 🤖 Check dependabot status (push) Has been cancelled
CI / 🐧 Test on Linux (${{ github.ref_type == 'tag' }}, misc) (push) Has been cancelled
CI / 🐧 Test on Linux (coverage) (push) Has been cancelled
CI / 🐧 Test on Linux (regular) (push) Has been cancelled
CI / ❄️ Build on Nix (push) Has been cancelled
CI / 🍏 Build and test on macOS (push) Has been cancelled
CI / 🧪 End-to-end testing (push) Has been cancelled
CI / 🔍 Upload code coverage (push) Has been cancelled
CI / 🔬 Test only Go (push) Has been cancelled
CI / 🔬 Test only JS (${{ needs.dependabot.outputs.package-ecosystem }}, 20) (push) Has been cancelled
CI / 🔬 Test only JS (${{ needs.dependabot.outputs.package-ecosystem }}, 22) (push) Has been cancelled
CI / 🔬 Test only JS (${{ needs.dependabot.outputs.package-ecosystem }}, 24) (push) Has been cancelled
CI / ⚖️ Check licenses (push) Has been cancelled
CI / 🐋 Build Docker images (push) Has been cancelled
CI / 🐋 Tag Docker images (push) Has been cancelled
CI / 🚀 Publish release (push) Has been cancelled
Update Nix dependency hashes / Update dependency hashes (push) Has been cancelled
2025-10-29 22:34:38 +01:00
Vincent Bernat
ea3d5a9f28 common/remotedatasource: accept empty transform query 2025-10-29 22:04:37 +01:00
Gregor Düster
73d005d229 outlet/flow: implement RFC 5103 support 2025-09-29 05:37:19 +02:00
Vincent Bernat
a1e29071a9 common/schema: add ability to reverse flow direction 2025-09-29 05:37:19 +02:00
Vincent Bernat
801f3f1676 common/kafka: also logs output of kfake cluster 2025-09-23 07:06:58 +02:00
oliverpool
f3ebff6ce5 refactor: use sync.OnceValue 2025-09-22 14:03:44 +02:00
Vincent Bernat
86f2c9b1fa Revert "common/daemon: test if sending SIGINT terminates the component"
This reverts commit 7859cb0019. This seems
to break randomly when the process is encapsulated (during coverage?).
2025-09-19 21:53:01 +02:00
Vincent Bernat
75f6908f71 Revert "common/daemon: add a method to reexec itself"
This reverts commit 216786d40e. We don't
really need that if we move logic to cmd package.
2025-09-19 21:53:01 +02:00
Vincent Bernat
9f34f9caf9 common/helpers: return files parsed by yaml.UnmarshalWithInclude 2025-09-19 21:53:01 +02:00
Vincent Bernat
fa9315b9e1 common/daemon: add a method to reexec itself 2025-09-19 21:53:01 +02:00
Vincent Bernat
f735e46665 common/daemon: test if sending SIGINT terminates the component 2025-09-19 21:53:01 +02:00
Vincent Bernat
0db8f43b10 common/httpserver: do not connect to Redis before starting component
We need to be able to dump the configuration without redis being present.
2025-09-16 21:50:09 +02:00
Vincent Bernat
970fad2e47 common/schema: fix comment about ConsoleTruncateIP 2025-09-13 07:20:48 +02:00
Vincent Bernat
31b6591e0e build: update revive
And remove some unused variables.
2025-09-09 07:39:00 +02:00
Vincent Bernat
8a38e2a912 common/helpers: switch to go.yaml.in/yaml/v3
gopkg.in/yaml.v3 is now unmaintained.
2025-09-09 07:24:42 +02:00
Vincent Bernat
93ae69ad9e inlet/flow: switch to binary.NativeEndian to get native endianness
This is available since Go 1.21 and it is better than enumerating
architectures. We were not up-to-date. See for example:

https://cs.opensource.google/go/go/+/refs/tags/go1.25.1:src/encoding/binary/native_endian_big.go
2025-09-07 10:18:26 +02:00
Vincent Bernat
e21e612259 common/helpers: use Modify() for subnet maps as well
Replace `table.Update()` with `table.Modify()`.
2025-09-06 20:01:19 +02:00
Vincent Bernat
fdb65c93a5 outlet/routing: store v4 routes into a v4 tree
This is improves performance significantly:

```
goos: linux
goarch: amd64
pkg: akvorado/outlet/routing/provider/bmp
cpu: AMD Ryzen 5 5600X 6-Core Processor
                                       │      1       │                  2                   │
                                       │  sec/route   │  sec/route    vs base                │
RIBInsertion/1000_routes,_1_peers-12     466.6n ± ∞ ¹   413.6n ± ∞ ¹       ~ (p=1.000 n=1) ²
RIBInsertion/1000_routes,_2_peers-12     468.5n ± ∞ ¹   424.6n ± ∞ ¹       ~ (p=1.000 n=1) ²
RIBInsertion/1000_routes,_5_peers-12     475.0n ± ∞ ¹   419.6n ± ∞ ¹       ~ (p=1.000 n=1) ²
RIBInsertion/10000_routes,_1_peers-12    485.3n ± ∞ ¹   434.1n ± ∞ ¹       ~ (p=1.000 n=1) ²
RIBInsertion/10000_routes,_2_peers-12    532.6n ± ∞ ¹   477.0n ± ∞ ¹       ~ (p=1.000 n=1) ²
RIBInsertion/10000_routes,_5_peers-12    585.6n ± ∞ ¹   551.9n ± ∞ ¹       ~ (p=1.000 n=1) ²
RIBInsertion/100000_routes,_1_peers-12   623.8n ± ∞ ¹   587.7n ± ∞ ¹       ~ (p=1.000 n=1) ²
RIBInsertion/100000_routes,_2_peers-12   682.1n ± ∞ ¹   637.8n ± ∞ ¹       ~ (p=1.000 n=1) ²
RIBInsertion/100000_routes,_5_peers-12   804.9n ± ∞ ¹   740.8n ± ∞ ¹       ~ (p=1.000 n=1) ²
geomean                                  559.6n         510.1n        -8.85%
¹ need >= 6 samples for confidence interval at level 0.95
² need >= 4 samples to detect a difference at alpha level 0.05

                                    │       1       │                   2                   │
                                    │    sec/op     │    sec/op     vs base                 │
RIBLookup/1000_routes,_1_peers-12      82.87n ± ∞ ¹   14.59n ± ∞ ¹        ~ (p=1.000 n=1) ²
RIBLookup/1000_routes,_2_peers-12      82.86n ± ∞ ¹   14.68n ± ∞ ¹        ~ (p=1.000 n=1) ²
RIBLookup/1000_routes,_5_peers-12      83.24n ± ∞ ¹   14.56n ± ∞ ¹        ~ (p=1.000 n=1) ²
RIBLookup/10000_routes,_1_peers-12     87.27n ± ∞ ¹   14.69n ± ∞ ¹        ~ (p=1.000 n=1) ²
RIBLookup/10000_routes,_2_peers-12     89.92n ± ∞ ¹   14.62n ± ∞ ¹        ~ (p=1.000 n=1) ²
RIBLookup/10000_routes,_5_peers-12     99.67n ± ∞ ¹   14.74n ± ∞ ¹        ~ (p=1.000 n=1) ²
RIBLookup/100000_routes,_1_peers-12   129.60n ± ∞ ¹   14.68n ± ∞ ¹        ~ (p=1.000 n=1) ²
RIBLookup/100000_routes,_2_peers-12   121.50n ± ∞ ¹   14.71n ± ∞ ¹        ~ (p=1.000 n=1) ²
RIBLookup/100000_routes,_5_peers-12   122.90n ± ∞ ¹   14.69n ± ∞ ¹        ~ (p=1.000 n=1) ²
geomean                                98.40n         14.66n        -85.10%
¹ need >= 6 samples for confidence interval at level 0.95
² need >= 4 samples to detect a difference at alpha level 0.05

                                   │      1       │                   2                   │
                                   │    ms/op     │    ms/op      vs base                 │
RIBFlush/1000_routes,_1_peers-12     268.9m ± ∞ ¹   214.4m ± ∞ ¹        ~ (p=1.000 n=1) ²
RIBFlush/1000_routes,_2_peers-12     457.2m ± ∞ ¹   357.8m ± ∞ ¹        ~ (p=1.000 n=1) ²
RIBFlush/1000_routes,_5_peers-12     954.7m ± ∞ ¹   697.6m ± ∞ ¹        ~ (p=1.000 n=1) ²
RIBFlush/10000_routes,_1_peers-12     2.832 ± ∞ ¹    2.157 ± ∞ ¹        ~ (p=1.000 n=1) ²
RIBFlush/10000_routes,_2_peers-12     5.660 ± ∞ ¹    4.247 ± ∞ ¹        ~ (p=1.000 n=1) ²
RIBFlush/10000_routes,_5_peers-12     14.00 ± ∞ ¹    10.48 ± ∞ ¹        ~ (p=1.000 n=1) ²
RIBFlush/100000_routes,_1_peers-12    48.33 ± ∞ ¹    41.31 ± ∞ ¹        ~ (p=1.000 n=1) ²
RIBFlush/100000_routes,_2_peers-12    86.33 ± ∞ ¹    75.51 ± ∞ ¹        ~ (p=1.000 n=1) ²
RIBFlush/100000_routes,_5_peers-12    197.5 ± ∞ ¹    155.7 ± ∞ ¹        ~ (p=1.000 n=1) ²
geomean                               6.534          5.138        -21.36%
¹ need >= 6 samples for confidence interval at level 0.95
² need >= 4 samples to detect a difference at alpha level 0.05
```

Suggested in https://github.com/gaissmai/bart/issues/247#issuecomment-3257156436.
2025-09-05 20:49:16 +02:00
Vincent Bernat
b1d6382585 common/embed: replace all go:embed use by an embedded archive
Some of the files were quite big:

- asns.csv ~ 3 MB
- index.js ~ 1.5 MB
- *.svg ~ 2 MB

Use a ZIP archive to put them all and embed it. This reduce the binary
size from 89 MB to 82 MB. 🤯

This also pulls some code modernization (use of http.ServeFileFS).
2025-09-03 00:00:05 +02:00
Vincent Bernat
413f923dcc docker: update ClickHouse to 25.8 2025-08-30 23:46:17 +02:00
Vincent Bernat
bd787e654f common/daemon: experiment with the test/synctest package 2025-08-30 19:48:52 +02:00
Vincent Bernat
77306ddcee build: bump go.mod to Go 1.25 to use wg.Go() 2025-08-30 19:15:26 +02:00
Vincent Bernat
866658bc70 outlet/kafka: fix crash when scaling down and up the workers
The same metrics cannot be registered twice. Introduce a new method in
reporter to unregister a previously registered collector.

Fix #1908
2025-08-27 08:28:14 +02:00
Vincent Bernat
fa11e7de6d common/reporter: simplify interface for collecting metrics
Remove unused methods and always collect scoped metrics. As a
side-effect, BioRIS gRPC metrics are now correctly scoped.
2025-08-27 07:37:38 +02:00
Vincent Bernat
e2f1df9add tests: replace godebug by go-cmp for structure diffs
go-cmp is stricter and allow to catch more problems. Moreover, the
output is a bit nicer.
2025-08-23 16:03:09 +02:00
Vincent Bernat
59215899fc common/reporter: when running benchmarks, set log level to warning 2025-08-17 11:06:07 +02:00
Vincent Bernat
f7cc5e3dbc orchestrator/clickhouse: add a benchmark for networks.csv
```
goos: linux
goarch: amd64
pkg: akvorado/orchestrator/clickhouse
cpu: AMD Ryzen 5 5600X 6-Core Processor
BenchmarkNetworks-12                 482                 2.447 ms/op
```
2025-08-17 11:05:58 +02:00
Vincent Bernat
6b2af58a64 orchestrator/geoip: add a benchmark for Iterate*Databases()
Now:

```
goos: linux
goarch: amd64
pkg: akvorado/orchestrator/geoip
cpu: AMD Ryzen 7 PRO 6850U with Radeon Graphics
BenchmarkIterDatabase/ASN-16                3376               457.0 ns/entry
BenchmarkIterDatabase/GeoIP-16              2410               754.5 ns/entry
```

Before 0a10764cc9:

```
goos: linux
goarch: amd64
pkg: akvorado/orchestrator/geoip
cpu: AMD Ryzen 7 PRO 6850U with Radeon Graphics
BenchmarkIterDatabase/ASN-16                2863               609.3 ns/entry
BenchmarkIterDatabase/GeoIP-16              3286               719.3 ns/entry
```

I was hoping for a bit more!
2025-08-17 08:48:25 +02:00
Vincent Bernat
6118bb7aac common/helpers: convert SubnetMap to github.com/gaissmai/bart
I did not benchmark it myself, but it was benchmarked here:
 https://github.com/osrg/gobgp/issues/1414#issuecomment-3067255941

Of course, no guarantee that this benchmark matches our use cases.
Moreover, SubnetMap have been optimized to avoid parsing keys all
the time.

Also, the interface is a bit nicer and it uses netip.Prefix directly.

The next step is to convert outlet/routing/provider/bmp.
2025-08-16 09:38:44 +02:00
Vincent Bernat
3e68a41f57 docker: for dev, separate standalone ClickHouse setup from cluster
This way, there is no need to start a whole cluster just to work on a
single ClickHouse. Also add some hints in CONTRIBUTING.md.
2025-08-08 08:55:29 +02:00
Vincent Bernat
98eb1bdba5 chore: make a run of gofumpt 2025-08-05 06:21:34 +02:00
Vincent Bernat
a248997454 chore: more staticcheck fixes 2025-08-02 21:10:06 +02:00
Vincent Bernat
03b947e3c5 chore: fix many staticcheck warnings
The most important ones were fixed in the two previous commit.
2025-08-02 20:54:49 +02:00
Vincent Bernat
75b2d4821a common/reporter: avoid allocating on the stack with sync.Pool
Always return a pointer.
2025-08-02 20:18:12 +02:00
Vincent Bernat
3d01a68bcb common/helpers: cache skip decision when requiring external services 2025-07-30 08:19:00 +02:00
Vincent Bernat
a70029a4cd orchestrator/clickhouse: also guess the port when guessing HTTP URL 2025-07-30 08:11:28 +02:00
Vincent Bernat
8c85d54b3b common/remotedatasource: ensure we have at least one goroutine
Otherwise, Stop() will block.
2025-07-29 09:24:32 +02:00
Vincent Bernat
0aef1503a8 common/remotedatasource: disable the regular ticker on failure 2025-07-29 08:37:50 +02:00
Vincent Bernat
19d07d350c common/remotedatasource: add a Stop() method
This is cleaner this way. We can't use it for the static provider as we
cannot stop a provider.
2025-07-29 08:36:16 +02:00
Vincent Bernat
1a160c83b5 common/remotedatasource: move errors higher in the file
Otherwise, I am always confused on where is the New() function.
2025-07-29 08:35:47 +02:00
Vincent Bernat
aeb102c748 outlet/metadata: do not start fetcher for static until first query
We don't want initialization to spawn goroutines. All the more that we
don't stop them.
2025-07-29 08:29:57 +02:00
Vincent Bernat
239bf33f3a common/remotedatasource: make the test a bit more robust
We may have a 404 if the test is too slow.
2025-07-29 08:00:28 +02:00
Vincent Bernat
5e669db4b3 chore: use errors.New() instead of fmt.Errorf() 2025-07-29 07:42:49 +02:00
Vincent Bernat
18beb310ee chore: replace interface{} with any 2025-07-29 07:42:49 +02:00
Vincent Bernat
fa7e4745b8 common/remotedatasource: be stricter on results from remote sources
Also:
 - don't return partial results (not used)
 - fix tests
 - add more tests
2025-07-29 07:25:42 +02:00
Vincent Bernat
cce61cb0d6 common/remotedatasource: rename from remotedatasourcefetcher
Also rename RemoteDataSource to Source.
2025-07-28 18:41:50 +02:00
Vincent Bernat
e20645c92e outlet/metadata: synchronous fetching of metadata
As we are not constrained by time that much in the outlet, we can
simplify the fetching of metadata by doing it synchronously. We still
keep the breaker design to avoid continously polling a source that is
not responsive, so we still can loose some data if we are not able to
poll metadata. We also keep the background cache refresh. We also
introduce a grace time of 1 minute to avoid loosing data during start.

For the static provider, we wait for the remote data sources to be
ready. For the gNMI provider, there are target windows of availability
during which the cached data can be polled. The SNMP provider is loosing
its ability to coalesce requests.
2025-07-27 21:44:28 +02:00
Vincent Bernat
4c0b15e1cd inlet/outlet: rename a few metrics
For example:

```
 17:35 ❱ curl -s 127.0.0.1:8080/api/v0/outlet/metrics | promtool check metrics
akvorado_outlet_core_classifier_exporter_cache_size_items counter metrics should have "_total" suffix
akvorado_outlet_core_classifier_interface_cache_size_items counter metrics should have "_total" suffix
akvorado_outlet_flow_decoder_netflow_flowset_records_sum counter metrics should have "_total" suffix
akvorado_outlet_flow_decoder_netflow_flowset_records_sum non-histogram and non-summary metrics should not have "_sum" suffix
akvorado_outlet_flow_decoder_netflow_flowset_sum counter metrics should have "_total" suffix
akvorado_outlet_flow_decoder_netflow_flowset_sum non-histogram and non-summary metrics should not have "_sum" suffix
akvorado_outlet_kafka_buffered_fetch_records_total non-counter metrics should not have "_total" suffix
akvorado_outlet_kafka_buffered_produce_records_total non-counter metrics should not have "_total" suffix
akvorado_outlet_metadata_cache_refreshs counter metrics should have "_total" suffix
akvorado_outlet_routing_provider_bmp_peers_total non-counter metrics should not have "_total" suffix
akvorado_outlet_routing_provider_bmp_routes_total non-counter metrics should not have "_total" suffix
```

Also ensure metrics using errors as label don't have a too great
cardinality by using constants for error messages used.
2025-07-27 21:44:28 +02:00
Vincent Bernat
76151bea66 common/helpers: make some mapstructure hooks work with embedded structs
When using `mapstructure:",squash"`, most structure-specific hook don't
dive into the structure as they are provided with the parent structure.
Add an helper to make them work on the embedded structure as well and
use it for the generic "deprecated fields" hook, but also for the hook
for the common Kafka configuration.

This is a bit brittle. There are other use cases, but they may not need
this change.
2025-07-27 21:44:28 +02:00