Commit Graph

9 Commits

Author SHA1 Message Date
Vincent Bernat
3291abe680 outlet/kafka: delay shutdown when stopping worker after closing client
Some checks failed
CI / 🤖 Check dependabot status (push) Has been cancelled
CI / 🐧 Test on Linux (${{ github.ref_type == 'tag' }}, misc) (push) Has been cancelled
CI / 🐧 Test on Linux (coverage) (push) Has been cancelled
CI / 🐧 Test on Linux (regular) (push) Has been cancelled
CI / ❄️ Build on Nix (push) Has been cancelled
CI / 🍏 Build and test on macOS (push) Has been cancelled
CI / 🧪 End-to-end testing (push) Has been cancelled
CI / 🔍 Upload code coverage (push) Has been cancelled
CI / 🔬 Test only Go (push) Has been cancelled
CI / 🔬 Test only JS (${{ needs.dependabot.outputs.package-ecosystem }}, 20) (push) Has been cancelled
CI / 🔬 Test only JS (${{ needs.dependabot.outputs.package-ecosystem }}, 22) (push) Has been cancelled
CI / 🔬 Test only JS (${{ needs.dependabot.outputs.package-ecosystem }}, 24) (push) Has been cancelled
CI / ⚖️ Check licenses (push) Has been cancelled
CI / 🐋 Build Docker images (push) Has been cancelled
CI / 🐋 Tag Docker images (push) Has been cancelled
CI / 🚀 Publish release (push) Has been cancelled
Flushing can take some time and we have an heartbeat to respect to
commit offsets.
2025-11-13 22:53:34 +01:00
Vincent Bernat
f4875ed7b3 outlet/kafka: execute shutdown before committing work
Some checks failed
CI / 🤖 Check dependabot status (push) Has been cancelled
CI / 🐧 Test on Linux (${{ github.ref_type == 'tag' }}, misc) (push) Has been cancelled
CI / 🐧 Test on Linux (coverage) (push) Has been cancelled
CI / 🐧 Test on Linux (regular) (push) Has been cancelled
CI / ❄️ Build on Nix (push) Has been cancelled
CI / 🍏 Build and test on macOS (push) Has been cancelled
CI / 🧪 End-to-end testing (push) Has been cancelled
CI / 🔍 Upload code coverage (push) Has been cancelled
CI / 🔬 Test only Go (push) Has been cancelled
CI / 🔬 Test only JS (${{ needs.dependabot.outputs.package-ecosystem }}, 20) (push) Has been cancelled
CI / 🔬 Test only JS (${{ needs.dependabot.outputs.package-ecosystem }}, 22) (push) Has been cancelled
CI / 🔬 Test only JS (${{ needs.dependabot.outputs.package-ecosystem }}, 24) (push) Has been cancelled
CI / ⚖️ Check licenses (push) Has been cancelled
CI / 🐋 Build Docker images (push) Has been cancelled
CI / 🐋 Tag Docker images (push) Has been cancelled
CI / 🚀 Publish release (push) Has been cancelled
Update Nix dependency hashes / Update dependency hashes (push) Has been cancelled
And add a bit more logging to understand what happens on shutdown.
2025-11-13 20:39:59 +01:00
Vincent Bernat
3a34495b70 outlet/kafka: be more aggressive when scaling up/down workers
Some checks failed
CI / 🤖 Check dependabot status (push) Has been cancelled
CI / 🐧 Test on Linux (${{ github.ref_type == 'tag' }}, misc) (push) Has been cancelled
CI / 🐧 Test on Linux (coverage) (push) Has been cancelled
CI / 🐧 Test on Linux (regular) (push) Has been cancelled
CI / ❄️ Build on Nix (push) Has been cancelled
CI / 🍏 Build and test on macOS (push) Has been cancelled
CI / 🧪 End-to-end testing (push) Has been cancelled
CI / 🔍 Upload code coverage (push) Has been cancelled
CI / 🔬 Test only Go (push) Has been cancelled
CI / 🔬 Test only JS (${{ needs.dependabot.outputs.package-ecosystem }}, 20) (push) Has been cancelled
CI / 🔬 Test only JS (${{ needs.dependabot.outputs.package-ecosystem }}, 22) (push) Has been cancelled
CI / 🔬 Test only JS (${{ needs.dependabot.outputs.package-ecosystem }}, 24) (push) Has been cancelled
CI / ⚖️ Check licenses (push) Has been cancelled
CI / 🐋 Build Docker images (push) Has been cancelled
CI / 🐋 Tag Docker images (push) Has been cancelled
CI / 🚀 Publish release (push) Has been cancelled
Update Nix dependency hashes / Update dependency hashes (push) Has been cancelled
Update Go toolchain / Update Go toolchain (push) Has been cancelled
Update Nix flake.lock / Update Nix lockfile (asn2org) (push) Has been cancelled
Update Nix flake.lock / Update Nix lockfile (nixpkgs) (push) Has been cancelled
Use dichotomy to quickly reach the optimal. This avoid too much Kafka
rebalances on big setups.
2025-10-19 21:27:42 +02:00
Vincent Bernat
3374da6693 outlet/kafka: fix off-by-one scaling logic
And add a test for it.
2025-10-19 19:26:54 +02:00
Vincent Bernat
866658bc70 outlet/kafka: fix crash when scaling down and up the workers
The same metrics cannot be registered twice. Introduce a new method in
reporter to unregister a previously registered collector.

Fix #1908
2025-08-27 08:28:14 +02:00
Vincent Bernat
fa11e7de6d common/reporter: simplify interface for collecting metrics
Remove unused methods and always collect scoped metrics. As a
side-effect, BioRIS gRPC metrics are now correctly scoped.
2025-08-27 07:37:38 +02:00
Vincent Bernat
ea97eb1b9b outlet/kafka: fix spelling in a log message 2025-08-16 21:36:03 +02:00
Vincent Bernat
b71ed907e8 outlet/kafka: collect metrics have creating new client
We cannot do that before, otherwise, we run into a race condition as the
list of metrics is not protected inside kprom franz-go plugin.

Also do that for inlet/kafka, even if it is less important there, since
there is only one client.
2025-08-09 15:58:25 +02:00
Vincent Bernat
e5a625aecf outlet: make the number of Kafka workers dynamic
Inserting into ClickHouse should be done in large batches to minimize
the number of parts created. This would require the user to tune the
number of Kafka workers to match a target of around 50k-100k rows. Instead,
we dynamically tune the number of workers depending on the load to reach
this target.

We keep using async if we are too low in number of flows.

It is still possible to do better by consolidating batches from various
workers, but that's something I wanted to avoid.

Also, increase the maximum wait time to 5 seconds. It should be good
enough for most people.

Fix #1885
2025-08-09 15:58:25 +02:00