This is a bit like Traefik. We set metrics.port on each container we
want to scrape metrics from (and optionally metrics.path).
Semi-related, but we also rely on exposed port for Traefik and we override
it for all containers to be sure we select the right one. This is less
error prone as we need at least one exposed port and some containers may
or may not have one. Just always set an exposed port if we have metrics
or traefik rules.
The idea is that alloy can also be used for more. For example, we could
introduce Loki (with a `docker-compose-loki.yml`) and it would use alloy
too. Alloy configuration needs to be split into several parts and both
`docker-compose-prometheus.yml` and `docker-compose-loki.yml` would
define it but with an additional volume for their specific part of the
configuration (using the `extend` mechanism).
However, we don't use the bundled Node Exporter, nor the bundled
cAdvisor. It is better to have individual components to avoid reduce the
amount of code with elevated privileges (both Node Exporter and cAdvisor
need specific privileges). Also, we keep Prometheus instead of switching
to the full Grafana stack with Mimir as it is a more common setup and
this is not a goal to provide something universally scalable.
Also, Prometheus is now behind the private endpoint as it is possible to
send metrics.
Docker can easily break the firewall rules such that masquerading
happens internally.
```
ip saddr 247.16.12.0/24 oifname != "br-65eaa81ed142" counter packets 812 bytes 132030 masquerade
ip saddr 247.16.12.0/24 oifname != "br-fa3db0ecc1de" counter packets 0 bytes 0 masquerade
ip saddr 247.16.12.0/24 oifname != "br-c7a7788478c5" counter packets 0 bytes 0 masquerade
```
When the "current" bridge is the second one, inter-container
communication gets masqueraded. I didn't find an associated issue.
Also fix configuration of node-exporter to really monitor the host. And
fix Prometheus configuration which was broken since we tried to monitor
Traefik (in 8f73f70050).
The apache image defines a volume under /var/lib/kafka/data, which is
created as an anonymous volume by docker unless docker compose properly
mounts to the right path.
This is unfortunately a breaking change.
The default value is quite low. This is a bit of a stop gap. The
alternative would be to maintain a circular buffer of the same size
inside the outlet for each connection and ensure there is no lock in the
path. But doing it in the kernel means almost no code, even if it is a
bit complex for the user.
Fix#1461
Also, don't decode IPv4/IPv6 addresses when they are 0 (some templates
will include both). Also decode dot1VlanId and postDot1qVlanId but
prefer vlanId and postVlanId if they are present.
Fix#1621
Inserting into ClickHouse should be done in large batches to minimize
the number of parts created. This would require the user to tune the
number of Kafka workers to match a target of around 50k-100k rows. Instead,
we dynamically tune the number of workers depending on the load to reach
this target.
We keep using async if we are too low in number of flows.
It is still possible to do better by consolidating batches from various
workers, but that's something I wanted to avoid.
Also, increase the maximum wait time to 5 seconds. It should be good
enough for most people.
Fix#1885
And also add documentation on how to use IPv6. The proposed setup relies
on NAT66, which is not good, but it works on any host with IPv6
connectivity. The documentation explains how to configure routed IPv6.
By using an IPv4 subnet in class E, we ensure that it is very unlikely
users will have overlap between their Docker setup and their production
network. This way, no need to change the Docker daemon configuration.
The advice was not true. An active part is not one that should be
actively merged, it's one that is used (and not to be deleted).
ClickHouse is good with more than 10k parts.