akvorado

mirror of https://github.com/akvorado/akvorado.git synced 2025-12-12 06:24:10 +01:00

Author	SHA1	Message	Date
Vincent Bernat	f974d5591a	orchestrator/clickhouse: run some tests without a ClickHouse database Some tests don't rely on the ClickHouse database at all. Allow them to run without it.	2025-08-17 10:42:10 +02:00
Vincent Bernat	03b947e3c5	chore: fix many staticcheck warnings The most important ones were fixed in the two previous commit.	2025-08-02 20:54:49 +02:00
Vincent Bernat	f5ae97e30d	orchestrator/clickhouse: guess IP by connecting to port 80 It seems MacOS does not like to connect to port 0 (even if this is not really a connection).	2025-07-30 08:36:12 +02:00
Vincent Bernat	a70029a4cd	orchestrator/clickhouse: also guess the port when guessing HTTP URL	2025-07-30 08:11:28 +02:00
Vincent Bernat	5e669db4b3	chore: use errors.New() instead of fmt.Errorf()	2025-07-29 07:42:49 +02:00
Vincent Bernat	18beb310ee	chore: replace interface{} with any	2025-07-29 07:42:49 +02:00
Vincent Bernat	ac68c5970e	inlet: split inlet into new inlet and outlet This change split the inlet component into a simpler inlet and a new outlet component. The new inlet component receive flows and put them in Kafka, unparsed. The outlet component takes them from Kafka and resume the processing from here (flow parsing, enrichment) and puts them in ClickHouse. The main goal is to ensure the inlet does a minimal work to not be late when processing packets (and restart faster). It also brings some simplification as the number of knobs to tune everything is reduced: for inlet, we only need to tune the queue size for UDP, the number of workers and a few Kafka parameters; for outlet, we need to tune a few Kafka parameters, the number of workers and a few ClickHouse parameters. The outlet component features a simple Kafka input component. The core component becomes just a callback function. There is also a new ClickHouse component to push data to ClickHouse using the low-level ch-go library with batch inserts. This processing has an impact on the internal representation of a FlowMessage. Previously, it was tailored to dynamically build the protobuf message to be put in Kafka. Now, it builds the batch request to be sent to ClickHouse. This makes the FlowMessage structure hides the content of the next batch request and therefore, it should be reused. This also changes the way we decode flows as they don't output FlowMessage anymore, they reuse one that is provided to each worker. The ClickHouse tables are slightly updated. Instead of using Kafka engine, the Null engine is used instead. Fix #1122	2025-07-27 21:44:28 +02:00
Vincent Bernat	d60a714b8c	orchestrator/clickhouse: do not embed clickhouse database settings Instead, properly use them from the clickhousedb component. Also provide some automatic migration.	2025-07-08 09:06:31 +02:00
netixx	f0d85ebb9e	Fix system reload request to include db name	2024-12-17 18:23:00 +01:00
Vincent Bernat	aa9e5d1d67	orchestrator/clickhouse: escape user-provided strings Notably username and password may contain quotes or backslashes.	2024-10-27 08:43:19 +01:00
Paul Galceran	43c169677a	Resolve L4 ports protocol names (#1257 ) * fix: generation of protocols.csv file * feat: generation of ports-tcp.csv and ports-udp.csv files * build: add rules for creating udp and tcp csv files * feat: create dictionary tcp and udp * refactor: add replaceRegexpOne * test: transform src port and dest port columns in SQL * test: add TCP and UDP dictionaries for migration testing	2024-06-14 21:52:56 +02:00
Vincent Bernat	297e04b95c	common: clickHouse → clickhouse Let's say that we use "ClickHouse" and "clickhouse".	2024-06-09 14:59:09 +02:00
Vincent Bernat	8d96aa070a	orchestrator/clickhouse: simplify use of QueryRow() No need to check for errors, this is also done when invoking Scan().	2024-04-05 22:00:06 +02:00
Vincent Bernat	032d28561c	orchestrator/clickhouse: add support for replication only If there is only one shard, do not create distributed tables.	2024-04-05 21:54:14 +02:00
Vincent Bernat	28783ff4f3	orchestrator/clickhouse: add support for distributed/replicated tables Fix #605 All MergeTree tables are now replicated. For some tables, a `_local` variant is added and the non-`_local` variant is now distributed. The distributed tables are the `flows` table, the `flows_DDDD` tables (where `DDDD` is a duration), as well as the `flows_raw_errors` table. The `exporters` table is not distributed and stays local. The data is following this schema: - data is coming from `flows_HHHH_raw` table, using the Kafka engine - the `flows_HHHH_raw_consumer` reads data from `flows_HHHH_raw` (local) and sends it to `flows` (distributed) when there is no error - the `flows_raw_errors_consumer` reads data from `flows_HHHH_raw` (local) and sends it to `flows_raw_errors` (distributed) - the `flows_DDDD_consumer` reads fata from `flows_local` (local) and sends it to `flow_DDDD_local` (local) - the `exporters_consumer` reads data from `flows` (distributed) and sends it to `exporters` (local) The reason for `flows_HHHH_raw_consumer` to send data to the distributed `flows` table, and not the local one is to ensure flows are balanced (for example, if there is not enough Kafka partitions). But sending it to `flows_local` would have been possible. On the other hand, it is important for `flows_DDDD_consumer` to read from local to avoid duplication. It could have sent to distributed, but the data is now balanced correctly and we just send it to local instead for better performance. The `exporters_consumer` is allowed to read from the distributed `flows` table because it writes the result to the local `exporters` table.	2024-04-04 22:03:12 +02:00
Vincent Bernat	e910160c17	orchestrator/clickhouse: don't use TO syntax for exporters table Create a table, then use a consumer view.	2024-03-28 11:42:17 +01:00
Vincent Bernat	cc24077491	orchestrator/clickhouse: don't use TO syntax for flows_raw_errors table	2024-03-27 18:47:44 +01:00
Francois Espinet	87a57bf82e	Do geoip enrich in clickhouse instead of inlet One solution to https://github.com/akvorado/akvorado/issues/62	2024-03-11 15:29:09 +01:00
Vincent Bernat	e3c8f13562	build: enable loopvar experiment This is enabled by default once we switch to Go 1.22 (in go.mod). See https://tip.golang.org/wiki/LoopvarExperiment	2024-02-18 09:46:49 +01:00
Vincent Bernat	1c6599e879	orchestrator: standardize how we capture variables in for loops	2024-01-22 20:34:26 +01:00
Vincent Bernat	cec8661387	chore: capitalize comments	2024-01-22 20:34:08 +01:00
Marvin Gaube	e6effd1335	feat: add custom dictionaries for additional, customized flow hydration	2023-08-25 22:10:30 +02:00
Vincent Bernat	87be0ed374	orchestrator/clickhouse: add a version check to avoid buggy version	2023-07-17 08:24:25 +02:00
Vincent Bernat	0e1b5a3351	common/schema: introduce ICMPv4/ICMPv6 virtual columns	2023-06-03 18:57:19 +02:00
Vincent Bernat	002a93b036	orchestrator/clickhouse: add an end message for migration process	2023-02-12 15:01:04 +01:00
Vincent Bernat	4343f32acd	orchestrator/clickhouse: fix panic when migrations are not successful	2023-01-09 10:42:48 +01:00
Vincent Bernat	3912b8bbb8	orchestrator/clickhouse: stop meddling with TTL of system tables This does not seem to survive a restart. There is no indication in the documentation this is the right way. One should modify settings directly. I need to investigate how to do this properly with Docker.	2023-01-09 08:50:12 +01:00
Vincent Bernat	4dcde85523	orchestrator/clickhouse: make migrations test more reliable Wait longer while migrations are running, fail fast on errors.	2023-01-03 16:14:25 +01:00
Vincent Bernat	874d52f05f	orchestrator/clickhouse: set TTL for system logs tables	2023-01-03 14:26:58 +01:00
Vincent Bernat	7d1ba478a1	orchestrator/clickhouse: rework migrations to use an abstract schema We introduce an leaky abstraction for flows schema and use it for migrations as a first step. For views and dictionaries, we stop relying on a hash to know if they need to be recreated, but we compare the select statements with our target statement. This is a bit fragile, but strictly better than the hash. For data tables, we add the missing columns. We give up on the abstraction of a migration step and just rely on helper functions to get the same result. The migration code is now shorter and we don't need to update it when adding new columns. This is a preparatory work for #211 to allow a user to specify additional fields to collect.	2023-01-02 23:42:05 +01:00
Vincent Bernat	689497aa13	console: add SrcNetPrefix and DstNetPrefix as dimensions This is not added to filtering as I fail to see how it would be useful. One can still filter on SrcAddr and DstAddr. Fix #218	2022-11-26 15:49:26 +01:00
Vincent Bernat	5b88b5f30a	orchestrator/clickhouse: export netmask to ClickHouse	2022-11-26 14:24:54 +01:00
Vincent Bernat	e2e94e7a3c	orchestrator/clickhouse: tell explicitely when no migration is needed Otherwise, logs are confusing.	2022-11-03 20:54:52 +01:00
Vincent Bernat	25d2e6efd7	orchestrator/clickhouse: collect Kafka errors	2022-09-30 09:53:52 +02:00
Vincent Bernat	dc91670fb3	orchestrator/clickhouse: ingest large communities	2022-09-28 14:31:34 +02:00
Vincent Bernat	e2672503b8	orchestrator/clickhouse: ingest DstCommunities It's only available in the main flow table.	2022-09-27 00:34:41 +02:00
Vincent Bernat	7e876902d7	orchestrator/clickhouse: prefix ASPath/1stAS/2ndAS/3rdAS with Dst	2022-09-27 00:34:41 +02:00
Vincent Bernat	714340997a	orchestrator/clickhouse: ingest ASPath The complete AS path is only available in the `flows` table. The consolidated tables are only left with 1st, 2nd and 3rd AS numbers.	2022-09-27 00:34:41 +02:00
Vincent Bernat	3e3bcbdada	http: use a method to get local address And limit its export to testing.	2022-08-21 08:20:14 +02:00
Vincent Bernat	d2595dfef5	orchestrator/clickhouse: fix `SrcCountry`/`DstCountry` columns In aggregated tables, these columns were missing from the ORDER BY clause. This means they were set to some random values. This is not possible to fix that after their creation (see #60 for a tentative), therefore, we have to drop and recreate the columns. This only affects aggregated tables, not the main table, but nonetheless, unless you look at the last hour, the data is lost.	2022-07-29 18:52:51 +02:00
Vincent Bernat	02dc9401e2	orchestrator: add more attributes to classify networks Like for exporters, we add role, site, region, and tenant. This time, this is done in ClickHouse.	2022-07-18 11:34:56 +02:00
Vincent Bernat	1e147704c7	inlet: classify exporters to group, role, site, region, and tenant Previously, this was done only for groups. Encoding everything into groups is a bit restrictive. The same should be done for IP networks.	2022-07-18 11:01:30 +02:00
Vincent Bernat	f285114278	orchestrator/clickhouse: cap the number of consumers ClickHouse does not allow more consumers than the number of physical CPUs. Unless configured otherwise, the number of threads match the number of physical CPUs. We bound the number of consumers to this number. Fix #13	2022-07-17 00:49:05 +02:00
Vincent Bernat	2cee0e80f8	orchestrator/clickhouse: reload dictionaries when starting	2022-07-05 22:25:14 +02:00
Vincent Bernat	8be1bca4fd	license: AGPL-3.0-only ``` git ls-files \.js \.go \ \| xargs sed -i '1i // SPDX-FileCopyrightText: 2022 Free Mobile\n// SPDX-License-Identifier: AGPL-3.0-only\n' git ls-files \*.vue \ \| xargs sed -i '1i <!-- SPDX-FileCopyrightText: 2022 Free Mobile -->\n<!-- SPDX-License-Identifier: AGPL-3.0-only -->\n' ```	2022-06-29 11:42:28 +02:00
Vincent Bernat	c76f4e406d	orchestrator/clickhouse: implement network names	2022-06-03 15:32:41 +02:00
Vincent Bernat	9d3b74d305	console: add PacketSizeBucket dimensions	2022-05-22 23:48:40 +02:00
Vincent Bernat	db0b2dfb50	orchestrator/clickhouse: second tentative for consolidated tables This time, just forget about IP addresses after the predefined time.	2022-05-10 09:30:28 +02:00
Vincent Bernat	73b3d36008	orchestrator/clickhouse: tentative to downsample flows This is a tentative to downsample flows. However, we can only group by a prefix of the primary key. Therefore, all downsampling intervals should be encoded in the order by clause, which is not what we want.	2022-04-24 10:56:21 +02:00
Vincent Bernat	42b12d9db5	orchestrator/clickhouse: create consolidated tables for flows The idea is to not query the flows table unless absolutely necessary. It would have been nice to not have this Date field, but rebuilding the table is costly, we'll do that later when the table is smaller. We will also need to use a small PARTITION BY. Also remove some migrations not needed anymore.	2022-04-23 20:20:01 +02:00

1 2

59 Commits