ID: S202606091028
Status: imported
Tags: avans 2-4 LU1
Metrics Overview
All metrics flow through the OpenTelemetry Collector and are scraped by Prometheus. They are visible in Grafana at http://86.48.5.113:3000.
Application Metrics (custom, via OpenTelemetry)
Emitted by the MessageForwardingService and exposed through the OTel Collectorās Prometheus endpoint (otel-collector:8889).
Naming in Prometheus/Grafana: The OTel Collector adds a
communicatie_namespace prefix, converts.to_, appends_totalto counters, and appends the unit (_milliseconds) to histograms.Note: Counters only appear in Prometheus after they are first incremented. Send a test message through RabbitMQ to populate them.
Message Processing
| OTel name | Prometheus name | Type | Tags | Description |
|---|---|---|---|---|
mfs.messages.processed | communicatie_mfs_messages_processed_total | Counter | ā | Total messages successfully processed end-to-end |
mfs.messages.rejected | communicatie_mfs_messages_rejected_total | Counter | ā | Total messages rejected after a processing failure |
Database Writes (mfs-1)
| OTel name | Prometheus name | Type | Tags | Description |
|---|---|---|---|---|
mfs.db.writes.total | communicatie_mfs_db_writes_total | Counter | operation: upsert / cancel | Total DB write attempts |
mfs.db.writes.failed | communicatie_mfs_db_writes_failed_total | Counter | operation: upsert / cancel | Total DB write failures |
mfs.db.write.duration | communicatie_mfs_db_write_duration_milliseconds | Histogram | operation: upsert / cancel | Time taken for each DB write (_bucket, _count, _sum suffixes) |
Notification Jobs (mfs-2)
| OTel name | Prometheus name | Type | Tags | Description |
|---|---|---|---|---|
mfs.notification.jobs.published | communicatie_mfs_notification_jobs_published_total | Counter | ā | Notification jobs successfully published to the internal queue |
mfs.notification.jobs.failed | communicatie_mfs_notification_jobs_failed_total | Counter | ā | Notification jobs that failed to publish |
Provider Calls (mfs-3)
| OTel name | Prometheus name | Type | Tags | Description |
|---|---|---|---|---|
mfs.provider.calls.succeeded | communicatie_mfs_provider_calls_succeeded_total | Counter | provider, type (24H/1H) | Successful provider API calls |
mfs.provider.calls.failed | communicatie_mfs_provider_calls_failed_total | Counter | provider, type (24H/1H) | Provider calls that failed after all retries |
mfs.provider.call.duration | communicatie_mfs_provider_call_duration_milliseconds | Histogram | provider, status (success/failed) | Total time including retries (_bucket, _count, _sum suffixes) |
mfs.provider.retries | communicatie_mfs_provider_retries_total | Counter | provider, reason (rate_limited/error/exception) | Individual retry attempts |
.NET Runtime Metrics (via OpenTelemetry.Instrumentation.Runtime)
Automatically collected from the .NET runtime. No tagging required.
| Metric group | Examples |
|---|---|
| Garbage Collector | process.runtime.dotnet.gc.collections.count, process.runtime.dotnet.gc.heap.size |
| Thread Pool | process.runtime.dotnet.thread_pool.queue.length, process.runtime.dotnet.thread_pool.threads.count |
| Exceptions | process.runtime.dotnet.exceptions.count |
| JIT | process.runtime.dotnet.jit.il_compiled.size |
HTTP Client Metrics (via OpenTelemetry.Instrumentation.Http)
Automatically collected for all outbound HTTP calls (e.g. provider API calls to fakecomworld).
| Metric | Type | Tags |
|---|---|---|
http.client.request.duration | Histogram (s) | http.request.method, http.response.status_code, server.address |
http.client.open_connections | UpDownCounter | network.protocol.version, server.address |
RabbitMQ Metrics (via Prometheus plugin, port 15692)
Scraped from both rabbitmq-inbound and rabbitmq-internal.
| Metric group | Examples |
|---|---|
| Queue depth | rabbitmq_queue_messages, rabbitmq_queue_messages_ready, rabbitmq_queue_messages_unacked |
| Message rates | rabbitmq_queue_messages_published_total, rabbitmq_queue_messages_delivered_total |
| Connections | rabbitmq_connections, rabbitmq_channels |
| Node health | rabbitmq_process_resident_memory_bytes, rabbitmq_disk_space_available_bytes |
MariaDB Metrics (via mysqld-exporter, port 9104)
| Metric group | Examples |
|---|---|
| Queries | mysql_global_status_queries, mysql_global_status_slow_queries |
| Connections | mysql_global_status_threads_connected, mysql_global_status_max_used_connections |
| InnoDB | mysql_global_status_innodb_buffer_pool_reads, mysql_global_status_innodb_row_lock_waits |
| Replication | mysql_slave_status_seconds_behind_master (if applicable) |
Container Metrics (via cAdvisor, port 8080)
Per-container resource usage for all 14 Docker containers.
| Metric group | Examples |
|---|---|
| CPU | container_cpu_usage_seconds_total |
| Memory | container_memory_usage_bytes, container_memory_rss |
| Network | container_network_receive_bytes_total, container_network_transmit_bytes_total |
| Disk I/O | container_fs_reads_bytes_total, container_fs_writes_bytes_total |
Host / VPS Metrics (via node-exporter, port 9100)
System-level metrics for the VPS itself (86.48.5.113).
| Metric group | Examples |
|---|---|
| CPU | node_cpu_seconds_total (by mode: user, system, idle, iowait) |
| Memory | node_memory_MemAvailable_bytes, node_memory_MemTotal_bytes |
| Disk | node_disk_read_bytes_total, node_disk_written_bytes_total, node_filesystem_avail_bytes |
| Network | node_network_receive_bytes_total, node_network_transmit_bytes_total |
| Load | node_load1, node_load5, node_load15 |
| System | node_boot_time_seconds, node_time_seconds |