ID: S202606091028
Status: imported

Tags: avans 2-4 LU1

Metrics Overview

All metrics flow through the OpenTelemetry Collector and are scraped by Prometheus. They are visible in Grafana at http://86.48.5.113:3000.


Application Metrics (custom, via OpenTelemetry)

Emitted by the MessageForwardingService and exposed through the OTel Collector’s Prometheus endpoint (otel-collector:8889).

Naming in Prometheus/Grafana: The OTel Collector adds a communicatie_ namespace prefix, converts . to _, appends _total to counters, and appends the unit (_milliseconds) to histograms.

Note: Counters only appear in Prometheus after they are first incremented. Send a test message through RabbitMQ to populate them.

Message Processing

OTel namePrometheus nameTypeTagsDescription
mfs.messages.processedcommunicatie_mfs_messages_processed_totalCounter—Total messages successfully processed end-to-end
mfs.messages.rejectedcommunicatie_mfs_messages_rejected_totalCounter—Total messages rejected after a processing failure

Database Writes (mfs-1)

OTel namePrometheus nameTypeTagsDescription
mfs.db.writes.totalcommunicatie_mfs_db_writes_totalCounteroperation: upsert / cancelTotal DB write attempts
mfs.db.writes.failedcommunicatie_mfs_db_writes_failed_totalCounteroperation: upsert / cancelTotal DB write failures
mfs.db.write.durationcommunicatie_mfs_db_write_duration_millisecondsHistogramoperation: upsert / cancelTime taken for each DB write (_bucket, _count, _sum suffixes)

Notification Jobs (mfs-2)

OTel namePrometheus nameTypeTagsDescription
mfs.notification.jobs.publishedcommunicatie_mfs_notification_jobs_published_totalCounter—Notification jobs successfully published to the internal queue
mfs.notification.jobs.failedcommunicatie_mfs_notification_jobs_failed_totalCounter—Notification jobs that failed to publish

Provider Calls (mfs-3)

OTel namePrometheus nameTypeTagsDescription
mfs.provider.calls.succeededcommunicatie_mfs_provider_calls_succeeded_totalCounterprovider, type (24H/1H)Successful provider API calls
mfs.provider.calls.failedcommunicatie_mfs_provider_calls_failed_totalCounterprovider, type (24H/1H)Provider calls that failed after all retries
mfs.provider.call.durationcommunicatie_mfs_provider_call_duration_millisecondsHistogramprovider, status (success/failed)Total time including retries (_bucket, _count, _sum suffixes)
mfs.provider.retriescommunicatie_mfs_provider_retries_totalCounterprovider, reason (rate_limited/error/exception)Individual retry attempts

.NET Runtime Metrics (via OpenTelemetry.Instrumentation.Runtime)

Automatically collected from the .NET runtime. No tagging required.

Metric groupExamples
Garbage Collectorprocess.runtime.dotnet.gc.collections.count, process.runtime.dotnet.gc.heap.size
Thread Poolprocess.runtime.dotnet.thread_pool.queue.length, process.runtime.dotnet.thread_pool.threads.count
Exceptionsprocess.runtime.dotnet.exceptions.count
JITprocess.runtime.dotnet.jit.il_compiled.size

HTTP Client Metrics (via OpenTelemetry.Instrumentation.Http)

Automatically collected for all outbound HTTP calls (e.g. provider API calls to fakecomworld).

MetricTypeTags
http.client.request.durationHistogram (s)http.request.method, http.response.status_code, server.address
http.client.open_connectionsUpDownCounternetwork.protocol.version, server.address

RabbitMQ Metrics (via Prometheus plugin, port 15692)

Scraped from both rabbitmq-inbound and rabbitmq-internal.

Metric groupExamples
Queue depthrabbitmq_queue_messages, rabbitmq_queue_messages_ready, rabbitmq_queue_messages_unacked
Message ratesrabbitmq_queue_messages_published_total, rabbitmq_queue_messages_delivered_total
Connectionsrabbitmq_connections, rabbitmq_channels
Node healthrabbitmq_process_resident_memory_bytes, rabbitmq_disk_space_available_bytes

MariaDB Metrics (via mysqld-exporter, port 9104)

Metric groupExamples
Queriesmysql_global_status_queries, mysql_global_status_slow_queries
Connectionsmysql_global_status_threads_connected, mysql_global_status_max_used_connections
InnoDBmysql_global_status_innodb_buffer_pool_reads, mysql_global_status_innodb_row_lock_waits
Replicationmysql_slave_status_seconds_behind_master (if applicable)

Container Metrics (via cAdvisor, port 8080)

Per-container resource usage for all 14 Docker containers.

Metric groupExamples
CPUcontainer_cpu_usage_seconds_total
Memorycontainer_memory_usage_bytes, container_memory_rss
Networkcontainer_network_receive_bytes_total, container_network_transmit_bytes_total
Disk I/Ocontainer_fs_reads_bytes_total, container_fs_writes_bytes_total

Host / VPS Metrics (via node-exporter, port 9100)

System-level metrics for the VPS itself (86.48.5.113).

Metric groupExamples
CPUnode_cpu_seconds_total (by mode: user, system, idle, iowait)
Memorynode_memory_MemAvailable_bytes, node_memory_MemTotal_bytes
Disknode_disk_read_bytes_total, node_disk_written_bytes_total, node_filesystem_avail_bytes
Networknode_network_receive_bytes_total, node_network_transmit_bytes_total
Loadnode_load1, node_load5, node_load15
Systemnode_boot_time_seconds, node_time_seconds