Introduction
Kubernetes clusters are complex systems with hundreds of moving parts. Your Azure Kubernetes Service (AKS) deployment generates telemetry from the control plane, worker nodes, containers, and applications—but raw data isn't actionable. Without comprehensive monitoring, you're flying blind.
The challenge intensifies when you manage hybrid infrastructure spanning AKS, OpenShift, and other Kubernetes distributions. You need visibility across all clusters, unified alerting, and centralized dashboards. The good news? Azure Monitor combined with Red Hat's monitoring solutions provides exactly that.
This guide walks you through AKS monitoring capabilities, Red Hat integration options, and real-world deployment patterns that help you achieve enterprise-grade observability.
Understanding AKS Monitoring Fundamentals
Built-in Azure Monitor Integration
AKS doesn't require external monitoring tools to get started. Azure Monitor provides native integration with diagnostic settings that route telemetry to Log Analytics workspaces. This foundation gives you:
- Automatic data collection from clusters without agent complexity
- 50+ performance metrics automatically gathered through Container Insights
- Persistent storage with configurable retention from 30 to 730 days in Log Analytics
- Native Azure integration across your broader cloud infrastructure
Container Insights is the cornerstone of AKS monitoring. It collects granular performance data including CPU utilization, memory consumption, disk I/O, and network metrics across nodes and containers.
Control Plane Visibility
Most organizations overlook the control plane—the cluster's operational brain. AKS gives you access to five critical control plane components:
- kube-apiserver: Handles all Kubernetes API requests
- kube-controller-manager: Runs cluster controllers
- kube-scheduler: Assigns pods to nodes
- cloud-controller-manager: Manages cloud-specific operations
- Audit logs: Tracks all API activities for compliance
By enabling diagnostic settings at cluster creation time, these logs stream directly to Log Analytics where you can query them with Kusto Query Language (KQL).
Metrics Collection and Visualization
The Metrics Server Foundation
AKS includes a Metrics Server that's essential for scaling operations. This component provides resource metrics that power:
- Horizontal Pod Autoscaling (HPA): Automatically scale pods based on CPU/memory
- Vertical Pod Autoscaling (VPA): Right-size container resource requests
- Cluster monitoring dashboards: Real-time resource visibility
Without proper metrics collection, your autoscaling policies operate blind, leading to over-provisioned or under-resourced workloads.
Real-Time Observability with Live Logs
AKS's Live Logs feature streams container output in real-time. Unlike traditional logging where you query historical data, Live Logs show you what's happening now—crucial for troubleshooting issues as they occur.
Application logs are automatically collected from all containers, while Kubelet logs provide node-level insights into pod scheduling and resource management.
Visualization and Querying
Azure Monitor Workbooks provide pre-built dashboards for cluster health, while Grafana integration lets you build custom dashboards. AKS also supports Prometheus metrics natively, enabling you to:
- Export metrics in Prometheus format
- Deploy Prometheus scrapers alongside Azure Monitor
- Build hybrid monitoring stacks combining multiple tools
Application Insights for Application-Level Monitoring
Container Insights focuses on infrastructure metrics, but Application Insights adds application-level visibility. This integration tracks:
- Request latency and error rates
- Dependency calls to databases and APIs
- Custom business metrics
- User session information
By correlating Application Insights data with Container Insights metrics, you see the complete picture from request arrival through container execution.
Red Hat Monitoring Solutions: A Different Approach
OpenShift's Native Monitoring Stack
Red Hat's OpenShift includes a built-in monitoring architecture fundamentally different from AKS. OpenShift clusters ship with:
- Prometheus for metrics collection (integrated, not bolted-on)
- Alertmanager for alert routing and notification
- Grafana for visualization
- Operator-based architecture where monitoring components are Kubernetes-native resources
This approach means monitoring isn't an afterthought—it's part of the platform.
Two-Tier Monitoring Model
OpenShift implements a sophisticated two-tier monitoring approach:
- Cluster monitoring: System metrics from control plane and infrastructure
- User workload monitoring: Application and custom metrics from your workloads
This separation enables multi-tenancy where different teams see only their project's metrics while the platform team manages cluster health.
Prometheus Operator for Flexible Metrics Management
OpenShift uses the Prometheus Operator, which abstracts Prometheus complexity behind Kubernetes Custom Resource Definitions (CRDs). Instead of managing Prometheus configuration files, you define:
apiVersion: monitoring.coreos.com/v1
kind: PrometheusRule
metadata:
name: application-alerts
spec:
groups:
- name: application
interval: 30s
rules:
- alert: HighErrorRate
expr: rate(requests_total{job="app"}[5m]) > 0.05
This Kubernetes-native approach means monitoring rules follow the same GitOps patterns as your applications.
Red Hat Advanced Cluster Management: Unified Monitoring Across Hybrid Infrastructure
The Multi-Cluster Challenge
Managing AKS, OpenShift, EKS, and GKE requires different monitoring tools. Red Hat Advanced Cluster Management (ACM) solves this through centralized, unified monitoring.
ACM treats AKS clusters as managed clusters alongside OpenShift, EKS, and GKE. This enables:
- Single dashboard showing health across 100+ clusters
- Unified alerting where alerts from any cluster route through Alertmanager
- Consistent RBAC controlling who sees metrics from which clusters
- Multi-tenancy supporting project-level metric isolation
How ACM Works with AKS
ACM doesn't replace Azure Monitor—it augments it. Here's the integration pattern:
- Deploy Prometheus Operator on your AKS cluster (ACM can automate this)
- Configure ServiceMonitor resources to define what to scrape
- Forward metrics to ACM's central hub through Prometheus federation
- Access unified dashboards showing AKS metrics alongside OpenShift metrics
This approach gives you the best of both worlds: Azure's native integrations for AKS-specific features like Application Insights, plus Red Hat's cross-platform consistency.
Scale and Performance
ACM manages 15-day metric retention by default and handles scale to 100+ clusters. This is sufficient for operational troubleshooting while keeping storage costs manageable. For longer retention, metrics can be exported to external time-series databases.
Comparison: Azure Monitor vs. Red Hat Monitoring
Feature
Azure Monitor
Red Hat ACM + Prometheus
Native Integration
AKS built-in, simple enablement
Requires operator deployment
Multi-Cloud Support
Azure-focused
AKS, OpenShift, EKS, GKE
Visualization
Workbooks + optional Grafana
Grafana built-in
Cost Model
Per-GB ingestion pricing
Infrastructure-based
Metrics Retention
30-730 days configurable
15 days (federation for longer)
Query Language
Kusto Query Language (KQL)
PromQL
Application Integration
Application Insights native
Custom instrumentation
Alerting
Azure Alerts
Alertmanager
Setup Complexity
Minimal for AKS
Moderate with operators
Kubernetes-Native
Partial (via workbooks)
Full (CRDs, GitOps)
When to choose Azure Monitor alone: You're AKS-focused with minimal hybrid infrastructure. You want the simplest setup with native Azure integration.
When to add Red Hat ACM: You manage multiple Kubernetes distributions. You want consistent monitoring across OpenShift and AKS. Your teams prefer PromQL and Kubernetes-native configurations.
Best Practices for AKS Monitoring
1. Enable Container Insights at Cluster Creation
Don't enable monitoring post-deployment. Create AKS clusters with Container Insights enabled from the start:
az aks create \
--resource-group myRG \
--name myAKS \
--enable-managed-identity \
--enable-monitoring
This ensures complete baseline metrics collection without gaps.
2. Configure Diagnostic Settings Immediately
Route all control plane logs to Log Analytics:
az monitor diagnostic-settings create \
--name "AKS-Diagnostics" \
--resource /subscriptions/{subId}/resourcegroups/{rg}/providers/Microsoft.ContainerService/managedClusters/{cluster} \
--workspace /subscriptions/{subId}/resourcegroups/{rg}/providers/microsoft.operationalinsights/workspaces/{workspace} \
--logs '[{"enabled": true, "category": "kube-apiserver"}]'
3. Implement Alert Rules for Critical Conditions
Create alerts for:
- Node pressure:
KubeNodeNotReadylasting >5 minutes - Pod crashes: Containers restarting repeatedly
- Resource exhaustion: Nodes approaching CPU/memory limits
- Control plane latency: API server response times
4. Use KQL for Custom Queries
Log Analytics supports powerful KQL queries. Example: find pods consuming excessive memory:
KubePodInventory
| where TimeGenerated > ago(1h)
| where Memory_Requested_GB > 1
| join kind=inner (
ContainerInventory
| where TimeGenerated > ago(1h)
) on PodName
| project PodName, Namespace, Memory_Requested_GB, Image
5. Implement Multi-Cluster Monitoring Early
If you manage multiple clusters, deploy Red Hat ACM or establish Prometheus federation from day one. Retrofitting monitoring is disruptive.
6. Secure Metrics Access with Private Link
Use Azure Private Link to route monitoring data through your virtual network, preventing metrics from traversing the internet.
Real-World Deployment Scenarios
Scenario 1: AKS-Only Enterprise (100 developers)
Setup: Azure Monitor + Container Insights + Application Insights
- Enable Container Insights on all AKS clusters
- Configure Application Insights SDKs in application code
- Create Azure Monitor Workbooks for team dashboards
- Set up alerts routed to PagerDuty via webhooks
- Implement 90-day log retention for compliance
Why: Minimal complexity, tight Azure integration, native Azure billing and RBAC.
Scenario 2: Hybrid OpenShift + AKS (Multi-cloud strategy)
Setup: Azure Monitor for AKS + Red Hat ACM for unified monitoring
- AKS: Keep Azure Monitor + Container Insights for Azure-specific features
- Deploy Prometheus Operator on AKS via ACM
- Configure ServiceMonitors on both OpenShift and AKS
- Route all metrics to ACM's central Prometheus
- Use ACM's Grafana for cross-platform dashboards
- Route alerts from both platforms through ACM's Alertmanager
Why: Leverages each platform's strengths while maintaining consistency.
Scenario 3: Cost-Optimized Large-Scale Deployment (500+ nodes)
Setup: Hybrid approach with selective monitoring
- Use Container Insights for critical infrastructure clusters
- Deploy Prometheus on Kubernetes for application monitoring
- Implement 15-day Prometheus retention with external time-series database for longer-term data
- Configure log sampling in Azure Monitor (collect 10% of logs at scale)
- Use metrics that feed directly into autoscaling (essential metrics only)
- Implement alert aggregation to reduce notification noise
Why: Balances observability with cost efficiency at scale.
Conclusion
AKS monitoring isn't a binary choice between Azure Monitor and Red Hat solutions—it's about building a layered strategy that fits your infrastructure and team expertise.
Start here:
- Enable Container Insights at AKS cluster creation
- Configure diagnostic settings to capture control plane logs
- Deploy Application Insights SDKs in your applications
- Create alerts for critical conditions
- Evaluate Red Hat ACM if you manage multiple Kubernetes distributions
Azure Monitor provides an excellent foundation for AKS-focused organizations. Red Hat's monitoring solutions shine when you need consistency across hybrid Kubernetes environments. Many enterprises use both—leveraging Azure's native capabilities while standardizing on Red Hat's cross-platform approach for complex multi-cluster scenarios.
The key insight: modern Kubernetes requires observability by design, not by retrofit. Your monitoring strategy should be as important as your deployment strategy. Start with comprehensive metrics collection, add purposeful alerts, and iterate based on what your teams actually need to see.
Your clusters generate immense data—make sure you're capturing it, visualizing it, and acting on it.
Ready to implement comprehensive AKS monitoring? Start with Container Insights enabled at cluster creation, then layer in Red Hat solutions as your infrastructure grows.