CI/CD optimization starts with visibility. Building a successful DevOps platform at enterprise scale should include understanding pipeline performance, job execution patterns, and quantifiable operational insights — especially for organizations running GitLab self-managed instances.
To help GitLab customers maximize their platform investments, we developed the GitLab CI/CD Observability solution as part of our Platform Excellence program, which transforms raw pipeline metrics into actionable operational insights.
A leading financial services organization partnered with GitLab's customer success architect to gain visibility into their GitLab self-managed deployment. Together, we implemented a containerized observability solution combining the open-source gitlab-ci-pipelines-exporter with enterprise-grade Prometheus and Grafana infrastructure.
In this article, you'll learn the challenges they faced managing pipelines at scale and how GitLab CI/CD Observability addressed them with a practical, end-to-end implementation.
The challenge: Measuring CI/CD performance
Before implementing any observability solution, define your measurement landscape:
- What metrics matter? Pipeline duration, job success rates, queue times, runner utilization
- Who needs visibility? Developers, DevOps engineers, platform teams, leadership
- What decisions will this drive? Infrastructure investment, bottleneck remediation, capacity planning
Solution architecture: A full set of dashboards for observability
Once deployed, the observability stack provides a set of Grafana dashboards that give real-time and historical visibility into your CI/CD platform. A typical deployment includes:
- Pipeline Overview Dashboard: A top-level view showing total pipeline runs, success/failure rates over time (as stacked bar or time-series charts), and average pipeline duration trends. Panels use color-coded status indicators (green for success, red for failure, amber for cancelled) so platform teams can spot degradation at a glance.
- Job Performance Dashboard: Drill-down panels showing individual job duration distributions (histogram), the top 10 slowest jobs by average duration, and job failure heatmaps by project and stage. This is where teams identify specific bottleneck jobs worth optimizing.
- Runner & Infrastructure Dashboard: Combines Node Exporter host metrics (CPU, memory, disk) with pipeline queue-time data to correlate infrastructure saturation with pipeline wait times. Useful for capacity planning decisions such as scaling runner pools or upgrading instance sizes.
- Deployment Frequency Dashboard: Tracks deployment count and deployment duration over time per environment, aligned with DORA metrics. Helps engineering leadership assess delivery throughput and environment drift (commits behind main).
Each dashboard is provisioned automatically via Grafana's file-based provisioning, so it deploys consistently across environments. The dashboards can be further customized with Grafana variables to filter by project, ref/branch, or time range.

The solution requires two exporters:
- Pipeline Exporter: Collects CI/CD metrics via GitLab API (pipeline duration, job status, deployments)
- Node Exporter: Collects host-level metrics (CPU, memory, disk) for infrastructure correlation
Prerequisites:
- GitLab Self-Managed Version 18.1+
- Container orchestration platform: A Kubernetes cluster (recommended for enterprise deployments) or a container runtime such as Docker/Podman for smaller scale or proof-of-concept environments. The primary deployment guide below targets Kubernetes; a Docker Compose alternative is provided in the appendix for local testing and evaluation
- GitLab Personal Access Token (read_api scope)
Kubernetes deployment (recommended)
For enterprise environments, deploy each component as a separate Deployment within a dedicated namespace. This approach integrates with existing cluster infrastructure, secrets management, and network policies.
1. Create namespace and secret
kubectl create namespace gitlab-observability
# Create the GitLab token secret (see Secrets Management section below
# for enterprise-grade approaches using external secret operators)
kubectl create secret generic gitlab-token \
--from-literal=token=glpat-xxxxxxxxxxxx \
-n gitlab-observability
2. Deploy the Pipeline Exporter
# exporter-deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: gitlab-ci-pipelines-exporter
namespace: gitlab-observability
spec:
replicas: 1
selector:
matchLabels:
app: gitlab-ci-pipelines-exporter
template:
metadata:
labels:
app: gitlab-ci-pipelines-exporter
spec:
containers:
- name: exporter
image: mvisonneau/gitlab-ci-pipelines-exporter:latest
ports:
- containerPort: 8080
env:
- name: GCPE_GITLAB_TOKEN
valueFrom:
secretKeyRef:
name: gitlab-token
key: token
- name: GCPE_CONFIG
value: /etc/gcpe/config.yml
volumeMounts:
- name: config
mountPath: /etc/gcpe
volumes:
- name: config
configMap:
name: gcpe-config
---
apiVersion: v1
kind: Service
metadata:
name: gitlab-ci-pipelines-exporter
namespace: gitlab-observability
spec:
selector:
app: gitlab-ci-pipelines-exporter
ports:
- port: 8080
targetPort: 8080
3. Deploy Node Exporter (DaemonSet)
# node-exporter-daemonset.yaml
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: node-exporter
namespace: gitlab-observability
spec:
selector:
matchLabels:
app: node-exporter
template:
metadata:
labels:
app: node-exporter
spec:
containers:
- name: node-exporter
image: prom/node-exporter:latest
ports:
- containerPort: 9100
---
apiVersion: v1
kind: Service
metadata:
name: node-exporter
namespace: gitlab-observability
spec:
selector:
app: node-exporter
ports:
- port: 9100
targetPort: 9100
4. Deploy Prometheus
# prometheus-deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: prometheus
namespace: gitlab-observability
spec:
replicas: 1
selector:
matchLabels:
app: prometheus
template:
metadata:
labels:
app: prometheus
spec:
containers:
- name: prometheus
image: prom/prometheus:latest
ports:
- containerPort: 9090
volumeMounts:
- name: config
mountPath: /etc/prometheus
volumes:
- name: config
configMap:
name: prometheus-config
---
apiVersion: v1
kind: Service
metadata:
name: prometheus
namespace: gitlab-observability
spec:
selector:
app: prometheus
ports:
- port: 9090
targetPort: 9090
5. Deploy Grafana
The Grafana deployment below starts with authentication disabled (GF_AUTH_ANONYMOUS_ENABLED: true) for initial setup convenience.
This setting allows anyone with network access to view all dashboards without logging in. For production deployments, remove this variable or set it to false and configure a proper authentication provider (LDAP, SAML/SSO, or OAuth) to restrict access to authorized users.
# grafana-deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: grafana
namespace: gitlab-observability
spec:
replicas: 1
selector:
matchLabels:
app: grafana
template:
metadata:
labels:
app: grafana
spec:
containers:
- name: grafana
image: grafana/grafana:10.0.0
ports:
- containerPort: 3000
env:
# REMOVE or set to 'false' for production.
# When 'true', any user with network access can
# view dashboards without authentication.
- name: GF_AUTH_ANONYMOUS_ENABLED
value: 'true'
volumeMounts:
- name: dashboards-provider
mountPath: /etc/grafana/provisioning/dashboards
- name: datasources
mountPath: /etc/grafana/provisioning/datasources
- name: dashboards
mountPath: /var/lib/grafana/dashboards
volumes:
- name: dashboards-provider
configMap:
name: grafana-dashboards-provider
- name: datasources
configMap:
name: grafana-datasources
- name: dashboards
configMap:
name: grafana-dashboards
---
apiVersion: v1
kind: Service
metadata:
name: grafana
namespace: gitlab-observability
spec:
selector:
app: grafana
ports:
- port: 3000
targetPort: 3000
6. Set network policy
Restrict inter-pod traffic to only the required communication paths:
# network-policy.yaml
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: observability-policy
namespace: gitlab-observability
spec:
podSelector: {}
policyTypes:
- Ingress
ingress:
# Prometheus scrapes exporter and node-exporter
- from:
- podSelector:
matchLabels:
app: prometheus
ports:
- port: 8080
- port: 9100
# Grafana queries Prometheus
- from:
- podSelector:
matchLabels:
app: grafana
ports:
- port: 9090
7. Validate
kubectl get pods -n gitlab-observability
kubectl port-forward svc/grafana 3000:3000 -n gitlab-observability
curl http://localhost:3000/api/health
Configuration reference
Exporter configuration
# gitlab-ci-pipelines-exporter.yml (ConfigMap: gcpe-config)
log:
level: info
gitlab:
url: https://gitlab.your-domain.com
maximum_requests_per_second: 10
project_defaults:
pull:
pipeline:
jobs:
enabled: true
wildcards:
- owner:
name: your-group-name
kind: group
archived: false
Prometheus configuration
# prometheus.yml (ConfigMap: prometheus-config)
global:
scrape_interval: 15s
scrape_configs:
- job_name: 'gitlab-ci-pipelines-exporter'
static_configs:
- targets: ['gitlab-ci-pipelines-exporter:8080']
- job_name: 'node-exporter'
static_configs:
- targets: ['node-exporter:9100']
Grafana data sources
# datasources.yml (ConfigMap: grafana-datasources)
apiVersion: 1
datasources:
- name: Prometheus
type: prometheus
access: proxy
url: http://prometheus:9090
isDefault: true
# dashboards.yml (ConfigMap: grafana-dashboards-provider)
apiVersion: 1
providers:
- name: 'default'
folder: 'GitLab CI/CD'
type: file
options:
path: /var/lib/grafana/dashboards
Key metrics
Pipeline Exporter metrics
| Metric | Description |
|---|---|
gitlab_ci_pipeline_duration_seconds | Pipeline execution time |
gitlab_ci_pipeline_status | Pipeline success/failure by project |
gitlab_ci_pipeline_job_duration_seconds | Individual job execution time |
gitlab_ci_pipeline_job_status | Job success/failure status |
gitlab_ci_pipeline_job_artifact_size_bytes | Artifact storage consumption |
gitlab_ci_pipeline_coverage | Code coverage percentage |
gitlab_ci_environment_deployment_count | Deployment frequency |
gitlab_ci_environment_deployment_duration_seconds | Deployment execution time |
gitlab_ci_environment_behind_commits_count | Environment drift from main |
Node Exporter metrics
| Metric | Description |
|---|---|
node_cpu_seconds_total | CPU utilization |
node_memory_MemAvailable_bytes | Available memory |
node_filesystem_avail_bytes | Disk space available |
node_load1 | 1-minute load average |
Troubleshooting
Air-gapped Grafana plugin installation
For offline environments, install plugins manually. Example for Kubernetes:
# Copy plugin zip into the Grafana pod
kubectl cp grafana-polystat-panel-2.1.16.zip \
gitlab-observability/grafana-<pod-id>:/tmp/
# Extract plugin
kubectl exec -it -n gitlab-observability deploy/grafana -- \
sh -c "unzip /tmp/grafana-polystat-panel-2.1.16.zip -d /var/lib/grafana/plugins/"
# Restart Grafana pod
kubectl rollout restart deployment/grafana -n gitlab-observability
# Verify installation
kubectl exec -it -n gitlab-observability deploy/grafana -- \
ls -al /var/lib/grafana/plugins/
Enterprise considerations
For regulated industries, ensure:
- Token security: Store GitLab Personal Access Tokens in a dedicated secrets manager rather than hardcoded in ConfigMaps. Enforce token rotation policies and limit scope to read_api only.
- Network segmentation: Deploy behind a reverse proxy with TLS termination. In Kubernetes, use an Ingress controller with automated certificate provisioning.
- Authentication: Configure Grafana with your organization's identity provider (SAML, LDAP, or OAuth/OIDC) to enforce role-based access control on dashboards.
Why GitLab?
GitLab's API-first design enables custom observability solutions that complement native capabilities like Value Stream Analytics and DORA metrics. The open architecture allows organizations to integrate proven open-source tooling — like the gitlab-ci-pipelines-exporter — directly with their existing enterprise infrastructure, without disrupting established workflows.
As your observability maturity grows, GitLab's built-in Observability capabilities provide a natural next step — offering deeper, integrated visibility without additional tooling. Learn more about what's available natively in the platform for GitLab Observability.




