Terraform is the go-to for infrastructure as code, but it has friction in a GitOps workflow. State files need to be stored somewhere, drift detection requires running terraform plan, and there's no natural integration with Kubernetes. Crossplane takes a different approach: infrastructure as Kubernetes resources, managed by Kubernetes controllers.
This post covers how I use Crossplane to manage Grafana alerting configuration as code, the issues I ran into, and why this approach makes sense for certain use cases.
Series context: This post kicks off the Platform Engineering series. It builds on concepts from the Homelab Kubernetes Series (especially GitOps with ArgoCD) and the Observability Series. If you've read Alerting Done Right, this post goes deeper into the Crossplane side of that setup.
What Crossplane Does
Crossplane extends Kubernetes with Custom Resource Definitions (CRDs) that represent external resources. Instead of running terraform apply to create a cloud database, you apply a Kubernetes manifest. A Crossplane controller watches the resource and reconciles reality with the desired state.
The architecture has three layers:
- Crossplane Core - The control plane that manages providers and resources
- Providers - Plugins that know how to manage specific external systems (AWS, GCP, Grafana, etc.)
- Managed Resources - Your actual infrastructure defined as Kubernetes manifests
The promise: declarative infrastructure with the same tools you use for applications. GitOps for everything.
Crossplane vs Terraform
Both tools solve the same problem. The difference is how they fit into your workflow:
| Aspect | Terraform | Crossplane |
|---|---|---|
| State | External file (S3, Terraform Cloud) | Kubernetes etcd |
| Drift detection | Manual terraform plan | Continuous reconciliation |
| Execution | CLI or CI/CD runner | Kubernetes controller |
| Access control | Terraform Cloud, Vault | Kubernetes RBAC |
| Dependencies | HCL modules, providers | Compositions, XRDs |
Use Terraform when:
- Managing cloud infrastructure before Kubernetes exists
- Team is already proficient with HCL
- Resources don't benefit from continuous reconciliation
Use Crossplane when:
- Resources should be managed alongside applications
- You want GitOps for infrastructure
- Continuous drift correction is valuable
- Infrastructure is consumed by Kubernetes workloads
For this homelab, I use Terraform to bootstrap the cluster (External Secrets Operator, initial ArgoCD config), then Crossplane takes over for resources that benefit from Kubernetes-native management.
Installing Crossplane
Crossplane installs via Helm. I deploy it through ArgoCD like everything else (see GitOps All The Things for the app-of-apps pattern):
apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
name: crossplane-core
namespace: argocd
annotations:
argocd.argoproj.io/sync-wave: "1"
spec:
source:
repoURL: https://charts.crossplane.io/stable
chart: crossplane
targetRevision: 2.0.2
helm:
values: |
resourcesCrossplane:
limits:
cpu: 500m
memory: 512Mi
requests:
cpu: 100m
memory: 256Mi
leaderElection: true
packageCache:
sizeLimit: 1Gi
args:
- --debug
metrics:
enabled: true
rbacManager:
leaderElection: true
resourcesCrossplane:
limits:
cpu: 200m
memory: 256Mi
requests:
cpu: 50m
memory: 128Mi
destination:
server: https://kubernetes.default.svc
namespace: crossplane-systemThe packageCache setting is worth noting - Crossplane downloads provider packages, and caching them speeds up restarts. The debug flag is useful when troubleshooting provider issues.
The Grafana Provider
For this homelab, I use Crossplane to manage Grafana alerting. The LGTM stack provides the monitoring, and I want alert rules, contact points, and notification policies defined as code in Git.
First, install the provider:
apiVersion: pkg.crossplane.io/v1
kind: Provider
metadata:
name: provider-grafana
annotations:
argocd.argoproj.io/sync-wave: "2"
spec:
package: xpkg.upbound.io/grafana/provider-grafana:v0.34.0
packagePullPolicy: IfNotPresent
revisionActivationPolicy: Automatic
revisionHistoryLimit: 3Providers are Crossplane's plugin system. The Grafana provider knows how to talk to Grafana's API and manage resources like dashboards, folders, data sources, and alerting configuration.
Provider Configuration
The provider needs credentials. For Grafana, that's the admin password and URL. I pull this from Infisical via External Secrets (see Secrets Management with Infisical and External Secrets for the full setup):
apiVersion: external-secrets.io/v1
kind: ExternalSecret
metadata:
name: grafana-credentials
namespace: crossplane-system
annotations:
argocd.argoproj.io/sync-wave: "1"
spec:
refreshInterval: 15m
secretStoreRef:
name: infisical-cluster-secretstore
kind: ClusterSecretStore
target:
name: grafana-credentials
creationPolicy: Owner
template:
type: Opaque
data:
credentials: |
{
"auth": "admin:{{ .GF_SECURITY_ADMIN_PASSWORD }}",
"url": "http://lgtm-simple.monitoring.svc.cluster.local:3000",
"org_id": "1"
}
data:
- secretKey: GF_SECURITY_ADMIN_PASSWORD
remoteRef:
key: "/lgtm/GF_SECURITY_ADMIN_PASSWORD"The credentials format is specific to the Grafana provider - a JSON object with auth (basic auth format), url, and org_id.
Then configure the provider to use this secret:
apiVersion: grafana.crossplane.io/v1beta1
kind: ProviderConfig
metadata:
name: grafana-config
annotations:
argocd.argoproj.io/sync-wave: "5"
argocd.argoproj.io/sync-options: SkipDryRunOnMissingResource=true
spec:
credentials:
source: Secret
secretRef:
namespace: crossplane-system
name: grafana-credentials
key: credentialsAll managed resources reference this ProviderConfig to know how to connect to Grafana.
Managing Grafana Resources
With the provider configured, I can create Grafana resources as Kubernetes manifests.
Folder for organising alerts:
apiVersion: oss.grafana.crossplane.io/v1alpha1
kind: Folder
metadata:
name: homelab-alerts-folder
annotations:
argocd.argoproj.io/sync-wave: "5"
spec:
forProvider:
title: "Homelab Alerts"
uid: "homelab-alerts"
preventDestroyIfNotEmpty: true
providerConfigRef:
name: grafana-configContact point for Discord notifications:
apiVersion: alerting.grafana.crossplane.io/v1alpha1
kind: ContactPoint
metadata:
name: discord-homelab-alerts
annotations:
argocd.argoproj.io/sync-wave: "10"
argocd.argoproj.io/sync-options: SkipDryRunOnMissingResource=true
spec:
forProvider:
name: discord-homelab-alerts
discord:
- title: "Homelab Alert"
message: |
HOMELAB ALERT
Alert: {{ .GroupLabels.alertname }}
Status: {{ .Status }}
Severity: {{ .GroupLabels.severity }}
{{ range .Alerts }}
Summary: {{ .Annotations.summary }}
Description: {{ .Annotations.description }}
{{ end }}
urlSecretRef:
name: discord-webhook-secret
namespace: monitoring
key: DISCORD_WEBHOOK_URL
disableResolveMessage: false
providerConfigRef:
name: grafana-configNotification policy for routing:
apiVersion: alerting.grafana.crossplane.io/v1alpha1
kind: NotificationPolicy
metadata:
name: discord-routing-policy
annotations:
argocd.argoproj.io/sync-wave: "11"
argocd.argoproj.io/sync-options: SkipDryRunOnMissingResource=true
spec:
forProvider:
contactPoint: discord-homelab-alerts
groupBy: ["grafana_folder", "alertname"]
groupWait: "10s"
groupInterval: "5m"
repeatInterval: "12h"
policy:
- contactPoint: discord-homelab-alerts
matcher:
- label: grafana_folder
match: "="
value: "Homelab Alerts"
groupWait: "5s"
repeatInterval: "30m"
providerConfigRef:
name: grafana-configAlert rules:
apiVersion: alerting.grafana.crossplane.io/v1alpha1
kind: RuleGroup
metadata:
name: ztunnel-alerts
annotations:
argocd.argoproj.io/sync-wave: "12"
spec:
forProvider:
name: ztunnel-alerts
intervalSeconds: 60
folderUid: "homelab-alerts"
rule:
- name: ZtunnelCertificateExpired
for: "2m"
condition: B
noDataState: OK
execErrState: Alerting
data:
- refId: A
datasourceUid: loki
model: |
{
"expr": "sum(count_over_time({namespace=\"istio-system\", app=\"ztunnel\"} |~ \"certificate.*[Ee]xpired\" [5m]))",
"refId": "A"
}
- refId: B
datasourceUid: __expr__
model: |
{
"conditions": [
{
"evaluator": { "params": [5], "type": "gt" },
"query": { "params": ["A"] },
"reducer": { "type": "last" },
"type": "query"
}
],
"type": "classic_conditions"
}
annotations:
summary: "Istio ztunnel workload certificates have expired"
description: "Ztunnel is reporting certificate expiry errors. Run: kubectl rollout restart daemonset/ztunnel -n istio-system"
labels:
severity: critical
team: homelab
providerConfigRef:
name: grafana-configAll of this lives in Git. Push a change, ArgoCD syncs it, Crossplane reconciles Grafana. Alert rules as code.
The Issues I Hit
Getting this working wasn't smooth. Three issues in particular:
Issue 1: ExternalSecret API Version
The first manifests used external-secrets.io/v1beta1, but my cluster had a newer version of External Secrets Operator that required v1:
# Wrong
apiVersion: external-secrets.io/v1beta1
# Right
apiVersion: external-secrets.io/v1The symptom was the ExternalSecret staying in a pending state, with ArgoCD showing it as out of sync. Check your ESO version and use the matching API version.
Issue 2: org_id Type Mismatch
The Grafana provider expects org_id as a string in the credentials JSON, not an integer:
// Wrong - causes provider authentication failures
{
"auth": "admin:password",
"url": "http://grafana:3000",
"org_id": 1
}
// Right
{
"auth": "admin:password",
"url": "http://grafana:3000",
"org_id": "1"
}This one took time to track down. The provider would fail to authenticate with no obvious error message. The fix was changing "org_id": 1 to "org_id": "1" in the ExternalSecret template.
Issue 3: Sync Wave Timing
The ProviderConfig depends on the Provider being ready, which depends on the Secret existing. Without proper ordering, ArgoCD would try to create the ProviderConfig before the provider was installed, failing dry-run validation.
The solution was sync waves (see GitOps All The Things for more on sync wave ordering):
Wave 0: External Secrets (credentials)
Wave 1: Crossplane Core
Wave 2: Provider installation
Wave 5: ProviderConfig
Wave 10: Contact Points
Wave 11: Notification Policies
Wave 12: Rule GroupsPlus the SkipDryRunOnMissingResource=true annotation on resources that reference CRDs from providers. Without this, ArgoCD's dry-run fails because it can't validate resources for CRDs that don't exist yet.
annotations:
argocd.argoproj.io/sync-wave: "5"
argocd.argoproj.io/sync-options: SkipDryRunOnMissingResource=trueThe Provider → ProviderConfig → Resource Pattern
Understanding Crossplane's resource hierarchy helps avoid issues:
Provider (pkg.crossplane.io/v1)
└── Installs provider pod + CRDs
└── Creates new API types (grafana.crossplane.io/*)
ProviderConfig (grafana.crossplane.io/v1beta1)
└── Credentials + connection settings
└── One per external system instance
Managed Resource (e.g., Folder, ContactPoint)
└── References ProviderConfig
└── Controller manages external resourceThe full ecosystem, from the Upbound Marketplace through to your external systems:
Common mistakes:
- Creating ProviderConfig before Provider is ready
- Missing
providerConfigRefon managed resources - Wrong namespace for secrets (providers run in crossplane-system)
What Crossplane Gives You
Once working, the benefits are real:
GitOps for infrastructure: Alert rules are in Git. Review them in merge requests. Roll back with git revert. Same workflow as application code.
Continuous reconciliation: Someone manually changes an alert rule in Grafana? Crossplane reverts it. Desired state wins.
Kubernetes-native access control: Who can create alert rules? Configure it with RBAC. Same tools as pod security.
Dependency management: Resources can reference secrets, which can come from External Secrets, which pull from Infisical. The Kubernetes ecosystem composes naturally.
What I'd Change
More providers: The Grafana provider is useful, but Crossplane really shines with cloud providers. AWS, GCP, Azure providers let you manage databases, queues, storage - all as Kubernetes resources. For a homelab with cloud resources, this would be more valuable.
Compositions: Crossplane supports Composite Resource Definitions (XRDs) - essentially custom resources that compose multiple managed resources. I'm not using this yet, but it's how you'd build platform abstractions like "give me a database" that creates the instance, user, password secret, and network configuration together.
Better error visibility: When something fails, debugging often means reading controller logs. More specific status conditions on resources would help.
Exploring Available Providers
The Grafana provider is just one of many. The Upbound Marketplace is the official registry for Crossplane packages, with providers for:
- Cloud platforms: AWS, Azure, GCP, Oracle Cloud - manage EC2 instances, S3 buckets, Cloud SQL databases, all as Kubernetes resources
- Kubernetes: Manage resources in remote clusters
- Databases: Direct providers for PostgreSQL, MySQL, MongoDB
- Observability: Grafana, Datadog, New Relic
- DNS & Networking: Cloudflare, Route53
- Git & CI/CD: GitHub, GitLab
- And many more: Terraform (yes, you can run Terraform from Crossplane), Helm, Vault
Providers are published at xpkg.upbound.io and can be installed by referencing the package in a Provider resource. The marketplace shows documentation, supported resources, and version history for each provider.
For homelabs, interesting options beyond Grafana include the Cloudflare provider (manage DNS records as code), the Kubernetes provider (multi-cluster management), and the Terraform provider (for services without native Crossplane support).
This is Part 1 of the Platform Engineering series. See also the Homelab Kubernetes Series and Observability Series for the foundational setup this builds upon.
Sources:
- Upbound Marketplace - Official Crossplane provider registry
- CNCF Crossplane Project
- Crossplane Documentation