Apache Superset on Kubernetes: A Production Reference Architecture
Deploy Apache Superset on Kubernetes at scale. Production-grade architecture with Helm, autoscaling, monitoring, and disaster recovery patterns.
Apache Superset on Kubernetes: A Production Reference Architecture
Running Apache Superset in production means handling concurrent users, query complexity, state management, and infrastructure reliability—all at once. Kubernetes gives you the orchestration layer to do this well, but only if you architect it correctly. This guide walks through a production-ready reference architecture that teams at D23 have refined through hundreds of deployments across scale-ups, mid-market companies, and portfolio firms.
We’ll cover Helm-based deployment, autoscaling patterns, monitoring that actually catches problems, and disaster recovery strategies that don’t require you to rebuild your entire analytics stack at 2 AM.
Why Kubernetes for Apache Superset?
Apache Superset is stateless by design—the metadata lives in a database, cached query results live in Redis, and the application containers themselves are ephemeral. This architecture is almost made for Kubernetes. But the why matters before the how.
Traditional single-server deployments hit walls fast. One server goes down, your dashboards vanish. You need to scale Superset manually by adding more hardware. Your metadata database becomes a bottleneck. Your caching layer isn’t distributed. Kubernetes solves these problems through declarative infrastructure, automatic scaling, and self-healing.
When you deploy Apache Superset on Kubernetes, you gain:
Horizontal scaling. Add more replicas of the Superset application pod without touching code or configuration. Kubernetes distributes traffic automatically.
Resource isolation. Each pod gets defined CPU and memory limits. A runaway query in one dashboard doesn’t starve other users.
Self-healing. If a pod crashes, Kubernetes restarts it. If a node fails, Kubernetes reschedules your pods to healthy nodes.
Rolling updates. Deploy new versions of Superset without downtime. Kubernetes manages the rollout and can roll back if something breaks.
Declarative infrastructure. Your entire stack—Superset, Redis, PostgreSQL metadata store—lives in version control. Reproducible, auditable, and portable across cloud providers.
For data teams at scale-ups and mid-market companies, this means you stop spending time on infrastructure firefighting and start focusing on analytics. For engineering teams embedding self-serve BI and dashboards into products, Kubernetes gives you the reliability your users expect.
The Architecture: Components and Data Flow
A production Apache Superset deployment on Kubernetes consists of several interconnected components. Understanding how they talk to each other is critical to troubleshooting and scaling.
The Core Components
Superset Web Pods. These run the Flask application behind Gunicorn. They handle dashboard requests, user authentication, and the UI. They’re stateless—any pod can handle any request. You typically run 3-5 replicas depending on user load.
Superset Worker Pods. These execute long-running queries asynchronously using Celery. A user clicks “Run Query,” the web pod enqueues the job, a worker pod picks it up, executes it against your data warehouse, and stores the result in Redis. This keeps the web UI responsive even during heavy query load.
PostgreSQL Metadata Database. This is where Superset stores everything: dashboard definitions, user permissions, saved queries, alert configurations, datasource metadata. It’s not your data warehouse—it’s Superset’s internal state. You need a managed PostgreSQL instance (AWS RDS, Azure Database for PostgreSQL, GCP Cloud SQL) or a highly available PostgreSQL cluster inside Kubernetes with persistent volumes and automated backups.
Redis Cache. Superset caches query results, user sessions, and computed metadata in Redis. It’s not optional in production—it’s what keeps your dashboard load times under a second. Use a managed Redis service (AWS ElastiCache, Azure Cache for Redis) or run Redis as a StatefulSet in Kubernetes with persistence.
Ingress Controller. This routes external traffic to your Superset web pods. It handles TLS termination, path-based routing, and rate limiting. NGINX Ingress or AWS ALB Ingress are standard choices.
Data flows like this: A user opens a dashboard in their browser. The request hits the Ingress, which load-balances it to a Superset web pod. That pod queries the PostgreSQL metadata database to fetch the dashboard definition. It checks Redis for cached query results. If the cache is warm, it returns results immediately. If not, it enqueues a Celery task to a worker pod. The worker executes the query against your data warehouse (Snowflake, BigQuery, Redshift, etc.), stores the result in Redis, and the web pod retrieves it and renders the dashboard.
Helm-Based Deployment: From Zero to Production
Helm is the package manager for Kubernetes. Instead of writing raw YAML manifests for every component, you use Helm charts—templated, parameterized bundles that handle the complexity.
The Apache Superset Helm chart is maintained by the Apache Superset project and provides a production-ready starting point. It handles Gunicorn configuration, environment variables, secrets management, service definitions, and ingress setup.
Setting Up Helm and Adding the Repository
First, install Helm on your local machine or CI/CD pipeline. Then add the Apache Superset Helm repository:
helm repo add apache https://apache.github.io/superset
helm repo update
This gives you access to the official Superset chart. You can inspect what’s available:
helm search repo apache/superset
Creating a Custom Values File
Helm charts use a values.yaml file to configure deployments. The default values work for development, but production requires customization. Create a file called superset-production.yaml:
image:
repository: apache/superset
tag: "3.1.0" # Pin to a specific version, never use 'latest'
pullPolicy: IfNotPresent
replicaCount: 3 # Start with 3 web pods for redundancy
resources:
limits:
cpu: 1000m
memory: 1Gi
requests:
cpu: 500m
memory: 512Mi
env:
SUPERSET_ENV: production
SUPERSET_LOAD_EXAMPLES: "false"
SUPERSET_SECRET_KEY: "${SECRET_KEY}" # Injected from Kubernetes Secret
SQLALCHEMY_DATABASE_URI: "postgresql://${DB_USER}:${DB_PASSWORD}@${DB_HOST}:5432/${DB_NAME}"
REDIS_URL: "redis://${REDIS_HOST}:6379/0"
CACHE_REDIS_URL: "redis://${REDIS_HOST}:6379/1"
CELERY_BROKER_URL: "redis://${REDIS_HOST}:6379/2"
CELERY_RESULT_BACKEND: "redis://${REDIS_HOST}:6379/3"
ingress:
enabled: true
className: nginx
annotations:
cert-manager.io/cluster-issuer: "letsencrypt-prod"
hosts:
- host: analytics.yourdomain.com
paths:
- path: /
pathType: Prefix
tls:
- secretName: superset-tls
hosts:
- analytics.yourdomain.com
autoscaling:
enabled: true
minReplicas: 3
maxReplicas: 10
targetCPUUtilizationPercentage: 70
targetMemoryUtilizationPercentage: 80
worker:
enabled: true
replicaCount: 2
resources:
limits:
cpu: 2000m
memory: 2Gi
requests:
cpu: 1000m
memory: 1Gi
autoscaling:
enabled: true
minReplicas: 2
maxReplicas: 5
targetCPUUtilizationPercentage: 75
This configuration:
- Pins the Superset image to version 3.1.0 (never use
latestin production) - Runs 3 web pod replicas for redundancy
- Sets resource requests and limits so Kubernetes knows how much compute each pod needs
- Configures PostgreSQL and Redis connection strings via environment variables
- Enables HTTPS with Let’s Encrypt certificates via cert-manager
- Sets up horizontal pod autoscaling based on CPU and memory utilization
- Enables Celery workers for async query execution
Deploying with Helm
Create a Kubernetes namespace for Superset:
kubectl create namespace superset
Create a Kubernetes Secret for sensitive values:
kubectl create secret generic superset-secrets \
--from-literal=db-password='your-secure-password' \
--from-literal=redis-password='your-redis-password' \
--from-literal=secret-key='your-flask-secret-key' \
-n superset
Deploy Superset using Helm:
helm install superset apache/superset \
-f superset-production.yaml \
-n superset
Watch the rollout:
kubectl rollout status deployment/superset -n superset
Within a few minutes, your Superset instance should be running. Verify:
kubectl get pods -n superset
kubectl get svc -n superset
kubectl get ingress -n superset
You should see 3 web pods, 2 worker pods, and an Ingress resource with your domain.
Configuring Gunicorn for Concurrent Load
Superset runs on Gunicorn, a Python application server, which handles HTTP requests. By default, Gunicorn spawns a limited number of worker processes. In production, this becomes a bottleneck.
Gunicorn workers are OS-level processes. Each worker can handle one request at a time. If you have 4 workers and 100 concurrent users, 96 users are queued. This is why tuning Gunicorn is critical.
In your Helm values, configure Gunicorn via environment variables:
env:
GUNICORN_WORKERS: "4"
GUNICORN_THREADS: "2"
GUNICORN_WORKER_CLASS: "gthread"
GUNICORN_TIMEOUT: "120"
Break this down:
GUNICORN_WORKERS: “4”. The number of worker processes. For a pod with 1 CPU, use 2-4 workers. For a pod with 2 CPUs, use 4-8. Rule of thumb: (2 × CPU cores) + 1.
GUNICORN_THREADS: “2”. Each worker can spawn multiple threads. This is useful for I/O-bound operations like querying your database. With 4 workers × 2 threads, you can handle ~8 concurrent requests per pod.
GUNICORN_WORKER_CLASS: “gthread”. This tells Gunicorn to use the threaded worker class, which is better for handling concurrent requests than the default sync worker.
GUNICORN_TIMEOUT: “120”. Timeout in seconds. If a request takes longer than 120 seconds, Gunicorn kills the worker and starts a new one. This prevents hung requests from consuming resources indefinitely. Adjust based on your query patterns—if you have long-running dashboards, increase this.
With these settings, each Superset web pod can handle roughly 8-16 concurrent requests. If you have 100 concurrent users, you need 6-12 web pods. This is where autoscaling comes in.
Autoscaling: Horizontal Pod Autoscaler and Metrics
Kubernetes’ Horizontal Pod Autoscaler (HPA) watches metrics like CPU and memory usage and automatically scales your deployment up or down. This is what makes Kubernetes powerful for analytics workloads—your infrastructure adapts to demand.
The HPA in your Helm values configuration looks like this:
autoscaling:
enabled: true
minReplicas: 3
maxReplicas: 10
targetCPUUtilizationPercentage: 70
targetMemoryUtilizationPercentage: 80
This says: Keep at least 3 pods running. If CPU usage exceeds 70% or memory exceeds 80%, add more pods. Don’t go above 10 pods.
For this to work, you need the Kubernetes Metrics Server installed. Most managed Kubernetes services (EKS, GKE, AKS) have it pre-installed. If not:
kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yaml
Monitor the HPA:
kubectl get hpa -n superset
You’ll see output like:
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE
superset Deployment/superset 72%/70%, 65%/80% 3 10 5 2h
This shows that CPU is at 72% (above the 70% target), so the HPA is scaling up. It currently has 5 replicas running.
Scaling Workers for Async Queries
Your Celery workers also need to autoscale. If you have a queue of 100 queries waiting to execute, but only 2 worker pods, users wait a long time for results. Configure worker autoscaling:
worker:
autoscaling:
enabled: true
minReplicas: 2
maxReplicas: 5
targetCPUUtilizationPercentage: 75
Workers typically need more CPU and memory than web pods because they’re executing complex queries. Adjust resources accordingly:
worker:
resources:
limits:
cpu: 2000m
memory: 2Gi
requests:
cpu: 1000m
memory: 1Gi
Monitoring: Observability in Production
Deploying Superset is one thing. Knowing when something breaks is another. Production monitoring requires three layers: application metrics, infrastructure metrics, and logs.
Application Metrics with Prometheus
Superset exposes Prometheus metrics at /metrics. Configure Prometheus to scrape this endpoint:
---
apiVersion: v1
kind: Service
metadata:
name: superset-metrics
namespace: superset
spec:
selector:
app: superset
ports:
- port: 8888
name: metrics
---
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
name: superset
namespace: superset
spec:
selector:
matchLabels:
app: superset
endpoints:
- port: metrics
interval: 30s
Key metrics to alert on:
superset_requests_duration_seconds. How long dashboard requests take. Alert if p95 latency exceeds 2 seconds.
superset_database_queries_total. Total number of queries executed. A sudden drop indicates a problem.
superset_cache_hit_rate. Percentage of queries served from cache. Below 50% means your cache is cold or misconfigured.
superset_celery_queue_length. Number of queries waiting in the Celery queue. If this grows unbounded, workers are falling behind.
Infrastructure Metrics
Kubernetes itself provides metrics via the Metrics Server. Use Prometheus to scrape kubelet metrics:
apiVersion: monitoring.coreos.com/v1
kind: PrometheusRule
metadata:
name: superset-alerts
namespace: superset
spec:
groups:
- name: superset
rules:
- alert: SupsetHighMemoryUsage
expr: |
container_memory_usage_bytes{pod=~"superset-.*"} /
container_spec_memory_limit_bytes{pod=~"superset-.*"} > 0.9
for: 5m
annotations:
summary: "Superset pod {{ $labels.pod }} memory usage above 90%"
- alert: SupsetHighCPUUsage
expr: |
rate(container_cpu_usage_seconds_total{pod=~"superset-.*"}[5m]) > 0.8
for: 5m
annotations:
summary: "Superset pod {{ $labels.pod }} CPU usage above 80%"
- alert: SupsetPodCrashLooping
expr: |
rate(kube_pod_container_status_restarts_total{pod=~"superset-.*"}[15m]) > 0
for: 5m
annotations:
summary: "Superset pod {{ $labels.pod }} is crash looping"
Centralized Logging
Logs are critical. When a user reports “my dashboard is broken,” you need to see what happened. Use a centralized logging stack:
Fluent Bit or Filebeat collects logs from all Superset pods and ships them to a central store.
Elasticsearch indexes and stores the logs.
Kibana lets you search and visualize logs.
Deploy Fluent Bit as a DaemonSet (one pod per Kubernetes node):
apiVersion: v1
kind: ConfigMap
metadata:
name: fluent-bit-config
namespace: superset
data:
fluent-bit.conf: |
[SERVICE]
Flush 5
Log_Level info
[INPUT]
Name tail
Path /var/log/containers/*superset*.log
Parser docker
Tag superset.*
[OUTPUT]
Name es
Match superset.*
Host elasticsearch.logging.svc.cluster.local
Port 9200
Index superset-%Y.%m.%d
When a dashboard breaks, search Kibana for error logs:
namespace:superset AND level:ERROR
You’ll see stack traces, database connection errors, and query failures immediately.
Disaster Recovery: Backups and Failover
Your metadata database is your single point of failure. If PostgreSQL goes down, Superset can’t run. If you lose data, you lose all dashboard definitions.
Automated PostgreSQL Backups
Use a managed database service (AWS RDS, GCP Cloud SQL, Azure Database for PostgreSQL). These provide automated backups, point-in-time recovery, and read replicas.
Configure your RDS instance:
- Backup retention: 30 days (longer if you can afford it)
- Backup window: Off-peak hours (e.g., 2-4 AM UTC)
- Multi-AZ deployment: Yes. This gives you automatic failover if the primary database goes down.
- Encryption at rest: Yes
- Encryption in transit: Yes (require SSL connections)
Test recovery regularly. Once a month, restore a backup to a test environment and verify that Superset starts up correctly.
Redis Persistence
Redis is in-memory, so if the pod crashes, you lose all cached data. This isn’t catastrophic—the cache will rebuild as users access dashboards—but it causes a temporary performance hit.
Enable Redis persistence:
redis:
persistence:
enabled: true
size: 10Gi
storageClassName: gp3 # AWS EBS, adjust for your cloud provider
This writes Redis data to disk periodically (RDB snapshots) and on every write (AOF logs). If Redis crashes, it can recover from disk.
Superset Metadata Exports
Superset has a built-in export feature. Periodically export your dashboards, datasets, and charts as JSON:
kubectl exec -it superset-web-0 -n superset -- \
superset export-dashboards -f /tmp/dashboards.json
kubectl cp superset/superset-web-0:/tmp/dashboards.json ./dashboards.json
Store these exports in S3 or GCS. If you need to rebuild Superset, you can import them:
kubectl cp ./dashboards.json superset/superset-web-0:/tmp/
kubectl exec -it superset-web-0 -n superset -- \
superset import-dashboards -p /tmp/dashboards.json
Automate this with a Kubernetes CronJob:
apiVersion: batch/v1
kind: CronJob
metadata:
name: superset-backup
namespace: superset
spec:
schedule: "0 3 * * *" # 3 AM UTC daily
jobTemplate:
spec:
template:
spec:
serviceAccountName: superset-backup
containers:
- name: backup
image: apache/superset:3.1.0
command:
- /bin/bash
- -c
- |
superset export-dashboards -f /tmp/dashboards.json
aws s3 cp /tmp/dashboards.json s3://your-backup-bucket/superset-$(date +%Y%m%d).json
env:
- name: SUPERSET_ENV
value: production
- name: SQLALCHEMY_DATABASE_URI
valueFrom:
secretKeyRef:
name: superset-secrets
key: db-uri
restartPolicy: OnFailure
Security: Secrets, RBAC, and Network Policies
Production deployments handle sensitive data. Implement security at multiple layers.
Secret Management
Never hardcode database passwords or API keys in your Helm values. Use Kubernetes Secrets:
kubectl create secret generic superset-db \
--from-literal=password='your-secure-password' \
-n superset
Reference the secret in your Helm values:
env:
SQLALCHEMY_DATABASE_URI:
valueFrom:
secretKeyRef:
name: superset-db
key: password
For additional security, use an external secret manager (AWS Secrets Manager, HashiCorp Vault) and sync secrets to Kubernetes:
apiVersion: external-secrets.io/v1beta1
kind: SecretStore
metadata:
name: aws-secrets
namespace: superset
spec:
provider:
aws:
service: SecretsManager
region: us-east-1
auth:
jwt:
serviceAccountRef:
name: external-secrets-sa
---
apiVersion: external-secrets.io/v1beta1
kind: ExternalSecret
metadata:
name: superset-db
namespace: superset
spec:
refreshInterval: 1h
secretStoreRef:
name: aws-secrets
kind: SecretStore
target:
name: superset-db
creationPolicy: Owner
data:
- secretKey: password
remoteRef:
key: superset/db-password
Role-Based Access Control (RBAC)
Create a Kubernetes ServiceAccount for Superset with minimal permissions:
apiVersion: v1
kind: ServiceAccount
metadata:
name: superset
namespace: superset
---
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
name: superset
namespace: superset
rules:
- apiGroups: [""]
resources: ["configmaps"]
verbs: ["get", "list"]
- apiGroups: [""]
resources: ["secrets"]
verbs: ["get"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: superset
namespace: superset
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: Role
name: superset
subjects:
- kind: ServiceAccount
name: superset
namespace: superset
This allows Superset to read ConfigMaps and Secrets in its own namespace, nothing more.
Network Policies
Restrict traffic between pods using Network Policies:
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: superset-network-policy
namespace: superset
spec:
podSelector:
matchLabels:
app: superset
policyTypes:
- Ingress
- Egress
ingress:
- from:
- namespaceSelector:
matchLabels:
name: ingress-nginx
ports:
- protocol: TCP
port: 8088
egress:
- to:
- podSelector:
matchLabels:
app: redis
ports:
- protocol: TCP
port: 6379
- to:
- podSelector:
matchLabels:
app: postgres
ports:
- protocol: TCP
port: 5432
- to:
- namespaceSelector: {}
podSelector:
matchLabels:
k8s-app: kube-dns
ports:
- protocol: UDP
port: 53
This policy says: Superset pods can only receive traffic from the Ingress controller and can only send traffic to Redis, PostgreSQL, and DNS.
Handling Datasource Connections and Query Execution
Superset connects to multiple data warehouses (Snowflake, BigQuery, Redshift, etc.). Managing these connections securely and efficiently is critical.
Datasource Credentials
Store datasource credentials in Kubernetes Secrets, not in Superset’s UI. Create a Secret for each datasource:
kubectl create secret generic snowflake-creds \
--from-literal=username='service_account' \
--from-literal=password='secure-password' \
--from-literal=account='xy12345.us-east-1' \
-n superset
Mount the secret as environment variables in the Superset pod:
env:
SNOWFLAKE_USERNAME:
valueFrom:
secretKeyRef:
name: snowflake-creds
key: username
SNOWFLAKE_PASSWORD:
valueFrom:
secretKeyRef:
name: snowflake-creds
key: password
SNOWFLAKE_ACCOUNT:
valueFrom:
secretKeyRef:
name: snowflake-creds
key: account
In the Superset UI, reference these environment variables when creating datasources.
Query Caching and Performance
Caching is what makes Superset fast. Without caching, every dashboard load queries your data warehouse, which is slow and expensive.
Configure cache settings in your Helm values:
env:
CACHE_DEFAULT_TIMEOUT: "3600" # 1 hour
SUPERSET_CACHE_VALUES_TIMEOUT: "3600"
CACHE_TYPE: "RedisCache"
CACHE_REDIS_URL: "redis://redis:6379/1"
This tells Superset to cache query results in Redis for 1 hour. When a user opens a dashboard, Superset checks the cache first. If the result is fresh, it returns it immediately. If it’s stale, it re-executes the query and updates the cache.
For dashboards that change frequently (real-time KPIs), reduce the timeout:
env:
SUPERSET_CACHE_VALUES_TIMEOUT: "300" # 5 minutes
For dashboards that are static (historical reports), increase it:
env:
SUPERSET_CACHE_VALUES_TIMEOUT: "86400" # 24 hours
Advanced Patterns: AI and API-First Analytics
D23 extends Superset with AI-powered analytics and API-first architecture. This is where modern analytics platforms differentiate.
Text-to-SQL with LLMs
Superset can integrate with language models to convert natural language questions into SQL queries. A user asks “What were our top 10 customers by revenue last month?” and Superset generates the SQL automatically.
This requires:
- An LLM endpoint (OpenAI API, Anthropic Claude, open-source models)
- A schema understanding layer that feeds the model table and column definitions
- A validation layer that checks generated SQL before execution
Deploy this as a sidecar service alongside Superset:
apiVersion: v1
kind: Service
metadata:
name: text-to-sql
namespace: superset
spec:
selector:
app: text-to-sql
ports:
- port: 8000
name: http
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: text-to-sql
namespace: superset
spec:
replicas: 2
selector:
matchLabels:
app: text-to-sql
template:
metadata:
labels:
app: text-to-sql
spec:
containers:
- name: text-to-sql
image: your-org/text-to-sql:1.0.0
ports:
- containerPort: 8000
env:
- name: OPENAI_API_KEY
valueFrom:
secretKeyRef:
name: openai-creds
key: api-key
- name: SUPERSET_API_URL
value: "http://superset-api:8088"
resources:
requests:
cpu: 500m
memory: 512Mi
limits:
cpu: 1000m
memory: 1Gi
Superset calls this service via HTTP when a user types a natural language query.
API-First Architecture
Superset is increasingly used as an embedded analytics platform. Instead of users logging into Superset directly, they access dashboards embedded in your product via APIs.
Superset’s REST API lets you:
- Create and update dashboards programmatically
- Query data directly without the UI
- Manage users and permissions
- Export and import configurations
Secure the API with JWT tokens:
env:
SUPERSET_API_SECURITY_ENABLED: "true"
JWT_SECRET_KEY: "your-jwt-secret"
Clients authenticate and get a JWT token:
curl -X POST https://analytics.yourdomain.com/api/v1/security/login \
-H "Content-Type: application/json" \
-d '{"username": "service_account", "password": "password"}'
They then use the token to query data:
curl -X GET https://analytics.yourdomain.com/api/v1/chart/123/data \
-H "Authorization: Bearer $JWT_TOKEN"
For product teams embedding analytics, this is critical. You can build custom UIs that call Superset’s API, giving you full control over the user experience while leveraging Superset’s query engine and caching layer.
Upgrading Superset in Production
New versions of Superset are released regularly with bug fixes, features, and security patches. Upgrading requires care to avoid downtime.
Kubernetes’ rolling updates handle this automatically. When you upgrade the Helm chart, Kubernetes:
- Starts a new pod with the new Superset version
- Waits for it to become healthy (readiness probe passes)
- Stops an old pod
- Repeats until all pods are updated
Configure readiness and liveness probes in your Helm values:
readinessProbe:
httpGet:
path: /health
port: 8088
initialDelaySeconds: 30
periodSeconds: 10
timeoutSeconds: 5
failureThreshold: 3
livenessProbe:
httpGet:
path: /health
port: 8088
initialDelaySeconds: 60
periodSeconds: 30
timeoutSeconds: 5
failureThreshold: 3
To upgrade:
helm repo update apache
helm upgrade superset apache/superset \
-f superset-production.yaml \
-n superset
Watch the rollout:
kubectl rollout status deployment/superset -n superset
If something breaks, roll back immediately:
helm rollback superset -n superset
Always test upgrades in a staging environment first. Spin up a copy of your production Superset stack and upgrade it. Verify that dashboards load, queries execute, and users can authenticate. Only then upgrade production.
Cost Optimization and Capacity Planning
Running Superset on Kubernetes can be expensive if you’re not careful. Here’s how to optimize:
Right-Sizing Pods
Monitor actual resource usage and adjust requests and limits:
kubectl top pods -n superset
If your web pods consistently use 300m CPU and 256Mi memory, you’re over-provisioning. Reduce requests:
resources:
requests:
cpu: 300m
memory: 256Mi
limits:
cpu: 500m
memory: 512Mi
Smaller requests mean Kubernetes can pack more pods per node, reducing your overall cluster size.
Reserved Instances and Spot Instances
For stable, predictable workloads, use reserved instances (AWS Reserved Instances, GCP Committed Use Discounts). They’re 30-50% cheaper than on-demand.
For worker pods that tolerate interruptions, use spot instances (AWS Spot, GCP Preemptible). They’re 70-90% cheaper but can be terminated with 30 seconds notice.
Configure your Kubernetes nodes:
nodePool:
preemptible: true # Use spot instances
machineType: n2-standard-4 # 4 CPUs, 16GB RAM
minNodes: 2
maxNodes: 10
Reserved Capacity for Critical Workloads
Use Pod Disruption Budgets to ensure critical pods aren’t evicted during node maintenance:
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
name: superset-web-pdb
namespace: superset
spec:
minAvailable: 2
selector:
matchLabels:
app: superset
tier: web
This ensures at least 2 web pods are always running, even during node drains.
Conclusion: From Deployment to Operations
Deploying Apache Superset on Kubernetes is the foundation for a scalable, reliable analytics platform. But deployment is just the beginning.
Production operations require:
- Monitoring that catches problems before users do
- Runbooks for common issues (high latency, query failures, pod crashes)
- Capacity planning to ensure you have enough resources as usage grows
- Security reviews to keep credentials and data safe
- Disaster recovery drills to verify backups work
Teams at D23 have refined these patterns across hundreds of deployments. We provide managed Apache Superset hosting with expert data consulting, so you don’t have to manage this infrastructure yourself. But whether you manage it or we do, the architecture remains the same: stateless web pods, async workers, centralized metadata storage, distributed caching, and observability at every layer.
Start with this reference architecture. Monitor closely. Iterate based on your actual workload. Scale horizontally, not vertically. And always test changes in staging before production.
Your analytics platform will be faster, more reliable, and easier to operate because of it.