Apache Superset on AWS: A Production Deployment Guide
Deploy Apache Superset on AWS with ECS, RDS, ElastiCache, and ALB. Production-grade architecture, security, and scaling patterns for analytics at scale.
Apache Superset on AWS: A Production Deployment Guide
Deploying Apache Superset to production on AWS requires more than spinning up an EC2 instance and pointing a load balancer at it. You need a hardened, scalable architecture that handles concurrent users, query latency, state management, and security at the level your organization demands. This guide walks you through the production deployment patterns that teams at scale-ups and mid-market companies use to run analytics infrastructure without the operational overhead of Looker or Tableau.
We’ll cover the AWS stack—ECS, RDS, ElastiCache, Application Load Balancer (ALB), and IAM—and explain the architectural decisions that separate a weekend project from a production system that scales to hundreds of concurrent dashboard users.
Why AWS for Apache Superset?
AWS is the natural home for Superset deployments at scale. Unlike managed SaaS platforms, you retain full control over your deployment, data residency, and customization. You avoid the per-seat licensing and vendor lock-in that comes with Looker or Tableau. At the same time, AWS services like ECS, RDS, and ElastiCache abstract away infrastructure management, letting your team focus on analytics and dashboards rather than server patching.
The cost model is transparent and predictable. You pay for compute (ECS tasks), database capacity (RDS), caching (ElastiCache), and data transfer—not per user or per dashboard. For teams embedding analytics or running internal BI platforms, this often costs 30-50% less than enterprise BI vendors while giving you the flexibility to customize every layer of the stack.
Apache Superset itself is battle-tested. It powers analytics at companies like Airbnb, where it was born, and thousands of organizations worldwide. When you deploy Superset on AWS, you’re building on a proven, open-source foundation rather than betting on a proprietary platform’s roadmap.
Core Architecture: The Three-Tier Pattern
A production Superset deployment on AWS follows a three-tier architecture: a stateless application layer (ECS), a persistent data layer (RDS), and a caching layer (ElastiCache). This separation of concerns is critical.
Application Layer (ECS): Superset runs as a containerized workload on Amazon ECS (Elastic Container Service). ECS is AWS’s managed container orchestration service—simpler than Kubernetes if you’re not already running it, and deeply integrated with other AWS services. You define a task definition that specifies the Docker image, CPU and memory allocation, environment variables, and secrets. ECS launches multiple copies of this task across availability zones, and an Application Load Balancer distributes traffic across them. If a task crashes, ECS automatically replaces it. If traffic spikes, you can scale up the task count in seconds.
Data Layer (RDS): Superset’s metadata—dashboards, charts, user accounts, permissions, and query history—lives in a relational database. In production, this is Amazon RDS (Relational Database Service) running PostgreSQL. RDS handles automated backups, multi-AZ failover, and point-in-time recovery. You don’t manage patches or replication; AWS does. The database is not publicly accessible; it lives in a private subnet and communicates with ECS tasks via a security group.
Caching Layer (ElastiCache): Superset uses Redis or Memcached for session storage, query result caching, and background job queues. ElastiCache is AWS’s managed Redis/Memcached service. It gives you high-performance, in-memory storage without running your own Redis cluster. For production, you’ll use ElastiCache Redis with multi-AZ replication and automatic failover. This ensures that if a cache node fails, your sessions don’t disappear and your queries don’t restart.
These three layers communicate over private networks. Users reach Superset through an Application Load Balancer (ALB), which terminates TLS, routes traffic to healthy ECS tasks, and handles SSL/TLS offloading. Everything is secured with security groups and IAM roles—no task has direct internet access, and no database credentials are hardcoded in your Docker image.
Containerizing Superset: Docker Image and Task Definition
Before you can run Superset on ECS, you need a Docker image. You have two options: use the official Apache Superset image from Docker Hub, or build your own.
For most teams, starting with the official image is the right choice. You can pull apache/superset:latest from Docker Hub. However, you’ll likely want to customize it—adding Python packages for database drivers, installing custom fonts for dashboards, or baking in your own configurations.
Here’s a minimal Dockerfile that extends the official image:
FROM apache/superset:latest
# Install additional database drivers
RUN pip install psycopg2-binary snowflake-sqlalchemy pymongo
# Copy custom configuration
COPY superset_config.py /app/superset_config.py
ENV SUPERSET_CONFIG_PATH=/app/superset_config.py
This image includes PostgreSQL drivers (psycopg2), Snowflake support, and MongoDB connectivity—common requirements for mid-market analytics teams. You build it, tag it with your AWS account ID and ECR (Elastic Container Registry) URI, and push it to ECR. ECS pulls from ECR during task launch.
Next, you define an ECS task definition. This is a JSON template that tells ECS how to run your Superset container. Key fields include:
Container Definition:
- Image URI (your ECR repository)
- CPU and memory allocation (e.g., 512 CPU units, 1024 MB memory per task)
- Port mapping (Superset runs on port 8088 internally)
- Environment variables (database connection strings, Redis URLs, Flask configuration)
- Secrets (database passwords, secret keys—pulled from AWS Secrets Manager, not hardcoded)
- Log configuration (CloudWatch Logs for debugging and monitoring)
Task Role and Execution Role:
- Execution role: allows ECS to pull the image from ECR, write logs to CloudWatch, and retrieve secrets from Secrets Manager
- Task role: allows the running Superset container to access AWS services (e.g., S3 for exporting dashboards, STS for cross-account access if needed)
Here’s a simplified task definition snippet:
{
"family": "superset-prod",
"networkMode": "awsvpc",
"requiresCompatibilities": ["FARGATE"],
"cpu": "512",
"memory": "1024",
"executionRoleArn": "arn:aws:iam::ACCOUNT_ID:role/ecsTaskExecutionRole",
"taskRoleArn": "arn:aws:iam::ACCOUNT_ID:role/ecsTaskRole",
"containerDefinitions": [
{
"name": "superset",
"image": "ACCOUNT_ID.dkr.ecr.REGION.amazonaws.com/superset:latest",
"portMappings": [
{
"containerPort": 8088,
"protocol": "tcp"
}
],
"environment": [
{
"name": "SUPERSET_ENV",
"value": "production"
},
{
"name": "REDIS_URL",
"value": "redis://superset-cache.abc123.ng.0001.use1.cache.amazonaws.com:6379/0"
}
],
"secrets": [
{
"name": "SQLALCHEMY_DATABASE_URI",
"valueFrom": "arn:aws:secretsmanager:REGION:ACCOUNT_ID:secret:superset-db-uri"
}
],
"logConfiguration": {
"logDriver": "awslogs",
"options": {
"awslogs-group": "/ecs/superset",
"awslogs-region": "REGION",
"awslogs-stream-prefix": "ecs"
}
}
}
]
}
This tells ECS to run Superset with 512 CPU units (0.5 vCPU) and 1024 MB of memory. The container pulls its database URI and other secrets from AWS Secrets Manager at runtime. Logs go to CloudWatch, where you can search, filter, and alert on them. For detailed guidance on task definition parameters, the Amazon ECS Task Definition Parameters documentation covers every field.
Database Setup: RDS PostgreSQL
Superset’s metadata database is its source of truth. Every dashboard, chart, user, and permission is stored here. In production, you absolutely cannot use SQLite (the default development database). You need a managed, backed-up, replicated database.
Amazon RDS PostgreSQL is the standard choice. Here’s what a production RDS instance looks like:
Instance Configuration:
- Engine: PostgreSQL 15 or later
- Instance class: db.t3.small or larger (t3.small handles ~50 concurrent users; scale up for more)
- Multi-AZ: enabled (automatic failover to a standby replica in another availability zone)
- Storage: 100 GB gp3 (general-purpose SSD), with automatic scaling enabled
- Backup retention: 30 days
- Encryption: enabled (at rest with AWS KMS, in transit with SSL)
Network Configuration:
- VPC: same VPC as your ECS tasks
- Subnet group: private subnets only (no public accessibility)
- Security group: allows inbound traffic on port 5432 from the ECS security group only
Before your first ECS task starts, you need to initialize the RDS database. Superset includes a CLI command to set up the schema:
superset db upgrade
superset load_examples # optional: loads sample dashboards
superset fab create-admin --username admin --firstname Admin --lastname User --email admin@example.com --password yourpassword
You run these commands once, either in a separate ECS task or locally with network access to RDS. After initialization, your ECS tasks connect to the pre-initialized database at startup.
Store your RDS endpoint and credentials in AWS Secrets Manager. Your task definition pulls them as secrets, never exposing them in environment variables or logs. The connection string looks like:
postgresql://superset_user:password@superset-db.abc123.us-east-1.rds.amazonaws.com:5432/superset
Caching and Session Management: ElastiCache Redis
Superset uses Redis for three critical functions: session storage, query result caching, and background job queues. Without Redis, sessions are lost when a task restarts, cached query results are discarded, and long-running queries have nowhere to queue.
Amazon ElastiCache for Redis is a managed Redis service that handles replication, failover, and scaling. For production Superset, configure it like this:
Cluster Configuration:
- Engine: Redis 7.0 or later
- Node type: cache.t3.micro or larger (t3.micro is sufficient for most deployments; scale up if you have heavy caching traffic)
- Number of nodes: 2 (primary + replica for automatic failover)
- Multi-AZ: enabled
- Automatic failover: enabled
- Parameter group: default (Superset works with default Redis parameters)
- Encryption: enabled (at rest and in transit)
Network Configuration:
- VPC: same VPC as ECS and RDS
- Subnet group: private subnets
- Security group: allows inbound traffic on port 6379 from the ECS security group only
ElastiCache gives you a Redis endpoint like superset-cache.abc123.ng.0001.use1.cache.amazonaws.com. Your ECS task connects via the REDIS_URL environment variable:
redis://superset-cache.abc123.ng.0001.use1.cache.amazonaws.com:6379/0
Superset automatically uses Redis for:
- Sessions: User login state, CSRF tokens, and temporary data
- Query caching: Results of expensive database queries (with TTL)
- Background jobs: Celery task queue for async operations like email exports and scheduled refreshes
If Redis becomes unavailable, Superset degrades gracefully but loses some functionality. Sessions are lost on task restart, cached results expire immediately, and async jobs fail. This is why multi-AZ with automatic failover is non-negotiable in production.
Load Balancing and Traffic Routing: Application Load Balancer
Users don’t connect directly to ECS tasks. Instead, an Application Load Balancer (ALB) sits in front of them, distributing traffic, terminating TLS, and providing a stable endpoint.
Here’s the ALB setup:
Listener Configuration:
- Protocol: HTTPS (TLS 1.2 or later)
- Port: 443
- Certificate: AWS Certificate Manager (ACM) certificate for your domain (e.g., analytics.yourcompany.com)
- Default action: forward to target group
Target Group:
- Protocol: HTTP (traffic between ALB and ECS tasks is internal; TLS is unnecessary)
- Port: 8088 (Superset’s default port)
- VPC: same VPC as ECS tasks
- Health check: HTTP GET to
/healthwith 30-second interval, 2 consecutive successes to mark healthy - Stickiness: enabled with 1-day cookie duration (ensures a user’s requests go to the same task, preserving session state across requests)
Security Group:
- Inbound: allow HTTPS (443) from 0.0.0.0/0 (the internet)
- Outbound: allow HTTP (8088) to the ECS security group
The ALB also handles HTTP-to-HTTPS redirects automatically. If a user visits http://analytics.yourcompany.com, the ALB redirects them to HTTPS.
When you launch an ECS service, you attach it to the target group. ECS automatically registers new tasks and deregisters old ones as you deploy updates. The ALB continuously health-checks tasks and removes unhealthy ones from rotation.
IAM Roles and Permissions: Least Privilege Access
Every AWS resource needs permissions. ECS tasks need to pull Docker images, write logs, and retrieve secrets. Superset itself might need to access S3 or assume roles for cross-account database access. IAM roles enforce least-privilege access—each component gets only the permissions it needs.
ECS Task Execution Role: Allows ECS to manage the task on your behalf:
ecr:GetAuthorizationToken: pull Docker images from ECRecr:BatchGetImageandecr:GetDownloadUrlForLayer: download image layerslogs:CreateLogStreamandlogs:PutLogEvents: write logs to CloudWatchsecretsmanager:GetSecretValue: retrieve database credentials and other secrets
ECS Task Role: Allows the running Superset container to access AWS services:
s3:GetObjectands3:PutObject: export dashboards to S3sts:AssumeRole: assume roles in other AWS accounts (for cross-account database access)cloudwatch:PutMetricData: publish custom metrics
Create these roles with IAM, attach the appropriate policies, and reference them in your task definition. Never hardcode AWS credentials in your Docker image or environment variables.
Superset Configuration for Production
Superset’s behavior is controlled by a Python configuration file. Create a superset_config.py file and bake it into your Docker image (or mount it from a ConfigMap if using Kubernetes).
Key production settings:
# Security
SECRET_KEY = os.environ.get('SECRET_KEY') # from Secrets Manager
SESSION_COOKIE_SECURE = True
SESSION_COOKIE_HTTPONLY = True
SESSION_COOKIE_SAMESITE = 'Lax'
# Database
SQLALCHEMY_DATABASE_URI = os.environ.get('SQLALCHEMY_DATABASE_URI') # from Secrets Manager
SQLALCHEMY_TRACK_MODIFICATIONS = False
SQLALCHEMY_POOL_SIZE = 10
SQLALCHEMY_POOL_RECYCLE = 3600
# Redis
RESULTS_BACKEND = 'cache'
CACHE_DEFAULT_TIMEOUT = 86400 # 1 day
CACHE_CONFIG = {
'CACHE_TYPE': 'redis',
'CACHE_REDIS_URL': os.environ.get('REDIS_URL'),
}
# Celery (background jobs)
CELERY_BROKER_URL = os.environ.get('REDIS_URL')
CELERY_RESULT_BACKEND = 'redis://...'
# Feature flags
FEATURE_FLAGS = {
'ENABLE_TEMPLATE_PROCESSING': True,
'VERSIONED_EXPORT': True,
'DASHBOARD_RBAC': True, # role-based access control
}
# Logging
LOG_LEVEL = 'INFO'
This configuration:
- Secures cookies (HTTPS-only, HttpOnly, SameSite)
- Pools database connections (10 connections per task, recycled every hour)
- Caches query results in Redis for 1 day
- Uses Redis as the Celery broker for async jobs
- Enables role-based dashboard access control
For detailed security best practices, the Securing Your Superset Installation for Production guide covers HTTPS, reverse proxies, and authentication patterns.
Deployment and Scaling Patterns
Once your infrastructure is in place, deploying Superset is straightforward.
Initial Deployment:
- Build your Docker image and push to ECR
- Create the ECS task definition
- Create an ECS service (specifies desired task count, load balancer attachment, auto-scaling rules)
- ECS launches tasks, registers them with the ALB, and begins serving traffic
Scaling: ECS has two scaling mechanisms:
- Horizontal scaling: increase the desired task count (e.g., from 2 to 4 tasks). ECS launches new tasks, the ALB adds them to the target group, and traffic is distributed across more instances. Use auto-scaling policies to scale based on CPU or memory utilization.
- Vertical scaling: increase CPU/memory per task. This requires updating the task definition and rolling out new tasks.
For a typical mid-market deployment:
- Light usage (10-50 concurrent users): 2 tasks of t3.small (512 CPU, 1024 MB memory each)
- Medium usage (50-200 concurrent users): 4 tasks of t3.medium (1024 CPU, 2048 MB memory each)
- Heavy usage (200+ concurrent users): 6-8 tasks of t3.large (2048 CPU, 4096 MB memory each), with auto-scaling
Blue-Green Deployments: For zero-downtime updates, use ECS deployment controller with blue-green strategy. Create a new task definition, update the service to point to it, and ECS gradually shifts traffic from old tasks to new ones. If something breaks, you can instantly roll back.
Monitoring, Logging, and Alerting
You can’t operate what you can’t see. Set up comprehensive monitoring from day one.
CloudWatch Logs: ECS tasks write logs to CloudWatch. Create log groups for application logs, database logs, and ALB access logs. Use CloudWatch Insights to query logs:
fields @timestamp, @message, @duration
| filter @message like /ERROR/
| stats count() by bin(5m)
CloudWatch Metrics: Monitor:
- ECS: task count, CPU utilization, memory utilization
- RDS: database connections, query latency, CPU, storage
- ElastiCache: cache hits/misses, evictions, memory utilization
- ALB: request count, latency, HTTP 5xx errors
Alarms: Set up alarms for:
- High ECS CPU (>80% for 5 minutes)
- RDS CPU (>80%)
- ALB HTTP 5xx errors (>5 per minute)
- ElastiCache evictions (>1000 per minute indicates cache is too small)
When alarms trigger, send notifications to PagerDuty, Slack, or email. For detailed monitoring setup, D23 provides expert data consulting to optimize your infrastructure and alert strategies.
Security Hardening: Best Practices
Production Superset deployments handle sensitive business data. Security isn’t optional.
Network Security:
- Use VPCs and security groups to isolate resources
- RDS and ElastiCache in private subnets (no internet access)
- ALB in public subnets, ECS tasks in private subnets
- Use VPC endpoints for AWS services (ECR, Secrets Manager) to avoid internet gateway traffic
Secrets Management:
- Store database passwords, API keys, and SECRET_KEY in AWS Secrets Manager
- Rotate secrets every 90 days
- Audit secret access with CloudTrail
- Never commit secrets to Git
Authentication and Authorization:
- Integrate with your identity provider (Okta, Azure AD, Google Workspace) via LDAP or OAuth
- Enable multi-factor authentication (MFA) for admin accounts
- Use role-based access control (RBAC) to limit dashboard access
- Audit user activity with Superset’s audit logs
Data Encryption:
- Enable encryption at rest for RDS and ElastiCache (AWS KMS)
- Use TLS 1.2+ for all communication (ALB to internet, ALB to ECS, ECS to RDS, ECS to ElastiCache)
- Enable SSL verification for database connections
Compliance:
- Enable CloudTrail to audit API calls
- Use AWS Config to monitor security group and IAM changes
- If you handle PII, enable VPC Flow Logs to track network traffic
- Document your security posture for audits (SOC 2, ISO 27001, etc.)
Cost Optimization
Running Superset on AWS is cost-effective, but you can optimize further.
Compute:
- Use Fargate Spot for non-critical tasks (up to 70% savings, with interruption risk)
- Right-size instances (t3.small often suffices; don’t default to t3.large)
- Scale down during off-hours (e.g., 1 task at night, 4 during business hours)
Database:
- Use RDS Savings Plans or Reserved Instances (1-year commitment, 30-40% discount)
- Tune queries and indexes to reduce CPU time
- Archive old query logs and audit data
Caching:
- Use ElastiCache Reserved Nodes (1-year commitment, 40-50% discount)
- Tune cache TTLs to balance memory usage and hit rate
Data Transfer:
- Colocate Superset and data sources in the same AWS region (no cross-region data transfer charges)
- Use VPC endpoints to avoid internet gateway charges
Typical monthly cost for a mid-market deployment:
- ECS (2-4 tasks): $50-150
- RDS (db.t3.small): $100-200
- ElastiCache (cache.t3.micro): $30-50
- ALB: $20-30
- Data transfer: $10-50
- Total: $200-500/month (vs. $3000-10000/month for Looker or Tableau)
Common Pitfalls and Solutions
Pitfall 1: Stateful ECS Tasks If you store files or session data on the task’s local filesystem, they’re lost when the task restarts. Always use external storage (S3, EBS volumes, or Redis).
Pitfall 2: Hardcoded Secrets Never put database passwords in your Dockerfile or environment variables. Use Secrets Manager, and rotate regularly.
Pitfall 3: Undersized RDS A db.t3.micro works for development but not production. Start with db.t3.small and monitor CPU/connections. If you see frequent connection pool exhaustion, scale up.
Pitfall 4: No Cache Invalidation If you cache query results for too long, dashboards show stale data. Set appropriate TTLs (e.g., 1 hour for hourly reports, 1 day for slower-moving data).
Pitfall 5: Ignoring Logs Without CloudWatch Logs and alarms, you won’t know when something breaks. Set up monitoring before you need it.
Comparing to Managed Alternatives
You might wonder: why not use Preset, which is Superset-as-a-Service? Or stick with Looker or Tableau?
Preset is convenient but expensive (per-seat pricing, starting at $200/user/month). You lose control over infrastructure, data residency, and customization.
Looker and Tableau are powerful but proprietary, expensive ($2000-10000/month for typical teams), and lock you into their ecosystem. You can’t customize the query engine or embed dashboards without premium licensing.
D23, a managed Apache Superset platform, sits in the middle. You get the operational simplicity of Preset with the control and cost-effectiveness of self-hosted Superset. D23 provides managed hosting, AI-powered analytics, and expert consulting to accelerate your deployment and optimize your infrastructure.
For teams that need:
- Full control over data and infrastructure: self-hosted Superset on AWS is the right choice
- Operational simplicity without vendor lock-in: D23’s managed Superset service handles infrastructure, scaling, and updates
- Enterprise features and support: Looker or Tableau (at higher cost)
Conclusion: From Development to Production
Deploying Apache Superset to production on AWS is well within reach for engineering and data teams. The architecture is straightforward: stateless ECS tasks, managed RDS, ElastiCache for caching, and an ALB for load balancing. Security, monitoring, and cost optimization are built in from the start.
The key is treating infrastructure as code, automating deployments, and monitoring relentlessly. Start with a 2-task deployment, scale horizontally as needed, and use auto-scaling to handle traffic spikes. Your total cost will be a fraction of enterprise BI platforms, and you’ll have the flexibility to customize every layer.
For teams that want to skip the operational overhead, D23 offers managed Apache Superset with AI analytics, API-first architecture, and expert data consulting. Whether you deploy on AWS yourself or use a managed service, Apache Superset gives you production-grade analytics without the platform overhead.
Start small, monitor closely, and iterate. Your analytics infrastructure will scale with your business.