Migrating from Looker to Apache Superset on BigQuery
Step-by-step guide to migrating from Looker to Apache Superset while keeping BigQuery as your data warehouse. Cost savings, architecture, and best practices.
Why Teams Are Moving from Looker to Apache Superset
Looker has been the default BI platform for enterprise teams for over a decade. It’s powerful, feature-rich, and deeply integrated into the modern data stack. But it’s also expensive—often running $50,000 to $500,000+ annually depending on user count and deployment model. More importantly, Looker locks you into a proprietary platform. Your dashboards, data models, and configuration live inside Looker’s ecosystem with limited portability.
Apache Superset changes the equation. It’s open-source, runs on your infrastructure or a managed service, and costs a fraction of Looker. For teams already running BigQuery as their data warehouse, Superset offers a compelling alternative: keep your data in BigQuery, move your BI layer to an open-source platform you control, and reduce licensing costs by 60–80%.
This guide walks through the migration step-by-step—from assessment through cutover—so you can move your Looker dashboards, metrics, and workflows to Superset without losing data continuity or analytics capability.
Understanding the Architecture Differences
Before you migrate, you need to understand how Looker and Superset differ architecturally. These differences shape your migration strategy.
Looker’s Model Layer vs. Superset’s Query-First Approach
Looker is built around a semantic layer. You define LookML models that sit between your warehouse and dashboards. These models specify dimensions, measures, filters, and relationships. When a user interacts with a dashboard, Looker translates that into SQL and sends it to BigQuery.
Superset takes a more direct approach. You connect Superset to BigQuery, write SQL queries or use the visual query builder, and create dashboards. Superset doesn’t enforce a modeling layer—though you can layer one on top using tools like dbt or Cube if you want that abstraction.
This difference has implications:
- Governance: Looker’s semantic layer enforces consistent metric definitions across the org. Superset requires discipline—you define metrics in dashboards, SQL, or external tools.
- Performance: Superset queries BigQuery directly. If your Looker instance was caching heavily or using PDTs (Persistent Derived Tables), you may need to replicate that caching behavior in Superset or optimize your BigQuery data model.
- Flexibility: Superset gives you more control over SQL and query optimization. Looker abstracts away query generation, which can be safer but less flexible.
BigQuery Connectivity
Both platforms connect to BigQuery, but the setup differs slightly. With Looker, you configure a connection in the admin UI, and Looker manages the authentication and query execution. With Superset, you install the BigQuery Python driver, configure a database connection with service account credentials, and Superset handles the rest.
The good news: BigQuery connections in Superset are rock-solid. The Google BigQuery documentation for Apache Superset is comprehensive, and the community has documented adding BigQuery databases to Superset on GKE for teams running Kubernetes.
Pre-Migration Assessment: What You’re Moving
A successful migration starts with a clear inventory of what you’re moving.
Audit Your Looker Instance
Export a list of all Looker assets:
- Dashboards: Count, complexity (number of tiles), refresh rates, filter logic.
- Looks (saved queries): Identify which ones are embedded in dashboards vs. standalone.
- Models and explores: Document LookML structure, dimensions, measures, and derived tables.
- User roles and permissions: Export RBAC (role-based access control) settings.
- Scheduled reports and alerts: List any automated jobs or email exports.
- Custom fields and table calculations: Note any business logic defined in Looker.
- Integrations: Slack alerts, webhooks, embedded dashboards in your app.
You can do this manually by navigating Looker’s admin panel, or use the Looker API to programmatically export metadata. The Looker API endpoint /api/4.0/dashboards returns dashboard metadata in JSON; /api/4.0/looks returns saved looks.
Assess BigQuery Readiness
Since you’re keeping BigQuery as your warehouse, confirm:
- Service account permissions: The service account Superset will use needs BigQuery Data Editor and BigQuery Job User roles at minimum.
- Data organization: Are your datasets and tables logically organized? If Looker was abstracting complexity, you may need to reorganize tables or create views in BigQuery for Superset.
- Query performance: Run a sample of your most-used Looker queries directly in BigQuery. Note execution time and cost. This baseline helps you optimize for Superset.
- Caching strategy: If Looker was using PDTs or materialized views, plan how to replicate this in BigQuery (materialized views, scheduled queries, or snapshots).
Estimate Effort and Timeline
Migration effort depends on complexity:
- Simple: 5–10 dashboards, basic filters, no custom LookML. Timeline: 2–4 weeks.
- Moderate: 20–50 dashboards, complex filters, some derived tables. Timeline: 6–12 weeks.
- Complex: 100+ dashboards, extensive LookML, embedded dashboards, heavy PDT usage. Timeline: 3–6 months.
Factor in testing, training, and a parallel-run period where both systems are live.
Setting Up Superset with BigQuery
Once you’ve assessed what you’re moving, set up your Superset environment.
Deployment Options
You have three main choices:
- Managed Superset (like D23): Someone else handles infrastructure, updates, and scaling. You focus on dashboards and data. Best for teams that want Looker-like simplicity without Looker’s cost.
- Self-hosted on Kubernetes: Deploy Superset on your own K8s cluster. Full control, but you manage updates, scaling, and operations.
- Self-hosted on a VM or Docker: Simpler than K8s, but less scalable. Good for smaller deployments or proof-of-concept.
For a Looker-scale migration, managed or Kubernetes-hosted Superset makes sense. You’ll need HA (high availability), backup/restore, and version management.
Configuring BigQuery Connection
Regardless of deployment, configuring BigQuery is straightforward:
- Create a Google Cloud service account with BigQuery permissions.
- Generate a JSON key for that service account.
- In Superset, go to Data > Databases > + Database.
- Select Google BigQuery from the driver list.
- Paste the service account JSON key and specify your GCP project ID.
- Test the connection and save.
Superset will introspect your BigQuery datasets and tables, making them available for querying. The official BigQuery documentation for Superset has detailed setup instructions, and if you’re running on GKE, the GitHub discussion on adding BigQuery to Superset on GKE covers Kubernetes-specific configuration.
Setting Up Caching and Query Performance
Superset’s query cache is critical for performance. Configure it:
- Cache timeout: Set to 1 hour by default, but adjust based on your data freshness requirements.
- Database query cache: Enable caching at the database level so repeated queries don’t hit BigQuery every time.
- Async query execution: For long-running queries, enable async mode so dashboards don’t time out.
If you were using Looker PDTs, replicate that behavior in BigQuery using:
- Materialized views: Automatically refreshed views that Superset can query like tables.
- Scheduled queries: BigQuery jobs that run on a schedule and write results to tables.
- dbt models: If you use dbt, build your transformations there and query the output in Superset.
Migrating Dashboards and Queries
Now for the core migration: moving your dashboards from Looker to Superset.
Exporting Looker Dashboards
Unfortunately, there’s no automated Looker-to-Superset converter. You’ll need to recreate dashboards manually or semi-manually. Here’s the process:
- Export dashboard metadata from Looker using the API or by navigating each dashboard and noting its structure.
- Identify the underlying queries: For each tile on a dashboard, note the Explore and filters used.
- Recreate the query in Superset: Use Superset’s SQL Lab to write the equivalent SQL, or use the visual query builder.
- Build the dashboard in Superset: Add charts to a new dashboard, configure filters, and set layout.
This is manual work, but it’s also an opportunity to simplify. Many Looker dashboards accumulate clutter over time. Use the migration as a chance to audit and rebuild only the dashboards that drive decisions.
Translating LookML to SQL
If you have complex LookML logic—custom dimensions, measures, or table calculations—you’ll need to translate that to SQL.
For example, a Looker dimension might be:
dimension: revenue_bucket {
type: tier
tiers: [0, 1000, 5000, 10000]
sql: ${TABLE}.revenue ;;
}
In Superset, you’d write this as a SQL expression:
CASE
WHEN revenue < 1000 THEN '0-1000'
WHEN revenue < 5000 THEN '1000-5000'
WHEN revenue < 10000 THEN '5000-10000'
ELSE '10000+'
END AS revenue_bucket
Complex LookML—especially derived tables and filters—requires careful translation. Document your LookML logic before you start, and test each translated query in BigQuery to ensure it produces the same results.
Using dbt for Semantic Consistency
If you want to replicate Looker’s semantic layer, consider integrating dbt with Superset. dbt lets you define dimensions, measures, and relationships in YAML, which Superset can expose via its native dbt integration.
This approach:
- Centralizes metric definitions: Define metrics once in dbt, use them across Superset dashboards.
- Maintains consistency: Everyone uses the same business logic.
- Improves governance: dbt models are version-controlled and tested.
If you were using Looker’s semantic layer heavily, dbt + Superset replicates that pattern while keeping you in open-source tooling.
Handling Permissions and Access Control
Looker’s RBAC (role-based access control) is granular. Superset’s is simpler but still functional.
Mapping Looker Roles to Superset
Looker roles map roughly to Superset roles as follows:
| Looker Role | Superset Equivalent | Notes |
|---|---|---|
| Admin | Admin | Full system access. |
| Developer | Editor | Can create/edit dashboards and datasets. |
| Analyst | Editor | Same as Developer in Superset. |
| Viewer | Viewer | Read-only access to dashboards. |
| Explore User | Editor | Can create ad-hoc queries in SQL Lab. |
Superset doesn’t have Looker’s fine-grained “Can see user’s dashboard” permissions. Instead, it uses dataset and dashboard-level access. You can restrict which datasets a user can query and which dashboards they can view.
Setting Up Row-Level Security (RLS)
If you need row-level security—e.g., sales reps only see their own region’s data—Superset supports this via SQL-based RLS rules. You define a filter expression that Superset appends to every query for a given user or role.
For example:
WHERE region = '{{ current_user.region }}'
This requires your BigQuery data to have a region column, and your Superset user objects to have a region attribute. It’s less elegant than Looker’s LookML-based RLS, but it works.
SAML and SSO Integration
Both Looker and Superset support SAML and SSO. When migrating, configure Superset to use your existing SSO provider (Okta, Azure AD, etc.) so users log in with the same credentials.
This reduces friction during cutover and simplifies user management.
Managing the Cutover
Moving from Looker to Superset is a business event, not just a technical one. Plan the cutover carefully.
Parallel Running
Run both systems in parallel for 2–4 weeks:
- Week 1–2: Users access both Looker and Superset. They validate that Superset dashboards match Looker dashboards.
- Week 3: Looker becomes read-only. New queries and dashboards go to Superset.
- Week 4: Looker is decommissioned.
This approach lets you catch discrepancies and build confidence before fully switching.
Validating Dashboard Accuracy
For each migrated dashboard, compare key metrics between Looker and Superset:
- Run the same query in both systems.
- Compare results: Are the numbers identical? If not, debug the difference (usually a filter or calculation mismatch).
- Check performance: How long does the query take in each system? Superset should be comparable or faster.
- Validate filters: Ensure dashboard filters work the same way in Superset.
This validation step is tedious but essential. It’s where you catch bugs before users do.
Training and Communication
Superset’s UI differs from Looker’s. Users need training:
- Dashboard navigation: Where to find dashboards, how to search.
- Filtering: How to apply and combine filters.
- Exporting: How to download data or schedule reports.
- Creating new dashboards: If users were building in Looker, teach them Superset’s interface.
Hold a kickoff meeting, create documentation, and offer office hours for questions. The smoother the transition, the faster adoption.
Cost and Performance Comparison
One of the biggest wins of migrating to Superset is cost. Let’s quantify it.
Licensing Costs
Looker: Typically $50–100 per user per month for cloud-hosted Looker, or $100K–500K+ annually for large enterprises.
Superset: Open-source (free) + infrastructure costs. If using a managed Superset service like D23, expect $500–5,000 per month depending on query volume and user count. If self-hosting, you pay for compute (Kubernetes cluster, database, etc.)—often $2K–10K per month.
Savings: For a typical mid-market company with 50–100 BI users, expect to save $200K–400K annually by switching from Looker to Superset.
Query Performance
Superset queries BigQuery directly, with minimal overhead. In most cases, Superset is faster than Looker because there’s no semantic layer translation. Query latency is typically:
- Looker: 2–10 seconds (includes LookML translation).
- Superset: 1–5 seconds (direct SQL).
For cached queries, both systems are sub-second.
Infrastructure Costs
If self-hosting Superset on Kubernetes:
- Compute: $2K–5K/month for a HA cluster.
- Storage: $100–500/month for logs and metadata.
- BigQuery: Unchanged (you’re keeping BigQuery).
Total: ~$2.5K–5.5K/month vs. $50K–100K/month for Looker.
Advanced Features: AI and Embedded Analytics
Once you’ve migrated the basics, Superset offers capabilities that go beyond Looker.
Text-to-SQL with AI
Superset integrates with LLMs to convert natural language to SQL. Ask “What’s my revenue by region?” and Superset generates and runs the query. This is faster than Looker’s explore UI for ad-hoc analysis.
Managed Superset services like D23 offer this out-of-the-box, configured for your data schema.
Embedding Analytics in Your Product
Superset dashboards can be embedded in your product with a single iframe or API call. If you’re building a SaaS product and want to embed analytics for customers, Superset is more flexible than Looker.
D23’s embedded analytics capabilities make this especially straightforward—configure once, embed across your product.
API-First BI
Superset’s REST API lets you programmatically create dashboards, run queries, and manage users. This is useful for:
- Automated report generation: Build dashboards on-the-fly based on user input.
- Data app integration: Embed Superset queries in your app’s backend.
- Workflow automation: Trigger queries based on external events.
Common Pitfalls and How to Avoid Them
Pitfall 1: Underestimating Migration Effort
Translating LookML and recreating dashboards takes longer than expected. Solution: Audit your Looker instance upfront, prioritize dashboards by business impact, and don’t try to migrate everything at once.
Pitfall 2: Ignoring Performance Optimization
Superset queries BigQuery directly, so slow queries become visible. Solution: Optimize your BigQuery schema (indexes, clustering, partitioning) before migrating. Test query performance in BigQuery before building dashboards.
Pitfall 3: Losing Governance
Without Looker’s semantic layer, teams may create inconsistent metrics. Solution: Use dbt or Superset’s native datasets to define canonical metrics. Document your metric definitions.
Pitfall 4: Inadequate Testing
Small filter or calculation mismatches can go unnoticed until users complain. Solution: Systematically validate each migrated dashboard against Looker. Automate this with a test suite if possible.
Pitfall 5: Insufficient User Training
Users familiar with Looker may struggle with Superset’s different UI. Solution: Provide hands-on training, create documentation, and offer support during the transition.
Comparing Superset to Other Looker Alternatives
If you’re evaluating Superset vs. other Looker alternatives, here’s how it stacks up:
Superset vs. Tableau: Tableau is more polished and has stronger data governance. But it’s more expensive ($2K–10K per user annually). Superset is cheaper and more flexible for custom SQL.
Superset vs. Power BI: Power BI integrates tightly with Microsoft products (Excel, Azure, etc.). But if you’re BigQuery-centric, Superset is simpler. Apache Superset is increasingly recognized as a Looker alternative for teams prioritizing cost and flexibility.
Superset vs. Metabase: Metabase is simpler and better for ad-hoc exploration. Superset is more powerful for production dashboards and custom SQL. For Looker-scale deployments, Superset is the better choice.
Superset vs. Preset: Preset is a managed Superset offering. It’s convenient but more expensive than self-hosted Superset. D23 is another managed option with added features like AI-assisted queries and MCP integration.
Post-Migration: Maintaining and Scaling
After cutover, focus on maintaining and scaling your Superset deployment.
Monitoring and Observability
Set up alerts for:
- Query failures: If a dashboard query fails, notify the owner.
- Slow queries: If a query takes >30 seconds, log it for optimization.
- System health: Monitor Superset uptime, database connectivity, and resource usage.
If using a managed service like D23, these are typically handled for you.
Continuous Optimization
- Review slow queries quarterly: Identify bottlenecks and optimize BigQuery schema or queries.
- Archive unused dashboards: Remove dashboards that no one uses.
- Update data freshness: Adjust cache timeouts based on user needs.
- Upgrade Superset regularly: New versions bring performance improvements and features.
Governance and Standards
- Document dashboard owners: Know who’s responsible for each dashboard.
- Define naming conventions: Use consistent naming for datasets, dashboards, and charts.
- Establish SLAs: Define query latency and uptime targets.
- Review access quarterly: Ensure users only have access to data they need.
Conclusion: Why Superset Makes Sense for Looker Customers
Migrating from Looker to Apache Superset on BigQuery is achievable and rewarding. You’ll reduce costs by 60–80%, gain flexibility, and maintain or improve performance.
The migration requires planning and effort—especially for complex Looker instances—but the payoff is substantial. You’re moving from a proprietary, expensive platform to an open-source, community-driven alternative that’s increasingly recognized as a Looker alternative for modern BI.
Start with a clear audit of your Looker instance, set up Superset with BigQuery, and migrate dashboards in phases. Run both systems in parallel, validate accuracy, and train users. After cutover, focus on optimization and governance.
If you want to accelerate the migration and avoid infrastructure overhead, D23 offers managed Superset with BigQuery integration, AI-powered text-to-SQL, and expert data consulting. Whether you self-host or use a managed service, Superset is a proven alternative to Looker that works at scale.
Your data stays in BigQuery. Your BI platform becomes open-source, cost-effective, and under your control. That’s the Superset advantage.