Apache Superset SSO with Okta, Azure AD, and Google Workspace
Step-by-step guide to configure SAML and OAuth SSO for Apache Superset with Okta, Azure AD, and Google Workspace. Enterprise identity integration.
Understanding Apache Superset SSO Architecture
Single Sign-On (SSO) is no longer optional for enterprise analytics platforms. When you’re managing Apache Superset at scale—especially across distributed teams or as an embedded analytics solution—forcing users to maintain separate credentials becomes a friction point that kills adoption. SSO eliminates that friction by letting users authenticate through their existing identity provider (IdP), whether that’s Okta, Azure Active Directory, or Google Workspace.
Apache Superset handles authentication through Flask-AppBuilder, its underlying web framework. Flask-AppBuilder supports multiple authentication backends, including SAML 2.0 and OAuth 2.0, which means you can connect Superset directly to your enterprise identity infrastructure without building custom integrations. The beauty of this approach is that it’s declarative—you configure a few settings in Superset’s config file, and authentication flows through your IdP automatically.
The core benefit isn’t just convenience. SSO gives you centralized access control, audit trails for who accessed what dashboards and when, and the ability to revoke access instantly across all applications by disabling a user in your IdP. For teams running D23’s managed Apache Superset platform, this integration happens out of the box, but understanding the mechanics helps you make informed decisions about your deployment.
SAML 2.0 vs. OAuth 2.0: Which Protocol Does Superset Need?
Before diving into configuration, you need to understand which protocol your identity provider supports and which one Superset expects. These aren’t interchangeable.
SAML 2.0 (Security Assertion Markup Language) is an XML-based protocol designed specifically for enterprise SSO. Okta and Azure AD both support SAML natively. When you use SAML, your identity provider generates digitally signed XML assertions containing user identity information. Superset validates the signature, extracts user attributes (email, name, groups), and logs the user in. SAML is stateful—the IdP maintains knowledge of the authentication session.
OAuth 2.0 is a delegation protocol originally built for third-party authorization. Google Workspace primarily uses OAuth 2.0 for authentication. Instead of the IdP sending identity data directly to Superset, OAuth redirects users to Google’s login page, verifies their credentials, and returns an authorization token that Superset exchanges for user information via an API call. OAuth is stateless—the IdP doesn’t maintain session state; the token does.
Apache Superset’s Flask-AppBuilder framework supports both protocols through different authentication managers. You’ll use SAML_AUTH for SAML-based providers and OAUTH for OAuth-based providers. The configuration approach differs, but the end result—authenticated users in Superset—is identical.
For most enterprises, the choice is straightforward: if your IdP is Okta or Azure AD, use SAML. If you’re using Google Workspace and want to avoid managing a separate SAML configuration, use OAuth. Some organizations run both simultaneously for different user populations.
Setting Up SAML Authentication with Okta
Okta is the most commonly deployed enterprise identity provider, and configuring SAML SSO between Okta and Apache Superset is a two-sided process: you configure Okta as the identity provider, then configure Superset as the service provider.
Step 1: Create an Okta Application
Log into your Okta admin dashboard and navigate to Applications > Create App Integration. Select SAML 2.0 as the sign-on method. Give your application a name—something like “Apache Superset” or “D23 Analytics Platform” makes it clear in your application inventory.
On the SAML Settings page, you’ll need to provide critical URLs. The Single Sign-On URL (also called the Assertion Consumer Service URL or ACS URL) tells Okta where to send the SAML assertion after authentication. This is typically https://your-superset-domain.com/auth/login or, if Superset is behind a reverse proxy, the proxy URL with /auth/login appended.
The Audience URI (Entity ID) uniquely identifies your Superset instance. This is usually https://your-superset-domain.com/metadata/. This value must match exactly in your Superset configuration—a single character mismatch causes silent authentication failures.
Set the Name ID Format to EmailAddress and configure attribute statements to map Okta user attributes to Superset attributes:
email→http://schemas.xmlsoap.org/ws/2005/05/identity/claims/emailaddressname→http://schemas.xmlsoap.org/ws/2005/05/identity/claims/givennamegroups→http://schemas.xmlsoap.org/ws/2005/05/identity/claims/groups(optional, for role-based access)
Once you’ve saved the application, Okta generates an Identity Provider Metadata URL. This is a single XML file containing everything Superset needs to validate SAML assertions: signing certificates, endpoint URLs, and supported protocols. Copy this URL—you’ll need it in Superset’s configuration.
Step 2: Configure Superset for SAML
Superset’s configuration lives in superset_config.py, a Python file that overrides default settings. If you’re running Superset via Docker (the recommended approach), this file is typically mounted as a volume. According to the Apache Superset Official Documentation - Installation Guide, you can customize authentication by modifying the configuration file.
Add the following configuration block to superset_config.py:
from flask_appbuilder.security.manager import AUTH_SAML
AUTH_TYPE = AUTH_SAML
SAML_METADATA_URL = "https://your-okta-domain.okta.com/app/123456/sso/saml/metadata"
SAML_SETTINGS = {
"sp": {
"entityID": "https://your-superset-domain.com/metadata/",
"assertionConsumerService": {
"url": "https://your-superset-domain.com/auth/login",
"binding": "urn:oasis:names:tc:SAML:2.0:bindings:HTTP-POST",
},
"singleLogoutService": {
"url": "https://your-superset-domain.com/auth/logout",
"binding": "urn:oasis:names:tc:SAML:2.0:bindings:HTTP-Redirect",
},
},
}
SAML_ATTRIBUTE_MAPPING = {
"email": ("http://schemas.xmlsoap.org/ws/2005/05/identity/claims/emailaddress", None),
"name": ("http://schemas.xmlsoap.org/ws/2005/05/identity/claims/givenname", None),
"groups": ("http://schemas.xmlsoap.org/ws/2005/05/identity/claims/groups", None),
}
The SAML_METADATA_URL tells Superset where to fetch Okta’s metadata. Superset downloads this file, caches it, and uses the certificate inside to validate incoming SAML assertions. If Okta’s certificate rotates (which it does annually), Superset automatically picks up the new one from the metadata URL.
The SAML_ATTRIBUTE_MAPPING dictionary tells Superset which SAML attributes correspond to Superset user fields. The key (e.g., email) is the Superset field; the value is a tuple containing the SAML attribute name and an optional default value.
If you’re using role-based access control (RBAC) in Superset, you can map Okta groups to Superset roles:
AUTH_ROLES_MAPPING = {
"okta-superset-admins": ["Admin"],
"okta-superset-viewers": ["Viewer"],
"okta-superset-editors": ["Editor"],
}
After updating superset_config.py, restart your Superset container. The next time a user navigates to your Superset instance, they’ll be redirected to Okta’s login page. After authenticating with Okta, they’ll be redirected back to Superset with a signed SAML assertion, which Superset validates and uses to create or update the user account.
Step 3: Troubleshooting SAML Configuration
SAML authentication is finicky. The most common failure modes are:
Certificate validation errors: Okta’s certificate must match the one in the metadata URL. If you’ve pinned a certificate in Superset’s configuration, it becomes stale when Okta rotates certificates. Always use SAML_METADATA_URL instead of hardcoding certificates.
URL mismatch: The Audience URI in Okta must match entityID in Superset’s configuration exactly. If Okta expects https://superset.example.com/metadata/ but Superset is configured with https://superset.example.com/metadata, authentication fails silently.
Binding mismatch: Okta supports both HTTP-POST and HTTP-Redirect bindings for assertion delivery. Superset defaults to HTTP-POST, which is correct for most deployments. If you’re behind a proxy that strips POST data, you may need to switch to HTTP-Redirect.
To debug SAML issues, enable verbose logging in Superset:
import logging
logging.getLogger("onelogin.saml2").setLevel(logging.DEBUG)
Then check Superset’s logs for SAML parsing errors. Okta’s system log (in the admin dashboard) also shows authentication attempts and failures, which helps identify whether the problem is on Okta’s side or Superset’s.
Configuring Azure AD (Entra ID) with SAML
Microsoft’s Azure Active Directory (now called Entra ID) is another enterprise standard. The configuration approach is similar to Okta, but the Azure AD interface and metadata format differ slightly.
Step 1: Register Superset in Azure AD
In the Azure portal, navigate to Azure Active Directory > App registrations > New registration. Name the application “Apache Superset” and set the redirect URI to https://your-superset-domain.com/auth/login.
Unlike Okta, Azure AD doesn’t have a built-in SAML application template. Instead, you’ll use Azure AD’s enterprise application gallery and configure SAML manually. Go to Enterprise applications > New application > Create your own application. Select Integrate any other application you don’t find in the gallery and name it “Apache Superset”.
Once created, navigate to the Single sign-on section and select SAML. Azure AD will prompt you to upload or configure SAML settings.
Step 2: Configure SAML Settings in Azure AD
You’ll need to provide:
Identifier (Entity ID): https://your-superset-domain.com/metadata/
Reply URL (Assertion Consumer Service URL): https://your-superset-domain.com/auth/login
Sign on URL: https://your-superset-domain.com/
In the Attributes & Claims section, configure the following claim mappings:
email→user.mailname→user.givennamegroups→user.groups(if using group-based RBAC)
Azure AD will generate a Federation Metadata URL. This URL serves the same purpose as Okta’s metadata URL—it contains the certificate and endpoints Superset needs.
According to the Microsoft Azure AD Application Registration Guide, you can configure SAML applications with just these basic settings. Azure AD’s metadata endpoint is typically https://login.microsoftonline.com/{tenant-id}/federationmetadata/2007-06/federationmetadata.xml.
Step 3: Configure Superset for Azure AD
The Superset configuration for Azure AD is nearly identical to Okta:
from flask_appbuilder.security.manager import AUTH_SAML
AUTH_TYPE = AUTH_SAML
SAML_METADATA_URL = "https://login.microsoftonline.com/{tenant-id}/federationmetadata/2007-06/federationmetadata.xml"
SAML_SETTINGS = {
"sp": {
"entityID": "https://your-superset-domain.com/metadata/",
"assertionConsumerService": {
"url": "https://your-superset-domain.com/auth/login",
"binding": "urn:oasis:names:tc:SAML:2.0:bindings:HTTP-POST",
},
"singleLogoutService": {
"url": "https://your-superset-domain.com/auth/logout",
"binding": "urn:oasis:names:tc:SAML:2.0:bindings:HTTP-Redirect",
},
},
}
SAML_ATTRIBUTE_MAPPING = {
"email": ("http://schemas.xmlsoap.org/ws/2005/05/identity/claims/emailaddress", None),
"name": ("http://schemas.xmlsoap.org/ws/2005/05/identity/claims/givenname", None),
"groups": ("http://schemas.xmlsoap.org/ws/2005/05/identity/claims/groups", None),
}
Replace {tenant-id} with your actual Azure AD tenant ID. You can find this in the Azure portal under Azure Active Directory > Properties.
One key difference with Azure AD: the groups claim requires additional configuration. By default, Azure AD doesn’t include group memberships in SAML assertions. To enable this, go to Token configuration > Add groups claim and select the appropriate option (security groups, distribution groups, etc.).
OAuth 2.0 Authentication with Google Workspace
Google Workspace uses OAuth 2.0, not SAML. The authentication flow is different: instead of Superset validating a signed assertion, users are redirected to Google’s login page, and Google returns an authorization code that Superset exchanges for user information.
Step 1: Create a Google Cloud Project and OAuth Credentials
Head to the Google Cloud Console and create a new project. Enable the Google+ API and Gmail API (the Gmail API provides user profile information).
Navigate to Credentials > Create Credentials > OAuth 2.0 Client ID. Select Web application as the application type.
Add authorized redirect URIs:
https://your-superset-domain.com/auth/loginhttps://your-superset-domain.com/oauth-authorized/google
Google will generate a Client ID and Client Secret. Store these securely—the Client Secret should never be committed to version control or exposed in logs.
Step 2: Configure Superset for OAuth 2.0
Superset’s OAuth configuration uses a different authentication manager than SAML. Add this to superset_config.py:
from flask_appbuilder.security.manager import AUTH_OAUTH
AUTH_TYPE = AUTH_OAUTH
OAUTH_PROVIDERS = [
{
"name": "google",
"icon": "fa-google",
"remote_app": {
"client_id": os.environ.get("GOOGLE_OAUTH_CLIENT_ID"),
"client_secret": os.environ.get("GOOGLE_OAUTH_CLIENT_SECRET"),
"api_base_url": "https://www.googleapis.com/oauth2/v1/",
"client_kwargs": {"scope": "email profile"},
"request_token_url": None,
"access_token_url": "https://accounts.google.com/o/oauth2/token",
"authorize_url": "https://accounts.google.com/o/oauth2/auth",
"authorize_params": {"hd": "your-domain.com"},
},
},
]
OAUTH_USER_INFO_HANDLER = lambda remote, resp: {
"username": resp["email"],
"email": resp["email"],
"name": resp.get("name", resp["email"]),
}
The hd parameter in authorize_params restricts authentication to users in your Google Workspace domain. Without this, anyone with a Google account can log in.
Store the Client ID and Client Secret as environment variables rather than hardcoding them. This is a security best practice—if your config file is ever exposed, your OAuth credentials remain protected.
According to the Google OAuth 2.0 Official Documentation, you can customize the OAuth flow by modifying the remote_app configuration. Superset uses Authlib under the hood, which implements the OAuth 2.0 specification.
Step 3: Testing OAuth Authentication
After restarting Superset, the login page should display a “Login with Google” button. Clicking it redirects to Google’s login page. After authenticating, Google redirects back to Superset with an authorization code, which Superset exchanges for user information.
One thing to watch: OAuth doesn’t automatically create users in Superset. You need to configure auto-user registration. Add this to superset_config.py:
AUTH_USER_REGISTRATION = True
AUTH_USER_REGISTRATION_ROLE = "Viewer"
This creates new users automatically with the “Viewer” role. You can adjust the default role based on your security posture.
Advanced: Multi-Provider SSO and Role-Based Access Control
For organizations with heterogeneous identity infrastructure—some teams on Okta, others on Azure AD—you can configure multiple SSO providers simultaneously. Superset’s Flask-AppBuilder allows chaining multiple authentication backends.
from flask_appbuilder.security.manager import AUTH_SAML, AUTH_OAUTH
AUTH_TYPE = AUTH_SAML # Primary auth method
SAML_METADATA_URL = "https://your-okta-domain.okta.com/app/123456/sso/saml/metadata"
# Enable OAuth as a secondary method
OAUTH_PROVIDERS = [
{
"name": "google",
"icon": "fa-google",
"remote_app": {...},
},
]
With this configuration, users see both login options. They can choose their preferred provider, and Superset creates or updates their account accordingly.
For role-based access control, map IdP groups to Superset roles. In Okta, this means configuring group claims in the SAML assertion. In Azure AD, it’s the token configuration. In Google Workspace, you’d need to use a third-party service to map Google Groups to application roles.
AUTH_ROLES_MAPPING = {
"data-admins": ["Admin"],
"analytics-engineers": ["Editor"],
"business-users": ["Viewer"],
}
When a user logs in via SSO, Superset extracts their group memberships from the IdP and automatically assigns the corresponding Superset roles. This eliminates manual user provisioning and keeps access control synchronized with your identity provider.
Security Best Practices for Superset SSO
SSO is only as secure as your implementation. Here are critical practices:
Use HTTPS everywhere: All SSO communication must be encrypted. Self-signed certificates are acceptable for testing, but production deployments require valid, trusted certificates.
Validate metadata signatures: If you’re not using SAML_METADATA_URL (which Superset refreshes automatically), you’re responsible for validating that metadata hasn’t been tampered with. This is another reason to prefer dynamic metadata URLs over static configurations.
Implement session timeouts: SSO sessions can persist longer than local sessions. Configure Superset to log users out after a period of inactivity:
PERMANENT_SESSION_LIFETIME = 1800 # 30 minutes in seconds
Audit all authentication events: Enable Superset’s audit logging to track who logged in, when, and from where. This is essential for compliance and incident response.
Rotate OAuth secrets regularly: If you’re using OAuth, rotate your Client Secret every 90 days. This limits the window of exposure if credentials are compromised.
Enforce MFA at the IdP level: Rather than implementing MFA in Superset, enforce it in your identity provider. This centralizes security policy and reduces configuration complexity.
According to the Okta Security and SSO Best Practices, organizations should implement conditional access policies that require MFA for sensitive operations, restrict login by geography, and monitor for anomalous authentication patterns.
Troubleshooting Common SSO Issues
SSO implementations often encounter predictable problems. Here’s how to diagnose and fix them:
Users can’t log in via SSO but local authentication works: This indicates a configuration mismatch between your IdP and Superset. Check that the Entity ID, ACS URL, and attribute mappings match exactly on both sides. Enable debug logging in Superset to see the actual SAML assertion or OAuth response.
Users log in but are missing roles or permissions: The IdP is authenticating users, but group claims aren’t being passed to Superset. Verify that your IdP is configured to include group claims in SAML assertions or OAuth tokens. Then confirm that AUTH_ROLES_MAPPING in Superset matches the exact group names from your IdP.
Users are stuck on the login page after SSO authentication: This usually means Superset can’t find or create the user account. Check that the email attribute mapping is correct and that the email attribute actually exists in your IdP’s SAML assertion or OAuth response.
SSO works for some users but not others: This typically happens when group-based access control is misconfigured. Some users belong to groups that map to Superset roles, while others don’t. Verify that all users belong to at least one group that’s configured in AUTH_ROLES_MAPPING.
Metadata refresh fails, and SSO stops working: If you’re using SAML_METADATA_URL, Superset caches the metadata. If the IdP’s certificate expires before Superset’s cache is refreshed, authentication fails. Most IdPs rotate certificates annually, so this shouldn’t happen, but it’s worth monitoring.
Integrating SSO with Embedded Analytics
If you’re embedding Superset dashboards in your product (a common pattern for SaaS platforms), SSO introduces additional complexity. Users need to authenticate to your application, then access embedded Superset dashboards without re-authenticating.
The solution is to use Superset’s API to generate guest tokens. Your backend authenticates the user via your own authentication system, then requests a guest token from Superset’s API:
import requests
response = requests.post(
"https://your-superset-domain.com/api/v1/security/guest_token",
json={
"user": {
"username": user.email,
"first_name": user.first_name,
"last_name": user.last_name,
},
"resources": [
{"type": "dashboard", "id": dashboard_id},
],
"rls": [{"clause": "user_id = 123"}],
},
headers={"Authorization": f"Bearer {superset_api_token}"},
)
guest_token = response.json()["token"]
Your frontend then embeds the dashboard with this guest token, allowing users to view the dashboard without logging into Superset separately. This pattern works regardless of whether Superset uses SSO or local authentication.
For platforms using D23’s managed Superset, guest token generation and embedded analytics are preconfigured, eliminating the need for custom API integration.
Migration Path: From Local Authentication to SSO
If you’re running Superset with local authentication and want to migrate to SSO, the process is straightforward but requires planning:
Phase 1: Parallel authentication: Configure SSO alongside local authentication. Update superset_config.py to support both AUTH_SAML and local login. Users can choose their preferred method.
Phase 2: User migration: Encourage users to log in via SSO. Superset matches users by email address, so if a user’s email in the IdP matches their email in Superset, they’ll access the same account.
Phase 3: Disable local authentication: Once all users have migrated to SSO, disable local login:
AUTH_TYPE = AUTH_SAML # Remove local auth entirely
This approach minimizes disruption and gives users time to adjust to the new authentication method.
Compliance and Audit Considerations
Enterprise deployments often require audit trails for compliance reasons (SOC 2, ISO 27001, HIPAA, etc.). SSO integration should include comprehensive logging:
Authentication logs: Record every login attempt, including timestamp, user, IdP, and success/failure status.
Authorization logs: Record role changes, permission grants, and access to sensitive dashboards.
Token logs: Record OAuth token generation and expiration.
Superset’s built-in audit logging captures some of this information, but you may need to supplement it with application-level logging or SIEM integration.
According to the Auth0 Authentication and Authorization Flow Documentation, modern identity platforms provide audit logs and reporting dashboards that show authentication patterns, failed login attempts, and anomalous behavior.
For regulated industries, consider implementing conditional access policies: require MFA for administrative actions, restrict dashboard access by IP range, and flag unusual authentication patterns (e.g., login from a new country).
Conclusion: SSO as a Foundation for Secure Analytics
Configuring Apache Superset SSO with Okta, Azure AD, or Google Workspace is a one-time investment that pays dividends in security, user experience, and operational efficiency. SAML-based providers like Okta and Azure AD offer centralized authentication with group-based access control. OAuth 2.0 providers like Google Workspace offer simplicity and integration with existing Google Workspace deployments.
The technical configuration is straightforward—a few lines in superset_config.py and some metadata setup in your IdP—but the security implications are significant. SSO centralizes access control, enables audit trails, and allows you to enforce security policies (MFA, conditional access, session timeouts) at the identity provider level rather than in Superset itself.
For teams deploying Superset at scale, SSO is non-negotiable. It’s the difference between managing authentication across dozens of applications and managing it once, in your identity provider. Whether you’re self-hosting Superset or using a managed platform like D23, SSO integration should be one of your first configuration steps.
As you implement SSO, remember that security is ongoing. Monitor authentication logs, rotate credentials regularly, update your IdP’s certificate configurations, and audit access patterns. SSO is a tool that makes security easier, but only if you use it correctly.