Google Cloud Identity for Apache Superset SSO
Set up Google Cloud Identity SSO with Apache Superset via SAML. Enterprise authentication guide for Superset deployments.
Understanding Google Cloud Identity and Apache Superset Integration
Google Cloud Identity serves as a centralized identity and access management (IAM) platform for enterprise organizations. When integrated with Apache Superset, it enables single sign-on (SSO) capabilities that streamline user authentication, reduce credential management overhead, and enforce consistent security policies across your analytics platform. This integration is particularly valuable for teams running Superset in production environments where you need to synchronize user access with your existing enterprise directory.
Apache Superset, the modern open-source business intelligence platform, supports multiple authentication mechanisms including OAuth 2.0, OIDC (OpenID Connect), and SAML 2.0. The choice of protocol depends on your organization’s infrastructure and security requirements. Google Cloud Identity, as a SAML 2.0 identity provider, integrates cleanly with Superset’s security framework, allowing you to manage dashboard access, role-based permissions, and user provisioning through a single source of truth.
The fundamental advantage of this integration is operational efficiency. Instead of maintaining separate user lists in Superset, you leverage your existing Google Cloud Identity directory. When an employee joins or leaves your organization, their access to Superset updates automatically through directory synchronization. This approach eliminates the manual user management burden that plagues analytics teams at scale-ups and mid-market companies.
Prerequisites and Planning
Before you begin configuring Google Cloud Identity SSO for Apache Superset, verify that your environment meets the following requirements:
Infrastructure Requirements:
- Apache Superset version 1.5 or later (ideally 2.0+)
- A Google Cloud organization with active billing enabled
- Admin access to both your Google Cloud Identity console and your Superset deployment
- Network connectivity between your Superset instance and Google’s identity servers
- HTTPS enabled on your Superset instance (required for SAML redirects)
Organizational Prerequisites:
- A dedicated Google Cloud Identity domain (typically your company domain)
- User accounts already created in Google Cloud Identity
- Clear understanding of which Superset roles (admin, editor, viewer) map to your organizational structure
- SSL/TLS certificate installed on your Superset deployment
The planning phase is critical. Document your current user roles in Superset and how they align with Google Cloud Identity groups. If you’re migrating from a different SSO provider, plan your cutover carefully to avoid disrupting active dashboard usage.
For organizations evaluating managed alternatives to self-hosted Superset, D23 provides production-grade Apache Superset hosting with enterprise authentication pre-configured, eliminating much of this setup complexity. However, understanding the underlying SAML integration mechanics helps you evaluate any managed platform’s capabilities.
Google Cloud Identity Configuration
The first step in establishing SSO is configuring Google Cloud Identity as a SAML identity provider. This involves creating a service account, generating SAML metadata, and establishing trust between Google’s identity system and your Superset instance.
Enabling Cloud Identity:
Start by accessing your Google Cloud Console and navigating to the Cloud Identity section. According to Google’s official guide for enabling Cloud Identity, you need to ensure that Cloud Identity is activated for your organization. If you’re using Google Workspace, Cloud Identity features are typically already available.
Once Cloud Identity is enabled, navigate to the Admin console and select Security > Authentication > SSO with SAML apps. Here you’ll configure a custom SAML application for Apache Superset.
Creating a SAML Application:
In the Cloud Identity Admin console, click “Add a custom SAML app” and provide basic details:
- Application name: “Apache Superset” (or your deployment name)
- Description: “Business intelligence platform for self-serve analytics”
Google will generate a unique entity ID and provide you with an ACS (Assertion Consumer Service) URL. At this stage, you’re creating the identity provider side of the SAML handshake. The ACS URL is where Google will POST SAML assertions after user authentication.
Next, you’ll need to configure the service provider (Superset) details. The critical fields here are:
- ACS URL:
https://your-superset-domain.com/auth/login(or your actual Superset URL with the SAML callback path) - Entity ID:
https://your-superset-domain.com(should match your Superset instance’s base URL) - Name ID format: Use “Email address” for simplicity and compatibility
Google will generate SAML metadata containing the IdP certificate and SSO endpoints. Download this metadata file—you’ll need it when configuring Superset.
Configuring Attribute Mapping:
SAML attribute mapping determines which Google Cloud Identity user properties flow into Superset. At minimum, you need to map:
- Email (Google attribute:
email) → Superset username field - First Name (Google attribute:
givenName) → Superset first_name field - Last Name (Google attribute:
familyName) → Superset last_name field
You can optionally map custom attributes from Google Cloud Identity to Superset. For instance, if you maintain a department field in Cloud Identity, you could map it to a custom Superset attribute for role-based access control (RBAC) logic.
According to Google Cloud’s single sign-on architecture documentation, this attribute mapping is essential for maintaining user context across systems. The attributes you map here will be included in the SAML assertion that Google sends to Superset during authentication.
User Assignment:
Back in the Cloud Identity Admin console, assign users or groups to the Apache Superset SAML application. Only assigned users can authenticate to Superset via SSO. This provides a security boundary—users not explicitly assigned cannot access your analytics platform through this authentication method.
For larger organizations, assign entire organizational units or security groups to Superset rather than individual users. This scales better as your team grows and reduces administrative overhead when onboarding new team members.
Apache Superset SAML Configuration
Now that Google Cloud Identity is configured as a SAML identity provider, you need to configure Superset to accept and validate SAML assertions from Google.
Superset Security Manager Setup:
Apache Superset’s authentication system is built around a pluggable security manager. To enable SAML, you’ll extend Superset’s default security manager with SAML-specific logic. This requires modifying Superset’s configuration file, typically located at superset_config.py or within your deployment’s environment variables.
First, install the required Python dependency:
pip install python3-saml
This library provides SAML 2.0 support for Python applications. It handles the cryptographic validation of SAML assertions and manages the authentication flow.
Next, create a custom security manager class. Here’s a foundational example:
from superset.security import SupersetSecurityManager
from onelogin.saml2.auth import OneLogin_Saml2_Auth
from onelogin.saml2.utils import OneLogin_Saml2_Utils
import json
class SAMLSecurityManager(SupersetSecurityManager):
def __init__(self, app):
super(SAMLSecurityManager, self).__init__(app)
self.saml_settings = self.load_saml_settings()
def load_saml_settings(self):
# Load SAML settings from your Google Cloud Identity metadata
return {
'sp': {
'entityID': 'https://your-superset-domain.com',
'assertionConsumerService': {
'url': 'https://your-superset-domain.com/auth/login',
'binding': 'urn:oasis:names:tc:SAML:2.0:bindings:HTTP-POST'
},
'singleLogoutService': {
'url': 'https://your-superset-domain.com/auth/logout',
'binding': 'urn:oasis:names:tc:SAML:2.0:bindings:HTTP-Redirect'
}
},
'idp': {
# These values come from Google's SAML metadata
'entityID': 'https://accounts.google.com/o/saml2/idp?idpid=YOUR_IDP_ID',
'singleSignOnService': {
'url': 'https://accounts.google.com/o/saml2/idp?idpid=YOUR_IDP_ID',
'binding': 'urn:oasis:names:tc:SAML:2.0:bindings:HTTP-Redirect'
},
'x509cert': 'YOUR_GOOGLE_CERTIFICATE_HERE'
}
}
def auth_user_saml(self, saml_response):
# Validate and process SAML assertion
auth = OneLogin_Saml2_Auth(self.saml_settings)
auth.process_response(saml_response)
if not auth.is_authenticated():
return None
# Extract user attributes from SAML assertion
email = auth.get_attribute('email')[0]
first_name = auth.get_attribute('givenName')[0]
last_name = auth.get_attribute('familyName')[0]
# Find or create user in Superset
user = self.find_user(email=email)
if not user:
user = self.add_user(
username=email,
first_name=first_name,
last_name=last_name,
email=email,
role=[self.find_role('Viewer')] # Default role
)
return user
This custom security manager intercepts SAML assertions from Google, validates them using the certificate provided in Google’s metadata, and either finds an existing Superset user or creates a new one based on the assertion attributes.
Superset Configuration File Updates:
In your superset_config.py, add the following configuration:
from superset.security import SAMLSecurityManager
# Enable SAML authentication
AUTH_TYPE = AUTH_SAML
SECURITY_MANAGER_CLASS = SAMLSecurityManager
# SAML settings
SAML_METADATA_PATH = '/path/to/google-saml-metadata.xml'
SAML_SETTINGS = {
# Configuration from the custom security manager above
}
# User auto-provisioning
AUTH_USER_REGISTRATION = True
AUTH_USER_REGISTRATION_ROLE = 'Viewer'
# Session configuration
SESSION_COOKIE_SECURE = True
SESSION_COOKIE_HTTPONLY = True
SESSION_COOKIE_SAMESITE = 'Lax'
The AUTH_USER_REGISTRATION = True setting allows Superset to automatically create user accounts for anyone who successfully authenticates via Google Cloud Identity SAML. The AUTH_USER_REGISTRATION_ROLE determines what default role new users receive (typically “Viewer” for least-privilege access).
According to the official Apache Superset security documentation, Superset’s pluggable authentication system supports exactly this type of custom integration. The security manager pattern allows you to implement any authentication flow compatible with your identity provider.
Downloading and Configuring SAML Metadata
Google Cloud Identity provides SAML metadata in XML format. This file contains the identity provider’s certificate, SSO endpoints, and other configuration details that Superset needs to validate SAML assertions.
Obtaining Google’s SAML Metadata:
In the Cloud Identity Admin console, navigate back to your Apache Superset SAML application. Under “Service provider details,” you’ll find a link to download the SAML metadata. The file typically looks like:
<?xml version="1.0" encoding="UTF-8"?>
<EntityDescriptor xmlns="urn:oasis:names:tc:SAML:2.0:metadata" entityID="https://accounts.google.com/o/saml2/idp?idpid=YOUR_IDP_ID">
<IDPSSODescriptor protocolSupportEnumeration="urn:oasis:names:tc:SAML:2.0:protocol">
<KeyDescriptor use="signing">
<KeyInfo xmlns="http://www.w3.org/2000/09/xmldsig#">
<X509Data>
<X509Certificate>CERTIFICATE_DATA_HERE</X509Certificate>
</X509Data>
</KeyInfo>
</KeyDescriptor>
<SingleSignOnService Binding="urn:oasis:names:tc:SAML:2.0:bindings:HTTP-Redirect" Location="https://accounts.google.com/o/saml2/idp?idpid=YOUR_IDP_ID"/>
</IDPSSODescriptor>
</EntityDescriptor>
Download this file and store it securely in your Superset deployment directory. The path you specify in SAML_METADATA_PATH in your config should point to this file.
Extracting Critical Values:
From the metadata file, extract:
- Entity ID: The
entityIDattribute of theEntityDescriptorelement - SSO Service URL: The
Locationattribute of theSingleSignOnServiceelement - X509 Certificate: The certificate data between the
<X509Certificate>tags (remove line breaks for use in configuration)
These values go directly into your Superset SAML settings configuration. The certificate is used to cryptographically verify that SAML assertions actually came from Google and haven’t been tampered with.
Testing the SAML Integration
Before rolling out SSO to your entire organization, thoroughly test the integration in a staging environment.
Initial Login Test:
- Clear any existing Superset session cookies from your browser
- Navigate to your Superset login page
- If SAML is properly configured, you should see a “Login with Google Cloud Identity” button or similar
- Click the button—you’ll be redirected to Google’s login page
- Authenticate with a test user account from your Google Cloud Identity directory
- You should be redirected back to Superset and logged in automatically
If you encounter issues at this stage, check your Superset logs for SAML validation errors. Common problems include:
- Certificate validation failures: Ensure the X509 certificate in your config exactly matches the one in Google’s metadata
- URL mismatches: Verify that your ACS URL in Google’s configuration matches the SAML callback path in Superset
- Assertion attribute missing: Confirm that the email attribute (or whichever you’re using as the username) is being sent in the SAML assertion
User Provisioning Verification:
After successful login, verify that:
- The user account was created in Superset with correct email, first name, and last name
- The user was assigned the default role (typically “Viewer”)
- The user can access dashboards and datasets appropriate for their role
Log in to Superset’s admin panel and check the Users list to confirm the new user appears with the correct attributes.
Role-Based Access Control Testing:
Test that users with different roles in Google Cloud Identity receive appropriate permissions in Superset. If you’ve configured custom attribute mapping (e.g., a department field), verify that Superset’s role assignment logic correctly interprets these attributes.
For example, if you’ve mapped a “department” attribute from Google Cloud Identity and configured Superset to grant “Editor” role to users with department=analytics, create a test user with that department value and verify they receive Editor permissions.
Advanced Configuration: Group-Based Access Control
For larger organizations, managing individual user permissions becomes unwieldy. Instead, use Google Cloud Identity groups to manage Superset access at scale.
Mapping Google Groups to Superset Roles:
Modify your custom security manager to extract group information from the SAML assertion and map groups to Superset roles:
def auth_user_saml(self, saml_response):
auth = OneLogin_Saml2_Auth(self.saml_settings)
auth.process_response(saml_response)
if not auth.is_authenticated():
return None
email = auth.get_attribute('email')[0]
groups = auth.get_attribute('groups', [])
user = self.find_user(email=email)
if not user:
user = self.add_user(
username=email,
email=email,
role=[self.find_role('Viewer')]
)
# Map Google groups to Superset roles
role_mapping = {
'superset-admins@company.com': 'Admin',
'superset-editors@company.com': 'Editor',
'superset-viewers@company.com': 'Viewer'
}
for group in groups:
if group in role_mapping:
role = self.find_role(role_mapping[group])
if role not in user.roles:
user.roles.append(role)
return user
This approach requires that you configure Google Cloud Identity to send group membership information in the SAML assertion. In the Cloud Identity Admin console, add a custom attribute mapping for groups before deploying this code.
Creating Google Groups:
In Google Workspace or Cloud Identity, create groups corresponding to your Superset roles:
superset-admins@company.com→ Full platform administrationsuperset-editors@company.com→ Can create and edit dashboardssuperset-viewers@company.com→ Read-only access
Assign users to these groups based on their analytics needs. When they authenticate via SAML, their group membership automatically determines their Superset permissions.
Troubleshooting Common Issues
Even with careful configuration, SAML integrations sometimes encounter issues. Here’s how to diagnose and resolve common problems.
“Invalid SAML Response” Errors:
This typically indicates that Superset cannot validate the SAML assertion from Google. Common causes:
- Clock skew: If your Superset server’s clock is significantly out of sync with Google’s servers, SAML assertions may be rejected as expired. Synchronize your server time using NTP.
- Certificate mismatch: Ensure the X509 certificate in your Superset config exactly matches Google’s metadata. Even whitespace differences will cause validation to fail.
- Incorrect entity ID: Verify that the
entityIDin your Superset config matches the one in Google’s metadata.
Users Not Being Created:
If SAML authentication succeeds but users aren’t being created in Superset:
- Verify
AUTH_USER_REGISTRATION = Trueis set in your config - Check that the email attribute is being sent in the SAML assertion (enable debug logging in the SAML library)
- Ensure Superset’s database has write permissions for the user table
Redirect Loop Issues:
If you’re stuck in a redirect loop between Superset and Google:
- Verify that your ACS URL in Google’s configuration exactly matches your Superset callback endpoint
- Check that
SESSION_COOKIE_SECURE = Trueis set (required for SAML over HTTPS) - Ensure your Superset instance is accessible via the exact domain specified in the entity ID
For deeper debugging, enable SAML logging in your Superset deployment. The python3-saml library provides detailed logging that shows exactly what’s happening during the authentication flow. According to the GitHub discussion on SSO implementation in Superset, enabling debug logs is often the fastest way to identify configuration issues.
Production Deployment Considerations
Once you’ve validated SAML integration in staging, plan your production rollout carefully.
Gradual User Migration:
If users currently authenticate to Superset via local passwords or a different SSO provider, migrate them gradually:
- Create SAML accounts for a pilot group of users
- Have them test SAML authentication while their old credentials remain active
- Once comfortable, disable the old authentication method
- Migrate remaining users in batches
This approach minimizes disruption and allows you to address issues with a smaller user group first.
Backup Authentication Method:
Maintain a backup authentication method (such as local admin accounts) in case Google Cloud Identity becomes temporarily unavailable. Configure Superset’s authentication chain to fall back to local authentication if SAML fails:
AUTH_TYPE = [AUTH_SAML, AUTH_DB] # Try SAML first, fall back to database
Monitoring and Alerting:
Set up monitoring for SAML authentication failures. If your organization’s Google Cloud Identity service experiences issues, you want to know immediately. Log all authentication attempts and configure alerts for:
- Failed SAML assertion validations
- Certificate expiration warnings (Google rotates certificates periodically)
- Unusual authentication patterns that might indicate compromise
Session Management:
Configure appropriate session timeouts and idle session termination. SAML sessions in Superset inherit the timeout settings you specify in the config:
PERMANENT_SESSION_LIFETIME = 3600 # 1 hour in seconds
SESSION_REFRESH_EACH_REQUEST = True
This ensures that users are automatically logged out after a period of inactivity, improving security in shared environments.
Comparing SAML, OAuth, and OIDC for Superset
Google Cloud Identity supports multiple authentication protocols. Understanding the differences helps you choose the right approach for your organization.
SAML 2.0 (What We’ve Covered):
- XML-based assertion protocol
- Excellent for enterprise environments with established identity providers
- Supports complex attribute mapping and group-based access control
- Slightly more complex configuration than OAuth
- Widely supported by enterprise applications
OAuth 2.0: OAuth is an authorization framework rather than an authentication protocol. Google provides OAuth 2.0 support, and according to Google’s OAuth 2.0 developer documentation, it’s simpler to implement than SAML for many use cases. However, OAuth alone doesn’t provide user attribute information—you typically need to call Google’s user info endpoint separately.
For Superset, OAuth works well if you only need basic authentication and don’t require complex attribute mapping or group-based access control.
OpenID Connect (OIDC): OIDC is built on top of OAuth 2.0 and adds an identity layer. It provides user attributes in a standardized way through ID tokens. According to Google Cloud Identity’s OIDC configuration guide, OIDC offers a middle ground between OAuth’s simplicity and SAML’s enterprise features.
For new Superset deployments without legacy constraints, OIDC is often the best choice. It’s simpler than SAML but more feature-rich than OAuth.
Choosing the Right Protocol:
- Use SAML if you already have a SAML infrastructure or need complex attribute mapping
- Use OAuth if you want the simplest possible integration and don’t need user attributes
- Use OIDC if you’re building a new system and want modern, standards-based authentication
For organizations using D23’s managed Apache Superset platform, enterprise authentication including Google Cloud Identity SSO is pre-configured and managed as part of the service, eliminating the complexity of custom SAML configuration.
Security Best Practices
When implementing SAML authentication, security considerations extend beyond just getting the integration working.
Certificate Management: Google rotates SAML certificates periodically. Implement a process to regularly update your Superset configuration with new certificates from Google’s metadata. Set calendar reminders to check Google’s SAML metadata monthly.
Attribute Validation: Never trust SAML attributes blindly. Even though SAML assertions are cryptographically signed, validate that attributes contain expected values:
def validate_saml_attributes(self, attributes):
email = attributes.get('email', [''])[0]
if not email or '@' not in email:
raise ValueError('Invalid email in SAML assertion')
# Optionally restrict to your organization's domain
if not email.endswith('@company.com'):
raise ValueError('Email domain not authorized')
return email
Audit Logging: Log all SAML authentication events, including successful logins, failed assertions, and user provisioning actions. This creates an audit trail for compliance and security investigations.
HTTPS Enforcement: SAML absolutely requires HTTPS. Any SAML endpoint must be served over TLS with a valid certificate. Superset should reject any SAML authentication attempts over plain HTTP.
Maintenance and Ongoing Management
After deploying Google Cloud Identity SSO for Superset, ongoing maintenance is minimal but important.
Regular Testing: Monthly, test the SAML flow with a test user to ensure everything still works. Authenticate, verify user creation/updates, and check that role assignments are correct.
User Offboarding: When employees leave your organization, remove them from Google Cloud Identity. Their Superset access will be revoked the next time they attempt to authenticate (they’ll fail SAML validation). Optionally, you can manually delete their Superset user account to clean up the database.
Attribute Updates: If you change attribute mappings in Google Cloud Identity (e.g., adding a new custom field), update your Superset custom security manager to handle the new attributes.
Documentation: Maintain clear documentation of your SAML configuration, including:
- Entity ID and ACS URL
- Attribute mappings
- Group-to-role mappings
- Contact information for Google Cloud Identity and Superset administrators
This documentation is invaluable when troubleshooting issues or onboarding new team members to the infrastructure.
Scaling SAML for Enterprise Deployments
As your organization grows, your SAML implementation needs to scale gracefully.
Performance Considerations: SAML validation adds minimal overhead to each authentication request. The cryptographic operations involved are fast, typically completing in milliseconds. However, if you’re mapping user groups to roles, ensure that group lookups are efficient. Cache group membership information when possible.
Directory Synchronization: For large organizations, consider implementing directory synchronization that periodically pulls user and group information from Google Cloud Identity and updates Superset’s database. This is more efficient than validating every attribute on every login.
Tools like Apache Superset’s provisioning capabilities can be extended to support scheduled synchronization with Google Cloud Identity.
Multi-Tenant Scenarios: If you’re running multiple Superset instances or serving multiple organizations, each can have its own SAML configuration. Configure separate SAML applications in Google Cloud Identity for each Superset instance.
Conclusion
Integrating Google Cloud Identity with Apache Superset via SAML provides enterprise-grade authentication that scales with your organization. The configuration requires careful attention to detail—certificate validation, URL matching, and attribute mapping must all be precise—but the payoff is significant: centralized user management, automatic provisioning, and role-based access control without maintaining separate credential systems.
The process breaks down into clear phases: configuring Google Cloud Identity as a SAML identity provider, implementing a custom security manager in Superset, testing thoroughly in staging, and finally rolling out to production with appropriate safeguards.
For organizations that prefer not to manage this infrastructure themselves, D23 provides managed Apache Superset with enterprise authentication pre-configured, allowing you to focus on analytics rather than infrastructure. Regardless of your deployment model, understanding the underlying SAML mechanics ensures you can make informed decisions about your analytics platform’s authentication architecture.
Once deployed, the integration requires minimal maintenance—just regular testing and certificate updates—while providing significant operational benefits through automated user management and consistent security policies across your analytics platform.