Guide April 18, 2026 · 18 mins · The D23 Team

Apache Superset User Provisioning: SCIM and Just-in-Time Patterns

Master Apache Superset user provisioning with SCIM and JIT SAML. Automate identity sync, reduce overhead, and scale securely for enterprise teams.

Apache Superset User Provisioning: SCIM and Just-in-Time Patterns

Understanding User Provisioning in Apache Superset

User provisioning is the process of creating, updating, and removing user accounts and their access rights within an application. In Apache Superset, this traditionally meant manual intervention—an admin logs in, creates users one by one, assigns roles, and manages permissions through the UI. For teams managing hundreds or thousands of users across multiple data products, this approach becomes a bottleneck.

Automated user provisioning solves this by connecting your identity provider (Okta, Azure AD, Auth0, or another SAML/OIDC system) directly to Superset. When a new employee joins, their account appears in Superset automatically. When they leave, their access revokes instantly. This eliminates manual user management, reduces security risk, and scales without adding operational burden.

The two primary patterns for automating this workflow are SCIM (System for Cross-Domain Identity Management) and Just-in-Time (JIT) provisioning via SAML. Both solve the same problem—synchronizing identity state between your HR system and Superset—but they work differently in practice. Understanding when and how to use each is critical for teams building production analytics infrastructure.

At D23, we’ve implemented both patterns for customers managing Superset at scale. This guide walks through the mechanics, trade-offs, and concrete implementation steps so you can choose the right approach for your organization.

What Is SCIM and How Does It Work?

SCIM is a standardized protocol for automating user and group provisioning across cloud applications. It defines a REST API contract that identity providers use to push user data to applications. Instead of each app building its own provisioning integration, SCIM creates a common language.

Here’s how SCIM works in practice:

The SCIM Flow:

  1. A user is created in your identity provider (e.g., Okta)
  2. Okta detects this change and sends an HTTP POST request to your Superset SCIM endpoint with the user’s details (email, name, groups, attributes)
  3. Superset receives the request, validates it, and creates the user in its database
  4. When the user’s attributes change (name, group membership, email), Okta sends a PATCH request
  5. When the user is deactivated in Okta, a DELETE request removes or deactivates them in Superset

This is fundamentally different from JIT provisioning. SCIM is push-based: your identity provider proactively sends user data to Superset. JIT is pull-based: Superset creates users on first login.

For detailed protocol specifications and how major providers implement SCIM, see the SCIM Protocol Documentation and SCIM User Provisioning with Okta. These resources cover the technical structure of SCIM 2.0 requests and responses.

SCIM Request Example:

When Okta provisions a user, it sends something like this:

POST /scim/v2/Users HTTP/1.1
Host: superset.yourcompany.com
Authorization: Bearer your-scim-token
Content-Type: application/scim+json

{
  "schemas": ["urn:ietf:params:scim:schemas:core:2.0:User"],
  "userName": "alice.chen@company.com",
  "name": {
    "givenName": "Alice",
    "familyName": "Chen"
  },
  "emails": [{
    "value": "alice.chen@company.com",
    "primary": true
  }],
  "groups": [{
    "value": "analytics-team"
  }],
  "active": true
}

Your Superset instance receives this payload, parses it, and creates a user record. If that user is part of the “analytics-team” group, you can automatically assign them to a Superset role with the appropriate dataset and dashboard permissions.

Key advantages of SCIM:

  • Bi-directional sync: Groups and attributes stay in sync automatically
  • Deprovisioning: Removing a user from your identity provider immediately revokes Superset access
  • Scalability: Handles hundreds or thousands of users without manual work
  • Compliance: Audit trails show exactly when users were added, modified, or removed
  • Group-based access: Assign permissions based on group membership, not individual users

For comprehensive security guidance on implementing SCIM in a production Superset environment, consult the Securing Your Superset Installation for Production guide and the Tutorial - Develop a SCIM endpoint for user provisioning from Microsoft, which covers SCIM 2.0 endpoint development patterns.

Just-in-Time (JIT) Provisioning via SAML Attribute Mapping

Just-in-Time provisioning takes a different approach. Instead of your identity provider pushing user data, Superset creates users on their first login. The identity provider (via SAML assertions) tells Superset who the user is and what groups they belong to, and Superset creates them on the fly.

The JIT Flow:

  1. A user tries to log into Superset
  2. Superset redirects them to your SAML identity provider
  3. The user authenticates (with their password, MFA, etc.)
  4. The identity provider sends back a SAML assertion containing the user’s email, name, and group memberships
  5. Superset parses the SAML assertion, checks if the user exists, and creates them if they don’t
  6. The user is logged in and can immediately access dashboards

This is simpler to set up than SCIM because SAML is already widely supported in Superset. You don’t need to build a SCIM endpoint. You just configure SAML and tell Superset how to map SAML attributes to user fields and groups.

SAML Attribute Mapping in Superset:

Superset’s Flask AppBuilder security model (which underlies Superset’s authentication) supports SAML attribute mapping. You configure it in superset_config.py:

FROM_EMAIL = "superset@yourcompany.com"
SECURITY_MANAGER_CLASS = 'superset.security.SupersetSecurityManager'

AUTH_TYPE = 5  # SAML auth
SAML_METADATA_URL = "https://idp.yourcompany.com/metadata.xml"

# Map SAML attributes to Superset user fields
SAML_ATTRIBUTE_MAPPING = {
    "http://schemas.xmlsoap.org/ws/2005/05/identity/claims/emailaddress": "email",
    "http://schemas.xmlsoap.org/ws/2005/05/identity/claims/givenname": "first_name",
    "http://schemas.xmlsoap.org/ws/2005/05/identity/claims/surname": "last_name",
    "http://schemas.xmlsoap.org/ws/2005/05/identity/claims/groups": "groups"
}

# Map SAML groups to Superset roles
SAML_GROUP_MAPPING = {
    "analytics-team": "Analytics",
    "data-engineers": "Data Engineer",
    "executives": "Admin"
}

When a user logs in with this configuration, Superset automatically creates them and assigns them to the appropriate roles based on their SAML group membership. No manual provisioning needed.

For enterprise identity providers like Auth0, detailed integration guides are available. See SCIM Provisioning with Auth0 for Auth0-specific patterns, and Provision Users Using SCIM with AWS IAM Identity Center for AWS-based deployments.

Key advantages of JIT provisioning:

  • Simpler setup: No SCIM endpoint to build; just configure SAML
  • Lower operational overhead: Users are created on demand, not pre-provisioned
  • No external dependencies: Superset doesn’t call out to your identity provider; it just reads SAML assertions
  • Immediate group assignment: Groups from your identity provider map directly to Superset roles
  • Works with any SAML provider: Okta, Azure AD, Auth0, Keycloak, etc.

Key limitations of JIT provisioning:

  • No deprovisioning: When a user is removed from your identity provider, their Superset account still exists until manually deleted
  • No group sync: If a user changes groups in your identity provider, Superset won’t update their role until they log out and log back in
  • No audit trail for creation: You won’t know exactly when users were created unless you log SAML assertions

SCIM vs. JIT: When to Use Each

Both patterns solve the user provisioning problem, but they’re optimized for different scenarios.

Use SCIM when:

  • You need automatic deprovisioning. When employees leave, you want their Superset access revoked immediately, not when they log out
  • You have high user churn or large teams. Manually managing hundreds of users is expensive; SCIM scales automatically
  • You need group sync. If users frequently move between teams, SCIM keeps group memberships in sync in real-time
  • You’re building embedded analytics where users are managed programmatically. SCIM integrates cleanly with your identity infrastructure
  • You need audit compliance. SCIM provides a detailed log of who was provisioned, when, and by whom
  • You’re managing multiple applications. If Superset is one of many apps in your stack, SCIM is a standard way to provision across all of them

Use JIT provisioning when:

  • You have a small, stable team. If you have 20 users and low turnover, JIT is simpler
  • You want minimal operational overhead. JIT requires no SCIM endpoint; just configure SAML
  • Your identity provider doesn’t support SCIM. Some on-premise or legacy systems only support SAML
  • You’re evaluating Superset. JIT gets you up and running quickly without infrastructure investment
  • You don’t need automatic deprovisioning. If users stay in Superset after leaving your organization, JIT is acceptable
  • You’re using embedded analytics with a low number of static users. If your users don’t change frequently, JIT reduces complexity

In practice, many organizations use both. They might use SCIM for internal employees and JIT for partner/customer access, or they use JIT initially and migrate to SCIM as they scale.

Building a SCIM Endpoint for Apache Superset

If you decide SCIM is right for your organization, you’ll need to implement a SCIM endpoint in your Superset instance. This is an HTTP API that your identity provider calls to provision users.

Superset doesn’t ship with a built-in SCIM endpoint, so you’ll need to build one. Here’s the architecture:

SCIM Endpoint Structure:

A minimal SCIM endpoint implements these endpoints:

  • POST /scim/v2/Users — Create a new user
  • GET /scim/v2/Users/{id} — Retrieve a user
  • PATCH /scim/v2/Users/{id} — Update a user
  • DELETE /scim/v2/Users/{id} — Delete a user
  • GET /scim/v2/Groups — List groups
  • POST /scim/v2/Groups — Create a group
  • PATCH /scim/v2/Groups/{id} — Update a group

You’ll also need a .well-known/scim-configuration endpoint that tells identity providers about your SCIM capabilities.

Implementation in Flask (Superset’s Web Framework):

Here’s a skeleton of a SCIM endpoint for Superset:

from flask import Blueprint, request, jsonify
from superset.security import SupersetSecurityManager
from superset.models.core import User
from superset import db
import uuid

scim_bp = Blueprint('scim', __name__, url_prefix='/scim/v2')

@scim_bp.route('/Users', methods=['POST'])
def create_user():
    data = request.get_json()
    
    # Validate SCIM request
    if 'userName' not in data:
        return {"error": "userName required"}, 400
    
    # Extract user details
    email = data.get('emails', [{}])[0].get('value')
    first_name = data.get('name', {}).get('givenName', '')
    last_name = data.get('name', {}).get('familyName', '')
    username = data['userName']
    is_active = data.get('active', True)
    
    # Check if user exists
    existing_user = db.session.query(User).filter_by(username=username).first()
    if existing_user:
        return {"error": "User already exists"}, 409
    
    # Create user
    user = User(
        username=username,
        email=email,
        first_name=first_name,
        last_name=last_name,
        is_active=is_active
    )
    db.session.add(user)
    db.session.commit()
    
    # Assign groups/roles if provided
    for group in data.get('groups', []):
        group_name = group.get('value')
        role = db.session.query(Role).filter_by(name=group_name).first()
        if role:
            user.roles.append(role)
    
    db.session.commit()
    
    return {
        "schemas": ["urn:ietf:params:scim:schemas:core:2.0:User"],
        "id": user.id,
        "userName": user.username,
        "emails": [{"value": user.email, "primary": True}],
        "name": {
            "givenName": user.first_name,
            "familyName": user.last_name
        },
        "active": user.is_active
    }, 201

@scim_bp.route('/Users/<user_id>', methods=['PATCH'])
def update_user(user_id):
    user = db.session.query(User).filter_by(id=user_id).first()
    if not user:
        return {"error": "User not found"}, 404
    
    data = request.get_json()
    
    # Handle attribute updates
    for operation in data.get('Operations', []):
        op = operation.get('op')
        path = operation.get('path')
        value = operation.get('value')
        
        if op == 'replace':
            if path == 'active':
                user.is_active = value
            elif path == 'name.givenName':
                user.first_name = value
            elif path == 'name.familyName':
                user.last_name = value
    
    db.session.commit()
    
    return {"id": user.id}, 200

@scim_bp.route('/Users/<user_id>', methods=['DELETE'])
def delete_user(user_id):
    user = db.session.query(User).filter_by(id=user_id).first()
    if not user:
        return {"error": "User not found"}, 404
    
    db.session.delete(user)
    db.session.commit()
    
    return {}, 204

This is a simplified example. Production implementations need:

  • Authentication: Validate that requests include a valid SCIM bearer token
  • Error handling: Return proper SCIM error responses
  • Pagination: Support startIndex and count for listing users
  • Filtering: Support SCIM filter syntax for queries
  • Logging: Log all provisioning operations for audit trails

For detailed specifications on SCIM endpoint implementation, refer to the Tutorial - Develop a SCIM endpoint for user provisioning and Security — Apache Superset documentation for Superset-specific security considerations.

At D23, we provide managed SCIM endpoints as part of our Superset hosting service, so you don’t have to build this yourself. But understanding the mechanics helps you evaluate whether SCIM is right for your organization.

Configuring JIT Provisioning with SAML Attribute Mapping

JIT provisioning is simpler to set up than SCIM. You just need to configure SAML in Superset and define how SAML attributes map to Superset user fields.

Step 1: Install Flask-SAML

Superset uses Flask-SAML for SAML authentication. Make sure it’s installed:

pip install flask-saml2

Step 2: Configure SAML in superset_config.py

import os

# Enable SAML authentication
AUTH_TYPE = 5  # SAML

# SAML metadata URL (provided by your identity provider)
SAML_METADATA_URL = "https://idp.yourcompany.com/metadata.xml"

# Or use a local metadata file
# SAML_METADATA_FILE = "/path/to/metadata.xml"

# Entity ID (must match what's configured in your identity provider)
SAML_ENTITY_ID = "https://superset.yourcompany.com/metadata/"

# Assertion Consumer Service URL (where the identity provider sends assertions)
SAML_ASSERTION_CONSUMER_SERVICE_URL = "https://superset.yourcompany.com/acs"

# Single Logout Service URL
SAML_SINGLE_LOGOUT_SERVICE_URL = "https://superset.yourcompany.com/sls"

# Map SAML attributes to Superset user fields
SAML_ATTRIBUTE_MAPPING = {
    "http://schemas.xmlsoap.org/ws/2005/05/identity/claims/emailaddress": "email",
    "http://schemas.xmlsoap.org/ws/2005/05/identity/claims/givenname": "first_name",
    "http://schemas.xmlsoap.org/ws/2005/05/identity/claims/surname": "last_name",
    "http://schemas.xmlsoap.org/ws/2005/05/identity/claims/groups": "groups"
}

# Map SAML groups to Superset roles
SAML_GROUP_MAPPING = {
    "analytics-team": "Analytics",
    "data-engineers": "Data Engineer",
    "finance-team": "Finance",
    "executives": "Admin"
}

# Allow users to be created on first login
AUTH_ROLE_ADMIN = "Admin"
AUTH_ROLE_PUBLIC = "Public"

# Optional: Set default role for new users
AUTH_USER_REGISTRATION_ROLE = "Viewer"

Step 3: Restart Superset

Restart your Superset application for configuration changes to take effect:

superset db upgrade
superset load_examples
superset init

Step 4: Configure Your Identity Provider

In your identity provider (Okta, Azure AD, etc.), add Superset as an application:

  • Single Sign-On URL: https://superset.yourcompany.com/acs
  • Audience URI: https://superset.yourcompany.com/metadata/
  • Name ID Format: Email Address
  • Attribute Statements: Map your identity provider’s attributes to the SAML attribute names you configured in Superset

For Okta specifically, see SCIM User Provisioning with Okta for both SAML and SCIM configuration.

Step 5: Test the Configuration

Have a test user log into Superset. They should be redirected to your identity provider, authenticate, and be automatically created in Superset with the appropriate role.

If users aren’t being created, check the Superset logs for SAML errors:

tail -f /var/log/superset/superset.log | grep -i saml

Handling Group Membership and Role Assignment

Both SCIM and JIT provisioning support automatic role assignment based on group membership. This is critical for scaling access control without manual intervention.

How It Works:

  1. Your identity provider defines groups (e.g., “analytics-team”, “data-engineers”)
  2. Users are assigned to one or more groups
  3. When provisioning (SCIM or JIT), Superset reads the group membership
  4. Superset maps groups to roles using a configuration table
  5. Users are automatically assigned to the corresponding roles

Example: Analytics Team Access

Let’s say you want all members of the “analytics-team” group to have access to a specific set of datasets and dashboards.

In your identity provider:

Group: analytics-team
  Members: alice@company.com, bob@company.com, carol@company.com

In Superset configuration:

SAML_GROUP_MAPPING = {
    "analytics-team": "Analytics"
}

In Superset’s role management:

The “Analytics” role is configured with permissions to view specific datasets and dashboards. When users are provisioned, they’re automatically assigned this role.

Nested Groups:

Some identity providers support nested groups (e.g., “company:analytics-team”). You can handle these with regex or explicit mapping:

SAML_GROUP_MAPPING = {
    "company:analytics-team": "Analytics",
    "company:data-engineers": "Data Engineer",
    "company:executives": "Admin"
}

Dynamic Role Assignment:

For more complex scenarios, you might want to assign roles based on multiple attributes. For example, assign the “Finance” role only if a user is in the “finance-team” group AND has the “analyst” job title attribute.

This requires custom logic in Superset’s security manager or in your SCIM endpoint. D23 handles this through custom attribute mapping and role assignment rules.

Security Considerations for User Provisioning

Automating user provisioning introduces new security considerations. Here are the key things to think about:

SCIM Token Security:

Your SCIM endpoint must be protected by a strong authentication mechanism. Use:

  • Bearer tokens: Generate a long, random token that your identity provider includes in the Authorization header
  • Mutual TLS: Require your identity provider to present a valid certificate
  • Rate limiting: Prevent brute-force attacks on the SCIM endpoint
  • IP whitelisting: Only allow requests from your identity provider’s IP addresses

Example token validation:

@scim_bp.before_request
def validate_scim_token():
    auth_header = request.headers.get('Authorization', '')
    token = auth_header.replace('Bearer ', '')
    
    if not token or token != os.getenv('SCIM_TOKEN'):
        return {"error": "Unauthorized"}, 401

Deprovisioning Delays:

When a user is deprovisioned (removed from your identity provider), there’s often a delay before Superset is notified. During this window, the user can still access Superset.

Mitigation:

  • Use SCIM instead of JIT for automatic deprovisioning
  • Implement a daily sync job that checks if users in Superset still exist in your identity provider
  • Monitor for suspicious activity from deprovisioned users

Attribute Injection:

If your identity provider is compromised, an attacker could provision themselves with admin privileges. Protect against this by:

  • Validating group membership: Only allow specific groups to be assigned to admin roles
  • Whitelisting groups: Explicitly define which groups can be provisioned, reject others
  • Logging all provisioning: Audit every user creation and role assignment

For comprehensive security guidance, see Securing Your Superset Installation for Production.

Audit Logging:

Maintain detailed logs of all provisioning operations:

  • Who was provisioned and when
  • What groups and roles were assigned
  • What attributes were changed
  • Who deprovisioned users

This is critical for compliance audits and incident investigation.

Operational Patterns: Monitoring and Troubleshooting

Once user provisioning is live, you need to monitor it and handle failures gracefully.

Common Issues:

Issue 1: Users Not Being Created

  • Check that SAML metadata URL is correct and accessible
  • Verify SAML attribute names match your identity provider’s output
  • Check Superset logs for SAML parsing errors
  • Confirm the identity provider is sending the expected SAML assertion

Issue 2: Groups Not Mapping to Roles

  • Verify group names in SAML_GROUP_MAPPING exactly match the identity provider’s group names (case-sensitive)
  • Check that the roles exist in Superset
  • Confirm the identity provider is including groups in the SAML assertion

Issue 3: SCIM Token Failures

  • Verify the bearer token is correct and hasn’t expired
  • Check that the SCIM endpoint is accessible from the identity provider’s IP addresses
  • Review firewall and network security group rules
  • Check for rate limiting that might be blocking requests

Monitoring Best Practices:

  1. Set up alerts for SCIM/SAML failures
  2. Monitor user creation rate for anomalies (sudden spikes might indicate misconfiguration)
  3. Log all provisioning operations for audit trails
  4. Test provisioning regularly with test users
  5. Review role assignments periodically to catch permission drift

Health Check Endpoint:

Implement a health check endpoint that your monitoring system can poll:

@scim_bp.route('/health', methods=['GET'])
def health_check():
    try:
        # Check database connectivity
        db.session.execute('SELECT 1')
        return {"status": "healthy"}, 200
    except Exception as e:
        return {"status": "unhealthy", "error": str(e)}, 500

Scaling User Provisioning Across Multiple Superset Instances

If you’re running multiple Superset instances (for high availability or multi-region deployments), user provisioning becomes more complex.

Challenge: Consistency Across Instances

When a user is provisioned via SCIM, only one Superset instance receives the request. The other instances don’t know about the new user until they query the shared database.

If you’re using a shared PostgreSQL database (recommended for production), this isn’t a problem—all instances read from the same user table. But if you’re using local SQLite databases per instance, you need a synchronization mechanism.

Solution 1: Shared Database (Recommended)

Use a single PostgreSQL database shared by all Superset instances. This is the simplest approach and ensures consistency:

# All instances connect to the same database
export SQLALCHEMY_DATABASE_URI="postgresql://user:password@postgres.company.com/superset"

Solution 2: Event-Driven Sync

If you must use local databases, implement event-driven synchronization:

  1. When a user is provisioned, emit an event (to Kafka, SNS, etc.)
  2. All Superset instances subscribe to the event and update their local database
  3. This ensures consistency across instances

This is more complex and introduces operational overhead, so we recommend the shared database approach.

Comparing SCIM and JIT: A Decision Matrix

To help you choose the right pattern, here’s a comparison:

CriterionSCIMJIT
Setup ComplexityHigh (requires SCIM endpoint)Low (configure SAML)
DeprovisioningAutomaticManual
Group SyncReal-timeOn next login
Operational OverheadMediumLow
Audit TrailExcellentGood
ScalabilityExcellent (for large teams)Good (for small teams)
Identity Provider SupportLimited (major providers only)Universal (any SAML provider)
CostHigher (more infrastructure)Lower (simpler setup)

Integration with D23’s Managed Superset Service

Building and maintaining user provisioning infrastructure is complex. At D23, we handle this for you. Our managed Superset platform includes:

  • Pre-built SCIM endpoints that integrate with Okta, Azure AD, Auth0, and other major identity providers
  • JIT provisioning configured and tested with your SAML provider
  • Group-based access control with automatic role assignment
  • Audit logging for compliance and security investigations
  • Multi-instance synchronization across high-availability deployments
  • Expert support for complex provisioning scenarios

This means you can focus on building dashboards and insights, not managing identity infrastructure. We handle the operational complexity, security hardening, and scaling.

If you’re evaluating Superset for embedded analytics or self-serve BI, user provisioning is a critical part of the decision. Contact our team to discuss your specific requirements and see how we can accelerate your Superset deployment.

Conclusion: Choosing Your Provisioning Strategy

User provisioning is a foundational part of any production Superset deployment. Whether you choose SCIM or JIT depends on your organization’s size, complexity, and requirements:

  • Start with JIT if you have a small team, want to minimize complexity, or are evaluating Superset
  • Move to SCIM as you scale, need automatic deprovisioning, or require real-time group synchronization
  • Use both in hybrid scenarios (SCIM for employees, JIT for partners)

Regardless of which pattern you choose, focus on:

  1. Security: Protect your SCIM tokens and SAML assertions
  2. Audit logging: Track all provisioning operations
  3. Testing: Validate provisioning with test users before rolling out to your organization
  4. Monitoring: Alert on provisioning failures
  5. Documentation: Document your configuration so future team members understand it

The investment in automated user provisioning pays dividends as your organization grows. No more manual user management, faster onboarding, and better security. That’s the promise of SCIM and JIT provisioning in Superset.

For more guidance on securing your Superset deployment and managing users at scale, explore our Privacy Policy and Terms of Service to understand how we handle user data. Or reach out to learn how D23 simplifies Superset operations for data teams.