The D23 Discovery Workshop: How We Scope Engagements
Learn how D23's discovery workshop scopes fixed-fee analytics engagements. A technical deep-dive into planning managed Superset implementations.
Understanding the Discovery Workshop Framework
When a data leader or CTO reaches out to D23 about managed Apache Superset, they’re usually facing a familiar problem: they know they need production-grade analytics, but they don’t know exactly what that looks like for their specific team, data stack, and business outcomes. A discovery workshop is where we translate that uncertainty into a concrete, scoped engagement.
Unlike vendor demos or generic consulting kickoffs, a D23 discovery workshop is a structured, time-boxed investigation designed to answer three critical questions: What does your analytics problem actually look like? What does a solution cost in time and resources? What’s the right engagement model for your situation?
This isn’t a sales pitch disguised as discovery. It’s a working session where we dig into your data architecture, your team’s skill mix, your business metrics, and your timeline. We come out the other side with a clear statement of work, a fixed fee, a delivery schedule, and mutual clarity on what success looks like. For teams evaluating managed Apache Superset hosting as an alternative to Looker, Tableau, or Power BI, this process removes ambiguity and lets you make a confident decision.
The workshop typically spans 2–4 hours across one or two sessions, involves 3–5 people from your team, and produces a detailed scope document that becomes the foundation of your engagement.
Why Traditional Scoping Fails for Analytics Projects
Most analytics consulting engagements fail at scoping. Here’s why: analytics is not a linear project. You don’t build a dashboard the way you build a bridge. The requirements emerge as you learn your data, test hypotheses, and watch how your team actually uses the tools.
Conventional consulting approaches ask you to define success upfront: “We want five dashboards covering sales, marketing, and finance.” Then they estimate hours, lock in a contract, and 12 weeks later you get something that doesn’t quite fit because the underlying data was messier than expected, or your CFO’s definition of “pipeline” changed, or the team needed real-time alerts instead of static reports.
This is especially true with self-serve BI implementations. The promise of self-serve analytics is that your team gets faster, more accurate answers. But building that capability requires more than spinning up a Superset instance. You need to understand your data governance model, your team’s SQL literacy, your query performance constraints, and how analytics decisions actually flow through your organization.
A discovery workshop sidesteps these traps by making the unknowns visible before you commit. You’re not guessing at scope—you’re measuring it.
The Pre-Workshop Preparation Phase
Before we sit down together, we ask you to do some homework. This isn’t busywork. It’s a filter that ensures the workshop is productive and that we’re working with decision-makers.
You’ll provide a brief intake form covering:
- Your current analytics stack: Where does your data live? What tools are you using today? Are you on Postgres, Snowflake, BigQuery, Redshift? Are you already using Superset, or evaluating it? Do you have a data warehouse, or are you querying operational databases?
- Your team: How many people will use analytics? What’s their SQL skill level? Do you have a dedicated analytics engineer, or are you asking product and data engineers to own reporting?
- Your business metrics: What are the top 5–10 KPIs your leadership team cares about? Where do those metrics live in your data? How often do they need to be refreshed?
- Your pain points: What’s broken about your current setup? Is it speed, governance, cost, or something else?
- Your timeline and budget band: Are you thinking weeks or months? Are you funded for this, or still in evaluation?
This pre-work typically takes 30 minutes. We also ask you to pull a few sample data tables or schemas—nothing sensitive, just enough for us to understand your data structure.
Why this matters: we come into the workshop already oriented. We’re not spending the first hour asking questions we could have asked asynchronously. We can dive straight into the technical and organizational details that actually determine scope.
The Workshop Itself: Four Core Modules
Module 1: Data Architecture and Source System Mapping
The first module is about understanding your data landscape in detail. This is where we map source systems, identify latency requirements, and flag data quality issues that will affect dashboard design.
We walk through your tech stack with your data engineer or analytics engineer. The questions are concrete:
- How often does your data land in the warehouse? Is it real-time streaming, hourly batch, or daily overnight load?
- What’s the freshness requirement for your dashboards? Do your sales dashboards need to show deals from the last 30 minutes, or is yesterday’s data acceptable?
- How large are your fact tables? If you have a billion-row events table, what’s the query latency when you filter to the last 30 days?
- What’s your current data quality baseline? Are there known issues—missing values, late-arriving facts, dimension table inconsistencies—that we need to work around?
- Do you have a data catalog or lineage tool? Can we trace a metric from the source system back to the warehouse table?
These details matter because they directly affect how you architect dashboards in Apache Superset. A query that scans a billion rows and takes 45 seconds is a bad user experience. If that’s your baseline, we need to either optimize the underlying query, pre-aggregate the data, or redesign the dashboard to use smaller, filtered datasets.
We also map integrations. Are you pulling data from Salesforce, Mixpanel, Stripe, Segment? Each integration is a potential failure point. We want to know if you have automated syncs, who owns them, and how they’re monitored.
By the end of this module, we have a data architecture diagram and a list of “known unknowns”—things we’ll need to investigate deeper during the engagement.
Module 2: Analytics Capability and Team Readiness
The second module is about your team and their readiness for self-serve analytics. This is the hardest part to scope because it’s not purely technical.
We assess:
- SQL literacy: Can your product engineers write queries? Do you have analysts who can? Or are you starting from scratch?
- BI tool experience: Has anyone used Looker, Tableau, or Power BI? Have you tried Superset? What did they like or dislike?
- Analytics culture: Who asks for reports today? Is it ad-hoc requests to a single person, or do teams have access to a shared dashboard?
- Governance model: Do you have data governance policies? Who owns the “single source of truth” for a metric like MRR or churn?
- Training capacity: Can your team absorb training, or do you need hand-holding for the first few months?
These factors determine whether you need a lightweight implementation (just get Superset running and let the team explore) or a more structured engagement (we build curated dashboards, define metrics, and train your team on how to maintain them).
For example, if you have strong analytics engineers and a healthy data culture, you might benefit from D23’s API-first BI approach—embedding analytics programmatically and letting your team build on top of it. If you’re earlier in your analytics journey, you might need us to build initial dashboards that become templates for your team.
We also discuss change management. Introducing a new BI tool is a change. Some teams will adopt it immediately. Others will resist. We want to identify champions within your organization who can evangelize the tool and help drive adoption.
Module 3: Use Case Prioritization and Metric Definition
This module is where we get specific about what you’re actually building.
We start by listing all the dashboards, reports, and analytics use cases you think you need. This list is usually long—20, 30, sometimes 50 items. Our job is to ruthlessly prioritize.
We use a simple framework:
- Impact: How many people use this? How often? How does it affect business decisions?
- Complexity: How many data sources? How much transformation? How much SQL?
- Feasibility: Do we have the data? Is the data quality good enough? Can we get it fresh enough?
We then sort by impact/complexity ratio. The goal is to identify the first 3–5 use cases that give you the most value with the least complexity. These become your MVP (minimum viable product).
For each MVP use case, we define the metrics explicitly. This is critical. “Revenue” means different things to different people. Is it gross revenue, net revenue, revenue recognized this month, or revenue that landed in the bank account? We write down the exact definition, the source table, the business logic, and the refresh frequency.
We also identify dependencies. If your sales dashboard depends on a clean customer dimension table, and that table is currently messy, we flag it as a blocker. We might decide to fix it as part of the engagement, or we might decide to work around it.
By the end of this module, you have a prioritized roadmap with 3–5 initial use cases, explicit metric definitions, and a clear understanding of what’s a blocker versus what’s a nice-to-have.
Module 4: Engagement Model, Timeline, and Delivery
The final module is about translating everything we’ve learned into a concrete engagement.
We discuss delivery models:
- Managed hosting: We run your Superset instance. You focus on dashboards and analytics.
- Consulting only: We help you scope and design, but your team runs Superset in your infrastructure.
- Hybrid: We consult on architecture and dashboards, you handle hosting, or vice versa.
- Embedded analytics: You want to embed dashboards in your product. We help design and implement the architecture.
Each model has different cost and timeline implications. Managed hosting is simpler operationally but involves a monthly fee. Consulting-only is lower ongoing cost but requires more investment from your team.
We also discuss the role of AI-powered analytics and text-to-SQL capabilities. If you’re interested in letting your team ask questions in natural language and get SQL queries back, that’s a specific architecture choice that affects scope and timeline.
Then we estimate effort. Based on the MVP scope, the data architecture, and your team’s readiness, we estimate:
- Weeks to first dashboard: Typically 2–4 weeks from kickoff to your first production dashboard live in Superset.
- Total MVP effort: How many weeks of consulting, or how many months of managed hosting, to get all 3–5 initial use cases live and documented?
- Team capacity needed: How much time does your team need to invest? Are we doing 80% of the work, or 50/50?
We also discuss risk and unknowns. Maybe your data quality is worse than expected. Maybe you discover you need a new data pipeline. We build in contingency and define what happens if we hit a blocker.
Finally, we talk about success metrics. How will we know the engagement worked? Is it “all three dashboards are live and accurate”? Is it “your team can self-serve and answer 80% of ad-hoc questions without asking us”? Is it “we reduced reporting time from 2 days to 30 minutes”? We define this upfront so there’s no ambiguity at the end.
The Scope Document: From Workshop to Engagement
Within a week of the workshop, we produce a detailed scope document. This becomes your statement of work.
The document includes:
- Executive summary: One page covering the engagement overview, timeline, and cost.
- Current state assessment: Your data architecture, team readiness, and pain points as we understand them.
- Target state and success criteria: The 3–5 MVP use cases, metric definitions, and what success looks like.
- Deliverables: What you’re getting. Managed Superset instance? Dashboards? Documentation? Training? Data pipelines?
- Timeline: Weeks 1–2 (setup and data exploration), weeks 3–4 (first dashboards), weeks 5–6 (refinement and training), etc.
- Roles and responsibilities: What we’re doing, what your team is doing, what decisions you need to make and when.
- Assumptions and constraints: The things we’re assuming are true. “We assume you have a Snowflake instance with clean customer and transaction tables.” “We assume your team has 10 hours per week available for training and feedback.”
- Risk register: Known unknowns and how we’ll handle them.
- Pricing: The fixed fee for the engagement, what’s included, and what’s out of scope (e.g., “building dashboards for use cases outside the MVP is out of scope and will be quoted separately”).
This document is not a contract yet. It’s a detailed proposal. You review it, ask questions, and negotiate if needed. Once you sign off, it becomes the basis of your statement of work.
Why This Approach Works for Fixed-Fee Engagements
Fixed-fee consulting is risky for both sides. If the vendor underestimates, they lose money. If they overestimate, the client feels ripped off. A rigorous discovery workshop mitigates this risk by making scope concrete.
We’re not estimating in the abstract. We’ve looked at your data, talked to your team, and identified the actual work. We know what’s a blocker and what’s straightforward. We’ve built in contingency for real risks, not imaginary ones.
For you, the benefit is predictability. You know what you’re paying, when you’ll get it, and what you’re getting. No surprise invoices. No scope creep. If something unexpected comes up—your data is messier than we thought, or you need a new integration—we discuss it explicitly and decide together whether it’s in scope or out of scope.
This also means we’re motivated to scope conservatively. We’d rather deliver more than expected and build goodwill than promise the world and disappoint. If we think an engagement is too risky to fix-fee, we’ll say so and propose an hourly or retainer model instead.
Common Discoveries and How We Handle Them
In hundreds of discovery workshops, patterns emerge.
Discovery 1: Your data is messier than you thought. You think you have a clean customer dimension table, but it has duplicates, late-arriving facts, and inconsistent IDs. This is the most common discovery. We don’t panic. We either fix it as part of the engagement (if it’s blocking your MVP), or we design dashboards that work around it. Either way, we’re transparent about the effort and cost.
Discovery 2: Your team’s SQL literacy is lower than you hoped. You thought your engineers could maintain dashboards, but they’re not comfortable writing complex queries. This means you need more consulting support or more training. We adjust the engagement accordingly.
Discovery 3: You need real-time data, not batch. You thought daily refresh was fine, but during the workshop you realize your sales team needs to see deals as they close. This is a material change. We discuss whether real-time is actually necessary (sometimes teams think they need it but don’t), and if so, we scope the data pipeline work.
Discovery 4: You have more use cases than you thought. You came in thinking you needed three dashboards. By the end of the workshop, you realize you need ten. We don’t try to fit ten into the scope of three. We prioritize, scope the first three, and talk about a follow-on engagement for the rest.
Discovery 5: Your stakeholders disagree on metrics. Sales thinks MRR is calculated one way, finance thinks another. This is a governance issue, not a tool issue. We identify it, but we don’t solve it in the workshop. We flag it as a pre-engagement decision your leadership team needs to make.
In each case, the workshop gives us the visibility to handle these issues proactively, not reactively halfway through the engagement.
Preparing for Your Discovery Workshop
If you’re considering a discovery workshop with D23, here’s how to prepare:
Assemble the right team. You need someone who understands your data (data engineer or analytics engineer), someone who understands your business (product manager or analyst), and someone who can make decisions (director or VP). Three to five people is ideal. More than that, and the meeting becomes unwieldy.
Get your data documentation ready. You don’t need a perfect data dictionary, but have something. A screenshot of your warehouse schema, a list of your main tables, or a data lineage diagram. If you have a data catalog tool like dbt, share it. If not, a simple spreadsheet works.
Think about your use cases. Don’t overthink this. Just brain-dump: what reports do you need? What questions do you ask repeatedly? What decisions would be faster if you had better data? We’ll help you prioritize.
Be honest about your constraints. If you’re on a tight timeline, say so. If you have a limited budget, say so. If your data is a mess, say so. We’re not judging. We’re scoping. Honesty makes the scope accurate.
Clear your calendar. The workshop is 2–4 hours. Don’t try to do it between other meetings. Block the time, focus, and engage fully. The quality of the workshop depends on the quality of the conversation.
After the Workshop: Next Steps
Once the scope document is signed, the engagement begins. Typically, the first week is setup: we configure your Superset instance (if managed hosting), connect your data sources, and run your first queries to validate data freshness and quality.
Weeks 2–3 are usually the heaviest lift. We’re building dashboards, writing queries, and collaborating closely with your team. You’ll see drafts, give feedback, and iterate. This is where the real value is created.
Weeks 4–6 are refinement and knowledge transfer. We’re documenting dashboards, training your team on how to maintain and extend them, and making sure everything is production-ready.
Throughout, we’re using D23’s managed Superset platform or your self-hosted instance. We’re leveraging Superset’s API capabilities for programmatic dashboard creation and updates. If you’re interested in text-to-SQL or AI-assisted analytics, we integrate that into the workflow.
We also stay in touch with your team. Regular check-ins, quick feedback loops, and a clear escalation path if something isn’t working. The engagement is collaborative, not transactional.
Comparing Discovery Approaches Across Vendors
When you’re evaluating vendors like Preset, Looker, Tableau, or Power BI, ask them about their scoping process. You’ll notice differences.
Some vendors rush to a contract without deep discovery. They’re optimizing for sales velocity, not engagement success. Others do discovery but don’t commit to fixed fees—they want flexibility to bill more if things get complicated. Some have good discovery but poor follow-through on delivery.
D23’s discovery workshop is designed to be thorough, collaborative, and honest. We’re not trying to upsell you or lock you in. We’re trying to understand your problem and propose a solution you can execute with confidence.
This approach works because we’re focused on your success, not our revenue. If we scope conservatively and deliver more than expected, you’re happy and you’ll recommend us. If we overcommit and underdeliver, you’re unhappy and you’ll tell everyone. The incentives are aligned.
The Role of Consulting in Your Analytics Journey
It’s worth noting that the discovery workshop is just the beginning of your analytics journey. The engagement we scope might be 4–6 weeks of intensive work, but your analytics needs will evolve.
After the MVP is live, you’ll want to add more dashboards, integrate new data sources, and optimize performance. You might want to embed analytics in your product. You might want to adopt AI-powered analytics to let your team ask natural language questions.
Some teams transition to self-service after the initial engagement and rarely need us again. Others prefer an ongoing retainer or partnership model. D23 offers both: we can help you get to a state where you’re self-sufficient, or we can be your ongoing analytics partner.
The discovery workshop gives us the foundation to support either path. We understand your data, your team, and your business. We’ve built institutional knowledge about your stack. That’s valuable whether we’re deeply involved or lightly involved going forward.
Key Takeaways
The D23 discovery workshop is a structured, honest approach to scoping analytics engagements. Here’s what makes it work:
- Visibility into complexity: We don’t guess at scope. We look at your data, talk to your team, and measure the work.
- Prioritization: We help you focus on the highest-impact use cases first, not try to boil the ocean.
- Explicit metrics: We define metrics clearly so there’s no ambiguity about what success looks like.
- Risk awareness: We identify blockers and unknowns upfront so we can plan for them.
- Fixed fees with confidence: Because we’ve done the work to understand scope, we can commit to fixed fees without fear.
- Collaborative engagement: The workshop is not a sales pitch. It’s a working session where we’re learning together.
If you’re a data leader evaluating managed Apache Superset or considering a consulting engagement to build analytics capability, a discovery workshop is the right starting point. It’s a small investment of time that pays dividends in clarity, confidence, and successful outcomes.
Ready to get started? Reach out to D23 and let’s schedule your discovery workshop. We’ll come prepared, ask hard questions, and help you build a roadmap for analytics success.