Why Your Text-to-SQL Project Will Fail Without a Semantic Layer
Text-to-SQL without a semantic layer leads to hallucinations, inaccurate queries, and failed deployments. Here's why governed semantic layers matter.
Why Your Text-to-SQL Project Will Fail Without a Semantic Layer
You’ve seen the demos. A user types a natural language question into a chat interface, and within seconds, the system returns a perfectly formatted SQL query that answers their question. The LLM magic feels real, the promise feels inevitable, and your team is already imagining the productivity gains.
Then you deploy it to production.
Within days, you’re drowning in hallucinated metrics, incorrect aggregations, and queries that run for minutes before timing out. Your finance team is getting revenue numbers that don’t match the general ledger. Your product team is seeing user counts that contradict your warehouse. The executive dashboard breaks because the system confused “monthly active users” with “monthly user signups.” Your data team is now spending more time validating AI-generated queries than they would have spent answering questions manually.
This isn’t a failure of the LLM. It’s a failure of architecture.
Text-to-SQL systems without a semantic layer are like asking someone to navigate a foreign city without a map, street signs, or a shared language. The LLM is incredibly powerful, but it’s operating in a vacuum—without explicit definitions of what your metrics mean, how your tables relate to each other, or what the business rules are. The result is a system that feels intelligent but produces garbage.
A semantic layer fixes this. It’s the bridge between natural language and your data warehouse. It’s the Rosetta Stone that lets an LLM understand not just the structure of your database, but the meaning behind it.
Let’s walk through why text-to-SQL without a semantic layer fails, what a semantic layer actually does, and how to build one that works in production.
The Core Problem: LLMs Don’t Understand Your Data
Large language models are pattern-matching machines trained on vast amounts of text. They’re phenomenally good at predicting what comes next in a sequence. But they have no inherent understanding of your business, your data model, or what your metrics actually mean.
When you ask an LLM to write SQL against your database, you’re asking it to:
- Parse your natural language question
- Identify which tables contain the relevant data
- Determine how those tables should be joined
- Apply the correct aggregations and filters
- Handle edge cases and business logic
- Return syntactically correct SQL that actually runs
Without explicit guidance, the LLM will make assumptions. It will hallucinate table names that don’t exist. It will guess at join logic. It will invent column names. It will apply the wrong aggregation function because the training data had a similar-sounding example.
Consider a simple question: “What was our revenue last month?”
Without a semantic layer, the LLM has to figure out:
- Which table contains revenue? Is it
transactions,orders,sales,invoices, or something else? - How is revenue calculated? Is it the
amountcolumn, or do you need to sumunit_price * quantity? - What counts as “last month”? Do you want calendar month, fiscal month, or rolling 30 days?
- Should you include refunds, cancellations, or partial orders?
- What’s the currency? Do you need to convert?
- Are there any business rules about what counts as “revenue” for reporting purposes?
The LLM might guess correctly 60% of the time. The other 40%, it will produce plausible-sounding SQL that runs without errors but returns completely wrong numbers. This is worse than an error—errors force you to investigate. Wrong numbers silently corrupt your decision-making.
Why Traditional Documentation Doesn’t Work
You might think you can solve this by providing the LLM with better documentation. You document your schema, write out your business logic, include example queries, and feed it all to the prompt.
This works until it doesn’t.
Documentation is unstructured. It’s prose. It’s ambiguous. When you write “revenue is the sum of all completed orders,” you’re creating a semantic artifact that a human can interpret in multiple ways. Is an order “completed” when it ships, when the customer receives it, or when payment clears? Does “all” include refunded orders? What about test orders?
An LLM reading this documentation will make a guess. If the guess is wrong, there’s no systematic way to correct it. You can’t debug documentation the way you can debug code.
Moreover, documentation rots. Your data model evolves, your business logic changes, and your documentation falls out of sync. The LLM is now operating on stale information.
A semantic layer is executable documentation. It’s code that defines your metrics, dimensions, and relationships in a machine-readable format that the LLM can actually use.
What a Semantic Layer Actually Is
A semantic layer is a structured, governed definition of your business metrics and data relationships. It sits between your raw database and your applications (including text-to-SQL systems). Instead of having an LLM guess at your data model, the semantic layer explicitly defines:
Metrics: Calculated values like revenue, customer count, or churn rate. A metric has a clear definition, a calculation method, and documented business logic.
Dimensions: Attributes that you filter or group by, like date, customer segment, or product category. Dimensions have standardized names and clearly defined values.
Relationships: How tables connect to each other. Instead of the LLM guessing at join logic, the semantic layer explicitly defines the foreign key relationships.
Filters and Business Rules: Constraints that should apply to certain queries. For example, “revenue should exclude test orders” or “active users are those who logged in within the last 30 days.”
When an LLM has access to a semantic layer, it doesn’t have to figure out the data model. It can focus on understanding the user’s intent and translating that intent into a query that uses the semantic layer’s pre-defined metrics and dimensions.
This is the difference between asking someone to navigate a city with a map versus without one. The map doesn’t eliminate the need for navigation, but it makes the task vastly more tractable.
Real-World Failure Modes
Let’s walk through some concrete examples of how text-to-SQL fails without a semantic layer, and how a semantic layer prevents these failures.
Example 1: The Hallucinated Metric
Your CFO asks: “What’s our gross margin by product line?”
Without a semantic layer, the LLM searches your database schema and finds a column called margin. It generates a query that selects product_line, margin and groups by product line.
The query runs successfully. It returns numbers. The CFO is satisfied.
Six weeks later, your accounting team discovers that the “gross margin” numbers you’ve been reporting are actually net margin, and they’re off by 15% because the LLM picked the wrong column. The entire forecast is now wrong.
With a semantic layer, you’ve explicitly defined “gross_margin” as a metric: (revenue - cogs) / revenue. The LLM can’t confuse it with net margin because the semantic layer has already codified the correct calculation. If the CFO asks for gross margin, the LLM generates a query that uses the defined metric. If the LLM tries to invent its own margin calculation, the semantic layer’s type system catches the error.
Example 2: The Silent Join Failure
Your product team asks: “How many customers did we acquire last month, broken down by acquisition channel?”
Without a semantic layer, the LLM has to figure out how to join the customers table with the acquisition_channels table. It guesses at the join key. Maybe it assumes customer_id exists in both tables. Maybe it assumes there’s a channel_id foreign key.
The query runs. It returns numbers. But the join is wrong. Some customers have multiple acquisition channels, and the query is double-counting them. Other customers are missing because the join is too restrictive. The numbers are plausible, but they’re wrong.
With a semantic layer, the relationship between customers and acquisition channels is explicitly defined. The LLM doesn’t guess at join logic—it uses the pre-defined relationship. If there are edge cases (like customers with multiple channels), the semantic layer has already decided how to handle them. The query is correct by design.
Example 3: The Aggregation Trap
Your operations team asks: “What’s our average order value by region?”
Without a semantic layer, the LLM generates a query that calculates AVG(order_value) grouped by region.
But here’s the problem: average order value should be calculated as SUM(revenue) / COUNT(distinct orders), not AVG(order_value). If you have orders of varying sizes, the two calculations give different results. The LLM picked the simpler aggregation without understanding the business meaning.
With a semantic layer, “average order value” is a defined metric with a specific calculation. The LLM doesn’t get to choose the aggregation function—it uses the one that’s defined in the semantic layer. The query is correct by definition.
How a Semantic Layer Fixes Text-to-SQL
A semantic layer transforms text-to-SQL from a risky guessing game into a reliable, governed process. Here’s how:
1. Explicit Metric Definitions
Instead of having the LLM invent metrics, you define them once in the semantic layer. Each metric has:
- A clear name and description
- The calculation logic (SQL or reference to other metrics)
- The underlying tables and columns
- Filters and business rules that should apply
- Data type and expected range
When the LLM encounters a user question about a metric, it doesn’t generate SQL from scratch. It looks up the metric in the semantic layer and uses the pre-defined calculation. This eliminates entire categories of errors.
2. Standardized Dimension Definitions
Dimensions are the attributes you filter and group by. By defining them in the semantic layer, you ensure consistency:
- “Date” always means the same thing across all queries
- “Customer segment” is calculated the same way every time
- “Product category” uses the same categorization logic
- “Region” is standardized (not a mix of sales regions, support regions, and shipping regions)
The LLM can reference these standardized dimensions without ambiguity. When a user asks for “revenue by region,” the LLM knows exactly which region dimension to use.
3. Governed Relationships
Instead of the LLM guessing at join logic, the semantic layer explicitly defines how tables relate:
- One-to-many relationships (one customer has many orders)
- Many-to-many relationships (orders have many products through line items)
- Slowly changing dimensions (how to handle product category changes over time)
- Edge cases and special handling
The LLM uses these relationships to construct correct joins. It doesn’t have to understand the intricacies of your data model—the semantic layer handles that.
4. Enforced Business Logic
Business rules that should apply to every query are baked into the semantic layer:
- “Revenue should exclude test orders”
- “Active users are those who logged in in the last 30 days”
- “We only report on data from the current fiscal year forward”
These rules are applied automatically, consistently, across all queries. The LLM doesn’t have to remember them or apply them correctly—they’re enforced by the semantic layer.
Building a Semantic Layer for Text-to-SQL
So how do you actually build a semantic layer that works with text-to-SQL? There are several approaches, each with trade-offs.
dbt Metrics (Now dbt Semantic Layer)
dbt’s semantic layer approach provides a structured way to define metrics and dimensions in YAML. You define your metrics once, and they’re available to any tool that integrates with dbt—including text-to-SQL systems.
The advantages:
- Metrics are version-controlled alongside your data models
- Definitions are DRY (don’t repeat yourself)
- Changes to metric definitions propagate automatically
- Integration with dbt’s lineage and documentation
- Strong type safety
The disadvantages:
- Requires adoption of dbt (though this is increasingly standard)
- YAML-based definitions can become complex for sophisticated metrics
- Learning curve for teams new to dbt
Custom Semantic Layer (Wren AI / Cube / Others)
Specialized semantic layer tools like Wren AI provide a dedicated platform for defining and governing your semantic model. These tools often include:
- Web-based UI for defining metrics and dimensions
- Automatic schema introspection
- Validation and testing frameworks
- Integration with text-to-SQL systems
- Governance and change management
The advantages:
- Purpose-built for semantic layer governance
- Often includes testing and validation
- Can handle complex business logic
- Better UX for non-technical stakeholders
The disadvantages:
- Another tool to operate and maintain
- May require custom integration with your text-to-SQL system
- Potential vendor lock-in
Hybrid Approach (dbt + Custom Layer)
Many teams use dbt as the foundation (defining metrics and dimensions in dbt) and layer additional governance on top. For example, you might:
- Define base metrics in dbt
- Add business rules and filters in a custom application layer
- Implement validation and testing in your text-to-SQL pipeline
- Use D23’s managed Apache Superset to serve these metrics through a governed BI interface
This approach gives you the best of both worlds: the version control and structure of dbt, plus custom governance tailored to your needs.
Text-to-SQL With a Semantic Layer: How It Works
Once you have a semantic layer in place, here’s how text-to-SQL actually works:
-
User Input: A user asks a natural language question: “What was our revenue last month by product category?”
-
Intent Recognition: The LLM parses the question and identifies:
- The metric being requested: revenue
- The time period: last month
- The grouping dimension: product category
-
Semantic Layer Lookup: The system looks up these elements in the semantic layer:
- Revenue metric definition: SUM(orders.amount) WHERE orders.status = ‘completed’
- Time dimension: orders.created_date
- Product category dimension: products.category (with the join logic already defined)
-
Query Generation: The LLM generates SQL using the semantic layer’s definitions:
SELECT products.category, SUM(orders.amount) as revenue FROM orders JOIN products ON orders.product_id = products.id WHERE orders.status = 'completed' AND DATE_TRUNC('month', orders.created_date) = DATE_TRUNC('month', CURRENT_DATE - INTERVAL '1 month') GROUP BY products.category ORDER BY revenue DESC -
Validation: The system validates the query against the semantic layer:
- Are all referenced metrics and dimensions valid?
- Are the joins correct?
- Are all required filters applied?
- Is the aggregation appropriate?
-
Execution: If validation passes, the query is executed against the warehouse.
-
Result Formatting: The results are formatted and returned to the user.
The key difference: the LLM isn’t inventing the metric definition or the join logic. It’s using pre-defined, governed definitions from the semantic layer. This dramatically reduces the surface area for errors.
Advanced: Agentic Semantic Model Improvement
Once you have a semantic layer in place, you can go further. Snowflake’s research on agentic semantic model improvement shows how to use LLMs to continuously improve your semantic model.
The idea: when a text-to-SQL query returns unexpected results or when a user corrects an AI-generated query, that feedback can be used to improve the semantic layer itself. An agentic system can:
- Detect when metrics are being used incorrectly
- Suggest improvements to metric definitions
- Identify missing dimensions or relationships
- Refine business rules based on observed patterns
This creates a virtuous cycle: as your semantic layer improves, your text-to-SQL accuracy improves. As your text-to-SQL accuracy improves, you get better feedback to improve the semantic layer further.
Practical Implementation: Starting Small
You don’t need to boil the ocean. Here’s a pragmatic approach to implementing text-to-SQL with a semantic layer:
Phase 1: Identify Your Core Metrics
Start with the metrics that matter most:
- What does your CFO ask about most?
- What metrics appear in your executive dashboard?
- What calculations are most frequently wrong or misunderstood?
Define these metrics explicitly in your semantic layer. Don’t try to cover everything—focus on the 80/20 rule.
Phase 2: Add Dimensions and Relationships
For each metric, identify the dimensions that matter:
- What are the most common ways to slice and dice this metric?
- What dimensions do people filter by?
- What relationships need to be explicit?
Add these to the semantic layer and test the relationships.
Phase 3: Implement Text-to-SQL
Once you have a solid semantic layer foundation, implement text-to-SQL:
- Start with a small pilot group
- Monitor queries for accuracy
- Collect feedback on what’s working and what’s not
- Iterate on the semantic layer based on real usage
Phase 4: Expand and Govern
As usage grows, add governance:
- Version control for semantic layer changes
- Approval workflows for new metrics
- Testing and validation frameworks
- Documentation and discovery tools
The Role of Managed Platforms
Building and maintaining a semantic layer is non-trivial. It requires:
- Data engineering expertise to define metrics and relationships correctly
- Ongoing maintenance as your data model evolves
- Governance and change management processes
- Integration with your text-to-SQL system
- Monitoring and validation to catch errors
This is where managed platforms like D23 come in. D23 is built on Apache Superset with integrated semantic layer governance, AI-powered text-to-SQL, and expert data consulting.
Instead of building and maintaining these components yourself, D23 provides:
- A managed Apache Superset instance with semantic layer integration
- API-first architecture for embedding analytics in your products
- Text-to-SQL powered by LLMs, grounded in your semantic layer
- MCP (Model Context Protocol) integration for AI-assisted query generation
- Expert consulting to help you define and govern your semantic layer
- Self-serve BI that actually works because it’s built on a governed semantic foundation
For data and engineering leaders at scale-ups and mid-market companies, this eliminates the operational burden of building and maintaining these systems yourself. You get the benefits of a semantic layer and text-to-SQL without the infrastructure overhead.
Common Pitfalls to Avoid
Even with a semantic layer, text-to-SQL projects can fail. Here are the most common pitfalls:
1. Incomplete Semantic Layer
If your semantic layer doesn’t cover the metrics and dimensions that users actually care about, they’ll work around it. The LLM will revert to generating raw SQL, and you’re back to the original problem.
Solution: Start with your most important metrics and expand iteratively. Don’t try to cover everything at once.
2. Stale Definitions
If your semantic layer definitions drift out of sync with your actual data model, the LLM will generate incorrect queries.
Solution: Treat your semantic layer as code. Version control it, test it, and maintain it as part of your data infrastructure.
3. Poor Documentation
If your semantic layer definitions are cryptic or poorly documented, even humans can’t understand them. The LLM certainly won’t.
Solution: Document every metric and dimension. Explain the business logic, the calculation method, and the intended use cases.
4. Inadequate Testing
If you don’t test your text-to-SQL system, you won’t catch errors until they corrupt your decision-making.
Solution: Implement automated testing. Validate that text-to-SQL queries return expected results. Monitor query accuracy in production.
5. Over-Reliance on LLM Magic
The LLM is a tool, not a replacement for data governance. It can’t fix a broken semantic layer.
Solution: Invest in your semantic layer. Make it accurate, complete, and well-governed. The LLM will then work reliably.
The Business Case for Semantic Layers
Implementing a semantic layer requires upfront investment. Why is it worth it?
Reduced Time-to-Dashboard
Without a semantic layer, every new metric requires custom SQL, testing, and validation. With a semantic layer, new metrics are defined once and reused everywhere. This dramatically reduces the time it takes to build new dashboards and reports.
Improved Accuracy
Metrics defined in a semantic layer are consistent across all tools and use cases. No more discrepancies between the executive dashboard and the data warehouse. No more debates about which number is “right.”
Lower Operational Burden
With a semantic layer, your data team spends less time answering ad-hoc questions and more time building infrastructure. Text-to-SQL handles the routine questions; your team focuses on the complex ones.
Faster Decision-Making
When anyone can ask questions in natural language and get accurate answers immediately, decisions happen faster. No more waiting for the data team to write a query. No more wrong numbers corrupting your analysis.
Scalable Self-Service BI
Without a semantic layer, self-service BI leads to chaos—everyone writing their own SQL, generating their own metrics, and arriving at different answers. With a semantic layer, self-service BI is governed and reliable. Users can explore data without breaking things.
Conclusion: Semantic Layer First
Text-to-SQL is genuinely powerful. It can democratize data access and accelerate decision-making. But only if it’s built on a solid foundation.
That foundation is a semantic layer.
Without a semantic layer, text-to-SQL is a risky experiment that will fail spectacularly. With one, it’s a reliable system that scales.
If you’re considering a text-to-SQL project, don’t start with the LLM. Start with your semantic layer. Define your metrics. Govern your dimensions. Explicit your relationships. Build your semantic layer first, and the text-to-SQL system will work.
For organizations evaluating managed alternatives to Looker, Tableau, and Power BI—or for teams embedding self-serve analytics into their products—D23 provides a managed Apache Superset platform with semantic layer governance built in. You get text-to-SQL that actually works, self-serve BI that’s reliable, and embedded analytics that scale, all without building the infrastructure yourself.
The future of analytics isn’t about smarter LLMs. It’s about better semantic layers. Get that right, and everything else falls into place.