BigQuery Geospatial Analytics for Logistics and Real Estate
Master BigQuery geospatial functions for route optimization, site selection, and location analytics. Real-world examples for logistics and CRE.
Understanding BigQuery Geospatial Analytics
Geospatial analytics has transformed how organizations make location-based decisions. Whether you’re optimizing delivery routes for a logistics network or evaluating commercial real estate sites, the ability to analyze geographic data at scale unlocks competitive advantage. BigQuery, Google Cloud’s enterprise data warehouse, provides native geospatial capabilities that let you perform complex spatial analysis without leaving your SQL environment.
The foundation of BigQuery’s geospatial power is the GEOGRAPHY data type. Unlike traditional latitude-longitude pairs stored as separate numeric columns, GEOGRAPHY represents points, lines, and polygons as single objects. This means you can perform spatial operations—distance calculations, intersection tests, containment checks—directly in your queries. The introduction to geospatial analytics in BigQuery covers the GEOGRAPHY data type and GoogleSQL functions that form the backbone of location analysis.
For data teams at scale-ups and mid-market companies, this native integration matters. You’re not exporting data to a separate GIS tool, waiting for analysis, then importing results back. You’re running spatial queries alongside your business logic in the same warehouse. That speed compounds. A logistics company that can test 50 route variations in minutes instead of hours makes better decisions. A real estate fund evaluating 200 potential sites simultaneously identifies opportunities competitors miss.
The GEOGRAPHY Data Type: Foundation for Spatial Analysis
Before diving into specific use cases, understand what makes GEOGRAPHY different from storing coordinates as separate columns. A GEOGRAPHY object encodes location information in Well-Known Text (WKT) or Well-Known Binary (WKB) format, standardized representations used across GIS systems.
BigQuery supports four primary GEOGRAPHY types:
Points represent single locations. A delivery address, a warehouse, a retail store—these are points. In WKT format: POINT(longitude latitude). Note the order: longitude first, then latitude. This trips up many analysts accustomed to lat/long conventions.
LineStrings represent paths or routes. A truck’s delivery sequence, a proposed pipeline route, a flight path—these are lines connecting multiple points. WKT format: LINESTRING(lon1 lat1, lon2 lat2, lon3 lat3).
Polygons represent enclosed areas. A delivery zone, a real estate development boundary, a city district—these are closed shapes. WKT format: POLYGON((lon1 lat1, lon2 lat2, lon3 lat3, lon1 lat1)). Note the polygon closes by repeating the first point.
MultiGeometries combine multiple objects of the same type. A logistics company with 50 warehouses might store all warehouse locations as a MULTIPOINT. A real estate portfolio with properties across five states might use MULTIPOLYGON for regional boundaries.
Creating GEOGRAPHY objects from raw coordinates is straightforward. If you have latitude and longitude columns, use ST_GEOGPOINT(longitude, latitude) to create a point. If you have WKT strings, use ST_GEOGFROMTEXT(wkt_string). The function names follow a consistent pattern: ST_ prefix (Spatial Type), then the operation.
Spatial Functions: The Operations That Matter
Once you have GEOGRAPHY objects, BigQuery’s spatial functions let you ask meaningful questions about location data. The most commonly used functions in logistics and real estate fall into a few categories.
Distance calculations answer “how far apart are these locations?” The function ST_DISTANCE(geog1, geog2) returns distance in meters. For a logistics company, you’d calculate distances between delivery points to optimize routes. For real estate, you’d measure distance from a potential site to transit hubs, competitor properties, or demographic centers.
SELECT
warehouse_name,
delivery_address,
ST_DISTANCE(
ST_GEOGPOINT(warehouse_lon, warehouse_lat),
ST_GEOGPOINT(delivery_lon, delivery_lat)
) / 1000 AS distance_km
FROM logistics_network
ORDER BY distance_km;
This query calculates the straight-line distance (as the crow flies) from each warehouse to each delivery point. In practice, you’d combine this with routing APIs for actual driving distances, but the spatial calculation gives you a fast first filter.
Containment tests answer “is this point inside that area?” The function ST_CONTAINS(polygon, point) returns true if the point falls within the polygon. For logistics, you’d check whether a delivery address falls within a service zone. For real estate, you’d verify whether a site candidate falls within a target market or zoning district.
SELECT
property_id,
property_address,
CASE
WHEN ST_CONTAINS(downtown_zone, property_location) THEN 'Downtown'
WHEN ST_CONTAINS(midtown_zone, property_location) THEN 'Midtown'
WHEN ST_CONTAINS(uptown_zone, property_location) THEN 'Uptown'
ELSE 'Other'
END AS zone
FROM real_estate_candidates;
This approach scales. You can test a property against dozens of zones in a single query, assigning it to the correct region instantly.
Intersection tests answer “do these areas overlap?” The function ST_INTERSECTS(geog1, geog2) returns true if two geometries share any space. For logistics, you’d identify delivery zones that overlap, revealing redundancy. For real estate, you’d find sites that intersect with multiple market segments or zoning categories.
Buffer operations create zones around locations. ST_BUFFER(geog, distance_meters) expands a point into a circle or a line into a corridor. For logistics, you’d create a 5-kilometer buffer around a warehouse to identify all addresses within reasonable delivery distance. For real estate, you’d create a buffer around transit stations to find sites within walking distance.
SELECT
property_id,
property_address,
ST_INTERSECTS(
property_location,
ST_BUFFER(transit_hub, 800) -- 800 meter walk radius
) AS within_walking_distance
FROM real_estate_candidates
CROSS JOIN transit_hubs;
Aggregation functions summarize spatial data. ST_UNION_AGG(geog) combines multiple geometries into a single object. For logistics, you’d union all delivery points in a zone to understand the overall service area shape. For real estate, you’d union all candidate sites to visualize portfolio distribution.
These functions form the toolkit for spatial analysis. The tutorial on geospatial analysis in BigQuery walks through practical examples of each function in context.
Logistics Route Optimization: From Theory to Practice
Route optimization is where geospatial analytics directly impacts the bottom line. A logistics company with 100 daily deliveries across a city can’t manually plan routes. BigQuery geospatial functions automate the analysis, identifying inefficiencies and opportunities.
Start with the basics: cluster deliveries by proximity. Instead of assigning routes randomly, group addresses that are close together.
WITH delivery_clusters AS (
SELECT
delivery_id,
address,
location,
ST_CLUSTERDBSCAN(location, 500, 5) OVER () AS cluster_id
FROM daily_deliveries
WHERE delivery_date = CURRENT_DATE()
)
SELECT
cluster_id,
COUNT(*) AS deliveries_in_cluster,
ST_CENTROID(ST_UNION_AGG(location)) AS cluster_center
FROM delivery_clusters
GROUP BY cluster_id
ORDER BY deliveries_in_cluster DESC;
This query uses ST_CLUSTERDBSCAN, a clustering algorithm that groups nearby points. The parameters 500 (meters) and 5 (minimum points) define the cluster boundaries. Points within 500 meters of at least 5 other points form a cluster. The result: delivery addresses automatically grouped by geography.
Next, calculate the distance matrix between key locations—warehouses and cluster centers. This tells you which warehouse should service which cluster.
WITH cluster_summary AS (
SELECT
cluster_id,
ST_CENTROID(ST_UNION_AGG(location)) AS cluster_center,
COUNT(*) AS delivery_count
FROM daily_deliveries
WHERE delivery_date = CURRENT_DATE()
GROUP BY cluster_id
)
SELECT
w.warehouse_name,
c.cluster_id,
ST_DISTANCE(w.location, c.cluster_center) / 1000 AS distance_km,
c.delivery_count
FROM warehouses w
CROSS JOIN cluster_summary c
ORDER BY w.warehouse_name, distance_km;
This distance matrix reveals which warehouse is closest to each cluster. Assign clusters to the nearest warehouse, and you’ve reduced travel time before optimizing individual routes.
For more sophisticated analysis, layer in real-world constraints. Not all routes are created equal. A delivery in a congested downtown area takes longer than one in a suburban zone. A delivery requiring elevator access differs from street-level delivery. BigQuery lets you encode these constraints as additional columns and optimize accordingly.
The key insight: geospatial functions in BigQuery handle the geometric heavy lifting. You focus on business logic—cost, time, constraints—while the database handles distance calculations and spatial relationships. This separation of concerns makes iteration fast. Change a constraint, re-run the query, see results in seconds.
Real Estate Site Selection: Data-Driven Location Strategy
Real estate professionals evaluate sites based on dozens of factors. Location is paramount. Is the site near transit? What’s the demographic profile within a mile radius? Are there competitor properties nearby? How does the site rank against other candidates?
Geospatial analytics in BigQuery turns these questions into queries. Start by loading candidate properties and relevant geographic datasets. New geospatial datasets in BigQuery from Earth Engine and Google Maps Platform provide rich location context—land use, elevation, population density, and more.
Evaluate proximity to transit infrastructure:
WITH transit_buffers AS (
SELECT
transit_line,
ST_BUFFER(transit_route, 800) AS walkable_zone -- 800m walk radius
FROM transit_network
)
SELECT
p.property_id,
p.address,
COUNT(DISTINCT t.transit_line) AS nearby_transit_lines,
STRING_AGG(t.transit_line) AS transit_types
FROM real_estate_candidates p
JOIN transit_buffers t
ON ST_INTERSECTS(p.location, t.walkable_zone)
GROUP BY p.property_id, p.address
ORDER BY nearby_transit_lines DESC;
This query identifies which candidate sites intersect with walkable zones around transit. Properties with multiple transit options rank higher—they attract tenants and customers who prefer not to drive.
Analyze competitive landscape by measuring distance to existing tenant locations:
WITH competitor_analysis AS (
SELECT
candidate.property_id,
candidate.address,
competitor.competitor_name,
ST_DISTANCE(candidate.location, competitor.location) / 1000 AS distance_km,
ROW_NUMBER() OVER (
PARTITION BY candidate.property_id
ORDER BY ST_DISTANCE(candidate.location, competitor.location)
) AS competitor_rank
FROM real_estate_candidates candidate
CROSS JOIN existing_competitors competitor
)
SELECT
property_id,
address,
ROUND(AVG(distance_km), 2) AS avg_distance_to_competitors_km,
MAX(CASE WHEN competitor_rank = 1 THEN distance_km END) AS nearest_competitor_km,
MAX(CASE WHEN competitor_rank = 1 THEN competitor_name END) AS nearest_competitor
FROM competitor_analysis
GROUP BY property_id, address
ORDER BY avg_distance_to_competitors_km DESC;
This reveals competitive positioning. A site far from competitors may indicate an untapped market or a location no one wants. Context matters—you’d interpret results differently for a luxury retailer (wants prestige location, avoids mass-market competitors) versus a quick-service restaurant (wants high foot traffic, tolerates nearby similar concepts).
For sophisticated real estate analysis, integrate Places Insights in BigQuery, which provides statistical geospatial data on places for retail, hospitality, and location-based decisions. This dataset includes foot traffic patterns, visitor demographics, and dwell times—invaluable for site selection.
Create a composite score ranking all candidates:
WITH scoring AS (
SELECT
p.property_id,
p.address,
-- Transit accessibility (0-25 points)
LEAST(25, COALESCE(transit_score.nearby_lines, 0) * 5) AS transit_points,
-- Competitive position (0-25 points)
LEAST(25, GREATEST(0, 25 - (competitive.avg_distance / 100))) AS competition_points,
-- Demographics (0-25 points)
CASE
WHEN demo.median_income > 100000 THEN 25
WHEN demo.median_income > 75000 THEN 20
WHEN demo.median_income > 50000 THEN 15
ELSE 10
END AS demographic_points,
-- Zoning compliance (0-25 points)
CASE WHEN zoning.compliant THEN 25 ELSE 0 END AS zoning_points
FROM real_estate_candidates p
LEFT JOIN transit_score ON p.property_id = transit_score.property_id
LEFT JOIN competitive ON p.property_id = competitive.property_id
LEFT JOIN demographics demo ON ST_CONTAINS(demo.zone, p.location)
LEFT JOIN zoning ON ST_CONTAINS(zoning.zone, p.location)
)
SELECT
property_id,
address,
transit_points + competition_points + demographic_points + zoning_points AS total_score,
RANK() OVER (ORDER BY total_score DESC) AS rank
FROM scoring
ORDER BY total_score DESC;
This composite scoring system combines multiple factors into a single ranking. Different weights reflect business priorities—a retailer might weight demographics heavily, while a logistics hub prioritizes transit and zoning.
Integrating Geospatial Data with Your Analytics Stack
Geospatial analysis in BigQuery is powerful, but it’s most valuable when integrated into your broader analytics workflow. At D23, we help teams embed geospatial analytics into self-serve BI dashboards and AI-powered analytics products.
Consider this workflow: Your logistics team runs the clustering and route optimization queries in BigQuery. The results—optimized routes, cluster assignments, distance metrics—flow into a dashboard built on Apache Superset. Dispatchers see routes on a map, with real-time updates as conditions change. Managers see performance metrics: average delivery time, routes completed on schedule, cost per delivery.
For real estate teams, site evaluation queries produce a ranking. That ranking appears in a dashboard showing candidate properties on a map, with color-coding based on composite score. Clicking a property reveals detailed analysis: nearby transit, competitor proximity, demographic data, zoning status. Decision-makers make choices based on complete information, visualized intuitively.
The technical bridge between BigQuery and visualization is your BI platform. BigQuery can export GEOGRAPHY objects as GeoJSON, a standard format for web-based mapping. Most modern BI tools—including Apache Superset—support GeoJSON visualization. CARTO’s guide to spatial analysis in BigQuery shows how to integrate advanced spatial analysis with visualization and decision-making.
For teams embedding analytics into products, the workflow differs slightly. Your product’s backend queries BigQuery geospatial functions via API, retrieves results, and renders them in your application. A logistics SaaS platform might let customers upload their delivery addresses, then automatically calculate optimal routes using BigQuery geospatial functions. A real estate platform might let users explore candidate sites with interactive maps powered by geospatial queries.
This architecture scales because BigQuery handles the computational load. Geospatial queries on millions of points complete in seconds. Your application layer stays responsive, focusing on user experience rather than spatial computation.
Advanced Techniques: From Basics to Production
Once you master the fundamentals, several advanced techniques unlock deeper insights.
Spatial joins combine datasets based on geographic relationships rather than matching IDs. Instead of “join orders to customers where customer_id matches,” you’d join orders to regions where “delivery address falls within region boundary.” This is powerful for aggregating data by geography.
SELECT
region.region_name,
COUNT(DISTINCT order.order_id) AS orders,
SUM(order.revenue) AS total_revenue,
AVG(order.delivery_days) AS avg_delivery_time
FROM orders order
JOIN regions region
ON ST_CONTAINS(region.boundary, order.delivery_location)
GROUP BY region.region_name
ORDER BY total_revenue DESC;
Time-series geospatial analysis tracks how locations change over time. A logistics company might analyze how delivery density shifts across hours of the day. A real estate fund might track property value trends in specific geographic corridors.
WITH hourly_density AS (
SELECT
EXTRACT(HOUR FROM delivery_time) AS hour,
ST_GEOHASH(delivery_location, 6) AS geohash,
COUNT(*) AS delivery_count
FROM daily_deliveries
GROUP BY hour, geohash
)
SELECT
hour,
geohash,
delivery_count,
LEAD(delivery_count) OVER (
PARTITION BY geohash
ORDER BY hour
) AS next_hour_count
FROM hourly_density
ORDER BY hour, delivery_count DESC;
Geohashing divides the world into grid cells, making spatial aggregation efficient. This query shows delivery volume by location and hour, revealing peak demand times and places.
Machine learning on geospatial features combines spatial analysis with predictive modeling. BigQuery ML lets you train models that predict outcomes based on location. Will a delivery complete on time based on its location and current traffic? Will a real estate site succeed based on its proximity to transit, competitors, and demographics?
CREATE OR REPLACE MODEL logistics.delivery_success_predictor
OPTIONS(
model_type='LINEAR_REG',
input_label_cols=['on_time']
) AS
SELECT
delivery_id,
ST_DISTANCE(warehouse_location, delivery_location) / 1000 AS distance_km,
EXTRACT(HOUR FROM delivery_time) AS hour_of_day,
traffic_level,
on_time
FROM historical_deliveries
WHERE delivery_date BETWEEN '2023-01-01' AND '2023-12-31';
This model learns which spatial and temporal factors predict on-time delivery. You can then use it to score new deliveries, identifying high-risk routes before they happen.
Practical Considerations: Performance and Cost
Geospatial queries are powerful but can be resource-intensive. A few practical considerations ensure your analysis remains fast and cost-effective.
Index your GEOGRAPHY columns. BigQuery supports spatial indexes that accelerate containment and intersection tests. For large datasets, this makes the difference between a query that completes in seconds versus minutes.
Use ST_SIMPLIFY to reduce geometry complexity. Real-world polygon boundaries often have thousands of points (think of a city boundary with intricate coastlines). Simplifying reduces storage and query time without sacrificing accuracy for most analyses.
SELECT
region_id,
region_name,
ST_SIMPLIFY(boundary, 100) -- Simplify within 100m tolerance
AS simplified_boundary
FROM regions;
Materialize intermediate results. For complex analyses with multiple spatial operations, calculate intermediate results and store them. This avoids recalculating the same spatial relationships repeatedly.
Monitor query costs. BigQuery charges by data scanned, not query count. A geospatial query scanning a large table costs more than a simple filter. Use EXPLAIN to understand query execution and optimize accordingly.
Visualization and Decision-Making
Geospatial analysis in BigQuery produces numbers and geometries. Visualization turns them into insights.
For logistics, map-based dashboards showing routes, clusters, and performance metrics help dispatchers and managers make fast decisions. A heat map showing delivery density by hour and location reveals patterns that raw numbers hide. A route visualization showing actual paths taken versus optimal paths quantifies inefficiency.
For real estate, map-based property scoring helps investors compare candidates. A dashboard showing candidate sites color-coded by composite score, with overlays for transit, competitors, and demographics, enables data-driven site selection. Clicking a property reveals detailed analysis supporting the recommendation.
The tutorial on visualizing BigQuery geospatial data in Colab demonstrates how to analyze and visualize geospatial results using Python and web-based mapping libraries. For production dashboards, practical examples of geospatial analytics show integration patterns with BI tools.
At D23, we help teams build dashboards that make geospatial insights actionable. Our platform integrates with BigQuery, letting you create interactive maps and dashboards without leaving your analytics environment. For teams embedding analytics into products, our API-first approach supports geospatial visualization at scale.
Building Your Geospatial Analytics Practice
Geospatial analytics isn’t a one-time analysis—it’s an ongoing practice. Start small: pick a single use case (route optimization or site evaluation), build the queries, validate results against real-world outcomes, then expand.
Invest in data quality. Geospatial analysis is only as good as your location data. Ensure delivery addresses and property coordinates are accurate. Use geocoding services to standardize address formats and resolve ambiguities.
Build institutional knowledge. Geospatial analysis requires different thinking than traditional analytics. Your team needs to understand spatial relationships, distance calculations, and geometric operations. Invest in training and documentation.
Integrate with decision-making processes. The most sophisticated analysis is worthless if it doesn’t influence decisions. Build dashboards and reports that decision-makers actually use. Iterate based on feedback.
For teams standardizing analytics across organizations—private equity firms managing portfolio companies, venture capital firms tracking fund performance—geospatial analytics becomes a competitive advantage. The ability to evaluate sites, optimize logistics, and understand geographic patterns at scale drives value creation.
Conclusion: Location Intelligence at Scale
BigQuery geospatial analytics transforms how organizations use location data. Native GEOGRAPHY support, comprehensive spatial functions, and integration with Google Cloud’s ecosystem make complex spatial analysis accessible to any team with SQL skills.
For logistics companies, geospatial analytics optimizes routes, reduces costs, and improves service. For real estate teams, it enables data-driven site selection and portfolio analysis. For any organization making location-based decisions, it provides competitive advantage.
The technical foundation is solid: GEOGRAPHY data types, spatial functions like ST_DISTANCE and ST_CONTAINS, clustering and aggregation operations. The business applications are clear: route optimization, site selection, competitive analysis, demographic analysis, and predictive modeling.
Start with BigQuery’s geospatial documentation, build your first query, and iterate. The insights you uncover will drive decisions that impact your bottom line. For teams building on this foundation with dashboards and embedded analytics, D23’s platform provides the visualization and decision-making layer that turns geospatial data into action.
Geospatial analytics is no longer a specialized domain. It’s a core capability for data-driven organizations. Master it, and you’ll make better decisions faster.