This tool helps you audit your GA4 BigQuery data using pre-built SQL queries. The export schema is consistent for all GA4 users, but you can refer to this official documentation to understand how the data is structured.
Click โโก Update Queriesโ after entering your project details to generate ready-to-use queries.
Below you'll find a collection of ready-to-use GA4 BigQuery audit queries โ organised by theme. Each one is designed to help you investigate tracking issues, spot unusual patterns, and surface strategic insights. Click a section to expand and copy any query to your clipboard.
Identifies sessions with unusually short (<2s) or long (>30min) durations that may indicate bots or tracking issues.
WITH session_data AS ( SELECT user_pseudo_id, CAST((SELECT value.int_value FROM UNNEST(event_params) WHERE key = 'ga_session_id') AS INT64) AS ga_session_id, event_timestamp FROM `{{PROJECT_ID}}.{{DATASET}}.events_*` WHERE _TABLE_SUFFIX BETWEEN '{{START_DATE}}' AND '{{END_DATE}}' ) SELECT CONCAT(user_pseudo_id, '-', CAST(ga_session_id AS STRING)) AS session_id, user_pseudo_id, MIN(event_timestamp) AS session_start, MAX(event_timestamp) AS session_end, COUNT(*) AS event_count, ROUND((MAX(event_timestamp) - MIN(event_timestamp)) / 1000000, 2) AS session_duration_seconds FROM session_data GROUP BY user_pseudo_id, ga_session_id HAVING event_count > 1 AND (session_duration_seconds < 2 OR session_duration_seconds > 1800) ORDER BY session_duration_seconds
Flags users with over 100 distinct sessions in a single day, a pattern often associated with non-human activity.
SELECT user_pseudo_id, event_date, COUNT(DISTINCT CONCAT(user_pseudo_id, '-', CAST((SELECT value.int_value FROM UNNEST(event_params) WHERE key = 'ga_session_id') AS STRING))) AS session_count FROM `{{PROJECT_ID}}.{{DATASET}}.events_*` WHERE _TABLE_SUFFIX BETWEEN '{{START_DATE}}' AND '{{END_DATE}}' GROUP BY user_pseudo_id, event_date HAVING session_count > 100 ORDER BY session_count DESC
Detects sessions with only one repeated event name occurring more than 50 times โ a typical sign of automation or malfunction.
WITH session_data AS ( SELECT user_pseudo_id, event_name, CAST((SELECT value.int_value FROM UNNEST(event_params) WHERE key = 'ga_session_id') AS INT64) AS ga_session_id FROM `{{PROJECT_ID}}.{{DATASET}}.events_*` WHERE _TABLE_SUFFIX BETWEEN '{{START_DATE}}' AND '{{END_DATE}}' ), session_summary AS ( SELECT user_pseudo_id, ga_session_id, CONCAT(user_pseudo_id, '-', CAST(ga_session_id AS STRING)) AS session_id, COUNT(*) AS total_events, COUNT(DISTINCT event_name) AS unique_event_names, ARRAY_AGG(event_name ORDER BY event_name LIMIT 1)[OFFSET(0)] AS repeated_event FROM session_data GROUP BY user_pseudo_id, ga_session_id ), event_counts AS ( SELECT user_pseudo_id, ga_session_id, event_name, COUNT(*) AS event_name_count FROM session_data GROUP BY user_pseudo_id, ga_session_id, event_name ) SELECT s.session_id, s.user_pseudo_id, s.total_events, s.unique_event_names, s.repeated_event, ec.event_name_count AS repeated_event_count FROM session_summary s LEFT JOIN event_counts ec ON s.user_pseudo_id = ec.user_pseudo_id AND s.ga_session_id = ec.ga_session_id AND s.repeated_event = ec.event_name WHERE s.unique_event_names = 1 AND s.total_events > 50 ORDER BY s.total_events DESC
Finds the most common pages users visited immediately before landing on a 404 error page. Helps identify broken internal links or navigation issues.
WITH page_views AS ( SELECT user_pseudo_id, CAST((SELECT value.int_value FROM UNNEST(event_params) WHERE key = 'ga_session_id') AS INT64) AS ga_session_id, event_timestamp, (SELECT value.string_value FROM UNNEST(event_params) WHERE key = 'page_location') AS page_location FROM `{{PROJECT_ID}}.{{DATASET}}.events_*` WHERE _TABLE_SUFFIX BETWEEN '{{START_DATE}}' AND '{{END_DATE}}' AND event_name = 'page_view' ), broken_pages AS ( SELECT user_pseudo_id, ga_session_id, event_timestamp AS broken_time, page_location AS broken_page FROM page_views WHERE page_location LIKE '%/404%' ), valid_pages AS ( SELECT user_pseudo_id, ga_session_id, event_timestamp AS valid_time, page_location AS previous_page FROM page_views WHERE page_location NOT LIKE '%/404%' ) SELECT CONCAT(bp.user_pseudo_id, '-', CAST(bp.ga_session_id AS STRING)) AS session_id, bp.broken_page, vp.previous_page, COUNT(*) AS hits FROM broken_pages bp JOIN valid_pages vp ON bp.user_pseudo_id = vp.user_pseudo_id AND bp.ga_session_id = vp.ga_session_id AND vp.valid_time < bp.broken_time WHERE NOT EXISTS ( SELECT 1 FROM valid_pages vp2 WHERE vp2.user_pseudo_id = vp.user_pseudo_id AND vp2.ga_session_id = vp.ga_session_id AND vp2.valid_time < bp.broken_time AND vp2.valid_time > vp.valid_time ) GROUP BY session_id, bp.broken_page, vp.previous_page ORDER BY hits DESC
Tracks the daily volume of page views to 404 pages, allowing you to monitor spikes in broken traffic over time.
SELECT PARSE_DATE('%Y%m%d', event_date) AS date, COUNT(*) AS hits_404 FROM `{{PROJECT_ID}}.{{DATASET}}.events_*`, UNNEST(event_params) AS event_params WHERE _TABLE_SUFFIX BETWEEN '{{START_DATE}}' AND '{{END_DATE}}' AND event_name = 'page_view' AND event_params.key = 'page_location' AND event_params.value.string_value LIKE '%/404%' GROUP BY date ORDER BY date
Shows which marketing sources and mediums are driving users to 404 pages, helping isolate issues caused by campaign links.
SELECT traffic_source.source, traffic_source.medium, event_params.value.string_value AS broken_page, COUNT(*) AS hits FROM `{{PROJECT_ID}}.{{DATASET}}.events_*`, UNNEST(event_params) AS event_params WHERE _TABLE_SUFFIX BETWEEN '{{START_DATE}}' AND '{{END_DATE}}' AND event_name = 'page_view' AND event_params.key = 'page_location' AND event_params.value.string_value LIKE '%/404%' GROUP BY traffic_source.source, traffic_source.medium, broken_page ORDER BY hits DESC
Identifies sessions where source or medium values are missing or marked as '(not set)', which can indicate tracking gaps or misconfigured campaign links.
SELECT DISTINCT traffic_source.source, traffic_source.medium, traffic_source.name AS campaign FROM `{{PROJECT_ID}}.{{DATASET}}.events_*` WHERE _TABLE_SUFFIX BETWEEN '{{START_DATE}}' AND '{{END_DATE}}' AND ( traffic_source.medium IS NULL OR traffic_source.medium = '(not set)' OR traffic_source.source IS NULL OR traffic_source.source = '(not set)' )
Counts the number of unique campaign names, sources, and mediums in your data. A very high count could suggest tag pollution or inconsistent naming conventions.
SELECT COUNT(DISTINCT traffic_source.name) AS unique_campaigns, COUNT(DISTINCT traffic_source.source) AS unique_sources, COUNT(DISTINCT traffic_source.medium) AS unique_mediums FROM `{{PROJECT_ID}}.{{DATASET}}.events_*` WHERE _TABLE_SUFFIX BETWEEN '{{START_DATE}}' AND '{{END_DATE}}'
Finds sessions that include a gclid
parameter (indicating Google Ads traffic) but don't list Google as the source, suggesting broken auto-tagging or incorrect attribution.
SELECT traffic_source.source, traffic_source.medium, (SELECT value.string_value FROM UNNEST(event_params) WHERE key = 'gclid') AS gclid, COUNT(*) AS sessions FROM `{{PROJECT_ID}}.{{DATASET}}.events_*` WHERE _TABLE_SUFFIX BETWEEN '{{START_DATE}}' AND '{{END_DATE}}' AND (SELECT value.string_value FROM UNNEST(event_params) WHERE key = 'gclid') IS NOT NULL AND LOWER(traffic_source.source) NOT LIKE '%google%' GROUP BY traffic_source.source, traffic_source.medium, gclid ORDER BY sessions DESC
Counts daily sessions where traffic_source.source
contains 'chatgpt'
, helping you track AIโdriven traffic trends over time.
WITH sessions AS ( SELECT CAST( (SELECT value.int_value FROM UNNEST(event_params) WHERE key = 'ga_session_id' LIMIT 1 ) AS INT64 ) AS ga_session_id, DATE(TIMESTAMP_MICROS(MIN(event_timestamp))) AS session_date FROM `{{PROJECT_ID}}.{{DATASET}}.events_*` WHERE LOWER(traffic_source.source) LIKE '%chatgpt%' AND _TABLE_SUFFIX BETWEEN '{{START_DATE}}' AND '{{END_DATE}}' GROUP BY ga_session_id ) SELECT session_date, COUNT(*) AS sessions_count FROM sessions GROUP BY session_date ORDER BY session_date DESC;
This query compares the volume of sessions by consent status (ad_storage / analytics_storage) with the number of purchases in each group. It also highlights completely anonymous pings with no user or session identifiersโoften resulting from users declining consent or browser limitations.
WITH all_events AS ( SELECT user_pseudo_id, CAST((SELECT value.int_value FROM UNNEST(event_params) WHERE key = 'ga_session_id') AS INT64) AS ga_session_id, event_name, privacy_info.analytics_storage AS analytics_storage, privacy_info.ads_storage AS ads_storage FROM `{{PROJECT_ID}}.{{DATASET}}.events_*` WHERE _TABLE_SUFFIX BETWEEN '{{START_DATE}}' AND '{{END_DATE}}' ), sessions AS ( SELECT CONCAT(user_pseudo_id, '-', CAST(ga_session_id AS STRING)) AS session_id, user_pseudo_id, ga_session_id, analytics_storage, ads_storage FROM all_events WHERE user_pseudo_id IS NOT NULL AND ga_session_id IS NOT NULL GROUP BY session_id, user_pseudo_id, ga_session_id, analytics_storage, ads_storage ), purchases AS ( SELECT DISTINCT CONCAT(user_pseudo_id, '-', CAST((SELECT value.int_value FROM UNNEST(event_params) WHERE key = 'ga_session_id') AS STRING)) AS session_id FROM `{{PROJECT_ID}}.{{DATASET}}.events_*` WHERE _TABLE_SUFFIX BETWEEN '{{START_DATE}}' AND '{{END_DATE}}' AND event_name = 'purchase' ), session_summary AS ( SELECT 'session_group' AS type, analytics_storage, ads_storage, COUNT(*) AS count, COUNT(*) AS total_sessions, COUNTIF(session_id IN (SELECT session_id FROM purchases)) AS total_purchases FROM sessions GROUP BY analytics_storage, ads_storage ), cookieless_summary AS ( SELECT 'cookieless_pings' AS type, CAST(NULL AS STRING) AS analytics_storage, CAST(NULL AS STRING) AS ads_storage, COUNT(*) AS count, CAST(NULL AS INT64) AS total_sessions, CAST(NULL AS INT64) AS total_purchases FROM all_events WHERE user_pseudo_id IS NULL AND ga_session_id IS NULL ) SELECT * FROM session_summary UNION ALL SELECT * FROM cookieless_summary ORDER BY count DESC
Identifies events with a high average number of parameters, which may cause row bloat, higher processing costs, or tracking inefficiencies. Useful for flagging events that may be over-engineered.
SELECT event_name, ROUND(AVG(param_count), 2) AS avg_params, MAX(param_count) AS max_params, COUNT(*) AS total_events FROM ( SELECT event_name, ARRAY_LENGTH(event_params) AS param_count FROM `{{PROJECT_ID}}.{{DATASET}}.events_*` WHERE _TABLE_SUFFIX BETWEEN '{{START_DATE}}' AND '{{END_DATE}}' ) GROUP BY event_name HAVING avg_params > 10 ORDER BY avg_params DESC
Returns the most frequent non-standard GA4 events, excluding built-in ones like page_view
or session_start
. Helps you review your most active custom tracking points.
SELECT event_name, COUNT(*) AS event_count FROM `{{PROJECT_ID}}.{{DATASET}}.events_*` WHERE _TABLE_SUFFIX BETWEEN '{{START_DATE}}' AND '{{END_DATE}}' AND event_name NOT IN ( 'page_view', 'session_start', 'user_engagement', 'scroll', 'click', 'view_search_results', 'purchase', 'add_to_cart', 'begin_checkout' ) GROUP BY event_name ORDER BY event_count DESC LIMIT 20
Audits key eCommerce events like purchase
, add_to_cart
, etc., for missing or invalid required parameters. It also checks if items
arrays are missing entirely from purchase events.
-- eCommerce parameter audit and empty item detection [See query above]
Finds parameters with placeholder, null, or otherwise suspicious values like 'undefined'
or 'nan'
. Useful for identifying tagging errors or poorly populated fields.
SELECT event_name, param.key AS parameter_name, param.value.string_value AS suspicious_value, COUNT(*) AS occurrences FROM `{{PROJECT_ID}}.{{DATASET}}.events_*`, UNNEST(event_params) AS param WHERE _TABLE_SUFFIX BETWEEN '{{START_DATE}}' AND '{{END_DATE}}' AND LOWER(param.value.string_value) IN ('null', 'undefined', 'nan', '') GROUP BY event_name, parameter_name, suspicious_value ORDER BY occurrences DESC
Extracts all page_view
events where any parameter key is a common PII name (email, phone, etc.) or the value matches an email pattern.
SELECT event_timestamp, user_pseudo_id, (SELECT value.string_value FROM UNNEST(event_params) WHERE key = 'page_location' ) AS page_location, ep.key AS param_key, ep.value.string_value AS param_value FROM `{{PROJECT_ID}}.{{DATASET}}.events_*`, UNNEST(event_params) AS ep WHERE event_name = 'page_view' AND ( LOWER(ep.key) IN ('email', 'phone', 'name', 'address') OR REGEXP_CONTAINS(ep.value.string_value, r'[\w\.-]+@[\w\.-]+\.\w+') ) LIMIT 1000;
Shows session counts, purchases, conversion rate, and traffic share for each device/platform/browser/OS combination. Use it to compare how well different devices convert and which contribute most to your traffic.
WITH sessions AS ( SELECT user_pseudo_id, CONCAT(user_pseudo_id, '-', CAST(( SELECT value.int_value FROM UNNEST(event_params) WHERE key = 'ga_session_id' ) AS STRING)) AS session_id, platform, device.web_info.browser AS browser, device.operating_system AS os, device.category AS device_type, COUNTIF(event_name = 'purchase') AS purchases FROM `{{PROJECT_ID}}.{{DATASET}}.events_*` WHERE _TABLE_SUFFIX BETWEEN '{{START_DATE}}' AND '{{END_DATE}}' GROUP BY user_pseudo_id, session_id, platform, browser, os, device_type ) SELECT platform, browser, os, device_type, COUNT(*) AS sessions, SUM(purchases) AS conversions, ROUND(SAFE_DIVIDE(SUM(purchases), COUNT(*)) * 100, 2) AS conversion_rate, ROUND(SAFE_DIVIDE(COUNT(*), SUM(COUNT(*)) OVER ()) * 100, 2) AS traffic_share_pct FROM sessions GROUP BY platform, browser, os, device_type ORDER BY traffic_share_pct DESC
Flags unusual or suspicious combinations of browsers and operating systems that don't typically appear together. Helpful for bot detection or debugging client-side implementation errors.
WITH event_device_data AS ( SELECT device.web_info.browser AS browser, device.operating_system AS operating_system, COUNT(*) AS event_count FROM `{{PROJECT_ID}}.{{DATASET}}.events_*` WHERE _TABLE_SUFFIX BETWEEN '{{START_DATE}}' AND '{{END_DATE}}' GROUP BY browser, operating_system ), known_valid_pairs AS ( SELECT 'Chrome' AS browser, 'Windows' AS operating_system UNION ALL SELECT 'Chrome', 'macOS' UNION ALL SELECT 'Chrome', 'Android' UNION ALL SELECT 'Chrome', 'iOS' UNION ALL SELECT 'Chrome', 'Chrome OS' UNION ALL SELECT 'Safari', 'iOS' UNION ALL SELECT 'Safari', 'macOS' UNION ALL SELECT 'Safari', 'Macintosh' UNION ALL SELECT 'Safari (in-app)', 'iOS' UNION ALL SELECT 'Edge', 'Windows' UNION ALL SELECT 'Firefox', 'Windows' UNION ALL SELECT 'Firefox', 'macOS' UNION ALL SELECT 'Samsung Internet', 'Android' UNION ALL SELECT 'Internet Explorer', 'Windows' UNION ALL SELECT 'Android Webview', 'Android' UNION ALL SELECT 'Android Browser', 'Android' ) SELECT e.browser, e.operating_system, e.event_count, IF(k.browser IS NULL, 'โ ๏ธ Unusual Combination', 'Valid') AS validity FROM event_device_data e LEFT JOIN known_valid_pairs k ON e.browser = k.browser AND e.operating_system = k.operating_system ORDER BY validity DESC, e.event_count DESC
Identifies transaction IDs that appear in more than one session or on multiple dates โ a potential sign of deduplication issues or tracking bugs.
SELECT (SELECT value.string_value FROM UNNEST(event_params) WHERE key = 'transaction_id') AS transaction_id, COUNT(*) AS transaction_count, COUNT(DISTINCT CONCAT(user_pseudo_id, '-', CAST(( SELECT value.int_value FROM UNNEST(event_params) WHERE key = 'ga_session_id' ) AS STRING))) AS unique_sessions, COUNT(DISTINCT event_date) AS unique_days FROM `{{PROJECT_ID}}.{{DATASET}}.events_*` WHERE _TABLE_SUFFIX BETWEEN '{{START_DATE}}' AND '{{END_DATE}}' AND event_name = 'purchase' GROUP BY transaction_id HAVING transaction_count > 1 ORDER BY transaction_count DESC
Compares GA4 revenue values (based on the 'value' parameter in purchase events) against your backend data to identify discrepancies.
SELECT ga.transaction_id, ga.ga4_revenue, backend.revenue AS backend_revenue, ga.ga4_revenue - backend.revenue AS variance, ROUND(SAFE_DIVIDE(ga.ga4_revenue - backend.revenue, backend.revenue) * 100, 2) AS variance_pct FROM ( SELECT (SELECT value.string_value FROM UNNEST(event_params) WHERE key = 'transaction_id') AS transaction_id, SUM((SELECT value.double_value FROM UNNEST(event_params) WHERE key = 'value')) AS ga4_revenue FROM `{{PROJECT_ID}}.{{DATASET}}.events_*` WHERE _TABLE_SUFFIX BETWEEN '{{START_DATE}}' AND '{{END_DATE}}' AND event_name = 'purchase' GROUP BY transaction_id ) ga JOIN `your_project.your_dataset.backend_transactions` AS backend ON ga.transaction_id = backend.transaction_id ORDER BY ABS(variance_pct) DESC
Flags sessions where a purchase occurred but the user did not trigger an add_to_cart
or begin_checkout
event โ often caused by broken funnel instrumentation or third-party integrations.
WITH all_events_with_sessions AS ( SELECT CONCAT(user_pseudo_id, '-', CAST(( SELECT value.int_value FROM UNNEST(event_params) WHERE key = 'ga_session_id' ) AS STRING)) AS session_id, user_pseudo_id, event_name FROM `{{PROJECT_ID}}.{{DATASET}}.events_*` WHERE _TABLE_SUFFIX BETWEEN '{{START_DATE}}' AND '{{END_DATE}}' ), session_event_counts AS ( SELECT session_id, COUNTIF(event_name = 'add_to_cart') AS add_to_cart_count, COUNTIF(event_name = 'begin_checkout') AS begin_checkout_count, COUNTIF(event_name = 'purchase') AS purchase_count FROM all_events_with_sessions GROUP BY session_id ), purchasing_sessions AS ( SELECT * FROM session_event_counts WHERE purchase_count > 0 ) SELECT COUNT(*) AS purchasing_sessions, COUNTIF(add_to_cart_count = 0) AS sessions_with_no_add_to_cart, COUNTIF(begin_checkout_count = 0) AS sessions_with_no_begin_checkout FROM purchasing_sessions
Checks for purchase events where the items
array is either missing or empty. These transactions will show up in GA4 totals but wonโt include product-level data.
SELECT COUNT(DISTINCT ( SELECT value.string_value FROM UNNEST(event_params) WHERE key = 'transaction_id' )) AS transactions_without_items FROM `{{PROJECT_ID}}.{{DATASET}}.events_*` WHERE _TABLE_SUFFIX BETWEEN '{{START_DATE}}' AND '{{END_DATE}}' AND event_name = 'purchase' AND (items IS NULL OR ARRAY_LENGTH(items) = 0)
Categorises sessions based on the number of events they contain. Useful for identifying sessions that are unusually light or heavy in event activity, which can indicate bots, passive sessions, or broken tracking.
WITH sessions AS ( SELECT CONCAT(user_pseudo_id, '-', CAST(( SELECT value.int_value FROM UNNEST(event_params) WHERE key = 'ga_session_id' ) AS STRING)) AS session_id, COUNT(*) AS total_events FROM `{{PROJECT_ID}}.{{DATASET}}.events_*` WHERE _TABLE_SUFFIX BETWEEN '{{START_DATE}}' AND '{{END_DATE}}' GROUP BY session_id, user_pseudo_id ), bucketed_sessions AS ( SELECT CASE WHEN total_events < 50 THEN '< 50' WHEN total_events BETWEEN 50 AND 99 THEN '50โ99' WHEN total_events BETWEEN 100 AND 149 THEN '100โ149' ELSE '150+' END AS event_bracket FROM sessions ) SELECT event_bracket, COUNT(*) AS session_count FROM bucketed_sessions GROUP BY event_bracket ORDER BY CASE event_bracket WHEN '< 50' THEN 1 WHEN '50โ99' THEN 2 WHEN '100โ149' THEN 3 WHEN '150+' THEN 4 END
Lists the first event triggered in each session and counts how often each event was the first interaction. Helpful for identifying unexpected session entry points or broken landing page logic.
WITH events_with_session_id AS ( SELECT user_pseudo_id, CONCAT(user_pseudo_id, '-', CAST(( SELECT value.int_value FROM UNNEST(event_params) WHERE key = 'ga_session_id' ) AS STRING)) AS session_id, event_name, event_timestamp FROM `{{PROJECT_ID}}.{{DATASET}}.events_*` WHERE _TABLE_SUFFIX BETWEEN '{{START_DATE}}' AND '{{END_DATE}}' ), first_event_per_session AS ( SELECT session_id, event_name AS first_event FROM ( SELECT session_id, event_name, ROW_NUMBER() OVER (PARTITION BY session_id ORDER BY event_timestamp) AS rn FROM events_with_session_id ) WHERE rn = 1 ) SELECT first_event AS event_name, COUNT(DISTINCT session_id) AS sessions_starting_with_event FROM first_event_per_session GROUP BY first_event ORDER BY sessions_starting_with_event DESC
Calculates the percentage of sessions that lack key engagement events like user_engagement
, scroll
, or click
, and only have one or two total events. These may be bounces or sessions from passive users or bots.
WITH sessions AS ( SELECT CONCAT(user_pseudo_id, '-', CAST(( SELECT value.int_value FROM UNNEST(event_params) WHERE key = 'ga_session_id' ) AS STRING)) AS session_id, ARRAY_AGG(DISTINCT event_name) AS event_list, COUNT(DISTINCT event_name) AS unique_events FROM `{{PROJECT_ID}}.{{DATASET}}.events_*` WHERE _TABLE_SUFFIX BETWEEN '{{START_DATE}}' AND '{{END_DATE}}' GROUP BY session_id ), summary AS ( SELECT COUNT(*) AS total_sessions, COUNTIF( NOT 'user_engagement' IN UNNEST(event_list) AND NOT 'scroll' IN UNNEST(event_list) AND NOT 'click' IN UNNEST(event_list) AND unique_events <= 2 ) AS non_engaged_sessions FROM sessions ) SELECT total_sessions, non_engaged_sessions, ROUND(SAFE_DIVIDE(non_engaged_sessions, total_sessions) * 100, 2) AS non_engaged_pct FROM summary
Aggregates all yearsโ data by calendar month to compute average sessions and transactions per month, then derives seasonality multipliers (and conversionโrate index) rounded to two decimals.
WITH sessions_per_month_year AS ( SELECT FORMAT_TIMESTAMP('%Y-%m', TIMESTAMP_MICROS(event_timestamp)) AS month_year, EXTRACT(MONTH FROM TIMESTAMP_MICROS(event_timestamp)) AS month_num, FORMAT_TIMESTAMP('%B', TIMESTAMP_MICROS(event_timestamp)) AS month_name, COUNT(DISTINCT CAST( (SELECT value.int_value FROM UNNEST(event_params) WHERE key = 'ga_session_id' LIMIT 1 ) AS INT64 )) AS sessions_count FROM `{{PROJECT_ID}}.{{DATASET}}.events_*` WHERE event_name = 'session_start' GROUP BY month_year, month_num, month_name ), transactions_per_month_year AS ( SELECT FORMAT_TIMESTAMP('%Y-%m', TIMESTAMP_MICROS(event_timestamp)) AS month_year, EXTRACT(MONTH FROM TIMESTAMP_MICROS(event_timestamp)) AS month_num, FORMAT_TIMESTAMP('%B', TIMESTAMP_MICROS(event_timestamp)) AS month_name, COUNT(*) AS transactions_count FROM `{{PROJECT_ID}}.{{DATASET}}.events_*` WHERE event_name = 'purchase' GROUP BY month_year, month_num, month_name ), combined_month_year AS ( SELECT s.month_year, s.month_num, s.month_name, s.sessions_count, COALESCE(t.transactions_count, 0) AS transactions_count FROM sessions_per_month_year s LEFT JOIN transactions_per_month_year t USING(month_year, month_num, month_name) ), stats AS ( SELECT COUNT(*) AS months_count, SUM(sessions_count) / COUNT(*) AS overall_avg_sessions, SUM(transactions_count) / COUNT(*) AS overall_avg_transactions, SAFE_DIVIDE(SUM(transactions_count), SUM(sessions_count)) AS overall_avg_conv_rate FROM combined_month_year ), agg_by_month AS ( SELECT month_num, month_name, AVG(sessions_count) AS avg_sessions_by_month, AVG(transactions_count) AS avg_transactions_by_month, SAFE_DIVIDE( AVG(transactions_count), AVG(sessions_count) ) AS avg_conv_rate_by_month FROM combined_month_year GROUP BY month_num, month_name ) SELECT month_name AS month, ROUND( SAFE_DIVIDE(avg_sessions_by_month, stats.overall_avg_sessions), 2 ) AS session_seasonality, ROUND( SAFE_DIVIDE(avg_transactions_by_month, stats.overall_avg_transactions), 2 ) AS transaction_seasonality, ROUND( SAFE_DIVIDE(avg_conv_rate_by_month, stats.overall_avg_conv_rate), 2 ) AS conversion_rate_seasonality FROM agg_by_month CROSS JOIN stats ORDER BY month_num;
Calculates how often sessions that include each event lead to a purchase. This is useful for identifying which events are most correlated with conversions and can help prioritise key interactions in the funnel.
-- Conversion Propensity by Event WITH session_purchases AS ( SELECT CONCAT(user_pseudo_id, '-', CAST(( SELECT value.int_value FROM UNNEST(event_params) WHERE key = 'ga_session_id' ) AS STRING)) AS session_id, MAX(IF(event_name = 'purchase', 1, 0)) AS converted FROM `{{PROJECT_ID}}.{{DATASET}}.events_*` WHERE _TABLE_SUFFIX BETWEEN '{{START_DATE}}' AND '{{END_DATE}}' GROUP BY session_id ), event_in_sessions AS ( SELECT event_name, CONCAT(user_pseudo_id, '-', CAST(( SELECT value.int_value FROM UNNEST(event_params) WHERE key = 'ga_session_id' ) AS STRING)) AS session_id FROM `{{PROJECT_ID}}.{{DATASET}}.events_*` WHERE _TABLE_SUFFIX BETWEEN '{{START_DATE}}' AND '{{END_DATE}}' ), event_conversion_propensity AS ( SELECT eis.event_name, COUNTIF(sp.converted = 1) AS conversions, COUNT(*) AS total_events, SAFE_DIVIDE(COUNTIF(sp.converted = 1), COUNT(*)) AS conversion_propensity FROM event_in_sessions eis JOIN session_purchases sp ON eis.session_id = sp.session_id GROUP BY eis.event_name ) SELECT * FROM event_conversion_propensity ORDER BY conversion_propensity DESC
Assigns fractional revenue to each event based on the number of events in a session. Helps reveal which events tend to appear in higher-value sessions and gives a proxy for their revenue contribution.
-- Revenue Contribution When Event Is Present WITH sessions_with_revenue AS ( SELECT CONCAT(user_pseudo_id, '-', CAST(( SELECT value.int_value FROM UNNEST(event_params) WHERE key = 'ga_session_id' ) AS STRING)) AS session_id, SUM((SELECT value.double_value FROM UNNEST(event_params) WHERE key = 'value')) AS session_revenue FROM `{{PROJECT_ID}}.{{DATASET}}.events_*` WHERE _TABLE_SUFFIX BETWEEN '{{START_DATE}}' AND '{{END_DATE}}' AND event_name = 'purchase' GROUP BY session_id ), session_event_counts AS ( SELECT CONCAT(user_pseudo_id, '-', CAST(( SELECT value.int_value FROM UNNEST(event_params) WHERE key = 'ga_session_id' ) AS STRING)) AS session_id, COUNT(*) AS event_count FROM `{{PROJECT_ID}}.{{DATASET}}.events_*` WHERE _TABLE_SUFFIX BETWEEN '{{START_DATE}}' AND '{{END_DATE}}' GROUP BY session_id ), events_with_context AS ( SELECT event_name, CONCAT(user_pseudo_id, '-', CAST(( SELECT value.int_value FROM UNNEST(event_params) WHERE key = 'ga_session_id' ) AS STRING)) AS session_id FROM `{{PROJECT_ID}}.{{DATASET}}.events_*` WHERE _TABLE_SUFFIX BETWEEN '{{START_DATE}}' AND '{{END_DATE}}' ), combined AS ( SELECT e.event_name, s.session_revenue, c.event_count, SAFE_DIVIDE(s.session_revenue, c.event_count) AS fractional_revenue FROM events_with_context e LEFT JOIN sessions_with_revenue s ON e.session_id = s.session_id LEFT JOIN session_event_counts c ON e.session_id = c.session_id WHERE s.session_revenue IS NOT NULL AND c.event_count > 0 ) SELECT event_name, COUNT(*) AS event_count, ROUND(SUM(fractional_revenue), 2) AS total_revenue_when_present, ROUND(SAFE_DIVIDE(SUM(fractional_revenue), COUNT(*)), 2) AS avg_revenue_per_event_instance FROM combined GROUP BY event_name ORDER BY avg_revenue_per_event_instance DESC
Compares how frequently each event appears in purchasing vs non-purchasing sessions. Useful for spotting events that are common in successful user journeys versus those that arenโt contributing to conversions.
-- Event Appearance in Purchase vs Non-Purchase Sessions WITH session_event_map AS ( SELECT user_pseudo_id, CONCAT(user_pseudo_id, '-', CAST(( SELECT value.int_value FROM UNNEST(event_params) WHERE key = 'ga_session_id' ) AS STRING)) AS session_id, event_name, MAX(IF(event_name = 'purchase', 1, 0)) OVER ( PARTITION BY user_pseudo_id, CONCAT(user_pseudo_id, '-', CAST(( SELECT value.int_value FROM UNNEST(event_params) WHERE key = 'ga_session_id' ) AS STRING)) ) AS purchase_session FROM `{{PROJECT_ID}}.{{DATASET}}.events_*` WHERE _TABLE_SUFFIX BETWEEN '{{START_DATE}}' AND '{{END_DATE}}' ) SELECT event_name, COUNTIF(purchase_session = 1) AS events_in_purchasing_sessions, COUNTIF(purchase_session = 0) AS events_in_non_purchasing_sessions, ROUND(SAFE_DIVIDE(COUNTIF(purchase_session = 1), COUNT(*)), 4) AS purchase_session_ratio FROM session_event_map GROUP BY event_name ORDER BY purchase_session_ratio DESC
Identifies how often key actions (e.g. purchases, sign-ups, form submissions) occur without a recognised user ID. Useful for diagnosing gaps in login or identity stitching.
SELECT event_name, COUNTIF(user_id IS NULL) AS anonymous_event_count, COUNTIF(user_id IS NOT NULL) AS identified_event_count, ROUND(SAFE_DIVIDE(COUNTIF(user_id IS NULL), COUNT(*)) * 100, 2) AS percent_anonymous FROM `{{PROJECT_ID}}.{{DATASET}}.events_*` WHERE _TABLE_SUFFIX BETWEEN '{{START_DATE}}' AND '{{END_DATE}}' AND event_name IN ('purchase', 'generate_lead', 'login', 'sign_up', 'form_submit') GROUP BY event_name ORDER BY percent_anonymous DESC
Evaluates how often each event is missing key user identifiers (`user_id`, `user_pseudo_id`, or both). Helps identify where identity tracking may be incomplete or broken across specific events.
SELECT event_name, COUNTIF(user_id IS NULL) AS null_user_id, COUNTIF(user_id IS NOT NULL) AS valid_user_id, COUNTIF(user_pseudo_id IS NULL) AS null_user_pseudo_id, COUNTIF(user_pseudo_id IS NOT NULL) AS valid_user_pseudo_id, COUNTIF(user_id IS NULL AND user_pseudo_id IS NULL) AS null_both_ids FROM `{{PROJECT_ID}}.{{DATASET}}.events_*` WHERE _TABLE_SUFFIX BETWEEN '{{START_DATE}}' AND '{{END_DATE}}' GROUP BY event_name ORDER BY null_both_ids DESC
Flags sessions where a user triggered the `login` event more than once. May indicate tracking duplication or issues with login/logout behaviour on your site.
SELECT CONCAT(user_pseudo_id, '-', CAST(( SELECT value.int_value FROM UNNEST(event_params) WHERE key = 'ga_session_id' ) AS STRING)) AS session_id, user_pseudo_id, COUNTIF(event_name = 'login') AS login_count FROM `{{PROJECT_ID}}.{{DATASET}}.events_*` WHERE _TABLE_SUFFIX BETWEEN '{{START_DATE}}' AND '{{END_DATE}}' GROUP BY session_id, user_pseudo_id HAVING login_count > 1 ORDER BY login_count DESC
Detects if a single anonymous user (`user_pseudo_id`) has been linked to multiple `user_id` valuesโcommon when login IDs change or if multiple users share a device.
SELECT user_pseudo_id, ARRAY_AGG(DISTINCT user_id IGNORE NULLS) AS user_ids FROM `{{PROJECT_ID}}.{{DATASET}}.events_*` WHERE _TABLE_SUFFIX BETWEEN '{{START_DATE}}' AND '{{END_DATE}}' GROUP BY user_pseudo_id HAVING ARRAY_LENGTH(user_ids) > 1
Before accessing our queries, we ask for your email address. This helps us improve the tool.
Please enter your email below to continue and receive future GA4 insights. We respect your privacy โ read our Privacy Policy.