Leveraging Geo-Experiments for Accurate Incrementality Testing

What if you could turn complex measurement hurdles into stepping stones for growth? At Mouthpiece Marketing, we recently partnered with two innovative clients to do just that. By leveraging incrementality testing, we addressed challenges that were holding their marketing efforts back. Scroll to the bottom if you want to jump ahead to read a case study showcasing how a Geo-X tests revolutionized one client’s way of thinking about branded SEM traffic.

What is a Geo-Experiment?

A geo-experiment involves dividing geographic regions into treatment and control groups. The treatment group receives the marketing intervention (such as ads), while the control group does not, allowing for a direct comparison of the campaign’s impact.

These experiments are ideal for larger campaigns spread across multiple regions, and they help minimize potential factors that often affect traditional testing, such as ad spillover, seasonality, or regional economic conditions.

Concepts to know

Match Markets

When designing a geo-experiment, it's important to carefully select match markets, where both treatment and control regions are chosen based on similar characteristics like market penetration, historical performance, or user behavior. This ensures that any observed differences can be attributed to the campaign, not pre-existing disparities between regions.

List of Test Cells, including Holdout Groups, for Incrementality and Geo Lift

Holdout Groups

In every geo-experiment, a holdout group is used as the baseline. This control group remains "untouched" by the campaign, helping marketers measure what would have happened in the absence of the campaign. By comparing the performance of the treatment group to the holdout, you can accurately calculate the incremental lift generated by the campaign.

While match markets and holdout groups work together in geo-experiments, they serve different purposes—match markets ensure comparability between regions, while holdouts provide the baseline to measure true lift.

Why Use Geo-Experiments for Incrementality Testing?

Geo-experiments are uniquely suited for incrementality testing, offering several advantages:

  1. Precise Measurement: By comparing test regions to control (holdout) regions simultaneously, you gain a clear understanding of the incremental impact your campaign has. This approach isolates the effects of your marketing efforts from other external factors, providing more accurate results than traditional methods.

  2. Mitigation of Pre-Post Analysis Limitations: Geo-experiments reduce the reliance on pre-post analyses, which can be confounded by time-based trends, seasonal variations, and external events occurring between the pre-campaign and post-campaign periods. By running tests and controls concurrently, you control for temporal factors that might otherwise skew the results, leading to more reliable conclusions.

  3. Scalable Approach: Geo-experiments can be scaled to any number of regions, making them an effective solution to adjust the sample size. This scalability allows for robust statistical analysis and enhances the generalizability of the insights gained from the experiment.

  4. Reduced Bias: Utilizing multiple matched markets helps mitigate biases arising from regional differences. By carefully selecting and pairing regions with similar characteristics—such as demographics, purchasing behavior, and market conditions—you ensure that observed differences are attributable to the marketing intervention rather than inherent disparities.

    By focusing on these advantages, geo-experiments provide a powerful framework for accurately measuring the true impact of your marketing campaigns, enabling data-driven decisions and optimized strategies without the pitfalls associated with pre-post analyses.

Steps to Designing a Successful Geo-Experiment

  1. Selecting Regions: Start by selecting regions that represent your target market. Be mindful to avoid outlier geos like New York City. Choose areas that drive similar performance to your business, such as overall conversions, paid conversions, conversion rate, and LTV. Using match markets here is crucial for reducing noise and ensuring accurate results.

  2. Random Assignment: Randomly assign the selected regions into test and control groups. This step is critical for minimizing any bias that could skew your results.

  3. Defining the Metrics: Identify the key performance indicators (KPIs) that you want to measure. These could include conversions, sales, or Cost Per Acquisition (CPA), depending on your campaign’s objectives.

  4. Setting Up a Holdout Group: Ensure that your control group includes a designated holdout. This will serve as your reference point for calculating incremental lift.

  5. Campaign Duration: Run your geo-experiment for a sufficient length of time—typically, aim for a minimum of 4 to 6 weeks. This allows you to capture enough data and account for weekly fluctuations.

Running and Analyzing Your Geo-Experiment

  1. Launching the Campaign: Execute your campaign in the test regions while the control (holdout) regions remain unaffected. Carefully track performance in both groups, focusing on the KPIs you’ve defined.

  2. Collecting Data: Monitor conversions, sales, and other key metrics in both test and holdout regions throughout the campaign. Consistent data collection ensures that any lift you observe is attributable to the campaign itself.

  3. Measuring Incrementality: Once the campaign ends, analyze the data to calculate the incremental lift—the difference in performance between the test regions and the holdout group. A simple calculation involves measuring the percentage increase in your KPI (e.g., sales) in the test group compared to the holdout.

    Example: Lift (%) = (Performance in Test Group - Performance in Holdout Group) / Performance in Holdout Group * 100

  4. Statistical Analysis: To validate the significance of your results, apply statistical techniques like t-tests or Bayesian analysis. A p-value of less than 0.05 typically indicates that the observed difference is statistically significant and not a random fluctuation.

Using Geo-Experiment Insights for Future Campaigns

The results of geo-experiments offer more than just a snapshot of your campaign’s success—they provide valuable insights that can inform future strategies.

  1. Budget Allocation: If certain regions in your test group perform exceptionally well, you can reallocate future budgets to those areas.

  2. Campaign Optimization: The insights from holdout group comparisons can help you refine campaign targeting and creative strategies.

  3. Regional Customization: Geo-experiments often highlight regional preferences, allowing for more tailored approaches in future campaigns.

Conclusion

Geo-experiments offer a scalable, reliable way to conduct incrementality testing across multiple regions. By incorporating match markets and using holdout groups effectively, businesses can confidently assess the true impact of their marketing campaigns. The insights gained from these experiments not only help optimize current campaigns but also inform long-term strategies for budget allocation and regional targeting.

Case Study

Using Geo-x To Measure Branded incrementality on app installs

The Challenge: Our client, a fintech app developer, had traditionally excluded branded search terms from their Google App Install Campaigns. The prevailing belief within the organization was that branded traffic did not contribute to incremental installs. Previous analyses seemed to support this assumption, but no definitive testing had been conducted to confirm it. To address this gap, we collaborated with the client's internal data science team to launch a Geo-Experiment (Geo-X Test) on Google App Install Campaigns. The goal was to quantify the incremental impact of branded search terms on app installs, encompassing both paid and organic search channels.

Methodology & Setup: We implemented a Geo-X Test by selecting six U.S. states with comparable user behavior metrics. These states were divided into three distinct groups:

  • Test Group (Branded + Non-Branded Keywords): Florida and Washington

  • Control Group (Non-Branded Keywords Only): California and Illinois

  • Holdout Group (No Marketing Intervention): Texas and Georgia

This setup allowed us to isolate the effects of including branded search terms in the campaigns.

Results:

The experiment yielded compelling results:

  • Significant Increase in Installs: The Test Group experienced a 204% lift in total installs compared to the Control Group during the same period.

  • Return on Investment (ROI): While branded search alone showed a negative ROI, the combination of branded and non-branded keywords resulted in a breakeven ROI. This indicates that branded search enhances overall campaign efficiency when paired with non-branded terms.

  • Revenue Growth: The campaign including branded keywords generated 2.6 times more revenue than the non-branded only campaign within the test period.

  • Adjusted Cost Per Install (CPI): The unadjusted CPI reported by Google was $2.86. After accounting for seasonality effects observed in the Holdout Group and adjusting for organic cannibalization in the Test Group, the true cost per incremental install was calculated to be $3.57.

Case Study Conclusion:

The inclusion of branded search terms in Google App Install campaigns substantially improves the ability to scale app installs and achieve revenue goals. Despite an initially negative ROI for branded search alone, its combination with non-branded search balances out the ROI and leverages the higher Lifetime Value (LTV) associated with users acquired through branded terms.

Recommendation:

Based on the empirical evidence from the Geo-X Test, we recommend the following:

  1. Reactivation of Branded Traffic: Incorporate branded search terms back into the Google App Install campaigns to capitalize on their significant impact on scaling installs and boosting revenue.

  2. Adjustment of CPI Targets: Implement a 26% upward adjustment to the platform-reported CPI to reflect the true cost per incremental install. This adjustment is supported by the higher LTV associated with users acquired through branded search traffic.

FAQs

  1. What is the ideal number of regions for a geo-experiment?
    While there is no set rule, at least 20 regions, divided between test and control, usually yields statistically reliable results.

  2. How long should a geo-experiment run?
    A geo-experiment should typically run for 4–6 weeks to gather enough data for accurate analysis.

  3. Can geo-experiments work with online-only campaigns?
    Absolutely! Geo-experiments can be applied across digital and offline channels, from paid search to TV advertising.


    By leveraging geo-experiments and optimizing the use of match markets and holdout groups, you can accurately measure the incremental value of your marketing efforts, empowering smarter, data-driven decisions across your campaigns.

Previous
Previous

4 Next-Gen Betas From Google’s AI Innovation Day

Next
Next

YouTube Brand Lift Studies: Comprehensive Guide for Marketers