Resources
MMT Guidebook - Resource for understanding all things MMT, look at this first.
Weighted Controls Explanations - Deck explaining the new weighted control method for constructing control markets.
Implementation Guide - Guide for how to act on MMT results, how to effectively change strategy to lead to business success.
Lauch Checklists - Before setting up
Introduction
A Matched Market Test is a geographic A/B Test
MMT’s measure a media experiment at the geographic level, which reveal a more complete picture of the incremental impact of ad spend within different channels or tactics.
MMT’s do not rely on any third party data, they look at marketing’s impact on holistic revenue, and are very easy to implement within ad platforms.
Basic Example
We paused spend in the test market, 10 days later revenue declines, then stabilizes.
From here we can compared the revenue and spend drop in the test market to the stable control market where no changes were made.
This comparison shows the revenue that was lost due to decreasing the spend. When we divide the change in revenue by the change in spend, we see the true ROAS (what we call iROAS).
iROAS is different than in-platform ROAS, rather than focusing on the last click, iROAS shows the complete impact of spend on revenue.
Why This Matters
“Half of my advertising spend is wasted; the trouble is I don’t know which half.” - John Wanamaker
Incrementality is net new revenue that would not have occurred without the presence of marketing. The scenario above shows an example of non-incremental marketing on an individual customer level.
If the same outcome occurs whether someone sees an ad or not, we have not influenced a change in revenue with ad dollars, effectively wasting the spend.
The scenarios above show examples of a non-incremental and an incremental program when looking at a business’s high level KPI’s.
The non-incremental program shows attributed CPA (red line) getting more efficient over time, but total conversions are flat. Incremental conversions are not growing, causing incremental CPA (blue line) to spike.
The incremental program shows attribute CPA (red line) going up/flat, but total conversions are growing. Incremental conversions make up a larger share, reducing iCPA and leading to the topline growth of the company.
The MMT Process
Reference the guidebook for more information
Identify Questions
What matters most to the business? The first step in the MMT process is to identify the questions that you want to answer. These questions should be specific, measurable, achievable, relevant, and time-bound. For example, you might ask:
Examples
What is media’s impact to the overall business?
What is [channel]’s impact to the overall business?
What is the right mix of channels to maximize ROI?
What is [tactic]’s impact to the overall business?
Which creative concept is more impactful to sales?
Considerations
The more granular the test, the harder it is to measure
The number of concurrent tests is limited by viable markets
We will proactively bring ideas or questions
Prioritize highest investment or most “suspicious” elements
Comparison tests (e.g. this vs that, rather than on vs off) are harder to execute + measure
Relevant Resources
Risk Tolerance
What is the limit of lost sales? Once you have identified your questions, you need to assess your risk tolerance. This will help you to determine the size and scope of your test. For example, if you are willing to take on a higher level of risk, you may be able to test a larger number of markets. However, if you are more risk-averse, you may want to limit your test to a smaller number of markets.
Examples
What percentage of lost sales is tolerable during the test?
How many markets can we test in simultaneously?
Are we prioritizing learning or performance?
How accurate or stat sig does the test need to be?
How long can we run the test for; any limitations?
Considerations
Look at worst case scenario based on the number of tests
This is often a decision made or forced by financial points
We will to outline the pros and cons of learning vs risk
Taking a short term loss can significantly help long term
If this is a first test, it is usually best to keep it small
Relevant Resources
Identify Markets
The next step is to identify the markets that you will test. The markets that you choose should be representative of your overall business. You should also consider the following factors when selecting markets:
Criteria | Notes |
Questions: | How many? Which? Determines number of markets |
Risk: | If we have a cap (like a %) then that limits the size/number |
Predictability: | Do test and compare markets match CVR and seasonality? |
Representative: | Are the test markets representative of the program? (Spend, etc) |
Sample size: | Are the test markets big enough to measure? (10+ sales/day) |
Viability: | Are the test markets too important to risk experimentation? |
Wildcards: | Are there any external wildcards (billboards, events) in market? |
The Data Intelligence team will identify the market selections (make sure to request an asana task here), but it is important for the client teams to make sure the market selections fit all of the criteria listed above.
The client team is closer to the client and has more information about their risk tolerance for testing.
If the questions above cannot be answered it is crucial to set up a meeting with the client prior to launch to align on these questions.
Relevant Resources
Implement Test
Once you have identified your markets, you need to implement your test. This involves making changes to your marketing campaigns and tracking the results. The changes that you make should be based on the questions that you identified in the first step. For example, if you are testing the impact of your marketing campaigns on sales, you might increase your spending on certain channels for a growth test, or decrease spengin on certain channels for a holdout test.
Put the experiment into the test markets (usually 5-10% of client business for 4-6 weeks).
If there are any questions regarding implementation please reference the guides below, then reach out to the Data Intelligence team.
Relevant Resources
Measure Results
The next step in the MMT process is to measure the results of your test. This involves collecting data on sales, conversions, spend, and other metrics. You can use this data to answer the questions that you identified in the first step. For example, if you are testing the impact of your marketing campaigns on sales, you can compare sales data from before and after the test.
The Data Intelligence team will measure results twice, once half way through the test, then again at the end of the test.
If there are questions regarding the results please reference the documents below, then if there is still confusion reach out to the Data Intelligence team.
Relevant Resources
Optimize Program
Incorporate insights from the test into the program (reallocate budget, etc).
Often the test results are clear, sometimes they beg further questions that must be tested.
Matched Market Tests take 4-6 weeks to complete, given the amount of time, effort, and budget that goes into them it is important to take actions on the results of the test. In general:
If results are incremental we should be sharing that with the client and recommending scaling spend in channels/tactics that work, suggesting more granular tactic level testing, and trying to isolate the levers that led to success in that channel.
If results are non-incremental we should remove spend from that channel/tactic or recommend making changes to strategy to improve incrementality, see this slide here for which levers to pull in order to make improvements.
Relevant Resources
Dictionary
Terms | Definition |
Causal Impact | A time series model that uses pre-test period data to predict the test period, this prediction is known as the counterfactual. The Causal Impact model is the statistical model that is used in every matched market test. |
Confidence in Incrementality | The likelihood that the estimate iROAS is greater than the threshold. If the estimate is an iCPL we want the estimate to be less than the threshold. |
Correlation/Correlation of Logs | Correlation is the relationship between two variables, in most cases daily revenue from the test and control markets. Correlation should be above 0.5 to indicate a strong relationship. Correlation of logs is the relationship between the logged numbers, leading to higher correlations when markets are more similar in size. |
Designated Market Area (DMA) | 210 geographic regions in the United States that are determined by Nielsen’s for defining radio and TV markets. |
Estimate | The iROAS or iCPA value returned by the MMT. Calculated using incremental response and incremental cost. |
Geo | Any geographic region, this can include DMA, State, or some other geographic region. Tests are only run at one geo level, 90% of tests occur at the DMA level. |
Global Web Index (GWI) | A tool that pulls audience data by market - this is used to determine the TAM in order to plan budgets for growth tests. |
Growth Test | A test that launches a net new channel or tactic in the test market with the aim of increasing spend in the test market. |
Holdout Test | A test that excludes the test market from a channel or tactic with the aim of decreasing spend in the test market. |
Incrementality | Net new revenue or conversions that would not have occurred without some intervention. Generally, a purchase from an existing customer that sees an email, then clicks on a brand search link is not considered incremental revenue. A net new customer that sees 4 TikTok Ads over a two week period, conducts a non-brand search, then purchases would be more incremental than the first example. Each platform mentioned plays a different role in producing incrementality, and adjusting strategy can increase or decrease incrementality of marketing activity on platforms. Focusing on incrementality leads to 12% higher growth in our clients compared to our clients that do no focus on it. |
Incremental Cost | The budget that the channel being tested spent through the test period. Negative spend indicates a holdout test occurred. |
Incremental Response | The revenue or conversions that were influenced by the change in spend during the test. |
iCPL | Incremental cost divided by incremental conversions. |
iROAS | Incremental revenue divided by incremental cost. |
Market Selection | The process of pairing test and control markets that would be used for the matched market test. |
Matched Market Test (MMT) | The complete process from market selection, to implemeting the test in platform, to reporting on results, and finally optimizing budget allocations based on the iROAS estimate. |
Model | The causal impact model is a TBR model. TBR means Time Based Regression. |
Percent Causal Impact | This measure is the likelihood that the control market can predict the test market. The best this number can be is 0.5, the higher the number gets the less accurate the prediction is, with 1 being the worst score. This number should be below 0.7. |
Percentage of Test/Control | The percentage of test and control indicate the size of the markets that are being used in the market selection, it is the total revenue of the test or control divided by the total revenue in all markets. Generally, test markets should not exceed 5-10% of the total market. The test needs to be large enough to yield significant results, but not too large to interfere with client revenue goals. |
Start Pre/Start Test/End Test | The relevant dates for the test, start pre is the start of the pre period (historic data), start test is the day that the test began (the day spend changes), and end test is the last day of the test that was measured. |
Target | The channel or tactic that is being tested, i.e. Facebook, Google Ads Non-Brand, etc. |
Test Market(s)/Control Market(s) | The markets that are being used in the MMT. The test market has it's spend changed for the test. |
Threshold | The lower limit that is used to measure confidence in incrementality. For iCPL it becomes the upper limit. |
Total Addressable Market (TAM) | The relevant segment of the population that we are targeting in the test market during a growth test. We pull TAM from GWI and use it to calculate the budget, we also assume that reach is equal to TAM. |
Weighted Controls | The process of taking non-test markets and taking small pieces of control markets to construct the best control possible. |