Current Match Marketing Testing Challenges
For all questions not covered here, reference the MMT guidebook
What is best practice for executing a MMT when the paid search campaign has PMAX active?
Follow this document here. Based on conversations that I have had with Leah, #1 and #2 are our recommended strategies to get the most consistent results. If the client has a substantial program (more than 30K per month) you can look at individual tactic level testing. Follow #4 and #5, the main takeaway is to exclude PMAX from test and control markets when testing non-PMAX campaigns.
What is best practice for executing a MMT when the Meta campaign has Advantage+ active?
See this slide here. There is no way to add geo exclusions to Advantage+ campaigns, meaning we cannot test Advantage+ campaigns in an MMT test. Additionally, running holdout tests on any non-Advantage+ Meta campaigns while running Advantage+ campaigns will interfere with results. We recommend only running growth tests on Meta while running Advantage+ campaigns.
Can a holdout MMT be executed on TikTok or does it have to be a growth MMT?
See this slide here. TikTok Ads MMTs have historically run as Growth tests due to platform exclusion limitations; however, a recent update allows Holdout tests to be run on TikTok. The TikTok team has developed this resource for their team that leads to proper implementation.
Can TikTok campaigns optimizing towards conversions (sales) work with a growth MMT?
You can run a conversion based TikTok campaign, but we do not recommend doing that. Historically, we have not seen conversion TiikTok campaigns come back as incremental.
Intuitively this makes sense, the more the platform’s algorithm tries to optimize towards people that have purchased in the past, the more it targets people that are going to purchase no matter what we do (the definition of non-incrementality, see this slide here).
Rather than getting in front of people that are already going to purchase, reach and frequency focused TikTok campaigns with rigid frequency and audience targeting often come back incremental.
How, if in any way, does the amount budget is increased within the growth market affect the validity of a growth MMT? For example, if the growth market receives 5x the budget during test vs “normal” spend, what does that really tell us about how incremental that channel is at “normal” spend levels?
Growth tests are a balancing act, we need to increase the budgets enough so that we can significantly detect the effect on revenue from increasing spend, but not too much where we run into diminishing returns. Budgets are recommended by the planning team based on the criteria here.
If you are worried that the platform is close to diminishing returns then you can try to adjust the set up of the test to be a holdout test (when possible). Growth tests are only recommended in two scenarios, when the channel is net new and when holdout testing is not possible (think Adv+). We can always have a discussion around the budgets if that is a concern before the test launches.
What optimizations are the activation teams restricted from executing during a MMT?
In general, as long as the change affects both test and control markets evenly (creative refresh, nationwide sale, month-over-month budget changes, etc.) then that should be fine. If the change will effect test and control differently (see these case studies) then we need to figure out how we can postpone the change or mitigate the negative impact on the test. When in doubt, reach out to the person who did the market selection to figure out if the optimization will effect the test.
When is it best to use MMT vs LRA va CIA?
See slide 1 and 3 here for a description of the three analyses.
MMT is used to validate channel/tactic incrementality in a rigorous geographic experiment. I.e. how incremental is PMAX?
CIA is used to look at how an individual decision impacted a variable. I.e. what was the impact of the iOS 14.5 update on revenue?
LRA is an analysis that can look at the relationship between multiple variables over a long period of time and is not bound by one start date. I.e. what happens to revenue as facebook spend changes?
Talk track on how to reach statistical significance and why it hasn’t been reached yet
See this slide here, the example is very helpful in understanding how the confidence is calculated.
If the iROAS estimate is close to 0 or the confidence interval around the estimate is too large, this will lead to low confidence. In that case as long as we are confident in the implementation of the test, it is an indication that the channel or tactic being tested is not incremental.
Not every test ends in incrementality, and because incrementality is a prerequisite for confidence, not every test ends in confidence.
The talk track for reaching statistical significance is this: as we adjust spend, if revenue responds similarly to our changes in spend, we are rewarded with confident predictions. The more that we change spend and revenue does not respond to our changes it indicates that there is no incrementality, leading to less confidence.
Request for setup QA to ensure everything is set up as it should be prior to launching so we don’t get to mid way through the test and find out the test is not usable / invalid. Similar to this, ensuring the team executing the test understands the correct way to set it up and how they should be run.
The MMT team are not ad strategists, and don’t have the background in Meta/Google/etc to check. The best QA’s I have seen in the past are when the person who added it to the platform has someone from their department that has done an MMT before check their work to ensure proper setup.
I think it would also be great for more teams to make an internal resource like the TikTok team, I am working with Sarah M. to set those up with other departments.
Another item that helps to ensure the proper set up is a test brief developed by the AD’s, this sets up clear expectations internally and with the client.
The last note I will make is just encourage any team members running the platform that do not have much MMT experience to reach out to the DI team. The more we can educate people and give them resources like the MMT guidebook and have them truly think about and understand what an MMT is the fewer mistakes will happen.
What would make a test invalid? Product launch, promotions, increase spend, optimizations, etc?
Like mentioned above, anything that affect the test and control markets unevenly or improper test set ups. Also see this slide here about outliers.
For clients where there’s a lag in conversions, what should the ‘ramp’ period be and when should measurement start/end? And what’s the correct period to compare to?
The lag in conversions is factored into the experiment based on the channel being tested. A Facebook ads test should only take 4 weeks to see results, while a programmatic test could take 6-8 weeks. As long as the test covers atleast 2 consideration cycles (which for most of our clients is 4-8 weeks) then the test is set up properly.
Measurement is started the day that spend is changed in the test market, there is not other day where measurement can start. The period to compare to is determined by the MMT team, we normally look at the past year of data to construct the counterfactual (control) estimate which we compare the real test market results to.
Specific Client Examples
Colgate TikTok Growth MMT - strategist launched Growth MMT in conversion campaign per account team’s instruction. Growth MMTs on TikTok should not be run in conversion campaigns as results will be weak without a significant amount of spend and other prerequisites. When communicating Growth MMTs in awareness campaigns to clients, it is important to underscore that they will not see in-engine ROAS/Rev from a Growth MMT because Growth MMTs are run in Awareness campaign types. Budget is way too low to drive stat sig results in a conversion campaign.
It was established by the TikTok team during the middle of the test that reach and frequency campaigns led to incrementality, causing the switch in order to see better results.
Aerogarden TikTok Growth MMT - same as issues with Colgate TikTok Growth MMT
It was established by the TikTok team during the middle of the test that reach and frequency campaigns led to incrementality, causing the switch in order to see better results.
W&P TikTok Growth MMT - same as issues with Colgate TikTok Growth MMT
It was established by the TikTok team during the middle of the test that reach and frequency campaigns led to incrementality, causing the switch in order to see better results.
Wild One TikTok Growth MMT - same as issues with Colgate TikTok Growth MMT
It was established by the TikTok team during the middle of the test that reach and frequency campaigns led to incrementality, causing the switch in order to see better results.
Sunshine Sisters TikTok Growth MMT - same as issues with Colgate TikTok Growth MMT
It was established by the TikTok team during the middle of the test that reach and frequency campaigns led to incrementality, causing the switch in order to see better results.
WP Engine Meta Growth MMT - we struggled with determining how long the ramp up period should be, when the measurement period would be, and what period we should compare to since there’s often a lag between when an ad is seen and when a conversion happens. Additionally, with revenue as a KPI they often saw more deals close at EOM, so we always had to ensure the test and compare periods included similar days of the month.
We used unique MMT measurement frameworks and were able to see very compelling results, so in my eyes this test was a success despite many hurdles, we took a longer test approach and were able to account for the seasonality of more deals coming in at the end of the month.