Data collection is one of the most important preparatory stages in the Marketing Mix Modeling process. Collecting the right types of data comes from carefully planning your objectives then choosing the KPIs most relevant to those objectives. We’ve put some key pointers together to provide a sensible and pain-free data collection process.
Analysis of data at different frequencies can bring different perspectives, so it’s important to choose the one that works best for you. Think of how you can zoom-in on a map for a more detailed street view or zoom-out for a more global view. Choosing your data frequency has a similar effect.
Monthly data is slower moving and captures “macro” trends. Often brand health measures like brand awareness will only be available monthly.
Daily data is faster and useful for things like monitoring website traffic volumes.
Weekly data is by far the most common frequency level of data used by our clients. It carries the advantage of excluding any “day of the week” effects and often means that weather effects are negated. If looking at daily data, rainfall, etc., can cause very short-term issues in retail sales – and boost online sales.
At some point you’ll need to bring all this data together for modeling. You should first look at and prepare the KPI data you intend to model against (dependent variable) - the goal being to create what is known as a "flat” file or table; this allows for quick checks of your data, either via code (Python, R, etc.) or pivot tables.
Once you’ve decided on the format of your KPI data, you should then align all your other data sources to this same structure. The data should match the same frequency and any other differentiating features such as applicable product or market location/geography.
By having all data in the same format and aligned, you can quickly compare independent variables against the dependent variable to come up with hypotheses when you build your models.
This is important for both reporting results at a high level and when building an actionable model.
As an example, say you’re building a model that must include insights on most of the social media budget. You have 4 different campaigns, however, you’re only able to get 1 of the campaigns into the model. This won’t work well in providing action items on what to do for social media in the future; but if you create an aggregated or grouped variable that is the combination of all 4 campaigns, you can test this in your model. If it successfully comes in, then it is more useful for planning and decision making than just having that 1 campaign.
By preparing these groupings during data collection, you can quickly pivot to them during modeling if needed rather than having to rethink about the data once modeling has begun.
Data collection can be challenging, but it doesn’t have to be. With proper planning and understanding of the data, you can minimize the challenges of getting the right data for your model and head off any problems that may arise while modeling that stem from data collection.
Learn how to bring your MMM in-house.