{"id":2664,"date":"2025-03-12T20:40:23","date_gmt":"2025-03-12T13:40:23","guid":{"rendered":"https:\/\/mintea.blog\/?p=2664"},"modified":"2025-03-12T20:40:50","modified_gmt":"2025-03-12T13:40:50","slug":"2664","status":"publish","type":"post","link":"https:\/\/mintea.blog\/?p=2664","title":{"rendered":"How to investigate a spike in your data"},"content":{"rendered":"<h2>How to investigate a spike in your data<\/h2>\n<p>So, you\u2019ve just noticed a spike in your data. Maybe your trendline looks something like this?<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" width=\"1429\" height=\"367\" class=\"wp-image-2665\" src=\"https:\/\/mintea.blog\/wp-content\/uploads\/2025\/03\/a-spike-your-data.png\" alt=\"A spike your data\" srcset=\"https:\/\/mintea.blog\/wp-content\/uploads\/2025\/03\/a-spike-your-data.png 1429w, https:\/\/mintea.blog\/wp-content\/uploads\/2025\/03\/a-spike-your-data-300x77.png 300w, https:\/\/mintea.blog\/wp-content\/uploads\/2025\/03\/a-spike-your-data-1024x263.png 1024w, https:\/\/mintea.blog\/wp-content\/uploads\/2025\/03\/a-spike-your-data-768x197.png 768w\" sizes=\"auto, (max-width: 1429px) 100vw, 1429px\" \/><\/p>\n<p>Or if you\u2019ve caught it early, it might even look something like this:<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" width=\"1429\" height=\"367\" class=\"wp-image-2666\" src=\"https:\/\/mintea.blog\/wp-content\/uploads\/2025\/03\/a-sudden-trendline-spike.png\" alt=\"A sudden trendline spike\" srcset=\"https:\/\/mintea.blog\/wp-content\/uploads\/2025\/03\/a-sudden-trendline-spike.png 1429w, https:\/\/mintea.blog\/wp-content\/uploads\/2025\/03\/a-sudden-trendline-spike-300x77.png 300w, https:\/\/mintea.blog\/wp-content\/uploads\/2025\/03\/a-sudden-trendline-spike-1024x263.png 1024w, https:\/\/mintea.blog\/wp-content\/uploads\/2025\/03\/a-sudden-trendline-spike-768x197.png 768w\" sizes=\"auto, (max-width: 1429px) 100vw, 1429px\" \/><\/p>\n<p>Now, obviously, I have no way of knowing if your y-axis represents something really good (like new customers) or something really bad (like system errors). And depending on which it is, you may be feeling a strong urge to pop the champagne or hit the panic button.<\/p>\n<p>But as we know, data can be deceptive, so don\u2019t jump to conclusions just yet. Instead, here are some simple steps to work through when you notice a spike in your data.<\/p>\n<p>1. Is this metric important?<\/p>\n<p>Before investigating any data anomaly, it\u2019s important to triage the issue. This makes sure we respond to the most important issues quickly, without unnecessarily distracting ourselves from our current priorities. Ask yourself: if this spike was genuine and represented a real-world change:<\/p>\n<ol>\n<li>Would the consequences be significant?<\/li>\n<li>How urgently would they require attention?<\/li>\n<\/ol>\n<p><strong>Scenario A: It\u2019s significant and needs urgent attention.<\/strong><br \/>\nIt\u2019s potentially a critical issue, so you may need to begin a parallel process. Respond to the issue as if it was genuine, but also begin the process of investigation. Remember to clearly communicate to others the extent to which you have been able to verify the data.<\/p>\n<p><strong>Scenario B: It\u2019s significant, but doesn\u2019t need urgent attention.<\/strong><br \/>\nIf you responded to every blip in your data immediately, you would never get anything else done. But that\u2019s okay. Schedule a time to investigate the issue (and make sure you follow through.)<\/p>\n<p><strong>Scenario C: There wouldn\u2019t be any significant consequences.<\/strong><br \/>\nRemember that<strong>\u00a0<\/strong>not every anomaly requires detailed investigation. Of course, curiosity is a valuable quality for anyone working with data \u2013 but so is efficiency. And living in the \u2018era of big data\u2019 doesn&#8217;t mean you have to be across every data point. Smart leaders choose to focus on the metrics that matter. Prioritize your investigations, or give them an appropriate time frame.<\/p>\n<p>2. Is the spike just a natural variation?<\/p>\n<p>Sometimes we see irregularities in data that are, in fact, completely normal. Ask yourself: how does the metric usually behave? Is it usually steady, or does it carry variance? (In other words, does it jump around a lot?)<\/p>\n<p>A simple way to check this is by looking at how the trend changes over time. Have there ever been spikes this extreme before? Is the current spike consistent with any wider trends?<\/p>\n<p>In the example below, a Customer Success Manager is looking at the average first response time over the past week. They\u2019ve noticed what looks like a spike in their data.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" width=\"1429\" height=\"489\" class=\"wp-image-2667\" src=\"https:\/\/mintea.blog\/wp-content\/uploads\/2025\/03\/looking-at-a-spike-over-a-seven-day-period.png\" alt=\"Looking at a spike over a seven day period\" srcset=\"https:\/\/mintea.blog\/wp-content\/uploads\/2025\/03\/looking-at-a-spike-over-a-seven-day-period.png 1429w, https:\/\/mintea.blog\/wp-content\/uploads\/2025\/03\/looking-at-a-spike-over-a-seven-day-period-300x103.png 300w, https:\/\/mintea.blog\/wp-content\/uploads\/2025\/03\/looking-at-a-spike-over-a-seven-day-period-1024x350.png 1024w, https:\/\/mintea.blog\/wp-content\/uploads\/2025\/03\/looking-at-a-spike-over-a-seven-day-period-768x263.png 768w\" sizes=\"auto, (max-width: 1429px) 100vw, 1429px\" \/><\/p>\n<p>But that\u2019s because this metric carries a lot of variance, especially when calculated each day. By looking at the trendline over time (in this case, over the past month), the manager can see that this spike is completely normal.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" width=\"1429\" height=\"430\" class=\"wp-image-2668\" src=\"https:\/\/mintea.blog\/wp-content\/uploads\/2025\/03\/looking-at-a-spike-over-a-one-month-period.png\" alt=\"Looking at a spike over a one month period\" srcset=\"https:\/\/mintea.blog\/wp-content\/uploads\/2025\/03\/looking-at-a-spike-over-a-one-month-period.png 1429w, https:\/\/mintea.blog\/wp-content\/uploads\/2025\/03\/looking-at-a-spike-over-a-one-month-period-300x90.png 300w, https:\/\/mintea.blog\/wp-content\/uploads\/2025\/03\/looking-at-a-spike-over-a-one-month-period-1024x308.png 1024w, https:\/\/mintea.blog\/wp-content\/uploads\/2025\/03\/looking-at-a-spike-over-a-one-month-period-768x231.png 768w\" sizes=\"auto, (max-width: 1429px) 100vw, 1429px\" \/><\/p>\n<p>Often, like with the example above, a quick look at the shape of the data over time will tell you if your spike is normal. However, if you want to be more rigorous, you can employ statistical process controls.<\/p>\n<p>Statistical process controls (SPCs)<\/p>\n<p>Statistical process controls are a more scientific way of determining whether a metric is behaving normally. Typically, the control would involve a series of tests; for example, do any of the data points fall outside of a defined \u2018normal\u2019 range. SPCs are also useful because they can be automated.<\/p>\n<p>A common example of an SPC is an XmR chart \u2013 we\u2019d recommend this guide to\u00a0<a href=\"https:\/\/www.staceybarr.com\/measure-up\/build-xmr-chart-kpi\/\">building your own XmR chart<\/a>\u00a0by Stacey Barr.<\/p>\n<p>3. Consider seasonal trends and natural cycles<\/p>\n<p>Some metrics vary according to natural cycles. Many sales patterns, for example, are tied to the calendar year. Think about consumer purchase behaviour around Black Friday and in the run up to Christmas, new gym memberships in January or ice-cream sales in summer.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" width=\"640\" height=\"480\" class=\"wp-image-2669\" src=\"https:\/\/mintea.blog\/wp-content\/uploads\/2025\/03\/homer-simpson-invested-in-pumpkins-in-october.jpeg\" alt=\"Homer Simpson invested in pumpkins in October\" srcset=\"https:\/\/mintea.blog\/wp-content\/uploads\/2025\/03\/homer-simpson-invested-in-pumpkins-in-october.jpeg 640w, https:\/\/mintea.blog\/wp-content\/uploads\/2025\/03\/homer-simpson-invested-in-pumpkins-in-october-300x225.jpeg 300w\" sizes=\"auto, (max-width: 640px) 100vw, 640px\" \/><\/p>\n<p>When we correctly determine natural cycles, we can use this to add valuable context to our data. For example, this ecommerce site has added a yearly comparison to their trendline. It helps you to determine whether the spikes in December and May are normal, or the result of a seasonal trend.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" width=\"1429\" height=\"741\" class=\"wp-image-2670\" src=\"https:\/\/mintea.blog\/wp-content\/uploads\/2025\/03\/trendline-with-seasonal-context-comparison.png\" alt=\"trendline with seasonal context comparison\" srcset=\"https:\/\/mintea.blog\/wp-content\/uploads\/2025\/03\/trendline-with-seasonal-context-comparison.png 1429w, https:\/\/mintea.blog\/wp-content\/uploads\/2025\/03\/trendline-with-seasonal-context-comparison-300x156.png 300w, https:\/\/mintea.blog\/wp-content\/uploads\/2025\/03\/trendline-with-seasonal-context-comparison-1024x531.png 1024w, https:\/\/mintea.blog\/wp-content\/uploads\/2025\/03\/trendline-with-seasonal-context-comparison-768x398.png 768w\" sizes=\"auto, (max-width: 1429px) 100vw, 1429px\" \/><\/p>\n<p>Similarly, awareness of cycles can help us discover whether irregularities are happening that would otherwise appear normal. In the example below \u2013 this product would usually experience an uptick in sales in the run up to Christmas, but this hasn\u2019t happened this year. Without the comparison, we might incorrectly interpret the trendline as \u2018normal\u2019, with no cause for alarm.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" width=\"1429\" height=\"365\" class=\"wp-image-2671\" src=\"https:\/\/mintea.blog\/wp-content\/uploads\/2025\/03\/seasonal-comparison-shows-missed-uptick.png\" alt=\"seasonal comparison shows missed uptick\" srcset=\"https:\/\/mintea.blog\/wp-content\/uploads\/2025\/03\/seasonal-comparison-shows-missed-uptick.png 1429w, https:\/\/mintea.blog\/wp-content\/uploads\/2025\/03\/seasonal-comparison-shows-missed-uptick-300x77.png 300w, https:\/\/mintea.blog\/wp-content\/uploads\/2025\/03\/seasonal-comparison-shows-missed-uptick-1024x262.png 1024w, https:\/\/mintea.blog\/wp-content\/uploads\/2025\/03\/seasonal-comparison-shows-missed-uptick-768x196.png 768w\" sizes=\"auto, (max-width: 1429px) 100vw, 1429px\" \/><\/p>\n<p>Be careful when making cyclical or seasonal comparisons. Particularly as they can be prone to our own biases. Find a way to test or verify these assumptions.<\/p>\n<p>4. \u00a0Is there a data quality issue?<\/p>\n<p>Your spike could be caused by a data quality issue, as opposed to a real-world change.<\/p>\n<p>Data quality issues occur when there is a problem in the way data is collected or recorded, or because of non-genuine inputs. An example of a non-genuine input would be a member of your team testing a feature multiple times, without disabling the analytics, which then causes a usage spike.<\/p>\n<p>Consider the obvious causes of data quality issues, and try to rule them out. Many common issues include:<\/p>\n<ul>\n<li>outages in tools<\/li>\n<li>internal testing<\/li>\n<li>broken code (such as event tracking)<\/li>\n<li>changes to your platform<\/li>\n<li>bots<\/li>\n<li>spam (such as spam signups)<\/li>\n<li>bugs that cause events to fire \/ not fire<\/li>\n<li>human error \/ manual steps not being completed<\/li>\n<li>SQL job not run<\/li>\n<\/ul>\n<p>Once you\u2019ve ruled out some of the more common causes of data quality issues, you should aim to verify or falsify the spike.<\/p>\n<p>Use secondary data to verify or falsify the spike<\/p>\n<p>If your KPIs have genuinely spiked, then the cause of that spike will likely have influenced other related metrics.<\/p>\n<p>For example, in the example below, our pageviews have spiked, but two closely related metrics (sessions and unique pageviews) have not. If the spike was genuine (and our website had genuinely attracted many more visitors that day) you would expect all three to be affected in a similar way. The fact that only one metric has spiked would indicate a data quality issue.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" width=\"1429\" height=\"1114\" class=\"wp-image-2672\" src=\"https:\/\/mintea.blog\/wp-content\/uploads\/2025\/03\/comparing-related-metrics-to-investigate-spike.png\" alt=\"Comparing related metrics to investigate spike\" srcset=\"https:\/\/mintea.blog\/wp-content\/uploads\/2025\/03\/comparing-related-metrics-to-investigate-spike.png 1429w, https:\/\/mintea.blog\/wp-content\/uploads\/2025\/03\/comparing-related-metrics-to-investigate-spike-300x234.png 300w, https:\/\/mintea.blog\/wp-content\/uploads\/2025\/03\/comparing-related-metrics-to-investigate-spike-1024x798.png 1024w, https:\/\/mintea.blog\/wp-content\/uploads\/2025\/03\/comparing-related-metrics-to-investigate-spike-768x599.png 768w\" sizes=\"auto, (max-width: 1429px) 100vw, 1429px\" \/><\/p>\n<p>Similarly, you may use different tools to track the same metrics. Discrepancies between data reported by different tools are a tell-tale sign of issues in data quality.<\/p>\n<p>For example, if your analytics tool has recorded no website traffic to your payment pages, but your payment tool recorded consistent new payments, then it\u2019s likely there&#8217;s an issue with your data quality. However, if both tools recorded a dip, then it\u2019s more likely something has genuinely happened within the user journey.<\/p>\n<p>5. \u00a0Segment the data, follow the breadcrumbs<\/p>\n<p>Segmenting data is a great way of testing existing theories and prompting new ones.<\/p>\n<p>In the example below, we see a huge drop in traffic on September 12. If our hypothesis was that this was caused by a national holiday in one of the countries where we operate, you could easily test that theory by segmenting the data according to country. If it was true, you would see a drop in that country and normal activity in segments from other countries.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" width=\"1432\" height=\"610\" class=\"wp-image-2673\" src=\"https:\/\/mintea.blog\/wp-content\/uploads\/2025\/03\/chart-line-chart-description-automatically-gener-3.png\" alt=\"Chart, line chart Description automatically generated\" srcset=\"https:\/\/mintea.blog\/wp-content\/uploads\/2025\/03\/chart-line-chart-description-automatically-gener-3.png 1432w, https:\/\/mintea.blog\/wp-content\/uploads\/2025\/03\/chart-line-chart-description-automatically-gener-3-300x128.png 300w, https:\/\/mintea.blog\/wp-content\/uploads\/2025\/03\/chart-line-chart-description-automatically-gener-3-1024x436.png 1024w, https:\/\/mintea.blog\/wp-content\/uploads\/2025\/03\/chart-line-chart-description-automatically-gener-3-768x327.png 768w\" sizes=\"auto, (max-width: 1432px) 100vw, 1432px\" \/><\/p>\n<p>If you don\u2019t have any working theories, then segmenting by different variables can present clues. Here, we\u2019ve segmented the data by acquisition source. We can see that all segments have stayed normal apart from one \u2013 traffic from Google Ads.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" width=\"2478\" height=\"2586\" class=\"wp-image-2674\" src=\"https:\/\/mintea.blog\/wp-content\/uploads\/2025\/03\/two-line-charts-one-is-segmenting-session-data-by.png\" alt=\"Two line charts: one is segmenting session data by source and shows a dip in Google Ads traffic.\" srcset=\"https:\/\/mintea.blog\/wp-content\/uploads\/2025\/03\/two-line-charts-one-is-segmenting-session-data-by.png 2478w, https:\/\/mintea.blog\/wp-content\/uploads\/2025\/03\/two-line-charts-one-is-segmenting-session-data-by-287x300.png 287w, https:\/\/mintea.blog\/wp-content\/uploads\/2025\/03\/two-line-charts-one-is-segmenting-session-data-by-981x1024.png 981w, https:\/\/mintea.blog\/wp-content\/uploads\/2025\/03\/two-line-charts-one-is-segmenting-session-data-by-768x801.png 768w, https:\/\/mintea.blog\/wp-content\/uploads\/2025\/03\/two-line-charts-one-is-segmenting-session-data-by-1472x1536.png 1472w, https:\/\/mintea.blog\/wp-content\/uploads\/2025\/03\/two-line-charts-one-is-segmenting-session-data-by-1962x2048.png 1962w\" sizes=\"auto, (max-width: 2478px) 100vw, 2478px\" \/><\/p>\n<p>Of course, on its own, this doesn\u2019t explain the underlying cause of the dip, but it does give us our next step (or breadcrumb) to investigate. You can repeat this process of segmentation as you zone in on the specific causes of your spike. But be careful, repeatedly segmenting can reduce the sample size, resulting in trends that may look irregular, but are not actually statistically unlikely.<\/p>\n<p>6. Has anything changed?<\/p>\n<p>Another way of generating theories, which you can then test, is by asking a very simple question &#8211; has anything changed?<\/p>\n<p>Have we changed anything?<\/p>\n<p>Start internally. Check for new updates, new features, areas of work. And pay\u00a0<em>very<\/em>\u00a0close attention to dates and sequences of events. This will allow you to rule out working theories if the sequences of events don\u2019t match. It will also reveal\u00a0<a href=\"https:\/\/www.geckoboard.com\/best-practice\/statistical-fallacies\/\">coincidences<\/a>, such as two unusual but seemingly unconnected things happening at the same time.\u00a0<em>Always<\/em>\u00a0interrogate these coincidences.<\/p>\n<p>It\u2019s important to keep a completely open mind. The cause of your spike may be an unintended consequence of something completely unrelated. Normally, it\u2019s because something, somewhere, has changed.<\/p>\n<p>Has anything changed in the outside world?<\/p>\n<p>You can\u2019t investigate everything, of course, but do start with the likely suspects. If your spike relates to web traffic, check to see if Google has updated their algorithm. If your spike relates to an IT system failure, check to see if the same thing has happened to other users.<\/p>\n<p>7. Involve others<\/p>\n<p>If you haven\u2019t already, ask around.<\/p>\n<p>Has anyone noticed data spikes of their own? Does anyone know if anything has changed? (Again, pay very close attention to dates and sequences of events)<\/p>\n<p>Your team is one of the best resources you have for troubleshooting data issues. They contribute valuable knowledge and perspective. Use them!<\/p>\n<p>8. \u00a0Keep monitoring<\/p>\n<p>As frustrating as it may be, you should accept it\u2019s not always possible to get to the bottom of data spikes. At least not straight away.<\/p>\n<p>But the worst thing we can do is forget about them\u2026<\/p>\n<p>Why?<\/p>\n<p>Because it might happen again.<\/p>\n<p>And if it happens again, then now you have a reoccurring issue, and that\u2019s far more significant to you and your business.<\/p>\n<p>Also, if it happens again, you have twice as many data points to investigate and compare.<\/p>\n<p>So really, the best thing we can do is to keep monitoring. Because the greater level of awareness you can create, the more likely it is we will eventually spot the patterns and crack the case.<\/p>\n<p>So be patient. Keep monitoring your KPIs. And good luck.<\/p>\n<p>Source: Geckoboard Blog<\/p>\n<p>&nbsp;<\/p>\n","protected":false},"excerpt":{"rendered":"<p>How to investigate a spike in your data So, you\u2019ve just noticed a spike in your data. Maybe your trendline looks something like this? Or if you\u2019ve caught it early, it might even look something like this: Now, obviously, I have no way of knowing if your y-axis represents something really good (like new customers) &hellip; <a href=\"https:\/\/mintea.blog\/?p=2664\" class=\"more-link\">Continue reading <span class=\"screen-reader-text\">How to investigate a spike in your data<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[108],"tags":[32,79,43,26,68,78],"class_list":["post-2664","post","type-post","status-publish","format-standard","hentry","category-articles","tag-analytic","tag-bi","tag-dashboard","tag-data","tag-kpi","tag-visualization"],"_links":{"self":[{"href":"https:\/\/mintea.blog\/index.php?rest_route=\/wp\/v2\/posts\/2664","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/mintea.blog\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/mintea.blog\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/mintea.blog\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/mintea.blog\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=2664"}],"version-history":[{"count":3,"href":"https:\/\/mintea.blog\/index.php?rest_route=\/wp\/v2\/posts\/2664\/revisions"}],"predecessor-version":[{"id":2677,"href":"https:\/\/mintea.blog\/index.php?rest_route=\/wp\/v2\/posts\/2664\/revisions\/2677"}],"wp:attachment":[{"href":"https:\/\/mintea.blog\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=2664"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/mintea.blog\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=2664"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/mintea.blog\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=2664"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}