{"id":2285,"date":"2022-09-11T10:21:10","date_gmt":"2022-09-11T03:21:10","guid":{"rendered":"https:\/\/mintea.blog\/?p=2285"},"modified":"2022-09-11T11:09:55","modified_gmt":"2022-09-11T04:09:55","slug":"2285","status":"publish","type":"post","link":"https:\/\/mintea.blog\/?p=2285","title":{"rendered":"Data fallacies"},"content":{"rendered":"<p>Data fallacies<\/p>\n<p>Statistical fallacies are common tricks data can play on you, which lead to mistakes in data interpretation and analysis. Explore some common fallacies, with real-life examples, and find out how you can avoid them.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" width=\"700\" height=\"989\" class=\"wp-image-2286\" src=\"https:\/\/mintea.blog\/wp-content\/uploads\/2022\/09\/data-fallcies-poster-preview.jpeg\" alt=\"Data fallcies poster preview\" srcset=\"https:\/\/mintea.blog\/wp-content\/uploads\/2022\/09\/data-fallcies-poster-preview.jpeg 700w, https:\/\/mintea.blog\/wp-content\/uploads\/2022\/09\/data-fallcies-poster-preview-212x300.jpeg 212w\" sizes=\"auto, (max-width: 700px) 100vw, 700px\" \/><\/p>\n<p><a href=\"https:\/\/www.geckoboard.com\/uploads\/data-fallacies-to-avoid.pdf\">Download poster<\/a><\/p>\n<p>Cherry Picking<\/p>\n<p>The practice of selecting results that fit your claim and excluding those that don\u2019t. The worst and most harmful example of being dishonest with data.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" width=\"1200\" height=\"864\" class=\"wp-image-2287\" src=\"https:\/\/mintea.blog\/wp-content\/uploads\/2022\/09\/cherry-picking.png\" alt=\"Cherry Picking\" srcset=\"https:\/\/mintea.blog\/wp-content\/uploads\/2022\/09\/cherry-picking.png 1200w, https:\/\/mintea.blog\/wp-content\/uploads\/2022\/09\/cherry-picking-300x216.png 300w, https:\/\/mintea.blog\/wp-content\/uploads\/2022\/09\/cherry-picking-1024x737.png 1024w, https:\/\/mintea.blog\/wp-content\/uploads\/2022\/09\/cherry-picking-768x553.png 768w\" sizes=\"auto, (max-width: 1200px) 100vw, 1200px\" \/><\/p>\n<p><a href=\"https:\/\/www.geckoboard.com\/uploads\/cherry-picking.pdf\"><strong>Get the printable card<\/strong><\/a><\/p>\n<p>When making a case, data adds weight \u2013 whether a study, experiment or something you\u2019ve read. However, people often only highlight data that backs their case, rather than the entire body of results. It\u2019s prevalent in public debate and politics where two sides can both present data that backs their position. Cherry Picking can be deliberate or accidental. Commonly, when you\u2019re receiving data second hand, there\u2019s an opportunity for someone choosing what data to share to distort the truth to whatever opinion they\u2019re peddling. When on the receiving end of data, it\u2019s important to ask yourself: \u2018What am I not being told?\u2019.<\/p>\n<p>Related Reading:<\/p>\n<ul>\n<li><a href=\"https:\/\/www.economicshelp.org\/blog\/21618\/economics\/cherry-picking-of-data\/\"><strong>Economics Help: Examples of Cherry Picking in action<\/strong><\/a><\/li>\n<li><a href=\"https:\/\/www.huffingtonpost.com\/peter-h-gleick\/misrepresenting-climate-s_b_819367.html\"><strong>Misrepresenting climate science: Cherry Picking data to hide the disappearance of arctic ice<\/strong><\/a><\/li>\n<\/ul>\n<p>Data Dredging<\/p>\n<p>Data dredging is the failure to acknowledge that the correlation was in fact the result of chance.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" width=\"1200\" height=\"864\" class=\"wp-image-2288\" src=\"https:\/\/mintea.blog\/wp-content\/uploads\/2022\/09\/data-dredging.png\" alt=\"Data Dredging\" srcset=\"https:\/\/mintea.blog\/wp-content\/uploads\/2022\/09\/data-dredging.png 1200w, https:\/\/mintea.blog\/wp-content\/uploads\/2022\/09\/data-dredging-300x216.png 300w, https:\/\/mintea.blog\/wp-content\/uploads\/2022\/09\/data-dredging-1024x737.png 1024w, https:\/\/mintea.blog\/wp-content\/uploads\/2022\/09\/data-dredging-768x553.png 768w\" sizes=\"auto, (max-width: 1200px) 100vw, 1200px\" \/><\/p>\n<p><a href=\"https:\/\/www.geckoboard.com\/uploads\/data-dredging.pdf\"><strong>Get the printable card<\/strong><\/a><\/p>\n<p>Tests for statistical significance only work if you\u2019ve defined your hypothesis upfront. Historically, this has been a problem with clinical trials where researchers have \u2018data-dredged\u2019 their results and switched what they were testing for. It explains why so many results published in scientific journals have subsequently been proven to be wrong. To avoid this, it\u2019s now becoming standard practice to register clinical trials, stating in advance what your primary endpoint measure is.<\/p>\n<p>Related reading:<\/p>\n<ul>\n<li><a href=\"https:\/\/xkcd.com\/882\/\"><strong>xkcd cartoon: Do green jelly beans cause acne?<\/strong><\/a><\/li>\n<li><a href=\"http:\/\/www.nature.com\/news\/scientific-method-statistical-errors-1.14700\"><strong>Statistical errors and P values explained: Why P values aren\u2019t as reliable as many scientists assume<\/strong><\/a><\/li>\n<li><a href=\"https:\/\/www.methodspace.com\/primer-p-hacking\/\"><strong>How to avoid P Hacking and false positives in research studies<\/strong><\/a><\/li>\n<\/ul>\n<p>Survivorship Bias<\/p>\n<p>Drawing conclusions from an incomplete set of data, because that data has \u2018survived\u2019 some selection criteria.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" width=\"1200\" height=\"864\" class=\"wp-image-2289\" src=\"https:\/\/mintea.blog\/wp-content\/uploads\/2022\/09\/survivorship-bias.png\" alt=\"Survivorship Bias\" srcset=\"https:\/\/mintea.blog\/wp-content\/uploads\/2022\/09\/survivorship-bias.png 1200w, https:\/\/mintea.blog\/wp-content\/uploads\/2022\/09\/survivorship-bias-300x216.png 300w, https:\/\/mintea.blog\/wp-content\/uploads\/2022\/09\/survivorship-bias-1024x737.png 1024w, https:\/\/mintea.blog\/wp-content\/uploads\/2022\/09\/survivorship-bias-768x553.png 768w\" sizes=\"auto, (max-width: 1200px) 100vw, 1200px\" \/><\/p>\n<p><a href=\"https:\/\/www.geckoboard.com\/uploads\/survivorship-bias.pdf\"><strong>Get the printable card<\/strong><\/a><\/p>\n<p>When analyzing data, it\u2019s important to ask yourself what data you don\u2019t have. Sometimes, the full picture is obscured because the data you\u2019ve got has survived a selection of some sort. For example, in WWII, a team was asked where the best place was to fit armour to a plane. The planes that came back from battle had bullet holes everywhere except the engine and cockpit. The team decided it was best to fit armour where there were no bullet holes, because planes shot in those places had not returned.<\/p>\n<p>Related reading:<\/p>\n<ul>\n<li><a href=\"https:\/\/medium.com\/@penguinpress\/an-excerpt-from-how-not-to-be-wrong-by-jordan-ellenberg-664e708cfc3d\"><strong>Abraham Wald and the Missing Bullet Holes: An excerpt from How Not To Be Wrong by Jordan Ellenberg<\/strong><\/a><\/li>\n<\/ul>\n<p>Cobra Effect<\/p>\n<p>When an incentive produces the opposite result intended. Also known as a Perverse Incentive.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" width=\"1200\" height=\"864\" class=\"wp-image-2290\" src=\"https:\/\/mintea.blog\/wp-content\/uploads\/2022\/09\/cobra-effect.png\" alt=\"Cobra effect\" srcset=\"https:\/\/mintea.blog\/wp-content\/uploads\/2022\/09\/cobra-effect.png 1200w, https:\/\/mintea.blog\/wp-content\/uploads\/2022\/09\/cobra-effect-300x216.png 300w, https:\/\/mintea.blog\/wp-content\/uploads\/2022\/09\/cobra-effect-1024x737.png 1024w, https:\/\/mintea.blog\/wp-content\/uploads\/2022\/09\/cobra-effect-768x553.png 768w\" sizes=\"auto, (max-width: 1200px) 100vw, 1200px\" \/><\/p>\n<p><a href=\"https:\/\/www.geckoboard.com\/uploads\/cobra-effect.pdf\"><strong>Get the printable card<\/strong><\/a><\/p>\n<p>Named from a historic legend, the Cobra Effect occurs when an incentive for solving a problem creates unintended negative consequences. It\u2019s said that in the 1800s, the British Empire wanted to reduce cobra bite deaths in India. They offered a financial incentive for every cobra skin brought to them to motivate cobra hunting. But instead, people began farming them. When the government realized the incentive wasn\u2019t working, they removed it so cobra farmers released their snakes, increasing the population. When setting incentives or goals, make sure you\u2019re not accidentally encouraging the wrong behaviour.<\/p>\n<p>Related reading:<\/p>\n<ul>\n<li><a href=\"https:\/\/www.datasciencecentral.com\/profiles\/blogs\/unintended-consequences-of-the-wrong-measures\"><strong>Unintended Consequences of the Wrong Measures<\/strong><\/a><\/li>\n<li><a href=\"https:\/\/medium.com\/geckoboard-under-the-hood\/the-cobra-effect-how-to-avoid-unintended-consequences-when-setting-goals-b5c2864276e1\"><strong>The Cobra Effect: How to avoid unintended consequences when setting goals<\/strong><\/a><\/li>\n<\/ul>\n<p>False Causality<\/p>\n<p>To falsely assume when two events occur together that one must have caused the other.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" width=\"1200\" height=\"864\" class=\"wp-image-2291\" src=\"https:\/\/mintea.blog\/wp-content\/uploads\/2022\/09\/false-causality.png\" alt=\"False causality\" srcset=\"https:\/\/mintea.blog\/wp-content\/uploads\/2022\/09\/false-causality.png 1200w, https:\/\/mintea.blog\/wp-content\/uploads\/2022\/09\/false-causality-300x216.png 300w, https:\/\/mintea.blog\/wp-content\/uploads\/2022\/09\/false-causality-1024x737.png 1024w, https:\/\/mintea.blog\/wp-content\/uploads\/2022\/09\/false-causality-768x553.png 768w\" sizes=\"auto, (max-width: 1200px) 100vw, 1200px\" \/><\/p>\n<p><a href=\"https:\/\/www.geckoboard.com\/uploads\/false-causality.pdf\"><strong>Get the printable card<\/strong><\/a><\/p>\n<p>Global temperatures have steadily risen over the past 150 years and the number of pirates has declined at a comparable rate. No one would reasonably claim that the reduction in pirates caused global warming or that more pirates would reverse it. But it\u2019s not usually this clear-cut. Often correlations between two things tempt us to believe that one caused the other. However, it\u2019s often a coincidence or there\u2019s a third factor causing both effects that you\u2019re seeing. In our pirates and global warming example, the cause of both is industrialization. Never assume causation because of correlation alone \u2013 always gather more evidence.<\/p>\n<p>Related reading:<\/p>\n<ul>\n<li><a href=\"http:\/\/www.tylervigen.com\/spurious-correlations\"><strong>Spurious Correlations<\/strong><\/a><\/li>\n<li><a href=\"https:\/\/www.youtube.com\/watch?v=gxSUqr3ouYA&amp;feature=youtu.be\"><strong>This \u2260 That<\/strong><\/a><\/li>\n<li><a href=\"https:\/\/www.kdnuggets.com\/2014\/06\/cardinal-sin-data-mining-data-science.html\"><strong>The Cardinal Sin of Data Mining and Data Science: Overfitting<\/strong><\/a><\/li>\n<\/ul>\n<p>Gerrymandering<\/p>\n<p>The practice of deliberately manipulating boundaries of political districts in order to sway the result of an election.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" width=\"1200\" height=\"864\" class=\"wp-image-2292\" src=\"https:\/\/mintea.blog\/wp-content\/uploads\/2022\/09\/gerrymandering.png\" alt=\"Gerrymandering\" srcset=\"https:\/\/mintea.blog\/wp-content\/uploads\/2022\/09\/gerrymandering.png 1200w, https:\/\/mintea.blog\/wp-content\/uploads\/2022\/09\/gerrymandering-300x216.png 300w, https:\/\/mintea.blog\/wp-content\/uploads\/2022\/09\/gerrymandering-1024x737.png 1024w, https:\/\/mintea.blog\/wp-content\/uploads\/2022\/09\/gerrymandering-768x553.png 768w\" sizes=\"auto, (max-width: 1200px) 100vw, 1200px\" \/><\/p>\n<p><a href=\"https:\/\/www.geckoboard.com\/uploads\/gerrymandering.pdf\"><strong>Get the printable card<\/strong><\/a><\/p>\n<p>In many political systems, it\u2019s possible to manipulate the likelihood of one party being elected over another by redefining the political districts \u2013 include more rural areas in a district to disadvantage the party that\u2019s more popular in cities etc. A similar phenomenon known as the Modifiable Areal Unit Problem (MAUP) can occur when analyzing data. How you define the areas to aggregate your data \u2013 e.g. what you define as \u2018Northern counties\u2019 \u2013 can change the result. The scale used to group data can also have a big impact. Results can vary wildly whether using postcodes, counties or states.<\/p>\n<p>Related reading:<\/p>\n<ul>\n<li><a href=\"https:\/\/www.nytimes.com\/interactive\/2017\/10\/03\/upshot\/how-the-new-math-of-gerrymandering-works-supreme-court.html\"><strong>How the new math of Gerrymandering works: The New York Times covers a Gerrymandering case in Wisconsin<\/strong><\/a><\/li>\n<li><a href=\"https:\/\/www.ed.ac.uk\/files\/imports\/fileManager\/RFlowerdew_Slides.pdf\"><strong>Understanding the Modifiable Areal Unit Problem [PDF]<\/strong><\/a><\/li>\n<li><a href=\"http:\/\/wiki.gis.com\/wiki\/index.php\/Modifiable_areal_unit_problem\"><strong>The Modifiable Areal Unit Problem (MAUP) explained [Wikipedia]<\/strong><\/a><\/li>\n<\/ul>\n<p>Sampling Bias<\/p>\n<p>Drawing conclusions from a set of data that isn\u2019t representative of the population you\u2019re trying to understand.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" width=\"1200\" height=\"864\" class=\"wp-image-2293\" src=\"https:\/\/mintea.blog\/wp-content\/uploads\/2022\/09\/sampling-bias.png\" alt=\"Sampling bias\" srcset=\"https:\/\/mintea.blog\/wp-content\/uploads\/2022\/09\/sampling-bias.png 1200w, https:\/\/mintea.blog\/wp-content\/uploads\/2022\/09\/sampling-bias-300x216.png 300w, https:\/\/mintea.blog\/wp-content\/uploads\/2022\/09\/sampling-bias-1024x737.png 1024w, https:\/\/mintea.blog\/wp-content\/uploads\/2022\/09\/sampling-bias-768x553.png 768w\" sizes=\"auto, (max-width: 1200px) 100vw, 1200px\" \/><\/p>\n<p><a href=\"https:\/\/www.geckoboard.com\/uploads\/sampling-bias.pdf\"><strong>Get the printable card<\/strong><\/a><\/p>\n<p>A classic problem in election polling where people taking part in a poll aren\u2019t representative of the total population, either due to self-selection or bias from the analysts. One famous example occurred in 1948 when The Chicago Tribune mistakenly predicted, based on a phone survey, that Thomas E. Dewey would become the next US president. They hadn\u2019t considered that only a certain demographic could afford telephones, excluding entire segments of the population from their survey. Make sure to consider whether your research participants are truly representative and not subject to some sampling bias.<\/p>\n<p>Related reading:<\/p>\n<ul>\n<li><a href=\"https:\/\/www.khanacademy.org\/math\/statistics-probability\/designing-studies\/sampling-and-surveys\/a\/identifying-bias-in-samples-and-surveys\"><strong>How to identify bias in samples and surveys<\/strong><\/a><\/li>\n<\/ul>\n<p>Gambler&#8217;s Fallacy<\/p>\n<p>The mistaken belief that because something has happened more frequently than usual, it\u2019s now less likely to happen in future and vice versa.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" width=\"1200\" height=\"864\" class=\"wp-image-2294\" src=\"https:\/\/mintea.blog\/wp-content\/uploads\/2022\/09\/gamblers-fallacy.png\" alt=\"Gambler's Fallacy\" srcset=\"https:\/\/mintea.blog\/wp-content\/uploads\/2022\/09\/gamblers-fallacy.png 1200w, https:\/\/mintea.blog\/wp-content\/uploads\/2022\/09\/gamblers-fallacy-300x216.png 300w, https:\/\/mintea.blog\/wp-content\/uploads\/2022\/09\/gamblers-fallacy-1024x737.png 1024w, https:\/\/mintea.blog\/wp-content\/uploads\/2022\/09\/gamblers-fallacy-768x553.png 768w\" sizes=\"auto, (max-width: 1200px) 100vw, 1200px\" \/><\/p>\n<p><a href=\"https:\/\/www.geckoboard.com\/uploads\/gamblers-fallacy.pdf\"><strong>Get the printable card<\/strong><\/a><\/p>\n<p>This is also known as the Monte Carlo Fallacy because of an infamous example that occurred at a roulette table there in 1913. The ball fell in black 26 times in a row and gamblers lost millions betting against black, assuming the streak had to end. However, the chance of black is always the same as red regardless of what\u2019s happened in the past, because the underlying probability is unchanged. A roulette table has no memory. When tempted by this fallacy, remind yourself that there\u2019s no rectifying force in the universe acting to \u2018balance things out\u2019!<\/p>\n<p>Related reading:<\/p>\n<ul>\n<li><a href=\"https:\/\/rationalwiki.org\/wiki\/Gambler's_fallacy\"><strong>The Gambler\u2019s Fallacy (aka the Monte Carlo Fallacy) explained [Wikipedia]<\/strong><\/a><\/li>\n<\/ul>\n<p>Regression Toward the Mean<\/p>\n<p>When something happens that\u2019s unusually good or bad, over time it will revert back towards the average.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" width=\"1200\" height=\"864\" class=\"wp-image-2295\" src=\"https:\/\/mintea.blog\/wp-content\/uploads\/2022\/09\/regression-toward-the-mean.png\" alt=\"Regression Toward the Mean\" srcset=\"https:\/\/mintea.blog\/wp-content\/uploads\/2022\/09\/regression-toward-the-mean.png 1200w, https:\/\/mintea.blog\/wp-content\/uploads\/2022\/09\/regression-toward-the-mean-300x216.png 300w, https:\/\/mintea.blog\/wp-content\/uploads\/2022\/09\/regression-toward-the-mean-1024x737.png 1024w, https:\/\/mintea.blog\/wp-content\/uploads\/2022\/09\/regression-toward-the-mean-768x553.png 768w\" sizes=\"auto, (max-width: 1200px) 100vw, 1200px\" \/><\/p>\n<p><a href=\"https:\/\/www.geckoboard.com\/uploads\/regression-toward-the-mean.pdf\"><strong>Get the printable card<\/strong><\/a><\/p>\n<p>Anywhere that random chance plays a part in the outcome, you\u2019re likely to see regression toward the mean. For example, success in business is often a combination of both skill and luck. This means that the best performing companies today are likely to be much closer to average in 10 years time, not through incompetence but because today they\u2019re likely benefitting from a string of good luck \u2013 like rolling a double-six repeatedly.<\/p>\n<p>Related reading:<\/p>\n<ul>\n<li><a href=\"https:\/\/rationalwiki.org\/wiki\/Regression_to_the_mean\"><strong>The Regression to the Mean explained [Wikipedia]<\/strong><\/a><\/li>\n<li><a href=\"http:\/\/onlinestatbook.com\/2\/regression\/regression_toward_mean.html\"><strong>A more in-depth look at Regression Toward the Mean<\/strong><\/a><\/li>\n<\/ul>\n<p>Hawthorne Effect<\/p>\n<p>When the act of monitoring someone can affect that person\u2019s behavior. Also known as the Observer Effect.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" width=\"1200\" height=\"864\" class=\"wp-image-2296\" src=\"https:\/\/mintea.blog\/wp-content\/uploads\/2022\/09\/hawthorne-effect.png\" alt=\"Hawthorne Effect\" srcset=\"https:\/\/mintea.blog\/wp-content\/uploads\/2022\/09\/hawthorne-effect.png 1200w, https:\/\/mintea.blog\/wp-content\/uploads\/2022\/09\/hawthorne-effect-300x216.png 300w, https:\/\/mintea.blog\/wp-content\/uploads\/2022\/09\/hawthorne-effect-1024x737.png 1024w, https:\/\/mintea.blog\/wp-content\/uploads\/2022\/09\/hawthorne-effect-768x553.png 768w\" sizes=\"auto, (max-width: 1200px) 100vw, 1200px\" \/><\/p>\n<p><a href=\"https:\/\/www.geckoboard.com\/uploads\/hawthorne-effect.pdf\"><strong>Get the printable card<\/strong><\/a><\/p>\n<p>In the 1920s at Hawthorne Works, an Illinois factory, a social sciences experiment hypothesised that workers would become more productive following various changes to their environment such as working hours, lighting levels and break times. However, it turned out that what actually motivated the workers\u2019 productivity was someone taking an interest in them. When using human research subjects, it\u2019s important to analyze the resulting data with consideration for the Hawthorne Effect.<\/p>\n<p>Related reading:<\/p>\n<ul>\n<li><a href=\"http:\/\/www.economist.com\/node\/12510632\"><strong>The Hawthorne Effect<\/strong><\/a><\/li>\n<li><a href=\"https:\/\/www.library.hbs.edu\/hc\/hawthorne\/intro.html#i\"><strong>Harvard Business School and the Hawthorne Experiments (a landmark study of worker behavior at Western Electric; 1924-1933)<\/strong><\/a><\/li>\n<li><a href=\"https:\/\/rationalwiki.org\/wiki\/Gambler's_fallacy\"><strong>The Gambler\u2019s Fallacy (aka the Monte Carlo Fallacy) explained [Wikipedia]<\/strong><\/a><\/li>\n<\/ul>\n<p>Simpson&#8217;s Paradox<\/p>\n<p>A phenomenon in which a trend appears in different groups of data but disappears or reverses when the groups are combined.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" width=\"1200\" height=\"864\" class=\"wp-image-2297\" src=\"https:\/\/mintea.blog\/wp-content\/uploads\/2022\/09\/simpsons-paradox.png\" alt=\"Simpson's Paradox\" srcset=\"https:\/\/mintea.blog\/wp-content\/uploads\/2022\/09\/simpsons-paradox.png 1200w, https:\/\/mintea.blog\/wp-content\/uploads\/2022\/09\/simpsons-paradox-300x216.png 300w, https:\/\/mintea.blog\/wp-content\/uploads\/2022\/09\/simpsons-paradox-1024x737.png 1024w, https:\/\/mintea.blog\/wp-content\/uploads\/2022\/09\/simpsons-paradox-768x553.png 768w\" sizes=\"auto, (max-width: 1200px) 100vw, 1200px\" \/><\/p>\n<p><a href=\"https:\/\/www.geckoboard.com\/uploads\/simpsons-paradox.pdf\"><strong>Get the printable card<\/strong><\/a><\/p>\n<p>In the 1970s, Berkeley University was accused of sexism because female applicants were less likely to be accepted than male ones. However, when trying to identify the source of the problem, they found that for individual subjects the acceptance rates were generally better for women than men. The paradox was caused by a difference in what subjects men and women were applying for. A greater proportion of the female applicants were applying to highly competitive subjects where acceptance rates were much lower for both genders.<\/p>\n<p>Related reading:<\/p>\n<ul>\n<li><a href=\"https:\/\/www.youtube.com\/watch?v=ebEkn-BiW5k\"><strong>Simpson\u2019s Paradox<\/strong><\/a><\/li>\n<li><a href=\"https:\/\/www.brookings.edu\/blog\/social-mobility-memos\/2015\/07\/29\/when-average-isnt-good-enough-simpsons-paradox-in-education-and-earnings\/\"><strong>When average isn\u2019t good enough: Simpson\u2019s Paradox in education and earnings<\/strong><\/a><\/li>\n<\/ul>\n<p>McNamara Fallacy<\/p>\n<p>Relying solely on metrics in complex situations can cause you to lose sight of the bigger picture.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" width=\"1200\" height=\"864\" class=\"wp-image-2298\" src=\"https:\/\/mintea.blog\/wp-content\/uploads\/2022\/09\/mcnamara-fallacy.png\" alt=\"McNamara Fallacy\" srcset=\"https:\/\/mintea.blog\/wp-content\/uploads\/2022\/09\/mcnamara-fallacy.png 1200w, https:\/\/mintea.blog\/wp-content\/uploads\/2022\/09\/mcnamara-fallacy-300x216.png 300w, https:\/\/mintea.blog\/wp-content\/uploads\/2022\/09\/mcnamara-fallacy-1024x737.png 1024w, https:\/\/mintea.blog\/wp-content\/uploads\/2022\/09\/mcnamara-fallacy-768x553.png 768w\" sizes=\"auto, (max-width: 1200px) 100vw, 1200px\" \/><\/p>\n<p><a href=\"https:\/\/www.geckoboard.com\/uploads\/mcnamara-fallacy.pdf\"><strong>Get the printable card<\/strong><\/a><\/p>\n<p>Named after Robert McNamara, the U.S. Secretary of Defense (1961-1968), who believed truth could only be found in data and statistical rigor. The fallacy refers to his approach of taking enemy body count as the measure of success in the Vietnam War. Obsessing over it meant that other relevant insights like the shifting mood of the U.S. public and the feelings of the Vietnamese people were largely ignored. When analyzing complex phenomena, we\u2019re often forced to use a metric as proxy for success. However, dogmatically optimizing for this number and ignoring all other information is risky.<\/p>\n<p>Related reading:<\/p>\n<ul>\n<li><a href=\"https:\/\/www.technologyreview.com\/s\/514591\/the-dictatorship-of-data\/\"><strong>The Dictatorship of Data<\/strong><\/a><\/li>\n<li><a href=\"https:\/\/hbr.org\/2010\/12\/robert-s-mcnamara-and-the-evolution-of-modern-management\"><strong>Robert S. McNamara and the Evolution of Modern Management<\/strong><\/a><\/li>\n<li><a href=\"https:\/\/www.logicallyfallacious.com\/tools\/lp\/Bo\/LogicalFallacies\/237\/McNamara-Fallacy\"><strong>McNamara Fallacy<\/strong><\/a><\/li>\n<li><a href=\"http:\/\/modelsandrisk.org\/blog\/the-mcnamara-fallacy-in-financial-policymaking\/index.html\"><strong>The McNamara fallacy in financial policymaking<\/strong><\/a><\/li>\n<\/ul>\n<p>Overfitting<\/p>\n<p>A more complex explanation will often describe your data better than a simple one. However, a simpler explanation is usually more representative of the underlying relationship.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" width=\"1200\" height=\"864\" class=\"wp-image-2299\" src=\"https:\/\/mintea.blog\/wp-content\/uploads\/2022\/09\/overfitting.png\" alt=\"Overfitting\" srcset=\"https:\/\/mintea.blog\/wp-content\/uploads\/2022\/09\/overfitting.png 1200w, https:\/\/mintea.blog\/wp-content\/uploads\/2022\/09\/overfitting-300x216.png 300w, https:\/\/mintea.blog\/wp-content\/uploads\/2022\/09\/overfitting-1024x737.png 1024w, https:\/\/mintea.blog\/wp-content\/uploads\/2022\/09\/overfitting-768x553.png 768w\" sizes=\"auto, (max-width: 1200px) 100vw, 1200px\" \/><\/p>\n<p><a href=\"https:\/\/www.geckoboard.com\/uploads\/overfitting.pdf\"><strong>Get the printable card<\/strong><\/a><\/p>\n<p>When looking at data, you\u2019ll want to understand what the underlying relationships are. To do this, you create a model that describes them mathematically. The problem is that a more complex model will fit your initial data better than a simple one. However, they tend to be very brittle: They work well for the data you already have, but try too hard to explain random variations. Therefore, as soon as you add more data, they break down. Simpler models are usually more robust and better at predicting future trends<\/p>\n<p>Related reading:<\/p>\n<ul>\n<li><a href=\"https:\/\/www.kdnuggets.com\/2017\/08\/understanding-overfitting-meme-supervised-learning.html\"><strong>Understanding overfitting: an inaccurate meme in Machine Learning<\/strong><\/a><\/li>\n<li><a href=\"https:\/\/www.youtube.com\/watch?v=cJA5IHIIL30\"><strong>How good is your fit? &#8211; Ep. 21 (Deep Learning Simplified)<\/strong><\/a><\/li>\n<li><a href=\"http:\/\/hunch.net\/?p=22\"><strong>Clever Methods of Overfitting<\/strong><\/a><\/li>\n<li><a href=\"https:\/\/elitedatascience.com\/overfitting-in-machine-learning\"><strong>Overfitting in Machine Learning: What It Is and How to Prevent It<\/strong><\/a><\/li>\n<\/ul>\n<p>Publication Bias<\/p>\n<p>How interesting a research finding is affects how likely it is to be published, distorting our impression of reality.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" width=\"1200\" height=\"864\" class=\"wp-image-2300\" src=\"https:\/\/mintea.blog\/wp-content\/uploads\/2022\/09\/publication-bias.png\" alt=\"Publication Bias\" srcset=\"https:\/\/mintea.blog\/wp-content\/uploads\/2022\/09\/publication-bias.png 1200w, https:\/\/mintea.blog\/wp-content\/uploads\/2022\/09\/publication-bias-300x216.png 300w, https:\/\/mintea.blog\/wp-content\/uploads\/2022\/09\/publication-bias-1024x737.png 1024w, https:\/\/mintea.blog\/wp-content\/uploads\/2022\/09\/publication-bias-768x553.png 768w\" sizes=\"auto, (max-width: 1200px) 100vw, 1200px\" \/><\/p>\n<p><a href=\"https:\/\/www.geckoboard.com\/uploads\/publication-bias.pdf\"><strong>Get the printable card<\/strong><\/a><\/p>\n<p>For every study that shows statistically significant results, there may have been many similar tests that were inconclusive. However, significant results are more interesting to read about and are therefore more likely to get published. Not knowing how many \u2018boring\u2019 studies were filed away impacts our ability to judge the validity of the results we read about. When a company claims a certain activity had a major positive impact on growth, other companies may have tried the same thing without success, so they don\u2019t talk about it.<\/p>\n<p>Related reading:<\/p>\n<ul>\n<li><a href=\"https:\/\/www.ncbi.nlm.nih.gov\/pmc\/articles\/PMC5655535\/\"><strong>The perceived feasibility of methods to reduce publication bias<\/strong><\/a><\/li>\n<li><a href=\"https:\/\/www.vox.com\/2016\/7\/14\/12016710\/science-challeges-research-funding-peer-review-process\"><strong>The 7 biggest problems facing science, according to 270 scientists<\/strong><\/a><\/li>\n<li><a href=\"https:\/\/www.ted.com\/talks\/ben_goldacre_what_doctors_don_t_know_about_the_drugs_they_prescribe\/up-next\"><strong>Ben Goldacre TED Talk: What doctors don\u2019t know about the drugs they prescribe<\/strong><\/a><\/li>\n<\/ul>\n<p>Danger of Summary Metrics<\/p>\n<p>It can be misleading to only look at the summary metrics of data sets.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" width=\"1200\" height=\"864\" class=\"wp-image-2301\" src=\"https:\/\/mintea.blog\/wp-content\/uploads\/2022\/09\/danger-of-summary-metrics.png\" alt=\"Danger of Summary Metrics\" srcset=\"https:\/\/mintea.blog\/wp-content\/uploads\/2022\/09\/danger-of-summary-metrics.png 1200w, https:\/\/mintea.blog\/wp-content\/uploads\/2022\/09\/danger-of-summary-metrics-300x216.png 300w, https:\/\/mintea.blog\/wp-content\/uploads\/2022\/09\/danger-of-summary-metrics-1024x737.png 1024w, https:\/\/mintea.blog\/wp-content\/uploads\/2022\/09\/danger-of-summary-metrics-768x553.png 768w\" sizes=\"auto, (max-width: 1200px) 100vw, 1200px\" \/><\/p>\n<p><a href=\"https:\/\/www.geckoboard.com\/uploads\/danger-of-summary-metrics.pdf\"><strong>Get the printable card<\/strong><\/a><\/p>\n<p>To demonstrate the effect, statistician Francis Anscombe put together four example data sets in the 1970s. Known as Anscombe\u2019s Quartet, each data set has the same mean, variance and correlation. However, when graphed, it\u2019s clear that each of the data sets are totally different. The point that Anscombe wanted to make is that the shape of the data is as important as the summary metrics and cannot be ignored in analysis.<\/p>\n<p>Related reading:<\/p>\n<ul>\n<li><a href=\"https:\/\/www.autodeskresearch.com\/publications\/samestats\"><strong>Same Stats, Different Graphs: Generating Datasets with Varied Appearance and Identical Statistics through Simulated Annealing<\/strong><\/a><\/li>\n<li><a href=\"http:\/\/www.thefunctionalart.com\/2016\/08\/download-datasaurus-never-trust-summary.html\"><strong>Download the Datasaurus: Never trust summary statistics alone; always visualize your data<\/strong><\/a><\/li>\n<li><a href=\"https:\/\/flowingdata.com\/2017\/07\/07\/small-summary-stats\/\"><strong>Summary Statistics Tell You Little About the Big Picture<\/strong><\/a><\/li>\n<li><a href=\"https:\/\/blog.heapanalytics.com\/anscombes-quartet-and-why-summary-statistics-dont-tell-the-whole-story\/\"><strong>Anscombe\u2019s Quartet, and Why Summary Statistics Don\u2019t Tell the Whole Story<\/strong><\/a><\/li>\n<li><a href=\"https:\/\/fivethirtyeight.com\/features\/al-gores-new-movie-exposes-the-big-flaw-in-online-movie-ratings\/\"><strong>Al Gore\u2019s New Movie Exposes The Big Flaw In Online Movie Ratings<\/strong><\/a><\/li>\n<\/ul>\n<p>&nbsp;<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Data fallacies Statistical fallacies are common tricks data can play on you, which lead to mistakes in data interpretation and analysis. Explore some common fallacies, with real-life examples, and find out how you can avoid them. Download poster Cherry Picking The practice of selecting results that fit your claim and excluding those that don\u2019t. The &hellip; <a href=\"https:\/\/mintea.blog\/?p=2285\" class=\"more-link\">Continue reading <span class=\"screen-reader-text\">Data fallacies<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":2379,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[25],"tags":[32,26,66,58,52,51,59],"class_list":["post-2285","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-bookmarked-articles","tag-analytic","tag-data","tag-data-fallacies","tag-data-modelling","tag-machine-learning","tag-modelling","tag-statistic"],"_links":{"self":[{"href":"https:\/\/mintea.blog\/index.php?rest_route=\/wp\/v2\/posts\/2285","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/mintea.blog\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/mintea.blog\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/mintea.blog\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/mintea.blog\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=2285"}],"version-history":[{"count":2,"href":"https:\/\/mintea.blog\/index.php?rest_route=\/wp\/v2\/posts\/2285\/revisions"}],"predecessor-version":[{"id":2303,"href":"https:\/\/mintea.blog\/index.php?rest_route=\/wp\/v2\/posts\/2285\/revisions\/2303"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/mintea.blog\/index.php?rest_route=\/wp\/v2\/media\/2379"}],"wp:attachment":[{"href":"https:\/\/mintea.blog\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=2285"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/mintea.blog\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=2285"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/mintea.blog\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=2285"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}