Being comfortable with no assumptions
This article discusses one of the few things that I have learned as a data scientist. It does not involve any popular new methods or deep learning, and perhaps does not sound as cool. It is something much more primary, but absolutely crucial to any scientist’s career.
Whenever a problem is given to someone and their task is to find the best possible solution, one of the most important aspects is to efficiently search through the possible solutions. However, when presented with a problem, the first step people usually take is to quickly begin searching for a solution. This search happens in a virtual space of possible solutions that is created by one’s mind almost automatically. We are able to create this virtual search space very quickly by making many assumptions about the problem.
We have been pushed by evolution to make assumptions in order to survive and learn quickly. Making assumptions is a very useful tool, it allows us to skip many steps in understanding things in order to incorporate new knowledge at a fast pace. In fact, we have been driven to making assumptions so strongly that it is not only automatic but many times completely invisible to us.
You might have to question some of the assumptions you made, or try to think of assumptions you might have made inadvertently
I would now like to prove to you that you do make assumptions. First, consider this; most of the solutions to puzzles/brainteasers are difficult to find because they are outside of the solution space you create to begin with. This is true because the problems are worded in such a way as to induce a certain automatic assumption by the listener. Look at the problems below and think about them for a bit before you continue reading this article.
Here is the first puzzle. Simply figure out which number is in the spot in which the car is parked in.
In this second puzzle, imagine 4 toothpicks on a table forming a cross. You are only allowed to move 1 toothpick once, and your goal is to make a square. You are not allowed to use a toothpick to move the other toothpicks or to break a toothpick.
In both examples above there are very simple solutions. They are only hard to find because your search space does not it. It is human nature to assume that a solution is not found because one did not look carefully enough. This applies to work in data science and software engineering.
In my academic years, I spent my fair share of time coding and fixing bugs; I also competed in coding competitions where both quality and speed were being tested, so finding bugs at a fast pace was crucial. One thing that I learned was to always be aware of my assumptions. They allow you to search in the most probable places first and get to your solution more quickly. However, you need to be aware of these assumptions so that when you are not finding your solution you can revisit them.
A behavior that I have seen over and over again in industry with software engineers is to make assumptions about their code when looking for a bug, and never questioning these assumptions. It goes something like this:
Coder 1 (the one with the bug) “Could you help me fix this bug?”
Coder 2 (friend helping out) “Where do you think the bug is?”
Coder 1 “Somewhere around here.”
Coder 2 “Did you look over here?”
Coder 1 “No, but I don’t have to. It is impossible for a bug to be there because …”
The explanations are endless for why it is not possible for a bug to exist in certain locations of the code. I always repeat myself and say “question your assumptions,” and it is usually where people are not looking where the bug is hiding. If the bug was hiding where the person was looking it wouldn’t really be hidden or hard to find in the first place. This is one of the hardest lessons for people to learn.
In part, making no assumptions may make you look like you do not know what you are doing to others. By checking parts of the code that are "impossible" to have a problem due to whatever reason, might make you seem foolish for testing them. Make no mistake, as a species we have shown time and again that the only foolish mistake is to always think we know more than what we actually do know. There is no shame in embracing the openness to learning new things; there is shame however, in refusing to learn because of posturing and wanting to appear smart.
To conclude this article, I would like to tell you a story about a true scientist with no fear of assumptions, Dr. Ignaz Semmelweis, one who is credited with the discovery of the importance of hand-washing before medical procedures. Firstly, he made use of statistics and noticed that women giving birth at a certain hospital had a significantly higher mortality rate than that of another hospital. What followed was a truly methodical and relentless pursuit of the reason why. He proceeded to test everything that was different between the two places, including the ringing of bells. When I tell this story, people think that it was silly to having tested bell ringing, and that he must have been superstitious, and if he were a true scientist he would have known that bell ringing could not have any possible effect on mortality rates. Does this sound familiar? This is another example of assumptions being reinforced by arrogance and fear of ridicule; unfortunately a very common behavior. Dr.Semmelweis was a TRUE scientist, he was not going to let embarrassment or ridicule get in the way of science; he proceeded beautifully and methodically and held “no assumptions”, eventually finding the true answer to the problem at hand.
It is hard to follow the above story with any last few words that drive the point home more poignantly; however, remember, first, put some thought into the problem definition. Second, remember to revisit this step if you can’t find the solution you are looking for. You might have to question some of the assumptions you made, or try to think of assumptions you might have made inadvertently, which are much trickier to spot.
Getting the Most out of Big Data
Big Data: Separating the Hype from Reality in Corporate Culture
Maintaining Maximum Relevancy for Buyers and Sellers
Building Levies to Manage Data Flood
By Tom Farrah, CIO & SVP, Dr Pepper Snapple Group
By George Evans, CIO, Singing River Health System
By John Kamin, EVP and CIO, Old National Bancorp
By Phil Jordan, CIO, Telefonica
By Elliot Garbus, VP-IoT Solutions Group & GM-Automotive...
By Dennis Hodges, CIO, Inteva Products
By Bill Krivoshik, SVP & CIO, Time Warner Inc.
By Gregory Morrison, SVP & CIO, Cox Enterprises
By Alberto Ruocco, CIO, American Electric Power
By Sam Lamonica, CIO & VP Information Systems, Rosendin...
By Sven Gerjets, SVP-IT, DIRECTV
By Marie Blake, EVP & CCO, BankUnited
By Lowell Gilvin, Chief Process Officer, Jabil
By Walter Carvalho, VP & Corporate CIO, Carnival Corporation
By Mary Alice Annecharico, SVP & CIO, Henry Ford Health System
By Bernd Schlotter, President of Services, Unify
By Bob Fecteau, CIO, SAIC
By Jason Alan Snyder, CTO, Momentum Worldwide
By Jim Whitehurst, CEO, Red Hat
By Marc Jones, Distinguished Engineer, IBM Cloud Infrastructure