Data science is a team sport.
In 2012, Tom Davenport, a well-known management and technology thinker, and D.J. Patil, a computer scientist who later served as chief data scientist in the Obama Administration's Office of Science and Technology Policy, penned a provocative Harvard Business Review essay in which they argued that being a data scientist was the "sexiest job of the twenty-first century," as the title of the article stated.
However, Joe DosSantos, the company's chief data and analytics officer, disapproves of their judgement. "The essay established unrealistic expectations of what a data scientist can achieve," DosSantos argues, adding that these exaggerated expectations exacerbated the talent battle by encouraging employers to seek out "unicorn" applicants who possess all of the skills Davenport and Patel described.
Companies should consider how to construct teams in which data-literate analysts work alongside subject-matter specialists from multiple business divisions, rather than looking for masters-of-all-trades data scientists like Davenport and Patil. He describes data science as a "team sport." And data scientists aren't even the most crucial members of the team, says the author.
Creating efficient algorithms for businesses entails a number of phases. DosSantos says that aligning the organisation behind an important problem to tackle and then figuring out how an algorithm might help solve that problem is one of the most crucial. Most data scientists, he claims, are "useless" at both of these tasks.
Other critical steps include determining what data is available and considering the ethical implications of combining that data with a specific business case. After that, a business must get the data, clean it, create the algorithm, and train it on the data. The algorithm must then be tested, and a deployment strategy must be devised. Finally, it must keep track of the algorithm's performance to guarantee that it continues to function properly. Only a few of those jobs, according to DosSantos, necessitate the use of a data scientist: algorithm development and training. Most of the others can be handled by subject matter experts or people with various engineering and analytic backgrounds.
It's also important, according to DosSantos, to distinguish between tasks that require a data scientist, such as designing an algorithm and developing ways to monitor the algorithm once it's in production, and those that only require a good data engineer, such as creating clean data sets to train the algorithm. He says, "The data scientist and the data engineer work together in a Batman and Robin kind of way." According to him, there are two data engineers for every data scientist working on a project at Qlik.