Back in the early 1990s, commercial data science was still in it’s infancy. In fact, even the term “data science” didn’t really exist at that point – not in the common lexicon, at least. At that point, the skill of extracting knowledge from large datasets and using it to solve high-dimensional problems was a niche one, reserved only for a select few at the leading edge of mathematics and computing.
Over the course of the next two decades, the number of people in possession of these specialized skills continued to grow. As businesses started to become cognisant of the competitive advantage that data analytics could provide, commercial interest – and vacancies for talented professionals – soared. Very quickly, data science changed from an isolated pursuit to a vital tool, and companies of all kinds naturally rallied behind the people who could wield it effectively.
The awakening of the corporate world to the potential of data science wasn’t the only driving force during that time, however. Running in parallel was a significant improvement in our ability to cheaply house and process vast amounts of data, driven by huge advances in storage and computation. As our own capability to exploit data continued to grow, technology improved in tandem, allowing us to augment our skills further still.
20 years of rapid progress leads us to today, an age in which data science has developed an almost cult-like status. Greater access to tools and learning means that data science has now become a sought-after career path, and a prestigious one, too. At the start of the past decade, the Harvard Business Review dubbed data science as “the sexiest job of the 21st Century”, and pondered a future in which data professionals might eventually be so in demand that shortages would eventually loom.
That speculation was not, as it happens, unwarranted. In the past few years, we’ve begun to see a clear gap emerging between the supply and demand of skilled workers: there are now three times as many job postings in the data field than there are job searches.
As demand for qualified data professionals has grown, however, we’ve also seen something of a democratization in what it means to be a data scientist. Most people who handle data today, whether in large volumes or small chunks, are involved in data science to at least some degree. Computational modelling for software development, financial or operational modelling, even sales forecasts – they all require at least some degree of data science ability.
This has given rise to a group of people that can best be referred to as “citizen data scientists”.
These “citizens” aren’t data scientists by trade. While they might need to generate models using advanced diagnostic analytics or predictive and prescriptive capabilities, their actual role sits outside of the field of statistics and analytics. Instead, citizen data scientists can be found in areas such as finance, sales, operations, and more. In the same way that most of us now use computers at work but aren’t computer scientists, citizen data scientists use data science to get their job done, even though it’s not their full-time vocation.
As the number of people sitting in those roles continues to grow, some of the responsibilities that today’s data scientists bear are likely to be passed down to that secondary group. That’s in addition to automation, which will take on a significant portion of a data scientist’s traditional workload by itself.
If their work is being filtered down to other parties then, does that mean that data scientists will disappear? Far from it. Instead, I believe that we’re on the cusp of an age in which we fundamentally change what it means to be a data scientist. Just as the role began with specialists and niche interests, the democratization of data science will free up highly skilled professionals to focus on delivering true value – something I’ll touch on more in my next post in this series.
Making this vision a reality, of course, means that we need to ensure that our citizen data scientists have everything they need to thrive. The key to that is data literacy, and the priority now should be on providing everyone in an organization with the capability to read, understand, and communicate through data.
I think that there are seven key areas that businesses need to address in order to make that happen.
- Strategy: data literacy should be part of a company’s organizational strategy, with well-defined and forward thinking processes.
- Investment: businesses need to ensure that they’re investing in data literacy not just today, but in the longer term as well.
- Partnership: companies need partners who can help them drive the adoption of data science as a wider organizational skillset.
- Access: they need to systematically introduce data to people, providing them with the right access and tools that help them easily digest and use data.
- Customization: they should also provide users with no-code solutions and automated machine learning, again something that the right partner can assist with.
- Security: organizations should maintain centralized controls that are compliant with ever-evolving privacy and protection laws.
- Training: and they should deliver systematic training that can be reflected in expanded job descriptions as capabilities improve.
By the end of this decade, data science will become a universal skill – one that stands alongside the likes of mathematics, basic statistics, and computing in terms of its ubiquity. As we move into the decade of data literacy, we all have a responsibility to ensure that we’re speaking the same language.
Vijay Balaji Madheswaran is Director of Applied Data Science for dunnhumby APAC. Focused on investment, partnership, access, and customization, Vijay is passionate about helping Retailers and Brands realize the full value of their data. As organizations around the world and across all fields look to capitalize on the growth potential of data science, data literacy is set to become an essential skill during the next decade. dunnhumby’s Vijay Balaji Madheswaran explores the history of data science, and looks towards a future in which everyone needs to talk the language of data.