Fulfilling big data’s promise with small and wide data

By Jim Hare, distinguished VP Analyst, Gartner

Disruptions such as the COVID-19 pandemic cause historical data that reflect past conditions for organisations to quickly become obsolete. As they experience the limitations of big data as a critical enabler of analytics and AI, new approaches known as ‘small data’ and ‘wide data’ are emerging.

The big data era achieved the ability to store and manage data but fell short of enabling organisations to derive value from it. This is where small and wide data come in to deliver on that promise.

The wide data approach enables the analysis and synergy of a variety of small and large, unstructured and structured data sources. The small data approach, on the other hand, is about the application of analytical techniques that require less data but still offer useful insights.

Both approaches enable more robust analytics and AI, reducing an organisation’s dependency on big data and enabling a richer, more complete situational awareness or 360-degree view. Organisations can then apply analytics for better decision making in the increasingly complex context of disruptions, market dynamics and demanding customers.

According to Gartner, 70% of organisations will be compelled to shift their focus from big to small and wide data by 2025, providing more context for analytics and making AI less data hungry.

Data and analytics (D&A) leaders must envisage a strategy that empowers their organisations to use small, wide and synthetic data to drive business transformation via analytics augmented with AI and machine learning (ML). This will help them tackle challenges such as low availability of training data or developing more robust models by using a wider variety of data.

Why is small and wide data important?

Analytics and AI need to be able to work with more recent and less voluminous data. In addition, collecting sufficiently large volumes of historical or labelled data for analytics and AI is a challenge for many organisations.

Data sourcing, data quality, bias and privacy protection are common challenges. But even if big data is available, the costs, time and energy to implement conventional supervised ML can still be prohibitive. In addition, decision making by humans and AI has become more complex and demanding, requiring a greater variety of data for better situational awareness.

Taken together, this means that there’s a growing need for analytical techniques that can leverage available data more effectively, either by reducing the required volume or by extracting more value from unstructured, diverse data sources.

What is the impact?

The wide data approach applies X analytics, with X standing for finding links between data sources, as well as for a diversity of data formats. These formats include tabular, text, image, video, audio, voice, temperature, or even smell and vibration. The data itself comes from an increasing range of internal and external data sources, such as data marketplaces, brokers, social media, IoT sensors and digital twins.

The small data approach includes the tailored use of less data hungry models, such as certain time-series analysis techniques, rather than using more data hungry deep learning techniques in a one-size-fits-all approach. Other techniques include few-shot learning, synthetic data or self-supervised learning. The need for data can be further alleviated through techniques such as collaborative or federated, adaptive, reinforcement and transfer learning.

Potential areas for innovation with small and wide data include, but not limited to, demand forecasting in retail, real-time behavioural and emotional intelligence in customer service applied to hyper-personalisation, and customer experience improvement.

Other areas include physical security or fraud detection and adaptive autonomous systems, such as robots, which constantly learn by the analysis of correlations in time and space of events in different sensory channels.

How to get started

Explore small and wide data approaches to lower your barrier to entry for advanced analytics and AI caused by a real or perceived lack of data, rather than overly relying on data hungry deep learning approaches.

Extend the toolbox of your D&A teams with techniques to provide a richer context for more accurate business decision making, leveraging the growing availability of external data sources through data sharing and marketplaces.

Finally, enrich and improve the predictive power of data by incorporating a greater variety of structured and unstructured data sources.

About the author

Jim Hare is a distinguished vice president analyst at Gartner. His areas of specialisation include AI, data science, analytics and business intelligence (BI). Jim will be presenting on small and wide data at the Gartner Data & Analytics Summit 2021 for APAC, taking place virtually 8-9 June.

Source: PC and Associates Consulting