What is big data?
The term “big data” means large quantities of raw information – typically terabytes or more – collated from multiple sources both within and outside of an organisation. The term can also loosely refer to the process of analysing such large sets of data to discover insights.
What sort of information is included?
Big data can include everything that could conceivably be relevant to a company’s operations, such as sales data, website activity, CRM records, network logs and sensor information. External information may include details of social media activity or currency exchange rates. Big data sets often focus on “high-velocity” sources, which grow rapidly as new data is continually added.
Another hallmark of big data is the inclusion of unstructured or semi-structured content, such as social media posts or web pages. This contrasts with traditional database-driven approaches to business intelligence, although big-data sources generally still need to be translated into a consistent format for analysis.
Why is big data useful?
Big data analysis uses powerful computing resources to process data sets that would be too large and diverse for a human to work with. Subtle trends and correlations can be spotted, and actionable insights can be generated – perhaps relating to customer behaviour, or to inefficiencies in the company’s workflow – that would be missed by traditional approaches, or uncovered much more slowly.
Does big data use AI?
Big data analysis doesn’t necessarily involve artificial intelligence. However, the task of finding patterns and connections in very large, unorganised data sets is a natural fit for machine learning. AI logic can be used at multiple stages of a big data process, such as standardising the data and making predictions from incomplete information.
How is the data processed?
There is no off-the-shelf tool for big data analysis: the process needs to be custom-coded to suit the available data sources and business parameters. Many solutions use the open-source Hadoop programming framework, which has built-in capabilities for handling the ingestion, storage and processing of large data stores.
What sorts of organisation can make use of big data?
Big data is of particular interest to enterprise-scale businesses: these are the companies most likely to generate the huge quantities of data required for big data analysis. Large companies are also most likely to have the resources to invest in the necessary computing power, and can afford to hire professional developers and analysts to realise big data projects.
However, big data techniques are open to businesses of all sizes. Hosted services such as Google Cloud BigQuery, IBM Cloud Pak for Data and Microsoft Azure Databricks let businesses of any size assemble their own data analysis processes, using a variety of languages and frameworks, on a pay-as-you-go basis.
Summary
- Big data refers to very large collections of unstructured data, and the analysis that can be performed on them.
- Applying AI-type logic to big data stores can unearth insights that a human data worker would never discover.
- Big data processing normally entails some degree of custom coding, using a suitable framework such as Hadoop.
- Small businesses can take advantage of numerous cloud-based big data services.
NEXT UP
Kees Wolters, Chief Product Officer and Founder at Mopinion: “To fuel your creativity, you need a solid foundation in design principles”
We interview Kees Wolters, Chief Product Officer & Founder at Mopinion a user feedback software for websites, mobile apps and email campaigns
Cisco ramps up partner training with $80m investment
Cisco announced at its 2024 partner summit an investment of US$80 million to help partners upskill their workforce through Cisco U
Whisper it, but genuinely useful local AI has already arrived on laptops
Local AI is coming to laptops, but can it actually perform tasks that will save you time? We allow HP’s new OmniBook Pro to take us out to lunch to find out