In this article, we’ll explain what big data is, how it came to be, how technological advances have made it possible to utilize it, and how it’s changing the way we collect data. We’ll also discuss how today’s big data is just the beginning, and what the future holds for even bigger data.
For some time now, we have been hearing the unfamiliar word “big data” in various media. It’s only been a few years since the term became popular, but we’ve already gotten used to it. Even phrases like “marketing with big data” have become commonplace. But what is it about big data and data mining that has garnered so much attention? To answer this question, we need to understand the concept of big data and its background.
Big data literally means huge data sets. Any data that can be stored on a storage medium, from simple numbers to complex CCTV footage, can be collected and aggregated to form big data, regardless of its format. It’s interesting to note that there’s not much difference between traditional data and big data when it comes to the format of the data. However, if big data is just big data, then we should have seen this trend already in the late 1990s or early 2000s, when computer technology was rapidly advancing. So why did big data only become a buzzword in the 2010s? This is closely related to three important technological advances.
The first and foremost is a paradigm shift in CPU evolution. The central processing unit (CPU) is the brain of a computer that performs computational tasks, and in the past, it has seen such rapid advancements that Moore’s Law, which states that CPU performance doubles every 18 months, has become widely accepted. However, in 2004, CPU advancements hit what has been called the “4 GHz wall.” Up until then, CPU advancements were based on increasing the speed of a single computing unit by increasing the number of transistors (operational units) in a single core. However, as transistors became more densely packed, they became increasingly hot, and a new approach was needed. Instead of increasing the number of transistors in a core, CPU manufacturers developed multi-core CPUs by putting multiple cores into a single CPU, which allowed them to process data in parallel. This has made it possible to process massive amounts of data faster and more efficiently, which was previously difficult due to computational speed limitations.
Advances in storage media also played an important role in ushering in the era of big data. Storage media such as hard disks (HDDs) have dramatically increased the capacity and speed of data storage. Hard disks with capacities of 8TB or more are now commonplace, and the advent of high-speed storage media such as solid state drives (SSDs) has greatly improved the speed at which large amounts of data can be stored and processed. These technological advances have made it easier to handle large amounts of data that were previously difficult to utilize due to storage space limitations.
While these advances in CPUs and storage media have made big data possible, changes in the way data is collected have further expanded its scope. The rapid proliferation of smart devices and social media in the 2010s changed the paradigm of data collection. Smart devices that are directly or indirectly connected to the network collect data from users through various sensors such as cameras, global positioning systems (GPS), and near field communication (NFC), and upload this data to the network in real time. In addition, users of social networks such as Facebook and Twitter voluntarily share their personal information online, which also contributes to this vast amount of data. In the past, it was common for entities with a specific purpose to collect targeted data, but now it is indiscriminately collected from the constant flow of data generated by smart devices and social networks. As more and more things are connected to the Internet due to advances in network technology, the Internet of Things (IOT) is ushering in the era of data collection and expanding the scope of data collection.
The development of multi-core CPUs, advances in storage media, and the expanding scope of data collection have combined to create the concept of big data. Many companies and government agencies are now analyzing their big data and trying to find meaningful information in it. The media is constantly emphasizing the importance of big data. But the most important reason to think about big data is that we’re just getting started. In the future, multi-core CPUs will evolve to perform even faster computations simultaneously, and storage media will offer greater capacity and faster speeds. And as more and more things become networked, the amount of data collected will grow exponentially. Big data as we recognize it today may pale in comparison to the truly big data era of the future.