What is data structures, and why is an understanding of basic data structures like arrays and linked lists essential in modern computer science?

W

This article explains the need for data structures with the evolution of computer memory and covers the concepts, advantages, and disadvantages of basic data structures such as arrays and linked lists. It emphasizes that data structures play an important role in data management and processing.

 

A lot of time has passed since the personal computer first appeared on the scene. Back then, it was enough to have a single computer in the house, but today, not only computers, but also smartphones, tablets, and even home appliances play the role of computers. This is not just because the number of devices has increased, but also because the performance and functionality of each device has improved by leaps and bounds. As time has passed, the capacity of memory media has become vastly larger than it was in the early days, and processing speeds have become unimaginably faster. Moore’s Law, a law that describes the scale of these advances in performance, states that for every year and a half that passes, hardware becomes twice as dense, resulting in twice as much performance. This law has played an important role in predicting the development of the semiconductor industry, which has led to a remarkable growth in the rate of hardware performance improvements. It has allowed us to go from having less than 10 megabytes of memory in our personal computers to having terabytes of memory in our computers. The computers we use today can easily store and process hundreds of gigabytes of data, a development that scientists of the past could never have imagined. As these memory devices get larger and larger, we also need to consider the efficiency of how we use them, and this efficient use of memory is called data organization.
To understand data organization and its necessity, we first need to understand the structure of a memory device. A memory device is made up of a collection of small memory units that can work independently of each other. You can think of it as a collection of small lockers with serial numbers. One locker can store information as small as a few letters of the alphabet and one or two numbers, and several such lockers together make up the entire memory space. Once each of these lockers is uniquely numbered, you can refer to them by their number. As long as the user remembers how many times their information is in a locker, they can pull out or modify the information they want without having to go through all the other lockers. This structural understanding is the first step in realizing why organization is important.
The need for organization starts when you break down your memory into smaller units. Lockers aren’t infinite in size, so if you have a lot of information you want to store, a single locker won’t be enough. For example, you wouldn’t be able to store a list of all your science and technical writing instructors in a single locker with only a few letters of the alphabet. Instead, you’ll need to slice and dice the information across multiple lockers. When using multiple lockers, it’s much more efficient to be able to remember just one or two serial numbers and know the entire number of lockers you’ve used, rather than having to haphazardly pick an empty locker, put your information in it, and remember the serial numbers of all those lockers. To efficiently manage and process this complex information, it is essential to choose the right data structure. A good organizational structure is necessary for efficient use of multiple lockers. There are many different organizational structures that have been studied, and two of the most basic ones are arrays and linked lists.
An array is a structure that uses only consecutive numbers of lockers when using multiple lockers. In this structure, you only need to remember the number of the first locker and how many lockers you’re using to know all the lockers you’ve used. If you know that a user has used 10 lockers, and you know that the first locker was locker 2016, you can immediately see that the user has used 10 lockers: 2016, 2017, 2018, …, 2024, 2025. This structure requires very little additional information to be memorized since we only need to remember the first location and length, is space-efficient since there is no wasted space, and allows us to immediately know which locker the user is using. For example, in the previous example, we can immediately calculate that the 5th locker the user has used is locker 2016 + 4 = 2020. This advantage of the array structure is especially useful in situations where the data doesn’t take up a lot of space. However, if you want to clear information in the middle of a busy locker, you have to advance the information in all the lockers behind it by one space, which severely reduces the efficiency of clearing the information, and even if you have enough free lockers, you can’t use all the extra space unless they’re close together.
A linked list is another structure that can be used when you want to use multiple lockers. Unlike an array, a linked list doesn’t use consecutive lockers, but instead each locker stores the number of the locker that comes next. This way, you only need to know the number of the first locker to get a list of all the lockers you’ve used in order. If a user used lockers 100, 201, 43, and 1, in that order, then locker 100 contains the first information and 201, locker 201 contains the second information and 43, and locker 43 contains the third information and 1. This characteristic of a linked list is especially advantageous when you need to process data dynamically. The advantage of this structure is that as long as you have enough free lockers, you can hold all the information you want to hold, and unlike an array, you can efficiently clear out the information in the middle if you want to do so. If a user wants to remove information from the 43rd locker in the above example, they can simply open locker 201 and replace the 43 with 1. This flexibility makes linked lists very useful in situations where you’re dealing with variable data. The disadvantages of this structure are that it wastes space by having to store an extra number for each locker used, and it’s inefficient because, unlike an array, when you want to figure out where the next locker is, you can’t immediately find it like an array, but have to traverse the lockers from the beginning.
Data structures are the basis of many disciplines in computer science and are closely related to many real-world problems. In fact, many of the latest theories in machine learning and big data are theories that deal with massive amounts of data, and they can only be realized if there is a good data structure that can handle large amounts of data. Data structures are more than just theoretical concepts, they are also the foundation of the technologies we use every day. Arrays and linked lists are the most basic of these, and they are the basis for almost all other data structures. Understanding these two fundamental data structures is essential for dealing with more complex data structures. With a solid understanding of these theories, you’ll have the flexibility to choose the right structure to manage your data in any given situation. Furthermore, by learning more about the characteristics of different data structures and their applications in depth, we will be better equipped to solve more complex and diverse data processing problems.

 

About the author

Blogger

Hello! Welcome to Polyglottist. This blog is for anyone who loves Korean culture, whether it's K-pop, Korean movies, dramas, travel, or anything else. Let's explore and enjoy Korean culture together!