What are compilers, and how did they bridge the gap between machine language and high-level languages?

W

A key tool in the evolution of computers, compilers translate human-understandable language into machine language, making programs executable on a variety of machines. Without the advent of compilers and their optimization capabilities, computers would not have become popular.

 

In 1942, the Atanasoff-Berry Computer, mankind’s first electronic computer, was completed. Since then, computers have continued to evolve, to the point where it’s hard to find someone around who doesn’t use one. In the future, we will live in an invisible electromagnetic jungle called the Internet of Things, where objects will be connected to each other in an internet network to exchange information. When computers first appeared, people wondered, “Why build such an expensive, clunky hunk of scrap metal to do calculations that humans can easily do?” But in the modern era, computers have become so important that the world would no longer work without them. One of the most important tools that have contributed to the development of computers is the compiler. Without compilers, computers might still be something you only see in large research facilities like nuclear fusion reactors and particle accelerators, far from the masses.
So what exactly are compilers, and why did they come into existence? In general, a compiler is a tool that translates human language into machine language. Machine language is literally a language that machines can understand. However, machine language is typically represented in binary, with 0s and 1s, like an electrical device being on or off, or current being present or absent. For example, 01100001 (97 in decimal) represents the letter ‘a’. As you can see, machine language is very inconvenient to read and write, so in the early stages of computer development, a more intuitive, human-understandable language called assembly language was used in place of machine language. However, a problem arose when programs were moved to other computers. Because machine language is a language specific to that machine, it couldn’t be used on another machine. Assembler is a human-understandable translation of machine language, so it has a one-to-one correspondence with machine language. So if you changed machines, you had to rewrite tens of thousands of lines of code from scratch. High-level languages emerged to solve this problem. High-level languages are machine-independent, meaning they don’t contain any machine-specific statements, so they mean the same thing on any machine. The problem with this is that machines can’t understand them. This is where compilers come in. Compilers take care of the translation into machine language, allowing programmers to write programs in one common high-level language, even if there are many different types of machines. Whereas it used to take 10,000 programs to be written for 100 computers to perform 100 tasks, now it only takes 100 compilers and 100 programs.
The first compiler was developed by Dr. Grace Hopper to translate A-0 into machine language. This was the first time the term “compiler” was used. Over time, however, a tool for translating into machine language and a compiler became not synonymous. In a broad sense, a compiler simply turns one language into something else. Of course, the semantics must be preserved. For example, a tool that turns a machine language back into a high-level language is called a decompiler. It can convert a high-level language into another high-level language, or it can convert machine language into another machine language. In the JAVA language, the compiler translates high-level language into bytecode that is neither machine nor high-level. It is even possible to turn high-level language into a picture file, as long as it makes sense when interpreted. A compiler that translates language A into language A is also a compiler if it can preserve the semantics exactly. This may sound a bit strange. For example, you’ve never heard of a translator that translates Korean into Korean.
To understand this, you need to know about one of the main functions of a compiler: optimization. For example, suppose you need to deliver a package and you need to go to each floor from the first to the tenth once. You could start from the first floor and go up to the 10th floor, or you could go from the 10th floor down to the first floor. But if you zigzag, like going to the 10th floor, then down to the 1st floor, then back up to the 9th floor, you can still deliver the package from the 1st floor to the 10th floor. Similarly, programming code has a myriad of ways to complete a specific task, but not all of them guarantee the best performance, which is why it’s important for compilers to turn inefficiently written source code into efficient code. If compilers between different languages are “translations,” compilers between the same language are more of an “editing” or “retirement” task. The first complete compiler is considered to be the Fortran compiler developed by John Backus in 1957. These optimizations make sense for compilers that translate a specific language into that specific language. However, in most cases, we will never see the transformed form of the language and it will be compiled directly into machine language. This intermediate form between input and output in the translation process is called an “intermediate representation”. For example, if we translate a high-level language into an assembler and then back into machine language, the assembler is the intermediate representation.
Also, one of the reasons for expressing things in the same language is readability. No matter how good the optimizations are, the performance of a program depends on the skill of the programmer. However, good code is not necessarily easy to read. Code isn’t written once and done; it needs to be maintained as things change. If you can’t remember hundreds of thousands of lines of code, you’ll have to read and reinterpret it every time, and code with complex logic is not only difficult to interpret, but mistakes in interpretation can lead to fatal errors in your program. So it’s important to make it easier for humans to understand, not machines.

 

About the author

Blogger

Hello! Welcome to Polyglottist. This blog is for anyone who loves Korean culture, whether it's K-pop, Korean movies, dramas, travel, or anything else. Let's explore and enjoy Korean culture together!