The P vs. NP problem is one of the most challenging problems in math and computer science, comparing the time to solve a problem with the time to check it. If solved, it could revolutionize many fields, including cryptography, but it remains unsolved.
On May 24, 2000, the Clay Mathematics Institute in the United States announced a prize of $1 million per problem for solving seven of the world’s most challenging unsolved problems in mathematics. These were called the “Millennium Problems”, and most people with a passing interest in math have heard of them. The high stakes meant that only the hardest problems were chosen, ones that had bothered many mathematicians for decades. Even understanding the problems can be difficult for non-mathematicians without a degree in the field. In this article, we’ll explain the P vs. NP problem, which is the most relevant to computer science, and by far the easiest to understand.
The P vs. NP problem looks like this
“Are P-problems and NP-problems the same problem?”
It’s a simple sentence, but it’s hard to understand this question if you don’t know what a P problem and an NP problem are. P problems stand for Polynomial-time problems, which are problems that take polynomial time to solve. For example, consider the problem of finding the nth term of the Fibonacci sequence. The Fibonacci sequence starts with 1, 1, and the next term is the sum of the previous two terms. The first few terms are as follows
1, 1, 2, 3, 5, 8, 13, 21, 34, …
To calculate the third term of the Fibonacci sequence, you only need to calculate 1+1=2. To calculate the fourth term, you need to do two additions: 1+1=2 and 1+2=3. Similarly, for the fifth term, you can do three additions: 1+1=2, 1+2=3, 2+3=5. In general, it only takes n-2 additions to find the nth term of the Fibonacci sequence, which means that this problem takes n-2 to solve and can be represented as a first degree polynomial in n. In technical terms, we say that this problem has a “time complexity” of O(n). First-order polynomials are written as O(n), second-order polynomials as O(n^2), and so on. If the time to solve a problem like this is polynomial in the number of seconds, it is called a P-problem.
Examples of P-problems are associated with many real-world computer-solvable problems. Operations like simple addition and multiplication are P problems, and many of the software we use every day solves them efficiently. But when we move on to NP problems, things get a bit more complicated.
NP stands for non-deterministic polynomial-time problem. It translates to “a problem that is undetermined to take polynomial time”, but that doesn’t exactly describe NP problems. The correct definition of NP is “a problem whose solution is polynomial in the time it takes to sum given the answer to the problem. A typical problem is the subset sum problem. The subset sum problem asks, given a set of real numbers, is there a subset whose sum is zero? For example, given the following sets
S = {-7, 3, 2, -5, 8}
The number of (non-conjugate) subsets of S is 31, and we need to find the subset whose elements sum to zero. How long does it take to see if one of them, {-7, 3, 2, 8}, answers the question? Simply add up all the elements of this set. In this case, we can calculate -7+3+2+8, which requires only three additions. The value is 6, so this subset can’t be the answer. In the same way, checking whether {3, 2, -5} is the answer requires only two additions. In general, given a set with n elements, the number of elements in any subset is at most n, and the time to check if the subset is the answer is n-1. In other words, the subset sum problem has time complexity O(n). Problems that take polynomial time to solve are called NP problems.
NP problems are often used to describe problems that are difficult to solve on a computer in the real world. Some of these problems play a very important role in modern society. For example, optimization problems or cryptography problems can be NP problems, and not solving them efficiently can lead to a huge waste of resources.
Now, back to the original question: what is a P vs. NP problem? Before we get back to that, it’s important to note that NP problems and P problems are not mutually exclusive. Because of the name NP, it’s easy to mistakenly assume that NPs are not Ps. However, if you recall the definitions of P and NP, every P problem is an NP problem. A problem that takes polynomial time to solve will naturally take polynomial time to check. We can write this as P⊂NP. Then, checking whether P=NP is equivalent to checking whether NP⊂P. So the P vs. NP problem can be written like this
“If a problem takes polynomial time to compute, does it also take polynomial time to solve?”
If you’ve read this far, you might be wondering: why are we so obsessed with “polynomial”? The reason is that polynomial is the time limit for computers to compute “relatively quickly”. Computers can compute much faster than humans, but they have their limits. Computers on the market today can typically perform between 1 billion and 3 billion operations per second. What happens if the time to solve is 2^n? For the subset sum problem we saw above, if we solve it by checking the answer for every subset, we need to do at least this many operations since the number of subsets is 2^n. For n=20, we only need to compute about 1 million times, so we can solve it in less than a second. However, if you try to solve this problem with n=100, it would take a computer until the end of the world. This is true even if the computer solves the problem 1000 times faster than it does now. This is because the growth rate of exponential functions is incomparably faster than that of polynomials. On the other hand, consider the P problem. A problem that takes n^3 to solve can be solved in 1 second if n=1000 because it requires 1 billion operations. Even if n is 10 times larger, you’ll only need to wait 10 minutes for the answer. Polynomial functions grow relatively slowly, so you can leave it to a computer to solve it in a relatively short time.
This is why so many computer scientists have tried to find ways to solve NP problems in polynomial time. However, the P vs. NP problem has remained unsolved and has plagued them for decades. One of the concepts designed to solve this problem is called NP-hard. Like P or NP, NP-hard is some kind of problem that, if you can solve it in polynomial time, you can solve all NP problems in polynomial time, surprisingly! NP-hard problems that are also NP-complete are called NP-complete, and the subset sum problem we saw above is one of them. If we could find a way to solve this problem in polynomial time, we could prove that P=NP. However, no one has found such a method yet. That’s why many people think that the answer to the P vs. NP problem is no.
What if the answer to the P vs. NP problem is P=NP? There is one field that takes advantage of computers’ inability to compute quickly: cryptography. While computers can quickly compute the product of any two primes, there is currently no known way to compute the product of two primes in polynomial time given the product of two primes. The world’s most widely used RSA cipher capitalizes on this fact by encrypting a message using the product of two prime numbers with more than 100 digits and making it impossible to decrypt the original message without knowing both prime numbers. But prime factorization is an NP problem, so if P=NP, a computer can crack RSA in a fraction of a second. The entire cryptosystem would be completely broken.
Of course, that’s not going to happen easily, but think about the ripple effect and you’ll realize how important it is. If you’re looking to win the jackpot of life, it’s worth a shot.