You hear a lot of computer scientists (and in particular, computer theorists) talk about the famous "P vs NP" problem. It's become so well known that it's entered popular culture in a number of different places, including an entire indie film that used it as its main plot device. However, there is

**much misconceptions about this problems out there, even among people with computer science degrees.**

*so*Most computer scientists and software engineers have a pretty good idea about what P means. It's the problems with a known polynomial time solution. In most cases (although not all) these are problems that run relatively efficiently.

The category NP throws people, though. A disarming number of people seem to use this term to mean "Not-Polynomial".

**But it couldn't be farther from the case!**NP actually refers to all the problems with a known "non-deterministic polynomial" solution. It means these problems

*can*be solved in polynomial time (that is, relatively efficiently)

*if*you have access to something called a non-deterministic Turing machine.

By definition,

*all problems in P are also in NP*. Sorting numbers? That's in NP. Finding the shortest path between two points? NP again. Calculating the area in a rectangle? NP, of course. All of the easiest problems are in NP because they are in P, and all problems in P are also in NP.

So at this point, if you've followed the P vs NP problem at all, you might be a little confused. "Isn't the point was that we didn't know if these things were the same? I thought NP meant really hard!!!" Ah yes, see, we know that all problems in P are in NP. What we don't know is whether or not all problems in NP are also in P! Because some things that are easy to do with a non-deterministic computer seem to be very, very hard to do on a real computer.

One famous example is the famous "traveling salesman" problem. What is the absolute most efficient way for a person to visit every city in his state, and return home? While it's possible to quickly come up with an answer that's close to the most efficient, there is no fast method for finding one that's certain to be the absolute fastest. The only way we really know how to do that is to basically try an exponential number of possible routes. Which takes a very long time if the number of cities is very large at all.

This is an example of what is known as an "NP-hard" problem.

*These*are the problems that are so hard for computers to solve. NP-hard problems are

*not necessarily in NP*. NP-hard simply means it's

*at least as hard*as the hardest problems in NP. It means that given a program that can solve an NP-hard problem, you can translate any other problem in NP in polynomial time to be solved by your NP-hard solving program. In other words, if you could solve the traveling salesman problem efficiently, you could solve all the other problems in NP relatively efficiently as well.

Traveling salesman is both NP-hard and in NP. I won't go into too many details about why this is, because over simplifications of NP often lead to more misunderstandings. (I'll save that for another post, if anyone is interested). Problems that are both NP-hard and in NP are known as "NP-complete". I believe when many people use the word "NP", what they really mean is "NP-complete". These are problems that (as far as we know) are very hard to solve on real computers, but are quite fast to solve on the theoretical device known as a non-deterministic Turing machine.

One common misconception out there is that a quantum computer is a non-deterministic Turing machine. This is not true. Quantum computers (if they ever really get built) are simply nowhere near that powerful. However, it is possible to use them to factor prime numbers quickly - something that could break the most popular form of encryption these days. Another common misconception is that prime factoring is NP-hard. It isn't. (If it was, quantum computers would be non-deterministic Turing machines). While we are pretty sure prime factoring is not as hard as NP-hard problems, we also don't known a polynomial time algorithm for it yet for traditional computers. There is one for quantum computers, though. I think this is part of what makes some people mistakenly think quantum computers are non-deterministic Turing machines. Sorry, no such luck.

So what's the big deal about the whole P vs NP problem? Well, after years and years of trying, computer scientists have been unable to do two different tasks. The first is to try and come up with a polynomial time solution on a standard computer for an NP-complete problem. If they could, it would mean P and NP are actually the same thing. All problems that can be solved on a non-deterministic computer in polynomial time could also be solved on a regular computer in polynomial time. This would be huge, as it would mean countless problems that are deemed too hard to solve today could potentially be solved relatively efficiently.

Most computer scientists think this is impossible, though, for reasons I can go into in another post. However, despite all their trying, no one has been able to write up a bullet-proof math proof showing

*why*this is impossible. Most of us are pretty sure it is, but there's a difference in mathematics between having good reasons to think something is impossible and actually

*proving*it. In other sciences, they'd probably just say "evidence shows this isn't true, so unless something changes, we are going to say P != NP". However in mathematics and computer theory, this doesn't cut it. These people expect perfect proofs, not just good evidence. (This is one of the major differences between mathematics and empirical science.) That said, from a working computer scientists perspective, it's generally just accepted that P != NP, although a very tiny minority still hold out hope.

And that, I believe, should help counter many of the major misconceptions about the famous P vs NP problem. P and NP are not opposites. The question is, are they exactly the same thing, or is NP everything in P, plus a lot more?