The method BigInteger.isProbablePrime() is quite strange; from the documentation, this will tell whether a number is prime with a probability of
1 - 1 / 2^arg, where
arg is the integer argument.
It has been present in the JDK for quite a long time, so it means it must have uses. My limited knowledge in computer science and algorithms (and maths) tells me that it does not really make sense to know whether a number is “probably” a prime but not exactly a prime.
So, what is a possible scenario where one would want to use this method? Cryptography?
Yes, this method can be used in cryptography. RSA encryption involves the finding of huge prime numbers, sometimes on the order of 1024 bits (about 300 digits). The security of RSA depends on the fact that factoring a number that consists of 2 of these prime numbers multiplied together is extremely difficult and time consuming. But for it to work, they must be prime.
It turns out that proving these numbers prime is difficult too. But the Miller-Rabin primality test, one of the primality tests uses by
isProbablePrime, either detects that a number is composite or gives no conclusion. Running this test
n times allows you to conclude that there is a 1 in 2n odds that this number is really composite. Running it
100 times yields the acceptable risk of 1 in 2100 that this number is composite.
If the test tells you an integer is not prime, you can certainly believe that 100%.
It is only the other side of the question, if the test tells you an integer is “a probable prime”, that you may entertain doubt. Repeating the test with varying “bases” allows the probability of falsely succeeding at “imitating” a prime (being a strong pseudo-prime with respect to multiple bases) to be made as small as desired.
The usefulness of the test lies in its speed and simplicity. One would not necessarily be satisfied with the status of “probable prime” as a final answer, but one would definitely avoid wasting time on almost all composite numbers by using this routine before bringing in the big guns of primality testing.
The comparison to the difficulty of factoring integers is something of a red herring. It is known that the primality of an integer can be determined in polynomial time, and indeed there is a proof that an extension of Miller-Rabin test to sufficiently many bases is definitive (in detecting primes, as opposed to probable primes), but this assumes the Generalized Riemann Hypothesis, so it is not quite so certain as the (more expensive) AKS primality test.
The standard use case for
BigInteger.isProbablePrime(int) is in cryptography. Specifically, certain cryptographic algorithms, such as RSA, require randomly chosen large primes. Importantly, however, these algorithms don’t really require these numbers to be guaranteed to be prime — they just need to be prime with a very high probability.
How high is very high? Well, in a crypto application, one would typically call
.isProbablePrime() with an argument somewhere between 128 and 256. Thus, the probability of a non-prime number passing such a test is less than one in 2128 or 2256.
Let’s put that in perspective: if you had 10 billion computers, each generating 10 billion probable prime numbers per second (which would mean less than one clock cycle per number on any modern CPU), and the primality of those numbers was tested with
.isProbablePrime(128), you would, on average, expect one non-prime number to slip in once in every 100 billion years.
That is, that would be the case, if those 10 billion computers could somehow all run for hundreds of billions of years without experiencing any hardware failures. In practice, though, it’s a lot more likely for a random cosmic ray to strike your computer at just the right time and place to flip the return value of
.isProbablePrime(128) from false to true, without causing any other detectable effects, than it is for a non-prime number to actually pass the probabilistic primality test at that certainty level.
Of course, the same risk of random cosmic rays and other hardware faults also applies to deterministic primality tests like AKS. Thus, in practice, even these tests have a (very small) baseline false positive rate due to random hardware failures (not to mention all other possible sources of errors, such as implementation bugs).
Since it’s easy to push the intrinsic false positive rate of the Miller–Rabin primality test used by
.isProbablePrime() far below this baseline rate, simply by repeating the test sufficiently many times, and since, even repeated so many times, the Miller–Rabin test is still much faster in practice than the best known deterministic primality tests like AKS, it remains the standard primality test for cryptographic applications.
(Besides, even if you happened to accidentally select a strong pseudoprime as one of the factors of your RSA modulus, it would not generally lead to a catastrophic failure. Typically, such pseudoprimes would be products of two (or rarely more) primes of approximately half the length, which means that you’d end up with a multi-prime RSA key. As long as none of the factors were too small (and if they were, the primality test should’ve caught them), the RSA algorithm will still work just fine, and the key, although somewhat weaker against certain types of attacks than normal RSA keys of the same length, should still be reasonably secure if you didn’t needlessly skimp on the key length.)
A possible use case is in testing primality of a given number (at test which in itself has many uses). The
isProbablePrime algorithm will run much faster than an exact algorithm, so if the number fails
isProbablePrime, then one need not go to the expense of running the more expensive algorithm.
Finding probable primes is an important problem in cryptography. It turns out that a reasonable strategy for finding a probable k-bit prime is to repeatedly select a random k-bit number, and test it for probable primality using a method like
For further discussion, see section 4.4.1 of the Handbook of Applied Cryptography.
Also see On generation of probable primes by incremental search by Brandt and Damgård.
Algorithms such as RSA key generation rely on being able to determine whether a number is prime or not.
However, at the time that the
isProbablePrime method was added to the JDK (February 1997), there was no proven way to deterministically decide whether a number was prime in a reasonable amount of time. The best known approach at that time was the Miller-Rabin algorithm – a probabilistic algorithm that would sometimes give false positives (i.e, would report non-primes as primes), but could be tuned to reduce the likelihood of false positives, at the expense of modest increases in runtime.
Since then, algorithms have been discovered that can deterministically decide whether a number is prime reasonably quickly, such as the AKS algorithm that was discovered in August 2002. However, it should be noted that these algorithms are still not as fast as Miller-Rabin.
Perhaps a better question is why no
isPrime method has been added to the JDK since 2002.