I am looking for an **implementation** or **clear algorithm** for getting the prime factorization of *N* in either python, pseudocode or anything else well-readable. There are a few demands/facts:

*N*is between 1 and ~20 digits- No pre-calculated lookup table, memoization is fine though.
- Need not to be mathematically proven (e.g. could rely on the Goldbach conjecture if needed)
- Need not to be precise, is allowed to be probabilistic/deterministic if needed

I need a fast prime factorization algorithm, not only for itself, but for usage in many other algorithms like calculating the Euler *phi(n)*.

I have tried other algorithms from Wikipedia and such but either I couldn’t understand them (ECM) or I couldn’t create a working implementation from the algorithm (Pollard-Brent).

I am really interested in the Pollard-Brent algorithm, so any more information/implementations on it would be really nice.

Thanks!

**EDIT**

After messing around a little I have created a pretty fast prime/factorization module. It combines an optimized trial division algorithm, the Pollard-Brent algorithm, a miller-rabin primality test and the fastest primesieve I found on the internet. gcd is a regular Euclid’s GCD implementation (binary Euclid’s GCD is **much** slower then the regular one).

# Bounty

Oh joy, a bounty can be acquired! But how can I win it?

- Find an optimalization or bug in my module.
- Provide alternative/better algorithms/implementations.

The answer which is the most complete/constructive gets the bounty.

And finally the module itself:

```
import random
def primesbelow(N):
# http://stackoverflow.com/questions/2068372/fastest-way-to-list-all-primes-below-n-in-python/3035188#3035188
#""" Input N>=6, Returns a list of primes, 2 <= p < N """
correction = N % 6 > 1
N = {0:N, 1:N-1, 2:N+4, 3:N+3, 4:N+2, 5:N+1}[N%6]
sieve = [True] * (N // 3)
sieve[0] = False
for i in range(int(N ** .5) // 3 + 1):
if sieve[i]:
k = (3 * i + 1) | 1
sieve[k*k // 3::2*k] = [False] * ((N//6 - (k*k)//6 - 1)//k + 1)
sieve[(k*k + 4*k - 2*k*(i%2)) // 3::2*k] = [False] * ((N // 6 - (k*k + 4*k - 2*k*(i%2))//6 - 1) // k + 1)
return [2, 3] + [(3 * i + 1) | 1 for i in range(1, N//3 - correction) if sieve[i]]
smallprimeset = set(primesbelow(100000))
_smallprimeset = 100000
def isprime(n, precision=7):
# http://en.wikipedia.org/wiki/Miller-Rabin_primality_test#Algorithm_and_running_time
if n < 1:
raise ValueError("Out of bounds, first argument must be > 0")
elif n <= 3:
return n >= 2
elif n % 2 == 0:
return False
elif n < _smallprimeset:
return n in smallprimeset
d = n - 1
s = 0
while d % 2 == 0:
d //= 2
s += 1
for repeat in range(precision):
a = random.randrange(2, n - 2)
x = pow(a, d, n)
if x == 1 or x == n - 1: continue
for r in range(s - 1):
x = pow(x, 2, n)
if x == 1: return False
if x == n - 1: break
else: return False
return True
# https://comeoncodeon.wordpress.com/2010/09/18/pollard-rho-brent-integer-factorization/
def pollard_brent(n):
if n % 2 == 0: return 2
if n % 3 == 0: return 3
y, c, m = random.randint(1, n-1), random.randint(1, n-1), random.randint(1, n-1)
g, r, q = 1, 1, 1
while g == 1:
x = y
for i in range(r):
y = (pow(y, 2, n) + c) % n
k = 0
while k < r and g==1:
ys = y
for i in range(min(m, r-k)):
y = (pow(y, 2, n) + c) % n
q = q * abs(x-y) % n
g = gcd(q, n)
k += m
r *= 2
if g == n:
while True:
ys = (pow(ys, 2, n) + c) % n
g = gcd(abs(x - ys), n)
if g > 1:
break
return g
smallprimes = primesbelow(1000) # might seem low, but 1000*1000 = 1000000, so this will fully factor every composite < 1000000
def primefactors(n, sort=False):
factors = []
for checker in smallprimes:
while n % checker == 0:
factors.append(checker)
n //= checker
if checker > n: break
if n < 2: return factors
while n > 1:
if isprime(n):
factors.append(n)
break
factor = pollard_brent(n) # trial division did not fully factor, switch to pollard-brent
factors.extend(primefactors(factor)) # recurse to factor the not necessarily prime factor returned by pollard-brent
n //= factor
if sort: factors.sort()
return factors
def factorization(n):
factors = {}
for p1 in primefactors(n):
try:
factors[p1] += 1
except KeyError:
factors[p1] = 1
return factors
totients = {}
def totient(n):
if n == 0: return 1
try: return totients[n]
except KeyError: pass
tot = 1
for p, exp in factorization(n).items():
tot *= (p - 1) * p ** (exp - 1)
totients[n] = tot
return tot
def gcd(a, b):
if a == b: return a
while b > 0: a, b = b, a % b
return a
def lcm(a, b):
return abs((a // gcd(a, b)) * b)
```

Pollard-Brent in implemented in Python:

https://comeoncodeon.wordpress.com/2010/09/18/pollard-rho-brent-integer-factorization/

If you don’t want to reinvent the wheel, use the library sympy

```
pip install sympy
```

Use the function `sympy.ntheory.factorint`

```
>>> from sympy.ntheory import factorint
>>> factorint(10**20+1)
{73: 1, 5964848081: 1, 1676321: 1, 137: 1}
```

You can factor some very large numbers:

```
>>> factorint(10**100+1)
{401: 1, 5964848081: 1, 1676321: 1, 1601: 1, 1201: 1, 137: 1, 73: 1, 129694419029057750551385771184564274499075700947656757821537291527196801: 1}
```

There is no need to calculate `smallprimes`

using `primesbelow`

, use `smallprimeset`

for that.

`smallprimes = (2,) + tuple(n for n in xrange(3,1000,2) if n in smallprimeset)`

Divide your `primefactors`

into two functions for handling `smallprimes`

and other for `pollard_brent`

, this can save a couple of iterations as all the powers of smallprimes will be divided from n.

```
def primefactors(n, sort=False):
factors = []
limit = int(n ** .5) + 1
for checker in smallprimes:
print smallprimes[-1]
if checker > limit: break
while n % checker == 0:
factors.append(checker)
n //= checker
if n < 2: return factors
else :
factors.extend(bigfactors(n,sort))
return factors
def bigfactors(n, sort = False):
factors = []
while n > 1:
if isprime(n):
factors.append(n)
break
factor = pollard_brent(n)
factors.extend(bigfactors(factor,sort)) # recurse to factor the not necessarily prime factor returned by pollard-brent
n //= factor
if sort: factors.sort()
return factors
```

By considering verified results of Pomerance, Selfridge and Wagstaff and Jaeschke, you can reduce the repetitions in `isprime`

which uses Miller-Rabin primality test. From Wiki.

- if n < 1,373,653, it is enough to test a = 2 and 3;
- if n < 9,080,191, it is enough to test a = 31 and 73;
- if n < 4,759,123,141, it is enough to test a = 2, 7, and 61;
- if n < 2,152,302,898,747, it is enough to test a = 2, 3, 5, 7, and 11;
- if n < 3,474,749,660,383, it is enough to test a = 2, 3, 5, 7, 11, and 13;
- if n < 341,550,071,728,321, it is enough to test a = 2, 3, 5, 7, 11, 13, and 17.

**Edit 1**: Corrected return call of `if-else`

to append bigfactors to factors in `primefactors`

.

Even on the current one, there are several spots to be noticed.

- Don’t do
`checker*checker`

every loop, use`s=ceil(sqrt(num))`

and`checher < s`

- checher should plus 2 each time, ignore all even numbers except 2
- Use
`divmod`

instead of`%`

and`//`

You should probably do some prime detection which you could look here,

Fast algorithm for finding prime numbers?

You should read that entire blog though, there is a few algorithms that he lists for testing primality.

There’s a python library with a collection of primality tests (including incorrect ones for what not to do). It’s called pyprimes. Figured it’s worth mentioning for posterity’s purpose. I don’t think it includes the algorithms you mentioned.

I just ran into a bug in this code when factoring the number `2**1427 * 31`

.

```
File "buckets.py", line 48, in prettyprime
factors = primefactors.primefactors(n, sort=True)
File "/private/tmp/primefactors.py", line 83, in primefactors
limit = int(n ** .5) + 1
OverflowError: long int too large to convert to float
```

This code snippet:

```
limit = int(n ** .5) + 1
for checker in smallprimes:
if checker > limit: break
while n % checker == 0:
factors.append(checker)
n //= checker
limit = int(n ** .5) + 1
if checker > limit: break
```

should be rewritten into

```
for checker in smallprimes:
while n % checker == 0:
factors.append(checker)
n //= checker
if checker > n: break
```

which will likely perform faster on realistic inputs anyway. Square root is slow — basically the equivalent of many multiplications —, `smallprimes`

only has a few dozen members, and this way we remove the computation of `n ** .5`

from the tight inner loop, which is certainly helpful when factoring numbers like `2**1427`

. There’s simply no reason to compute `sqrt(2**1427)`

, `sqrt(2**1426)`

, `sqrt(2**1425)`

, etc. etc., when all we care about is “does the [square of the] checker exceed `n`

“.

The rewritten code doesn’t throw exceptions when presented with big numbers, **and** is roughly twice as fast according to `timeit`

(on sample inputs `2`

and `2**718 * 31`

).

Also notice that `isprime(2)`

returns the wrong result, but this is okay as long as we don’t rely on it. IMHO you should rewrite the intro of that function as

```
if n <= 3:
return n >= 2
...
```

You could factorize up to a limit then use brent to get higher factors

```
from fractions import gcd
from random import randint
def brent(N):
if N%2==0: return 2
y,c,m = randint(1, N-1),randint(1, N-1),randint(1, N-1)
g,r,q = 1,1,1
while g==1:
x = y
for i in range(r):
y = ((y*y)%N+c)%N
k = 0
while (k<r and g==1):
ys = y
for i in range(min(m,r-k)):
y = ((y*y)%N+c)%N
q = q*(abs(x-y))%N
g = gcd(q,N)
k = k + m
r = r*2
if g==N:
while True:
ys = ((ys*ys)%N+c)%N
g = gcd(abs(x-ys),N)
if g>1: break
return g
def factorize(n1):
if n1==0: return []
if n1==1: return [1]
n=n1
b=[]
p=0
mx=1000000
while n % 2 ==0 : b.append(2);n//=2
while n % 3 ==0 : b.append(3);n//=3
i=5
inc=2
while i <=mx:
while n % i ==0 : b.append(i); n//=i
i+=inc
inc=6-inc
while n>mx:
p1=n
while p1!=p:
p=p1
p1=brent(p)
b.append(p1);n//=p1
if n!=1:b.append(n)
return sorted(b)
from functools import reduce
#n= 2**1427 * 31 #
n= 67898771 * 492574361 * 10000223 *305175781* 722222227*880949 *908909
li=factorize(n)
print (li)
print (n - reduce(lambda x,y :x*y ,li))
```