agda-stdlib
agda-stdlib copied to clipboard
Add prime factorization and its properties
This PR proves that every non-zero natural number has a unique (up to permutation) prime factorization.
To this end, it also adds a definition of "rough" numbers (numbers which don't have prime factors below a bound)
This is something that I think is important to have a proof for, but I think it needs considerable improvement before it meets the quality criterion for this library, and I can't see how to improve it, so I'd appreciate feedback.
The factorization algorithm is trial division. It does not currently short-circuit when it checks a factor greater than the square root of the number being factorized, which would significantly improve its speed in some cases, but makes the code a bit more complicated.
Also needs a CHANGELOG entry
This all looks great! Two further comments, of different character:
- the big proof makes several appeals to decomposing proofs/a proof of
m ∣ nto expose both the quotientqand its defining equation. Some of those proofs (and the supportingwhereclauses for all the several lemmas involved) might usefully have those record projections exposed at the beginning, and then shared throughout.., (this is a cosmetic refactoring, but still worth doing, I think) - can you/does the library, show anywhere that the divisibility relation (Modulo having a factor/quotient >1) is well-founded? And with such a proof, would the ambient wf-recursion which drives the big proof be simplified/simplifiable in any way? UPDATED I don't now myself see this immediately, except as an alternative proof strategy...
Well foundedness of divisibility is independently interesting as part of showing various things, including I think the original proofs of the Fundamental Theorem. Might be interesting to compare!?
Re: the proof in #2181 that every number is 1 Rough
This permits the main complete induction proof factorise to (mostly!) avoid having to relitigate the 0, 1, 2+_ pattern analysis in favour of a (slightly!?) simpler induction on NonZero n, and thereby stripping out a great deal of 1<2+n to-ing and fro-ing, as follows:
factorise : ∀ n → .{{NonZero n}} → PrimeFactorisation n
factorise n@(suc _) = <-rec P facRec n (≤⇒≤‴ z<s) 1-rough
where
P : ℕ → Set
P m = .{{NonZero m}} → ∀ {k} → k ≤‴ m → k Rough m → PrimeFactorisation m
facRec : ∀ n → <-Rec P n → P n
facRec 1 rec k<1 rough = record
{ factors = []
; factorsPrime = []
; isFactorisation = refl
}
facRec n@(2+ _) rec = h
where
h : P n
h {0} (≤‴-step (≤‴-step k<n)) rough = h k<n 2-rough
h {1} (≤‴-step k<n) rough = h k<n 2-rough
h {k@(2+ _)} ≤‴-refl rough = record
{ factors = n ∷ []
; factorsPrime = prime rough ∷ []
; isFactorisation = ≡.sym (*-identityʳ n)
}
h {k@(2+ _)} (≤‴-step k<n) rough with k ∣? n
... | no k∤n = h k<n (∤⇒rough-suc k∤n rough)
... | yes k∣n with divides q eq ← k∣n = record
{ factors = k ∷ ps
; factorsPrime = rough⇒∣⇒prime 1<k rough k∣n ∷ primes
; isFactorisation = n≡k*Πps
}
where
open ≤-Reasoning
1<k : 1 < k
1<k = 1<2+n
n≡k*q : n ≡ k * q
n≡k*q = begin-equality
n ≡⟨ eq ⟩
q * k ≡⟨ *-comm q k ⟩
k * q ∎
1<q : 1 < q
1<q = ≰⇒> λ q≤1 → n≮n n $ begin-strict
n ≡⟨ eq ⟩
q * k ≤⟨ *-monoˡ-≤ k q≤1 ⟩
1 * k ≡⟨ *-identityˡ k ⟩
k <⟨ ≤‴⇒≤ k<n ⟩
n ∎
0<q : 0 < q
0<q = begin-strict
0 <⟨ z<s ⟩
1 <⟨ 1<q ⟩
q ∎
instance _ = >-nonZero 0<q
q<n : q < n
q<n = begin-strict
q <⟨ m<m*n q k 1<k ⟩
q * k ≡⟨ eq ⟨
n ∎
q∣n : q ∣ n
q∣n = divides k n≡k*q
k≤q : k ≤ q
k≤q = ≮⇒≥ λ q<k → rough (boundedComposite 1<q q<k q∣n)
factorisation[q] : PrimeFactorisation q
factorisation[q] = rec q<n (≤⇒≤‴ k≤q) (rough⇒∣⇒rough rough q∣n)
ps = factors factorisation[q]
primes : All Prime ps
primes = factorsPrime factorisation[q]
q≡Πps : q ≡ product ps
q≡Πps = isFactorisation factorisation[q]
n≡k*Πps : n ≡ k * product ps
n≡k*Πps = begin-equality
n ≡⟨ n≡k*q ⟩
k * q ≡⟨ cong (k *_) q≡Πps ⟩
k * product ps ∎
The factorization algorithm is trial division. It does not currently short-circuit when it checks a factor greater than the square root of the number being factorized, which would significantly improve its speed in some cases, but makes the code a bit more complicated.
Returning to your initial discussion, how hard is it to include the analysis in terms of only considering trial factors of n up to (a suitable approximation to) square root of n... esp. in terms of characterisation via Rough ness?
Another followup:
- there's a wealth of additional material on theorem proving about primes from 20+ years ago in a great paper from Laurent Thery
which might offer opportunities for interesting downstream extensions to your work.
And apologies for all the diversions around #2181 and #2182 getting in the way of you getting this merged, but hopefully they help streamline it!
After a fair amount of refactoring, I've successfully made it short-circut when the candidate factor is greater than the square root of the number being factorised! This makes it so much quicker (my very scientific test of "factorise a decent sized prime from stdin in a compiled program" went from 40s to instant for 433494437)
@Taneb you wrote on #2300
The reason I asked about commutative structures is that the original reason I wanted
foldMapwas because it felt like the right way to say thatsumrespects permutation - I've defined something similar in #1969 forproducthere
Following up on this here... why not factor this proof as one for a commutative monoid operation, and then instantiate to product being commutative?
Addressed some of the feedback. @Taneb if you're willing to write something brief about @JacquesCarette's comment on the unique multiplicity representation I'm happy to merge this in.
@jamesmckinna any chance you could turn your thumbs up into an "approve"? :smile:
@jamesmckinna any chance you could turn your thumbs up into an "approve"? 😄
Apologies for holding things up through inattention and too much nit-picking!
Now DONE, and hitting 'merge'...