In one of my earlier blogs, I gave extracts from the book, ‘A Mathematicians Apology’ written by the great mathematician G H Hardy. Yes the same person who discovered our Ramanujan. In this book he talks about Fermat’s Two Squares Theorem, as below:
- – - “Another famous and beautiful theorem is Fermat’s ‘two square’ theorem. The primes may (if we ignore the special prime 2) be arranged in two classes; the primes
5, 13, 17, 29, 37, 41, …
which leave remainder 1 when divided by 4, and the primes
3, 7, 11, 19, 23, 31,…
which leave remainder 3. All the primes of the first class, and none of the second, can be expressed as the sum of two integral squares: thus
5 = 12 + 22 , 13 = 22 + 32 ,
17 = 12 + 42 , 29 = 22 + 52 ;
but 3, 7, 11, and 19 are not expressible in this way (as the reader may check by trial). This is Fermat’s theorem, which is ranked, very justly, as one of the finest of arithmetic. Unfortunately, there is no proof within the comprehension of anybody but a fairly expert mathematician.” – - -
Fermat’s Two Square Theorem looks nice and simple. But why should its proof also be nice and simple? I have given a proof here which I think is simple enough for all to appreciate.
In number theory, formal statement of Pierre de Fermat’s theorem on sums of two squares goes as below:
An odd prime p is expressible as
p = x2 + y2 ; with x and y as integers, if and only if p =1 (mod 4).
Proof-1
Sum of two squares:
Case 1: Both numbers are odd
2m+1 and 2n+1
Sum of squares = (2m+1)2 + (2n+1)2
= 4m2+ 4n2+ 4m+ 4n+ 1+ 1
= 4(m2+ n2+ m+ n) + 2 = 4k + 2 (say)
Case 2: Both are even
2m+2 and 2n+2 ( to avoid zero)
Sum of squares = (2m+2)2 + (2n+2)2
= 4m2+ 4n2+ 8m+ 8n+ 4+ 4
= 4(m2+ n2+ 2m+ 2n + 2) = 4k + 0 (say)
Case 3: One is odd and the other is even
2m+1 and 2n+2 ( to avoid zero)
Sum of squares = (2m+1)2 + (2n+2)2
= 4m2+ 4n2+ 4m+ 8n+ 1+ 4
= 4(m2+ n2+ m+ 2n + 1) + 1 = 4k + 1 (say)
Hence any number of the form 4k + 3 can never be a sum of two squares. This includes primes also. Hence one part of Fermat’s theorem is proved.
As per the second part of the theorem, the prime numbers of the form 4k+1 will definitely qualify to be sum of squares as per case-3 above. But we have to prove that they are always of the form x2 + y2.
Proof-2
In this proof, I am using another theorem by Fermat known as Fermat’s Little Theorem, which is given below along with proof.
Fermat’s Little Theorem.
Let p be a prime which does not divide the integer a, then ap-1 = 1 (mod p).
Proof.
Start by listing the first p-1 positive multiples of a:
a, 2a, 3a, … (p -1)a
Suppose that ra and sa are the same modulo p.
Then, (r – s)a is divisible by p
Since prime p does not divide a, (as stated above), it should divide r-s.
This means r = s, (mod p), which cannot be true.
So the p-1 multiples of a above are distinct and nonzero,
Hence, they must be congruent to 1, 2, 3, …, p-1 (mod p), in some order.
Multiply all these congruencies together and we find,
a*.2a*.3a*.…*.(p-1)a = 1*2*.3*.…*.(p-1) (mod p)
or, a(p-1)(p-1)! = (p-1)! (mod p).
Divide both sides by (p-1)!
a(p-1) = 1 (mod p).
Hence proved.
By Fermat’s Little Theorem, for a prime number p, and for any number a not divisible by p,
a ** (p-1) = 1 (mod p),
If p=1 (mod 4), then p must be of the form p = 4k + 1
Hence we may write, for any prime p of the form p = 4k + 1, and for any number a not divisible by p, by Fermat’s Little Theorem,
a ** (4k) = 1 (mod p),
i.e.
a ** 2k = +1 or -1 (mod p), depending on the value of a. [Note: -1 (mod p) = p-1]
Now let us take one value of a, for which, (a ** 2k) = -1 (mod p)
(The proof for existence of such a value for ‘a’ is given in the end)
So, (a**2k + 1) is divisible by p, = f*p, say.
(i.e.) (a**k + j1)(a**k – j1) = f*p
Let us say p is a prime in complex domain also. Then p divides at least on of the complex numbers on the left hand side. Then it should divide its conjugate also. Then it should divide the difference also. i.e. p divides ‘j2’, which cannot be true. Hence p is not a prime in complex domain. This means p has complex factors though it is a prime in real numbers domain.
Let,
(p + j0) = (x +jy) (r+ js)
= (xr-ys) + j(yr+xs)
i.e. (yr+xs)= 0
i.e. yr = -xs, and s = -yr/x
and p = xr – ys = xr + y2r/x = (r/x)(x2 + y2)
Since p is prime, r/x =1 and p = (x2 + y2)
Hence proved
Examples
a. Consider p =13 = 4*k + 1, k=3
2 ** 2k = 2**6 = 64 = -1|mod p
(2**2k + 1) = divisible by p = 5 * 13
(2**3 + j1)(2**3 – j1) = 5 * 13
Substituting, 5 = (2-j1)(2+j1) and 13 = (3+j2)(3-j2)
We get, (8 + j1) (8 – j1)= (2 – j1)[(3+j2)(3-j2)] (2 + j1)
b. Consider p=17 = 4*k + 1, k = 4
3**2k = 3**8 = 6561 = -1|mod p
3**2k + 1 = divisible by p = 386 * 17
i.e. (3**4 + j1)(3**4 – j1) = (19+j5) [(4-j1)(4+j1)] (19 – j5)
Proof for existence of a value for ‘a’ such that (a ** 2k) = -1 (mod p)
Consider a taking values a = 1 to p-1, none of them divisible by p.
Consider the sum,
4k 4k
∑ (a+1)**(n+1) = ∑ a**(n+1) + (4k+1)**(n+1)
a=0 a=1
We can write,
(a+1)**(n+1)
= a**(n+1) + (n+1)C1 a**n + (n+1)C2 a**(n-1) + ……+ (n+1)Cn a + 1
So, the above summation may be re-written as
4k
∑[(n+1)C1 a**n + (n+1)C2 a**(n-1) + ……+ (n+1)Cn a] + (4k+1)
a=0
= (4k+1)**(n+1)
Realising that p = 4k+1, we get,
4k
∑[(n+1)C1 a**n + (n+1)C2 a**(n-1) + ……+ (n+1)Cn a] = p**(n+1) – p = 0|mod p
a=0
In short we may write, Sum(n) = p**(n+1) – p = 0|mod p
The above equation is true for all values of n, n = 1 to n.
Hence,
Sum(1) = ∑ 04k 2C1 a = p**2 – p = p(p-1) = 0|mod p;
i.e ∑ 04k a = p(p-1)/2 = 4k(4k+1)/2: a known result.
And,
Sum(2) = ∑ 04k [ 3C2 a**2 + 3C1 a] = p**3 – p = p(p**2 – 1) = 0|mod p
Using the above result for Sum(1), we get ∑ 04k a**2 = 0|mod p
And,
Sum(3)=∑ 04k [4C3 a**3+ 4C2 a**2 + 4C1 a] = p**4 – p = p(p**3 – 1) = 0|mod p
Using above Sum(1) and Sum(2), we get ∑ 04k a**3 = 0|mod p
And so on . . . . . .
The particular case of interest to us is, when n = 2k.
Using same successive logic as above, we get,
∑ a=0 4k a**2k = 0|mod p - – - – - – - – - - A
As stated at the start, by Fermat’s Little Theorem,
a ** (4k) = 1 (mod p),
Hence,
a ** 2k = +1 or -1 (mod p), depending on the value of a.
Equation A above means, (a ** 2k)|mod p, equally assumes
both values of +1 and -1, for ‘a’ varying from 1 to 4k.
Thus proved