## Introduction

I have decided to write a series of blog posts which shall culminate in explaining Shor's algorithm. Peter Shor in the year 1994 showed an algorithm that can be used to break RSA in a reasonable amount of time. Shor's algorithm is a quantum algorithm that can be run on a quantum computer. Fortunately, we don't yet have a practical realization of quantum computers that can break RSA. To understand Shor's algorithm, one needs to understand a bit of quantum mechanics and quantum computation. Before I delve into posts relating to quantum mechanics and quantum computation, as a warm-up , I shall start with the RSA algorithm.

Fundamentally this blog post shall explain the math behind RSA. Public key cryptography that underlies secure transmission which enables eCommerce on the internet, military communication employs RSA. In public-key cryptography, we use two keys, one known as public key which is used to encrypt the message and is known to everybody. The other is the private key (secret key) which is used for decryption. It is impossible or expensive to derive the private key from public key using classical computers that are currently in existence and this forms the security of RSA. This algorithm was discovered by Ronald, Shamir and Adleman at MIT in the year 1976 and thus the name RSA. To understand RSA, let's start with the math that underlies it.

## Number Theoretic Introduction

**Modular Arithmetic**:- Modular arithmetic allows arithmetic over a set of numbers. The set is formed by the list of numbers that can differ from the multiples of number N. This number N is a positive integer which is termed the modulus and there are N distinct integers which is given by the set $\{0,1, ..., N-1\}$. Modular arithmetic differs from normal arithmetic in that you do arithmetic not on a line but on a circle or a ring. Once you reach $N-1$, you wrap around to zero. Addition and Multiplication are the arithmetic operations defined in $\mod N$.

Let's define more formally. If a,b,N are integers, then in mathematical notation

$ a\equiv b\pmod N\iff a = b + kN \text{ for some integer k.}$

Examples are:

$26\equiv 5\pmod 7\text{ because } 26 = 5 + 3 * 7$

$-1\equiv 6\pmod 7\text{ because } -1 = 6 + (-1) * 7$.

The next concept that we need to understand is

A number x has an inverse in $\mod N$ if and if only gcd(x, N) = 1. Here gcd is the greatest common divisor (the largest integer that divides both the number.). The gcd of two numbers x and y can written as a linear combination of two integers a and b such that $$ gcd(x,y) = a . x + b .y $$ Using the above, let's write $gcd(x,N) = a . x + b . N$. Now let's apply $\mod N$ arithmetic to the preceding equation. $b . N$ cancels out since it is a multiple of N. What remains is $ a . x = 1 \pmod N$. This implies the inverse of x is a. The above proof shows that inverse exists and shows way to find the inverse efficiently. $x^{-1}=a$ can be found out using Extended Euclidean Algorithm whose runtime is $0(log^2N)$.

Let's discuss the case when the modulus N is a prime p. When the modulus is prime, other than zero, all other remaining $p-1$ integers have no factor common with p which means they are coprime. This means that $gcd(x,p) = 1$ which implies all elements have modulo inverses. Fermat in 1640 introduced the Fermat Little Theorem which states that if p is a prime and x be any integer, then $$ a ^ {p-1} \equiv 1 \pmod p $$ About a 100 years later, Euler stated a more general result $$ x ^ {\phi(n)-1} \equiv 1 \pmod N $$ What is this $\phi(N)$ ?. It is called the Euler totient function which is a count of the number of positive integers that are coprime with N. If N is a prime p, then $ \phi(p) = p - 1$. If N is product of two primes p and q, that is, $N = p . q$, then \phi(N) = N - p - q + 1 = (p-1)(q-1). This follows from the fact that the number of integers less than N that are not coprime with N are those that are divisible by either p or q. Among the N (distinct numbers, there are p ( $p.q/q$)of them which are divisible by q and q ($p.q/p$) of them which are divisble by p. Zero is added in both cases, so we add 1 to avoid the double addition. This is a very important result which form the basis of the RSA cryptosystem.

We are almost done with the math required to understand the RSA algorithm. Now we just need to understand group theory a bit. It is required to prove the existence of an inverse ( in the group $G_{(p-1)(q-1)}$

[TO BE COMPLETED LATER]

The next concept that we need to understand is

**modular inverse**. Inverse is usually something that when combines with which it is inverse for gives you the identity. As an example, the inverse of $2$ in rational system is $1/2$. The modular inverse of x is y such that $ x . y = 1 \pmod N$. Next question what all elements in $\mod N$ have inverses and how to find the inverse efficiently.A number x has an inverse in $\mod N$ if and if only gcd(x, N) = 1. Here gcd is the greatest common divisor (the largest integer that divides both the number.). The gcd of two numbers x and y can written as a linear combination of two integers a and b such that $$ gcd(x,y) = a . x + b .y $$ Using the above, let's write $gcd(x,N) = a . x + b . N$. Now let's apply $\mod N$ arithmetic to the preceding equation. $b . N$ cancels out since it is a multiple of N. What remains is $ a . x = 1 \pmod N$. This implies the inverse of x is a. The above proof shows that inverse exists and shows way to find the inverse efficiently. $x^{-1}=a$ can be found out using Extended Euclidean Algorithm whose runtime is $0(log^2N)$.

Let's discuss the case when the modulus N is a prime p. When the modulus is prime, other than zero, all other remaining $p-1$ integers have no factor common with p which means they are coprime. This means that $gcd(x,p) = 1$ which implies all elements have modulo inverses. Fermat in 1640 introduced the Fermat Little Theorem which states that if p is a prime and x be any integer, then $$ a ^ {p-1} \equiv 1 \pmod p $$ About a 100 years later, Euler stated a more general result $$ x ^ {\phi(n)-1} \equiv 1 \pmod N $$ What is this $\phi(N)$ ?. It is called the Euler totient function which is a count of the number of positive integers that are coprime with N. If N is a prime p, then $ \phi(p) = p - 1$. If N is product of two primes p and q, that is, $N = p . q$, then \phi(N) = N - p - q + 1 = (p-1)(q-1). This follows from the fact that the number of integers less than N that are not coprime with N are those that are divisible by either p or q. Among the N (distinct numbers, there are p ( $p.q/q$)of them which are divisible by q and q ($p.q/p$) of them which are divisble by p. Zero is added in both cases, so we add 1 to avoid the double addition. This is a very important result which form the basis of the RSA cryptosystem.

We are almost done with the math required to understand the RSA algorithm. Now we just need to understand group theory a bit. It is required to prove the existence of an inverse ( in the group $G_{(p-1)(q-1)}$

[TO BE COMPLETED LATER]

## RSA Algorithm

As human culture has a fancy for hero and heroine, so is the cryptography world. The hero and heroine character shall be played by people usually named Alice and Bob. The villain is the female Eve. Let's get to the situation where Bob, the hero wants to send a secret message to the heroine Alice using RSA. This involves three phases:

**Key Generation**:- The key generation phase involves generating the public and private key. Alice chooses two random primes p and q. p and q should be large primes and of similar bit length (it's usually 1024 bits in length). Then she calculates $N = p . q$. Compute the Euler phi function $\phi (N) = (p-1)(q-1)$. Then Alice chooses the encryption exponent e which is coprime to $\phi (N) $ and is in range $1 < e < \phi (N)$. Coprime means the numbers $e$ and $\phi (N)$ do not have any factors in common, that is, $ gcd(e,\phi (N)) = 1$. Then the decryption exponent d is found such that $ e . d = 1 \mod \phi(N)$. d is the multiplicative inverse and can be found using the Extended Euclid Algorithm. Alice outputs the public key which is the pair (e,N) and this is made known to Bob and whosoever she wishes. She keeps the private key d with her. It is to be noted that p, q and $\phi (N) $ are to be kept secret by Alice and shall be used to recover d.**Encryption**:- The Bob wants to send a secret message so that Eve doesn't eavesdrop this message and create havoc in their relationship. The message that he wants to transmit is usually converted into a number x. Then he applies the RSA function which forms the secret messsage that shall be transmitted over the wire: $$ RSA(x) \equiv y \equiv x^e \pmod N $$ It to be noted that pair (e,N) provides a 'one-way function' and using (e,N) it is not possible to recover the original message.**Decryption**:- Alice receives the message from Bob and the original message can be recovered by making use of the decryption exponent d which she possesses. The decryption provides the invert function which is: $$ x \equiv RSA^{-1}(y) \equiv y^d \pmod N$$ What remains to be proven is the validity of the above inverse function, that is, it indeed recovers the original message , that Bob has sent to Alice. In addition, the female Eve should not be able to infer the original message if she knows (e,N) and RSA(x). This shall be topic of the next section.

RSA is generally a computational expensive algorithm. In practice, RSA is used to exchange secret key of another cipher algorithm like DES and the encrypted messages are exchanged using this algorithm which is computationally efficient.

## Security of RSA and Classical Algorithm for factoring

Let me prove the validity that the invert function $RSA^{-1}$ indeed recovers the original message.

$RSA^{-1}(y) \equiv y^d \pmod N$

$ \equiv x^{e^d} \pmod N \equiv x^{ed} \pmod N \equiv x^{1+k . \phi(N)} \pmod N$

$ \equiv x . x^{k . \phi(N)} \pmod N \equiv x . x^{{\phi(N)}^k} \pmod N \equiv x . 1^k \pmod N \equiv x \pmod N$

which gives the original message.

$RSA^{-1}(y) \equiv y^d \pmod N$

$ \equiv x^{e^d} \pmod N \equiv x^{ed} \pmod N \equiv x^{1+k . \phi(N)} \pmod N$

$ \equiv x . x^{k . \phi(N)} \pmod N \equiv x . x^{{\phi(N)}^k} \pmod N \equiv x . 1^k \pmod N \equiv x \pmod N$

which gives the original message.

To break RSA, one has to find the prime factor p and q of N. How hard it is to find the prime factors p and q and what is the time complexity?

[ TO BE COMPELTED LATER]

## No comments :

Post a Comment