CSCI 530 Lab

RSA public-key algorithm
operationally hard, formulaically easy

The RSA algorithm is centered on a short, simple piece of math: raising an integer to one number then dividing by a second one.

xy mod n

But you protest, shouldn't "dividing by a second one" be written as:

xy ÷ n  ?

Usually in a division the "answer" is considered to be the quotient q. But a division has two results:

a = qb + r

The other result is the remainder r.

We tend to overlook it. Sometimes we explicitly discard it, focusing on the quotient as the purpose and outcome of dividing. But here, by contrast, the remainder interests us and the quotient doesn't. Just as the "÷" operation means determining the quotient, the "mod" operation means determining the remainder when doing a division. But it's the exact same operation. Division. The thing that differs is the choice of result:

a ÷ b ÷  means "division" - answer is the quotient, remainder is uninteresting
a mod b mod means "division" - answer is the remainder, quotient is unintersting

What's the answer when you divide 10 by 3? Why, it's 1 of course. Get it?

OK, now that we understand "same operation, different answer," what about the difficulty of doing RSA which means doing xy mod n? That's formulaically easy but operationally hard. It's easy because everybody knows about taking powers and dividing and remainders. If x is 4, y is 3 and n is 10 can you do the two steps? Can you raise 43? Sixty-four? Thank you. Can you divide 64 by 10? What's the answer? Four? Thank you. The steps are well-defined and familiar. How about if x is 65, y is 15141, and n is 212197? The challenge that now overcomes you is the size of the numbers. It's not that you don't know what to do, it's that you can't actually do it. You can't raise 65 to the power 15141 in your head. You can't do it on your calculator either. Or even your computer with a regular programming language. Those are made for reliable, accurate calculation. But not precise calculation. They offer some few digits of "holding capacity" for numbers and can't hold any more. They can hold a number as large as you like. The magnitude isn't the problem. But they can only hold a few digits of that big number. The precision is the problem. 6515141 is an integer with thousands of digits. The calculator can't hold it.

A tool you will like

There is an arbitrary precision calculator product called bc ("basic calculator"). It is installed by default in most linux distributions and a Windows version is available. bc contains a modest programming language. It can also be called from within shell scripts leaving the programming logic to the shell but calling on bc to do the heavy numeric lifting and return the results to the script. Unlike most numeric programs bc does not do its arithmetic in binary, which would limit it to the word size of the CPU and bus. Rather, it does the necessary internal contortions (unnatural to the computer) to conduct and maintain all its operations in decimal. It will give you as much precision as you want. It prints 6515141 on the screen down to the integer (which occupies several screensful of printed digits). It has no problem with performing such things as xy mod n, so it a great tool for doing RSA on a tutorial basis. For doing it on a practical basis bc is too slow. But it can do the operational part-- the hard part. It's just right for our purposes.

What RSA does

RSA does xy mod n twice, with different numbers. Once to encrypt, again to decrypt. Unlike secret key algorithms such as DES there is no excruciating sequence of intricate molestations forcing the plaintext through multiple meat grinders diagonally, laterally, and upside down. Just a single simple exponentiation divided. To make it work, you  need only employ the right numbers. The numbers involved:

a public key consisting of 2 integers - e and n
a matching, precalculated private key consisting of 2 integers - d and n

an input integer, the plaintext message - m
the ciphered version of m, an integer - c

Then here's what RSA does.

encryption:  me mod n

That is c, the ciphertext. Followed by

decryption:  cd mod n

That will reproduce the original m. Integers are what RSA encrypts and decrypts. To process a substantial message, reduce it to a series of integers (ascii codes are a possibility) and individually treat those. That's it.

 

The exercise to perform:

As a convention, the text below printed like this signifies commands you are intended to type in and run as you see them. As you perform this exercise, when you determine values for p, q, e, and d retain them (write them down). You'll need them at the end.

In a linux environment with bc "basic calculator" installed, run it:

bc

You are now in an interactive environment. Though none is printed, you are at a prompt. If you type an expression and press enter, the value of the expression appears. Try

33

Suppose you are interested in one hundred divided by thirty-three:

100 = 3 x 33 + 1

If you want to do division and get the quotient:

100/33

You get the "3". If you want to do division and get the remainder:

 100%33

You get the "1". In bc, " /" is the "division for quotient" operator while the "%" is the "division for remainder" operator. (While written texts use "mod", bc and other languages use "%" synonymously.)

Now do something heavier:

65^15141

Multiple screensful worth of answer show up. "^" is the exponentiation operator. (The language's few operators are documented in the bc man page, or the online manual.) Now, for the moment, quit from bc:

quit

Implementing RSA encryption and decryption in bc

As a preliminary, please obtain a file containing a needed library of functions for bc. What we need is the ability to find greatest common divisors. I found an online set of pre-written functions for bc including gcd(a,b), which returns the greatest common divisor of 2 integers (it uses Euclid's algorithm to do so). To be able to use the gcd function, acquire the file by the same name (which contains the desired function and a bunch of others). You may find it located within your vm, in /home/public. In that case you can get it by:

cp  /home/public/gcd  .   [ don't overlook the dot, it is shorthand for "the current directory" and denotes the destination for the copy ]

Alternatively, if your vm has internet connection, by:

wget  http://www-scf.usc.edu/~csci530l/downloads/gcd

or otherwise per your instructor. Put it in your current working directory. Then re-invoke bc together with gcd:

bc gcd

The gcd( ) function is now available, additively, within bc. So let's get back to business.

Doing the RSA algorithm boils down to 3 steps. The second one is encrypting an input with a public key and the third is decrypting the result with the matching private key. But don't forget step one, which is coming up with a pair of keys that match (i.e., that will actually work) in the first place. For right now, I will hand you a pair of keys that match so you can go ahead and perform encrypting/decrypting to satisfy that it works. Then we'll go back and learn how to generate keys if you don't already have them.

Encrypting an integer in RSA is just raising it to one number then dividing by a second one. The remainder, another integer, becomes the encrypted version of the first one. RSA is a substitution cipher, so it's used to replace the original integer in the data stream to be transmitted to a recipient. In order to do it, you just need to know what 2 numbers to utilize. They, as a pair, are "the key." Decrypting is the same thing with different numbers: the other key. As long as the numbers used as keys are right, you get correct recovery. Here you go:

Public key - 3 and 55

Matching private key - 27 and 55

These are suitable for encrypting and decrypting any integer up to 54. Let's try it on 43. In bc:

43^3%55

bc prints 32 as the result. That is, 43 encrypts to 32. Now, how do we revert 32 back into 43? In bc:

32^27%55

bc prints 43. It worked. Try it on some other integers below 55. You can't use this on any integer at all, and 55 is pretty limiting. In real calculations though you will use a much larger number in place of 55. But bear in mind, whatever that number, you can only process input below that threshold. Beyond it the math doesn't work.

Implementing RSA key generation in bc

This is the slightly tricky part. You can't just pick any numbers. My use of 3-55 and 27-55 was very calculated. To produce keys that work, here's what you do:

1. choose 2 primes - call them p, q
2. multiply them - call product n 
3. multiply their “predecessors” (p-1,q-1) - call product φ
4. pick some integer between 1 and φ (exclusive)  sharing no prime factor with φ - call it e
5. find the integer (there’s only one) that, times e divided by φ, leaves remainder 1 - call it d

then your keys are:
public: e together with n
private: d together with n

In the above set of keys, the primes I started with were 5 and 11. In general there are mulitple candidates for e, that is there are several or many integers less than φ that share no prime factor with it. It doesn't matter which one you adopt, they'll all work. Once you choose an e however, d is deterministically fixed. You just have to identify what number it is. bc takes the trouble out of all this. Please perform these steps and make keys of your own.

1. choose 2 primes

Your choice, your call. The prime numbers are widely published. Here are the primes under 2000.

    2     3     5     7    11    13    17    19    23    29
   31    37    41    43    47    53    59    61    67    71
   73    79    83    89    97   101   103   107   109   113
  127   131   137   139   149   151   157   163   167   173
  179   181   191   193   197   199   211   223   227   229
  233   239   241   251   257   263   269   271   277   281
  283   293   307   311   313   317   331   337   347   349
  353   359   367   373   379   383   389   397   401   409
  419   421   431   433   439   443   449   457   461   463
  467   479   487   491   499   503   509   521   523   541
  547   557   563   569   571   577   587   593   599   601
  607   613   617   619   631   641   643   647   653   659
  661   673   677   683   691   701   709   719   727   733
  739   743   751   757   761   769   773   787   797   809
  811   821   823   827   829   839   853   857   859   863
  877   881   883   887   907   911   919   929   937   941
  947   953   967   971   977   983   991   997  1009  1013
 1019  1021  1031  1033  1039  1049  1051  1061  1063  1069
 1087  1091  1093  1097  1103  1109  1117  1123  1129  1151
 1153  1163  1171  1181  1187  1193  1201  1213  1217  1223
 1229  1231  1237  1249  1259  1277  1279  1283  1289  1291
 1297  1301  1303  1307  1319  1321  1327  1361  1367  1373
 1381  1399  1409  1423  1427  1429  1433  1439  1447  1451
 1453  1459  1471  1481  1483  1487  1489  1493  1499  1511
 1523  1531  1543  1549  1553  1559  1567  1571  1579  1583
 1597  1601  1607  1609  1613  1619  1621  1627  1637  1657
 1663  1667  1669  1693  1697  1699  1709  1721  1723  1733
 1741  1747  1753  1759  1777  1783  1787  1789  1801  1811
 1823  1831  1847  1861  1867  1871  1873  1877  1879  1889
 1901  1907  1913  1931  1933  1949  1951  1973  1979  1987
 1993  1997  1999

You can choose from these or use a test-for-primality website to come up with others. (For performance, I suggest you pick a couple primes in the low hundreds.) In bc assign the primes you chose to variables named p and q:

p=<your 1st prime>
q=<your 2nd prime>

2. multiply them

You want to assign their product to a variable called n; it's also interesting to view it:

n=p*q
print n

3. multiply their “predecessors”

You want to assign this product to a variable called phi (φ); it's also interesting to view it:

phi=(p-1)*(q-1)
print phi

4. pick some integer between 1 and φ (exclusive)  sharing no prime factor with φ

Hmmmm.... how are we going to do this one? We have gcd(a,b), which returns the greatest common divisor of 2 integers. So my strategy is to run through all the integers from 1 to φ, test whether the gcd between each and phi is 1, and call those my set of candidates. (If an integer's gcd with φ is 1, it means it has no common factors with φ. That satisfies me, because if it has none at all then it has none that are prime, as required. This doesn't produce the whole set, but I don't need the whole set. I only need one and will pick from this qualifying subset.) Key in and run this implementing code.

for (i=1; i<phi; i++) { if (gcd(i,phi)==1) {print i," "} }

The numbers printed on the screen qualify, they are (some of) your candidates for e. You can use any. Pick one. (shift-PgUp and shift-PgDn allow vertical screen scrolling.) Assign it:

e=<your chosen candidate>

5. find the unique integer that, times e divided by φ, leaves remainder 1

Key in the following search, which fishes for (and catches) that integer:

for (d=1;d<phi;d++) { print d,"\t",d*e,"\t", d*e%phi,"\n"; if (d*e%phi==1) { print d,"\n"; break } }
print d

When the loop breaks, the value of d will be right. The 3 constituents of your keys-- e and n for the public, d and n for the private-- are now set in memory variables e, n, and d. Ready for use. As you did before with my keys, test out yours. Encrypt something, then decrypt it. Use 43 as your plaintext integer m, the same as you used with my keys.  Then, again, do it with some other integer m of your choosing-- presumably much larger than 43 but make sure it's less than your n. You know what to do: me mod n (which is c), followed by cd mod n. Make sure m is recovered as your last result-- m in, m out.

Be patient. Especially if you chose pretty big primes. If you aren't patient, start over with small primes or get another job. This is what they're talking about when they say "computationally expensive."

For this exercise I used:
p=643
q=919
n=590917
phi=589356
e=588911
d=411887

What to turn in:

What I want you to turn in will be produced by a shell script program named "save-rsa". It generates a file named "outfile". That's what I want you to turn in. Be sure to run this program and capture "outfile" on a medium for you to incorporate in your submittal. 

You may find save-rsa located within your vm, in /home/public. In that case you can get it by:

cp  /home/public/gcd  .

Alternatively, if your vm has internet connection, by:

wget  http://www-scf.usc.edu/~csci530l/downloads/save-rsa

or otherwise per your instructor. Then run it:

chmod +x save-rsa
./save-rsa 

It asks you for the integers p, q, e, d you used in this exercise. Do not use the sample values I provide above, nor those in the comments within the script. Use your own, from today's activity. It additionally asks you for a plaintext integer to encrypt. It encrypts, then decrypts (had better reproduce the plaintext).

When it is finished, capture/take file "outfile" for inclusion in your submittal.  Here is a screenshot of save-rsa's operation:

What is your first prime? 131
What is your second prime? 373
What value did you choose for e? 15941
What as a result was d? 17741
What to encrypt? (must be less than 48863) 29292

Thu Sep 18 13:49:38 PDT 2008

My two primes p,q:      131  373
Their product n:        48863
"lesser product" phi:   48360
"relative prime" e:     15941
Resulting d I found:    17741
Number I am encrypting: 29292
Encrypted value:        46460
Recovered value:        29292