The Knapsack Problem and Public Key Cryptography

Age 16 to 18
Article by NRICH team

Published 2004 Revised 2011


Public-Key cryptography was invented in the 1970s by Whitfield Diffie, Martin Hellman and Ralph Merkle.

Public-key cryptography needs two keys. One key tells you how to encrypt (or code) a message and this is "public" so anyone can use it. The other key allows you to decode (or decrypt) the message. This decryption code is kept secret (or private) so only the person who knows the key can decrypt the message. It is also possible for the person with the private key to encrypt a message with the private key, then anyone holding the public key can decrypt the message, although this seems to be of little use if you are trying to keep something secret!

The First General Public-Key Algorithm used what we call the Knapsack Algorithm. Although we now know that this algorithm is not secure we can use it to look at how these types of encryption mechanisms work.

The knapsack algorithm works like this:
Imagine you have a set of different weights which you can use to make any total weight that you need by adding combinations of any of these weights together.
Let us look at an example:
Imagine you had a set of weights 1, 6, 8, 15 and 24. To pack a knapsack weighing 30, you could use weights 1, 6, 8 and 15. It would not be possible to pack a knapsack that weighs 17 but this might not matter.
You might represent the weight 30 by the binary code 11110 (one 1, one 6, one 8, one 15 and no 24).

Plain text 10011 11010 01011 00000
Knapsack 1 6 8 15 24 1 6 8 15 24 1 6 8 15 24 1 6 8 15 24
Cipher text 1 + 15 + 24 = 40 1 + 6 + 15 = 22 6 + 15 + 24 = 45 0 = 0

What total weights is it possible to make?

So, if someone sends you the code 38 this can only have come from the plain text 01101.
When the Knapsack Algorithm is used in public key cryptography, the idea is to create two different knapsack problems. One is easy to solve, the other not. Using the easy knapsack, the hard knapsack is derived from it. The hard knapsack becomes the public key. The easy knapsack is the private key. The public key can be used to encrypt messages, but cannot be used to decrypt messages. The private key decrypts the messages.


The Superincreasing Knapsack Problem

An easy knapsack problem is one in which the weights are in a superincreasing sequence. A superincreasing sequence is one in which the next term of the sequence is greater than the sum of all preceding terms. For example, the set {1, 2, 4, 9, 20, 38} is superincreasing, but the set {1, 2, 3, 9, 10, 24} is not because 10 < 1+2+3+9.

It is easy to solve a superincreasing knapsack. Simply take the total weight of the knapsack and compare it with the largest weight in the sequence. If the total weight is less than the number, then it is not in the knapsack. If the total weight is greater then the number, it is in the knapsack. Subtract the number from the total, and compare with the next highest number. Keep working this way until the total reaches zero. If the total doesn't reach zero, then there is no solution.

So, for example, if you have a knapsack that weighs 23 that has been made from the weights of the superincreasing series {1, 2, 4, 9, 20, 38} then it does not contain the weight 38 (as 38 > 23)
but it does contain the weight 20; leaving 3;
which does not contain the weight 9 still leaving 3;
which does not contain the weight 4 still leaving 3;
which contains the weight 2, leaving 1; which contains the weight 1.
The binary code is therefore 110010.

It is much harder to decrypt a non-superincreasing knapsack problem. Give a friend a non-superincreasing knapsack and a total and see why this is the case.
One algorithm that uses a superincreasing knapsack for the private (easy) key and a non-superincreasing knapsack for the public key was created by Merkle and Hellman They did this by taking a superincreasing knapsack problem and converting it into a non-superincreasing one that could be made public, using modulus arithmetic.

Making the Public Key

To produce a normal knapsack sequence, take a superincreasing sequence; e.g. {1, 2, 4, 10, 20, 40}. Multiply all the values by a number, n, modulo m. The modulus should be a number greater than the sum of all the numbers in the sequence, for example, 110. The multiplier should have no factors in common with the modulus. So let's choose 31. The normal knapsack sequence would be:

1$\times$31 mod(110) = 31
2$\times$31 mod(110) = 62
4$\times$31 mod(110) = 14
10$\times$31 mod(110) = 90
20$\times$31 mod(110) = 70
40$\times$31 mod(110) = 30

So the public key is: {31, 62, 14, 90, 70, 30} and
the private key is {1, 2, 4, 10, 20.40}.

Let's try to send a message that is in binary code:
The knapsack contains six weights so we need to split the message into groups of six:
This corresponds to three sets of weights with totals as follows
100100 = 31 + 90 = 121
111100 = 31+62+14+90 = 197
101110 = 31+14+90+70 =205
So the coded message is 121 197 205.

Now the receiver has to decode the message...
The person decoding must know the two numbers 110 and 31 (the modulus and the multiplier). Let's call the modulus "m" and the number you multiply by "n".
We need $n^{-1}$, which is a multiplicative inverse of n mod m, i.e. $n(n^{-1})$ = 1 mod m

In this case I have calculated $n^{-1}$ to be 71.

All you then have to do is multiply each of the codes 71 mod 110 to find the total in the knapsack which contains {1, 2, 4, 10, 20, 40} and hence to decode the message.
The coded message is 121 197 205:

121$\times$71 mod(110) = 11 = 100100
197$\times$71 mod(110) = 17 = 111100
205$\times$71 mod(110) = 35 = 101110

The decoded message is:
Just as I thought!

Simple and short knapsack codes are far too easy to break to be of any real use. For a knapsack code to be reasonably secure it would need well over 200 terms each of length 200 bits.