ENCRYPTION ALGORITHMS AND PERMUTATION MATRICES
by Rick Martinelli, Haiku Laboratories, June 2003
Updated October 2007

1.INTRODUCTION

The electronic transmission of text-based information is widespread today and expected to increase with time.  Many situations arise in which some degree of privacy is desired for the transmitted message.  This memo describes a family of encryption algorithms that can be used to translate an Ascii text message into another Ascii text message of the same length, whose characters are  permutations of the originals.  These algorithms have the properties

1) all characters are swapped

2) a second encryption returns the original message

The number of permutations of the characters in an alphabet of size N is N-factorial, a huge number for any reasonably sized alphabet.  However, many permutations leave one or more characters unchanged and so are unsuitable for encryption purposes, reducing this number.   In addition we require that the same permutation be used for both encrypting and decrypting a message, further reducing this number.  In the following sections, we characterize this family of algorithms as the set of symmetric permutation matrices having zeros on the main diagonal, and find the size of this set for a given N.  We also describe a method to reproducibly select a particular permutation by means of a single code known only to the parties involved.

2. PERMUTATIONS

Define an alphabet A as a set

A = {a1,a2,...,aN}

of N different characters, or elements.  A permutation of A is a one-to-one and onto map P of A to itself (a bijection) given by

P(aj) = ak

where aj and ak. are each members of A.  Permutations 're-arrange' the elements of the alphabet A.  The fact that a permutation is a bijection ensures that it has an inverse that  returns each character to its original location.  We may identify a permutation by a matrix as follows.  If we regard the members of A as the entries in a row vector in an N- dimensional vector space, then a permutation matrix for the alphabet A is an N by N matrix of zeros and exactly N ones such that there is exactly 1 one per column and row.  The permutation of a given row vector corresponds to multiplication on the right by a permutation matrix.

EXAMPLE 1

Consider the 4-character alphabet A = {a,b,c,d}.  The matrix P given by

0 1 0 0
0 0 1 0
0 0 0 1
1 0 0 0

is a permutation matrix that maps the characters abcd to dabc, i.e. swaps all characters as required (there are no ones on the main diagonal).  In vector notation, (a,b,c,d)P = (dabc).

3. ENCRYPTION

For the purposes of this memo we define an encryption of alphabet A as a permutation of A such that, for each k=1,2,...,N,

1. P(ak) ¹ ak

2. P(P(ak)) = ak.

Property 1 guarantees all characters are swapped, while Property 2 ensures that the same permutation matrix used for encrypting a mesage may also be used to decrypt it.  Property 2 may also be written P2 = I, where I is the N-dimensional identity matrix.  A permutation matrix having properties 1 and 2 is called an encryption matrix.

EXAMPLE 2

The matrix P given in Example 1 is not an encryption matrix because P2 is not the identity matrix.  Instead, P2 is given by

0 1 0 0
1 0 0 0
0 0 0 1
0 0 1 0

which maps the characters abcd to badc.  However P2 is an encryption matrix because it has no ones on the diagonal, and P4 is the identity matrix.  Matrix P is also called a cycle and represents the first step in a 4-step sequence that returns all elements to their original positions.

The number of permutation matrices is well known to be N-factorial.  The number of permutations that are encryptions will now be determined to ensure there are "enough".  To do this, we use the following three facts.

FACT 1: Let P be an N-dimensional permutation matrix.  Then   P(ak) ≠ ak for all k if and only if P has no ones on its main diagonal.

We have already been using this fact, which is obvious from the definition of P.

FACT 2: Let P be an N-dimensional permutation matrix.  Then the following are equivalent:

a. P2 = I

b. P is symmetric.

PROOF:  To show a =>b, let Pij, i,j=1,2,...,N represent the entries of P, where i is a row number and j is a column number.  We must show that Pij=Pji for all i and j.  From a. we can write

P(P(ai)) = ai

for each i.  Now let i,k be a fixed pair for which Pik=1, i.e.,   P(ai)=ak. Substituting in the last equation gives

P(ak) = ai.

Hence Pki=1 whenever Pik=1.  Since P is orthogonal, all other entries must be zeros and so Pij=Pji for all i and j.

To show b.=>a., assume P is symmetric and let P2=Q.  We must show that Q=I.  Since Q is the matrix product of two copies of P, its entries may be written (see [1] or any book on linear algebra):

Qkr = Pk1P1r + Pk2P2r +...+ PkNPNr

where k,r=1,2,...,N.  For each row k, Pkr= 1 for exactly one r-value.  Fix k and let Pkm= 1 for some m.  Then

Qkk = PkmPmk = 1

for this k.  Since k is arbitrary, we conclude that the diagonal entries of Q are ones.  Since Q is orthogonal, being the product of two orthogonal matrices, all other entries in Q must be zeros.  Hence Q is the identity as was to be shown. |

FACT 3. If P is an encryption matrix of dimension N, then N is an even integer.  In other words, an alphabet must contain an even number of characters to be encrypted.

PROOF: P contains exactly N ones, and, from FACT 1, its main diagonal entries are all zeros. From FACT 2 the remaining entries in P have the property

Pnk = Pkn

and the N ones must therefore be assigned in pairs whenever k is distinct from n.  Therefore N must be even.  |

We can finally determine the number of encryptions:

Let A be an alphabet of N characters, where N is an even number.  Then the number of different encryptions of A is given by

JN = (N-1)(N-3)(N-5)...1

PROOF: Define Problem N as follows: starting with a matrix P of dimension NxN filled with N2 zeros, count the number of ways N ones may replace N of the zeros so that there are no ones on the main diagonal, no row or column contains more that 1 one, and the resulting matrix is symmetric.  Starting with row 1, P11 must be 0, so there are N-1 ways to fill the first row. Choose column m > 1.  Because P is symmetric, this determines that a one appear in row m of the first column, and that no further entries may be made in columns 1 or m, or rows 1 or m.  If these rows and columns are deleted from P, we are left with an (N-2) dimensional matrix and and N-2 ones: i.e., with Problem N-2. Hence we have the recursion

JN = (N-1)JN-2

When N=2, the only encryption matrix is

0 1
1 0

an so J2 = 1.  Substituting for J2 on the right-hand side of the recursion and expanding gives the result. |

As an example, if N = 4 then J4 = 3 and the encryptions are given by

0 1 0 0                    0 0 1 0                    0 0 0 1
1 0 0 0                    0 0 0 1                    0 0 1 0
0 0 0 1                    1 0 0 0                    0 1 0 0
0 0 1 0                    0 1 0 0                    1 0 0 0

For any effective alphabet, say one of size N=80, there are approximately 8x1058 encryptions.  Although this is quite a large number, it pales in comparison to the number of permutation matrices of the same degree, being smaller by a factor of about 1050.

4. ENCRYPTION ALGORITHMS

In this section we describe a fast algorithm for reproducibly selecting a particular encryption.  The algorithm is based on the argument used to count the encryption matrices in Section 3.  Starting with an alphabet of N characters, we add N ones to an NxN matrix of zeroes, one per row, using a reproducible, random-number sequence.  For the first row, select a column from 2 to N at random.  Because of symmetry, this fixes the entry in column one.   When the rows and columns occupied by the two ones are eliminated, we have an (N-2)x(N-2) matrix and a new first row.  The process is repeated exactly (N-2)/2 times to fill the matrix.

The purpose of this algorithm is to provide a means to modify an Ascii text message in such a way that the original message may be recovered if a certain code is known.  As a simple example, consider useing a 32-bit integer to hold a a code number.  In base 10 this is about a 9.6 digit integer or one billion code numbers.  Most compilers have a random-number generator that uses such a number as a 'seed' to generate a unique sequence of random numbers, which may be used in the algorithm.  On the other hand, if the message is to be written using N=80 of the Ascii characters, then there are about 1058 possible encryptions, far more than the number of possible code numbers.

As an alternative, Haiku Laboratories has developed a random-number generator based on a set of chaotic, non-linear equations.  The generator  requires both a 40 character string and a nine digit integer as seeds.  Either the string or the integer, or both, may be defaulted to null values.  For an alphabet of size 80, there are 109x1040  or about 1075 different combinations from a 40 character string and a 9 digit, base 10 integer.  So, in general, there are many more code-pair combinations than encryptions.

This method has been used in our software registration module for C programmers, where the name of an individual together with a unique 9-digit integer serve as a code-pair. (The random number generator used in our product is described in [2], but is confidential.)

5. REFERENCES

[1] Serge Lang, Linear Algebra, Third edition, Chapter 2, Springer, 1987.
[2] Software Registration Algorithm, Haiku Laboratories Tech Memo HL95081 (Confidential), August 1995.