**
ENCRYPTION ALGORITHMS AND PERMUTATION MATRICES**

by Rick Martinelli, Haiku Laboratories, June 2003

Copyright © 2003

Updated October 2007

Introduction

Permutations

Encryption

Algorithms

References

The
electronic transmission of text-based information is widespread today and
expected to increase with time. Many
situations arise in which some degree of privacy is desired for the transmitted
message. This memo describes a family
of encryption algorithms that can be used to translate an Ascii text message
into another Ascii text message of the same length, whose characters are *permutations* of the
originals. These algorithms have the
properties

1) all characters are swapped

2) a second encryption returns the original message

The number of permutations of the characters in an alphabet of size N is N-factorial, a huge number for any reasonably sized alphabet. However, many permutations leave one or more characters unchanged and so are unsuitable for encryption purposes, reducing this number. In addition we require that the same permutation be used for both encrypting and decrypting a message, further reducing this number. In the following sections, we characterize this family of algorithms as the set of symmetric permutation matrices having zeros on the main diagonal, and find the size of this set for a given N. We also describe a method to reproducibly select a particular permutation by means of a single code known only to the parties involved.

Define
an *alphabet* A as a set

A = {a_{1},a_{2},...,a_{N}}

of
N different characters, or elements. A *permutation* of A is a one-to-one and
onto map P of A to itself (a bijection) given by

P(a_{j}) = a_{k}

where
a_{j} and a_{k}. are each members of A. Permutations 're-arrange' the elements of
the alphabet A. The fact that a
permutation is a bijection ensures that it has an inverse that returns each character to its original
location. We may identify a permutation
by a matrix as follows. If we regard
the members of A as the entries in a row vector in an N- dimensional vector
space, then a *permutation
matrix* for the alphabet A is an N by N matrix of zeros and exactly N
ones such that there is exactly 1 one per column and row. The permutation of a given row vector
corresponds to multiplication on the right by a permutation matrix.

EXAMPLE 1

Consider the 4-character alphabet A = {a,b,c,d}. The matrix P given by

0 1 0 0

0 0 1 0

0 0 0 1

1 0 0 0

is a permutation matrix that maps the characters abcd to dabc, i.e. swaps all
characters as required (there are no ones on the main diagonal). In vector notation, (a,b,c,d)P = (dabc).

For
the purposes of this memo we define an *encryption* of alphabet A as a permutation of A such that, for each k=1,2,...,N,

1. P(a_{k}) ¹ a_{k}

2. P(P(a_{k})) = a_{k}.

Property
1 guarantees all characters are swapped, while Property 2 ensures that the same
permutation matrix used for encrypting a mesage may also be used to decrypt
it. Property 2 may also be written P^{2} = I, where I is the N-dimensional identity matrix. A permutation matrix having properties 1 and 2 is called an *encryption
matrix*.

EXAMPLE 2

The
matrix P given in Example 1 is *not* an encryption matrix because P^{2} is not the identity matrix. Instead, P^{2} is given by

0 1 0 0

1 0 0 0

0 0 0 1

0 0 1 0

which
maps the characters abcd to badc. However P^{2} *is* an encryption matrix because it has no
ones on the diagonal, and P^{4} is the identity matrix. Matrix P is also called a cycle and
represents the first step in a 4-step sequence that returns all elements to
their original positions.

The number of permutation matrices is well known to be N-factorial. The number of permutations that are encryptions will now be determined to ensure there are "enough". To do this, we use the following three facts.

FACT
1: Let P be an N-dimensional permutation matrix. Then P(a_{k}) ≠ a_{k} for all k if and only if P has no
ones on its main diagonal.

We have already been using this fact, which is obvious from the definition of P.

FACT 2: Let P be an N-dimensional permutation matrix. Then the following are equivalent:

a. P^{2} = I

b. P is symmetric.

PROOF: To show a =>b, let P_{ij},
i,j=1,2,...,N represent the entries of P, where i is a row number and j is a
column number. We must show that P_{ij}=P_{ji} for all i and j. From a. we can write

P(P(a_{i})) = a_{i}

for
each i. Now let i,k be a fixed pair for
which P_{ik}=1, i.e., P(a_{i})=a_{k}.
Substituting in the last equation gives

P(a_{k}) = a_{i}.

Hence
P_{ki}=1 whenever P_{ik}=1. Since P is orthogonal, all other entries must be zeros and so P_{ij}=P_{ji} for all i and j.

To
show b.=>a., assume P is symmetric and let P^{2}=Q. We must show that Q=I. Since Q is the matrix product of two copies
of P, its entries may be written (see [1] or any book on linear algebra):

Q_{kr} = P_{k1}P_{1r} + P_{k2}P_{2r} +...+ P_{kN}P_{Nr}

where
k,r=1,2,...,N. For each row k, P_{kr}=
1 for exactly one r-value. Fix k and
let P_{km}= 1 for some m. Then

Q_{kk} = P_{km}P_{mk} = 1

for this k. Since k is arbitrary, we conclude that the diagonal entries of Q are ones. Since Q is orthogonal, being the product of two orthogonal matrices, all other entries in Q must be zeros. Hence Q is the identity as was to be shown. |

FACT 3. If P is an encryption matrix of dimension N, then N is an even integer. In other words, an alphabet must contain an even number of characters to be encrypted.

PROOF: P contains exactly N ones, and, from FACT 1, its main diagonal entries are all zeros. From FACT 2 the remaining entries in P have the property

P_{nk} = P_{kn}

and the N ones must therefore be assigned in pairs whenever k is distinct from n. Therefore N must be even. |

We can finally determine the number of encryptions:

Let A be an alphabet of N characters, where N is an even number. Then the number of different encryptions of A is given by

J_{N} = (N-1)(N-3)(N-5)...1

PROOF:
Define Problem N as follows: starting with a matrix P of dimension NxN filled
with N^{2} zeros, count the number of ways N ones may replace N of the
zeros so that there are no ones on the main diagonal, no row or column contains
more that 1 one, and the resulting matrix is symmetric. Starting with row 1, P_{11} must be
0, so there are N-1 ways to fill the first row. Choose column m > 1. Because P is symmetric, this determines that
a one appear in row m of the first column, and that no further entries may be
made in columns 1 or m, or rows 1 or m. If these rows and columns are deleted from P, we are left with an (N-2)
dimensional matrix and and N-2 ones: i.e., with Problem N-2. Hence we have the
recursion

J_{N} = (N-1)J_{N-2}

When N=2, the only encryption matrix is

0 1

1 0

an
so J_{2} = 1. Substituting for
J_{2} on the right-hand side of the recursion and expanding gives the
result. |

As
an example, if N = 4 then J_{4} = 3 and the encryptions are given by

0 1 0 0 0 0 1 0 0 0 0 1

1 0 0 0 0 0 0 1 0 0 1 0

0 0 0 1 1 0 0 0 0 1 0 0

0 0 1 0 0 1 0 0 1 0 0 0

For
any effective alphabet, say one of size N=80, there are approximately 8x10^{58} encryptions. Although this is quite a
large number, it pales in comparison to the number of permutation matrices of
the same degree, being smaller by a factor of about 10^{50}.

4. ENCRYPTION ALGORITHMS

In this section we describe a fast algorithm for reproducibly selecting a particular encryption. The algorithm is based on the argument used to count the encryption matrices in Section 3. Starting with an alphabet of N characters, we add N ones to an NxN matrix of zeroes, one per row, using a reproducible, random-number sequence. For the first row, select a column from 2 to N at random. Because of symmetry, this fixes the entry in column one. When the rows and columns occupied by the two ones are eliminated, we have an (N-2)x(N-2) matrix and a new first row. The process is repeated exactly (N-2)/2 times to fill the matrix.

The
purpose of this algorithm is to provide a means to modify an Ascii text message
in such a way that the original message may be recovered if a certain code is
known. As a simple example, consider
useing a 32-bit integer to hold a a code number. In base 10 this is about a 9.6 digit integer or one billion code
numbers. Most compilers have a random-number
generator that uses such a number as a 'seed' to generate a unique sequence of
random numbers, which may be used in the algorithm. On the other hand, if the message is to be written using N=80 of
the Ascii characters, then there are about 10^{58} possible
encryptions, far more than the number of possible code numbers.

As
an alternative, Haiku Laboratories has developed a random-number generator
based on a set of chaotic, non-linear equations. The generator requires
both a 40 character string and a nine digit integer as seeds. Either the string or the integer, or both,
may be defaulted to null values. For an
alphabet of size 80, there are 10^{9}x10^{40} or about 10^{75} different
combinations from a 40 character string and a 9 digit, base 10 integer. So, in general, there are many more
code-pair combinations than encryptions.

This method has been used in our software registration module for C programmers, where the name of an individual together with a unique 9-digit integer serve as a code-pair. (The random number generator used in our product is described in [2], but is confidential.)

[1]
Serge Lang, Linear Algebra, Third edition, Chapter 2, Springer, 1987.

[2] Software Registration Algorithm, Haiku Laboratories Tech Memo HL95081
(Confidential), August 1995.