ETOAN logo

ENGLISH LETTER FREQUENCIES

ETOAN IRSHL DCUP FM WY BGV

Okay, so it's not quantitative, but it's how I remember the relative frequency of letters in the English language, and it serves to explain why I chose this domain name.

Here's some more detail on statistical occurences of letters in the English language.

Letter Percent
E 13
T 9
O 8
A 8
N 7
I 7
R 7
S 6
H 6
L 4
Letter Percent
D 4
C 3
U 3
P 3
F 3
M 2
W 2
Y 2
B 1
Letter Percent
G 1
V 1
K <1
Q <1
X <1
J <1
Z <1

So, in set of 100 letters of English text, on average 13 of them would be 'E', 9 would be 'T', and so on.

From Amazing Stories, volume 9 number 6 (October 1934):

Cryptography

Most elaborate systems of secret writing have been devised, but it is probable that any system can be deciphered by a student of the very difficult subject and of the methods employed. In the story of "The Gold Bug" a very simple example is given of a cryptogram or secret writing, for that is the meaning of the word. Its deciphering was based on the relative frequency of letters in English language texts. In the printer's art the same proposition comes up. A font or fount of type means a set comprising all the individual types required to print any text. It would clearly be useless to have as many x's or y's in a font as there are e's and t's. The reader can count the e's in a few lines of any text and then in the same line count other letters and he will find a great difference. In the font of type the following figures give the relative number of letters approximately to be expected. It will be of interest to cryptogrammists.

LetterCount   LetterCount   LetterCount   LetterCount   LetterCount   LetterCount   LetterCount  
e 1000   t 770   a 728   i 704   s 680   o 672   n 670  
h 540   r 528   d 392   l 360   u 296   c 280   m 272  
f 236   w 190   y 184   p 168   g 168   b 158   v 120  
k 88   j 55   q 50   x 46   z 22  

GENERATING RANDOM NUMBERS

Radioactive symbol Click here for an explanation of my set-up to generate random numbers from a common radioactive source.

If you're looking for alternatives, here is a simple electronic number generator from 1976. I would not recommend it for serious cryptographic use, but it's neat to build nonetheless.

Machine generation of pseudo-random numbers dates back to the early days of computing. Click here for a PDF of John Mauchly's 1949 paper, "Pseudo-Random Numbers". The 12-page paper includes the generation algorithm (the "Binac routine" A-510-2B), a flowchart, results and analysis, as well as a one-page description of the UNIVAC system used to generate the numbers.

ON-LINE REFERENCES

  • The Handbook of Applied Cryptography by Alfred J. Menezes, Paul C. van Oorschot and Scott A. Vanstone is available on-line here.
  • The Cryptology ePrint Archive can be found here.
  • The National Institute of Standards and Technology (NIST) Computer Security Resource Center (CSRC) is here.

QUOTES

From the 1984 movie Repo Man:

Leila: Are you using a scrambler?
J. Frank: I can't hear you, I'm using a scrambler!

 

From the 1976 movie Midway:

Captain Garth: How much can you decipher?
Commander Rochefort: Fifteen percent...
Garth: Really decipher?
Rochefort: Ten percent.
Garth: That's one word in ten, Joe! You're guessing!
Rochefort: We like to call it "analysis."

(In reality, Commander Joseph J. Rochefort was the OIC (officer-in-charge) of the Communications Intelligence processing unit in Hawaii, part of OP-20-G, the Navy Radio Intelligence Section. He was in daily contact with Lieutenant Commander Edwin Layton, who was the intelligence officer for Admiral Chester W. Nimitz, Commander-in-Chief, Pacific Fleet. Layton later wrote a book entitled "And I Was There..." detailing his activities before and during World War II.)

 

From comedienne Elayne Boosler:

"I have six locks on my door all in a row. When I go out, I lock every other one. I figure no matter how long somebody stands there picking the locks, they are always locking three."

 

From the 1983 movie Never Say Never Again:

Bond: Commander Peterson, are you equipped with the new XT-7B's?
Peterson: That's Top Secret. How do you know about them?
Bond: From the Russian translation of one of your service manuals.
DIGRAPH FREQUENCY TABLE

ASCII digraph Relative Frequency
(e_ = 100.0)
Reverse/Forward Ratio
e_ 100.0 10.6%
_t 76.7 54.5%
th 60.8 1.9%
he 57.5 0.2%
s_ 52.8 47.2%
_a 44.8 26.6%
t_ 41.8 183.4%
re 39.8 88.6%
in 37.2 9.1%
CRLF 36.9 28.1%
er 35.2 112.9%
d_ 32.9 56.3%
n_ 30.3 38.1%
_i 29.7 1.6%
an 28.5 19.4%

KERCKHOFF'S PRINCIPLES

Auguste Kerckhoffs' Principles:

  1. The system must be physically, if not mathematically, undecipherable;
  2. The system must not require secrecy and can be stolen by the enemy without causing trouble;
  3. It must be easy to communicate and remember the keys without requiring written notes, it must also be easy to change or modify the keys with different participants;
  4. The system ought to be compatible with telegraph communication;
  5. The system must be portable, and its use must not require more than one person;
  6. Finally, regarding the circumstances in which such system is applied, it must be easy to use and must neither require stress of mind nor the knowledge of a long series of rules.

GROUPS OF FIVE

The practice of transmitting encrypted messages in sets of 5 characters (also known as "groups of five") is a historical convention that originated with the use of the telegraph and continued into the early days of radio communication.

In the early days of telegraphy, messages were transmitted using a system of dots and dashes known as Morse code. Morse code was transmitted using a series of electrical pulses, and operators would tap out the code on a telegraph key to send messages over long distances. Because Morse code had no punctuation and did not differentiate between upper and lower case letters, messages were often written out in all caps with no spaces between words. This made messages difficult to read and increased the risk of errors.

To make messages easier to read and reduce the risk of errors, telegraph operators would group the letters of each word into sets of five characters. These groups were typically separated by spaces or slashes and were known as "groups of five" or "five-letter groups." By breaking the message up into smaller chunks, operators could more easily read and transmit the message, and errors could be quickly identified and corrected.

The use of groups of five continued into the early days of radio communication, and the convention of using groups of five to transmit encrypted messages has persisted to this day. In modern cryptography, the use of groups of five is not strictly necessary, but it is still a common practice in some contexts as a nod to this historical convention.

IBM AND DES

IBM briefly describes the standardization and use of the Data Encryption Standard (DES) in a 1979 Student Manual here.

COMPUTERS AND CRYPTOLOGY

Here is a scan of the Computers and Cryptology article by Fred Chesson that appeared in the January 1973 issue of Datamation magazine.

VOYNICH MANUSCRIPT

The Voynich Manuscript is available on-line here.

NSA TECHNICAL JOURNAL ARTICLES

The National Security Agency has declassified a number of Technical Journal articles that can be read here.

INTELLIGENT NOISE

In the late 1970's a team of engineers in Seattle designed a secure telephone they called the PhasorPhone. Their attempt to have the design patented resulted in a secrecy order issued at the direction of the National Security Agency (NSA). Some brief exerpts from James Bamford's book The Puzzle Palace describing the situation can be seen by clicking here.

Even TIME magazine ran a story about it in their October 2, 1978 issue. You can read a snippet here.

An Associated Press news story that ran in February 1980 contained the following:

The inventors, who are still working on devices that scramble or encode conversations or computer data, hope they don't run into the NSA again.
"The less we have to do with that agency the better," Raike said.

Timeline

Patent application filed on October 20, 1977.
Secrecy order issued April 21, 1978, at the behest of NSA.
Secrecy order lifted October 11, 1978, after inquiry by Senator Warren Manguson (D-WA).
Patent 4,188,580 granted February 12, 1980. Inventors identified as Carl Nicolai, William Raike and David Miller. Invention assigned to Telesync Corporation of Carmel Valley, California.

Origins

The original design of the PhasorPhone was prompted by an article entitled Intelligent Noise which appeared in the December 1962 issue of Analog magazine. You can read the article by clicking here.

The article makes reference to radar signals using pseudo-noise sequences, which likely refers to either:

(a) MIT Lincoln Laboratory's high frequency NOMAC (NOise Modulation and Correlation) system, known under the Army Signal Corps production name F9C. NOMAC was a communications system using noise-like signals and cross-correlation detection. Papers describing this method and system date back to 1952. You can read more about the history of this system here;

You can also read Noise-like Signals and their Detection by Correlation (Dated 26 May 1952) here (local copy here).

or

(b) the Jet Propulsion Laboratory (JPL) radio communication system called CODORAC (COded DOppler, Ranging, and Command) that became the basis for what is now the Deep Space Network (DSN).

ENCRYPTION TYPES

The Committee on National Security Systems (CNSS), in the National Information Assurance (IA) Glossary, CNSS Instruction No. 4009, defines the following four types of encryption systems:
Type 1 U.S. Classified
Cryptographic equipment, assembly or component classified or certified by NSA for encrypting and decrypting classified and sensitive national security information when appropriately keyed. Developed using established NSA business process and containing NSA approved algorithms. Used to protect systems requiring the most stringent protection mechanisms.
 
Type 1 products contain classified NSA algorithms.
Type 2 U.S. Federal interagency
Cryptographic equipment, assembly or component certified by NSA for encrypting and decrypting sensitive national security information when appropriately keyed. Developed using established NSA business process and containing NSA approved algorithms. Used to protect systems requiring protection mechanisms exceeding best commercial practices including systems used for the protection of unclassified national security information.
 
Type 2 products may not be used for classified information, but contain classified NSA algorithms.
Type 3 Interoperable interagency (Federal, State and Local)
Unclassified cryptographic equipment, assembly or component used, when appropriately keyed, for encrypting or decrypting unclassified sensitive U.S. Government or commercial information, and to protect systems requiring protection mechanisms consistent with standard commercial practices. Developed using established commercial standards and containing NIST approved cryptographic algorithms/modules or successfully evaluated by the National Information Assurance Partnership (NAIP).
Type 4 Proprietary
Unevaluated commercial cryptographic equipment, assemblies, or components that neither NSA nor NIST certify for any government usage. These products are typically delivered as part of commercial offerings and are commensurate with the vendor's commercial practices. These products may contain either vendor proprietary algorithms, algorithms registered by NIST, or algorithms registered by NIST and published in a FIPS.

(See also Federal Standard 1037C)

ENIGMA MACHINE SIMULATION

You can see a simulation of the World War II-era Enigma encryption machine in Flash here.

PAPER ENIGMA MACHINE

Mike Koss has put together a basic model of the Engima encipherment method using just a sheet a paper. You can read more about it here.

M-209 CIPHER MACHINE SIMULATION

Dirk Rijmenants has an excellent M-209 simulator for Windows here.

He also has an Enigma simulator here.

ENCRYPTION USED BY THE KNIGHTS TEMPLAR

Knights Templar logo Question for the reader: are there any good references for details regarding the coding scheme used by the Knights Templar during the Middle Ages for their "letters of credit"?
By 1150, the Order's original mission of guarding pilgrims had changed into a mission of guarding their valuables through an innovative way of issuing letters of credit, an early precursor of modern banking. Pilgrims would visit a Templar house in their home country, depositing their deeds and valuables. The Templars would then give them an encrypted letter which would describe their holdings. While traveling, the pilgrims could present the letter to other Templars along the way, to "withdraw" funds from their account. This kept the pilgrims safe since they were not carrying valuables, and further increased the power of the Templars.
(from Wikipedia)

ACA COMPUTER SUPPLEMENTS

ACA logo The American Cryptogram Association (ACA) was organized originally to encompass the many phases of paper-and-pencil puzzle solving. It developed rapidly to include all forms of cryptography, and in recent years the home computer has begun to augment or replace the pencil and paper.

The Computer Supplement was a printed quarterly publication that served as an adjunct to the ACA's flagship magazine The Cryptogram, focused on the use of computers as aids to solving.

You can read more about the ACA and computer solutions to cryptographic puzzles here.


Comments to Webmaster
Click here for my wanted page.
Last updated April 6, 2023