Frequency of Occurrence of Letters in German

The following table and text are from:
Fletcher Pratt, Secret and Urgent: the Story of Codes and Ciphers Blue Ribbon Books, 1939, pp. 256-257.

Rank Letter Frequency of occurrence in 1000 words Frequency of occurrence in 1000 letters
1 E 988 166.93
2 N 586 99.05
3 I 463 78.12
4 S 401 67.65
5 T 399 67.42
6 R 387 65.39
7 A 385 65.06
8 D 321 54.14
9 H 241 40.64
10 U 219 37.03
11 G 216 36.47
12 M 178 30.05
13 C 168 28.37
14 L 167 28.25
15 B 152 25.66
16 O 132 22.85
17 F 126 20.44
18 K 112 18.79
19 W 83 13.96
20 V 63 10.69
21 Z 59 10.02
22 P 55 9.44
23 J 11 1.91
24 Q 3 .55
25 Y 2 .32
26 X 1 .22

"The average length of German words is 5.92 letters.

Letters with the umlaut have been treated as though they had no umlaut in this table. A, O and U are umlauted at times, but the frequency of all three is very small, being about that of J in the table. When the umlaut is not used, words containing an umlauted letter are usually spelled with an E after the letter; if this is done the frequency of E in the table should be slightly higher.

It will be noted that the letters, when arranged by relative frequency fall into certain well-defined groups.

In short messages, any letter is likely to show a higher frequency than another letter of the same group.

For convenience' sake the groups may be listed as follows:

III S, T, R, A, D
IV H, U, G
V M, C, L, B
VI O, F, K
VII W, V, Z, P

If the articles are omitted, there is less change in German than in any other language, owing to the declension of the articles. D and E are the letters whose frequency is most affected. However, articles are not usually omitted even in cipher messages in German, as their omission frequently changes the meaning of a sentence.

C, U and O are the letters whose frequency shows the sharpest variation in short messages, but messages of even 100 words in length exhibit fairly normal frequencies.

Leading peculiarities by which German may be identified from English (in transposition ciphers):

Leading peculiarities by which German may be identified from French:

Leading peculiarities by which German may be identified from Spanish:

Double M is very frequent in German, as is double S, and the combination SZ which occurs in no other language is common.

In German, C is always followed by H or K, and G is nearly always followed by E."

Verified by: DT 3/99

While the Library has verified the information presented in these files in what it considers to be reliable and authoritative sources, it cannot take responsibility for nor guarantee the accuracy of the information presented.

