Idea: Because a Caesar cipher just rotates the alphabet, we can try every one of the 26 shifts, score each tentative plaintext with a simple English–frequency χ² test, and pick the shift whose score is smallest (i.e. closest to normal English statistics). For texts longer than a few lines this almost always singles out the correct key.
Algorithm (χ²-based search)
1. Strip the text to A–Z and convert to uppercase.
2. For k from 0 to 25 do
a. Rotate every letter left by k (A→A, B→A is k=1, etc.).
b. Count the 26 letter frequencies fi in the result.
c. Compute
chi2 = Σ((fi – pi·N)² / (pi·N))
where pi is the expected English probability for that letter and N is text length.
3. Return the shift whose χ² is minimum and its plaintext.
Python 3 reference implementation (≈30 lines) — no external files needed:
import string, sys
ENGLISH_FREQ = [ # ACEFHILN... ordered A-Z
8.167, 1.492, 2.782, 4.253, 12.702, 2.228, 2.015,
6.094, 6.966, 0.153, 0.772, 4.025, 2.406, 6.749,
7.507, 1.929, 0.095, 5.987, 6.327, 9.056,
2.758, 0.978, 2.360, 0.150, 1.974, 0.074]
def rotate(text, k):
res = []
for c in text:
if c.isalpha():
base = ord('A')
res.append(chr(base + (ord(c.upper()) - base - k) % 26))
else:
res.append(c)
return ''.join(res)
def chi2_score(freq, N):
return sum((freq[i] - ENGLISH_FREQ[i]*N/100)**2 / (ENGLISH_FREQ[i]*N/100)
for i in range(26))
def decrypt_caesar(cipher):
best_chi, best_shift, best_plain = 10**9, None, None
cleaned = ''.join(c for c in cipher if c.isalpha()).upper()
for k in range(26):
plain = rotate(cipher, k)
freq = [0]*26
for c in cleaned:
idx = (ord(c) - ord('A') - k) % 26
freq[idx] += 1
chi = chi2_score(freq, len(cleaned))
if chi < best_chi:
best_chi, best_shift, best_plain = chi, k, plain
return best_shift, best_plain
if __name__ == '__main__':
ciphertext = sys.stdin.read()
shift, plaintext = decrypt_caesar(ciphertext)
print(f"Shift used = {shift}")
print(plaintext)
Complexity: 26 passes over the text ⇒ O(26·N) ≈ O(N) time and O(1) additional memory.
Run the script, paste or pipe the ciphertext, and it prints the most likely key and the decrypted message.
Back to
Chapter 3