Kryha Cryptanalysis - Hill Climbing method


Home Page
Kryha Home Page
Kryha Cryptanalysis Home Page

Introduction

The hill-climbing method is revolutionary. Thanks to a computer, we can easily decipher historical encryption methods from before the Cold War. Modern methods (DES, AES, etc.) are immune to this type of attack.

Hill Climbing Algorithm

We present a very simple implementation of Hill Climbing adapted to find the internal alphabet. We assume we know the coding wheel and the starting sector. More complex approaches would undoubtedly make it possible to find the complete key.

Note: The method for determining the starting sector is described on another page (link).

If you use the fitness functions bigrams, trigrams, ..., instead of the IC function, you need to know the language used and have good statistics for this language.

1. Generate initial random internal alphabet, decrypt and compute an inital 
   score with the fitness function.

2. Repeat the following steps. Exit from loop if the fitness function no 
   longer improves the score for a predetermined number of loops (for example, 
   1000 steps)

   2.1. Randomly we swap two letters of the internal alphabet.
   2.2. Decrypt and Compute the fitness function.
   2.3. If the score with the new alphabet, keep it,
        otherwise, discard this new alphabet.

3. If the new score if better that better score, fix the better score to 
   the new one and print this score, the key and the beginning of the plain 
   text discovered.

4. Repeat from step 1, i.e. restart with new random alphabet. Stop if the 
   predetermined number of loops is reached (Shotgun approch).

Remark: I tried several fitness functions:

  • IC (Index of Coincidences) [default fitness function]
  • UNI (Unigram)
  • BIG (Bigram)
  • TRI (Trigram)
  • QUA (Quadrigram)

Only the internal alphabet is unknown

1. We create a cryptogram

The text tobe_lg.txt is 1155 characters long, tobe.txt is 192 characters long.

$ python3 kryha_tui.py -o -i EAKTQYVOMZDHGRPNUJXLCIBFWS -f 8 \
	< MSGS/tobe.txt  >| /tmp/toto.cry

$ python3 groupe.py < /tmp/toto.cry
LVRVH MSLXR SYKDL VXATW KCESK DVGHQ VGTHL MNOJI HYGJA OXUJS
NJYQD IDZQZ QCEBT ITNVT UXIKR TPZFW PTNXJ DETSX BFPIS LPZIO
HOTUN GJYBV IBICM ZZMKT RLYPS MSUJN XYGNB YFGBR MAEZM BHRFV
NVOMI QSUSN GSRQM ECRZY GEORV FCYXV FASRG PYFFJ UA

$ cat MSGS/tobe.txt
To be, or not to be, that is the question:
Whether 'tis nobler in the mind to suffer
The slings and arrows of outrageous fortune,
Or to take arms against a sea of troubles,
And, by opposing, end them. To die, to sleep,
No more, and by a sleep to say we end

2. We use the IC fitness function with a long cryptogram

$ pypy3 hc_kry_aE.py -c /tmp/toto_lg.cry -W 8 -t 1 -M IC 
HIGH_SCORE initial: 0.03879898264646965
0 0.06408276876214485 RMZCMPLMRRMZRRFYRGQRUCOSRQRGMLUFCRFCPRRQLMZUCPGLCF

Key: JTQYVOMZDHGRPNUKXLCIBFWSEA Score: 0.06408276876214485
Plain text: RMZCMPLMRRMZRRFYRGQRUCOSRQRGMLUFCRFCPRRQLMZUCPGLCF

Notes:

  • The solution has been (almost) found, but the plaintext found is a simple substitution: RM ZC MP LMR RM ZR == TO BE OR NOT TO BE (a small error for the letter E).
  • pypy3 is a compiled version of the Python v3.x interpreter. You can obviously use the standard interpreter (python or python3), but a priori, it is slower.

3. We use trigram function

$ pypy3 hc_kry_aE.py -c /tmp/toto_lg.cry -W 8 -t 1 -M TRI -m english
HIGH_SCORE initial: 1853.3209267274315
0 6139.917549233701 TOBEORNOTTOBETHATISTHEQUESTIONWHETHERTISNOBLERINTH

4. We use unigram function

$ pypy3 hc_kry_aE.py -c /tmp/toto_lg.cry -W 8 -t 1 -M UNI -m english
HIGH_SCORE initial: 10591.32265406777
0 11695.604939139826 TOBEORNOTTOBETHATISTHEQUESTIONWHETHERTISNOBLERINTH

5. We use different fitness function but with short cryptogram

$ pypy3 hc_kry_aE.py -c /tmp/toto.cry -W 8 -t 1 -M BIG -m english
HIGH_SCORE inital: 833.8125373157359
0 1424.2020748183388 TOBEOGNOTTOBETHATHSTHERUESTIONWHETHTRUISNOBLDSINTH

$ pypy3 hc_kry_aE.py -c /tmp/toto.cry -W 8 -t 1 -M UNI -m english
HIGH_SCORE initial: 1757.8514830728209
0 1938.9078152003788 TOBEOGNOPTOEETHAPMSTHEFUESTIONWHETHTUTISNRBLIRENTH

Key: PFKTQAVOYZDHGRXEUJNLCIBMWS Score: 1938.9078152003788
Clear: TOBEOGNOPTOEETHAPMSTHEFUESTIONWHETHTUTISNRBLIRENTH

$ pypy3 hc_kry_aE.py -c /tmp/toto.cry -W 8 -t 1 -M TRI -m english
HIGH_SCORE initial: 317.61766223900946
0 1031.7093197034928 TOBEORNOTTOBETHATISTHEQUESTIONWHETHERTISNOBLERINTH

Key: EAKTQYVOMZDHGRPNUJXLCIBFWS Score: 1031.7093197034928
Clear: TOBEORNOTTOBETHATISTHEQUESTIONWHETHERTISNOBLERINTH

$ pypy3 hc_kry_aE.py -c /tmp/toto.cry -W 8 -t 15 -M QUA -m english
HIGH_SCORE initial: 17.843407578027147
0 653.6816767606102 TOBEORNOTTOBETHATISTHEQUESTIONWHETHERTISNOBLERINTH
Conclusion:
If we attack a cryptogram of approximately 200 characters, all fitness functions work except for the general IC and Unigram methods (although Unigram [UNI] partially succeeds).

With a cryptogram of approximately 1000 characters, the IC method works, but if we use the found key, the result is a simple substitution. It is easy to break.

Repeating Friedman's feat

If the starting sector is known, the Hill Climbing method allows us to find the internal alphabet. Then, using this key, we can decipher the cryptogram; the result is a simple substitution. It is easy to break.

Note: In the problem solved by Friedman, the wheel is known, but the two alphabets used (inner and outer) are disordered.

$ more MSGS/2h.cry

	XYICP NDEAM APDTR AXXPZ XHYRY TWQXF
	HCDJK AHQUR ZPPPZ QOFUV KFEMN EAONG
	TTXSV VUBDG JREJF HEOKV CQHFH ROKUP
	MQPQW ACOJC RLMBM EVKRV JDYNN SXUDL
	HNPFW MOCMJ FLGPM BKHAU XLIVV QSXUN
	JZUKK OBAAE UQOJY IZSZU HGWGW ATEJW
	YDIVX PEIKE ECMCI RXXLA ZLAIN MJZXI
	CIDKQ KMMTE LLFJT JUBQO LJAWM FEHEV
	SYCAS KFONO ZUMPA DAPJY LPFNT RUITC
	BWHJH MOLCV RDEPF QACIU HCZCB XTOKC
	IXGOS GCMRF HJVXS VZNMU GJJSO QBJQH
	BQNLH RTMEL YNHKU FXJDM JYCPA DPPWY
	MGUWO IAIIG PTSFC SOKID GGTYO AAQDR
	QRRMN TSHYN EXYVF CMJJK NXVTE FXAUT
	SEZQS HLULP CYGXO NLAWQ TEJNB SMVTE
	HSXUY NJKXF PEPGF CMMCW ZRPJY GOPZU
	ZNVXI AXZKQ MJEFW WMRQR TETPX RSUKC
	DLHED LLCTJ KXZMQ MKNJU VPFLY HYFQR
	EWNDZ MBMPB OJXEQ IZAXH NDBQQ WDIZQ
	PIFAY JGQJO FWFCD BXYNX YTWYK EQCDP
	DYDOZ HJFCZ UEDDJ BFXTT VFYGH CTBGO
	FEHBU BZDQQ TIGDY AIYFD FHABS AHYGX
	IBBLE CGOSE MOMZV KHQSI CMJFF EVVTL
	WTESL AYWFY CKOXP SVNAI GOCZZ KVVVJ
	SOPEN YXDDX LDCYA XMWWO CWOII BNXTV
	TLIVQ WXUET PSUHC SOYFP VYIKZ NFVIE
	YPHKI NCGGV IKROO SOVMG HKUNU SUNYV
	CFELO OWSAI YRREV NEXPE SEGRP ZNBMM
	YUZFG SXRXW MNWTL RHVFH GSXMW VREAJ
	DGOZA GRXKJ LDOGY PTYXN TMWQM YSQWL
	XHNGZ QDMCW PYATG NZFJK WFDKA VSJMH
	JGWJE CWTDB ZNMYT NAORV HARRP DXGCA
	PHJNZ KTLQR QJJAF FZGDX LRFFS AWSZN
	GLSAQ GMCDY JGMBL SXEOT LFJGG LGKKR
	YYWDA LHHJV CGYVR LYSPJ VPKGW WXHFA
	CMTRG UJEJW TAFSN ZXVVW IYWOO MTLUF
	SBCAJ RNRMP IYLWI KAOKH TMXCN IMWTF
	GTTDE HTDHM KKCDK EAPHI AXZYP

$ pypy3 hc_kry_aE.py -c MSGS/2h.cry -W 15 -t 1 -M IC
HIGH_SCORE inital:  0.03992883170563053
0 0.07096628829374792 RYSQGJIRWMJZUVCSRYSISHGISRGDSIQSWKSRYSDISMSZQSWZRY

Cle:  GFNJKPMIOLQTSRUVZXYWDBCAEH  Score:  0.07096628829374792
Clair:  RYSQGJIRWMJZUVCSRYSISHGISRGDSIQSWKSRYSDISMSZQSWZRY

$ python3 kryha_tui.py -o -f 15 -i GFNJKPMIOLQTSRUVZXYWDBCAEH -d \
	< MSGS/2h.cry >| /tmp/2h_pseudo.txt

$ python ic.py /tmp/2h_pseudo.txt
RYSQGJIRWMJZUVCSRYSISHGISRGDSIQSWKSRYSDISMSZQSWZRYSWZMRUZRQUMSGHUZEQWIQJBMR
UZQSLYWQYBWXYRPWHHSISZRWURSWRHIGBRYSQUMSMWZKGCKWZXRYSJMSGHRYSIJMMWUZMSUCPSQ
WPSPVEAJPXSUZAXSLMWZRYWMMWRJURWGZRYSQGJIRWMJZLWCCWZXRGISUQYUQGZQCEMWGZPWUBS
RIWQUCCEGDDGMSPRGRYURUIIWKSPURVERYURUVCSAJIWMRUHRSIRYSQUISHJCUZPDUWZMRUTWZX
QGZMWPSIURWGZGHRYSCULUZPRYSHUQRMLYWQYYWMCSUIZSPGDWZWGZWZPWQURSMYSXUKSRGRYSU
QRWGZMRIWSPVSHGISYWBRYSQGJIRUQQGIPWZXCEYGCPMRYURRYSISWZWZMJHHWQWSZRDIGGHWZR
YSDISMSZRISQGIPRGMJMRUWZUHWZPWFXRYURRYSIJMMWUZVJMWZSMMGIRYURWRMUXSZRMLSNSSB
DGLSISPRGUHHWNWRRGDGCWQWSMMGUMRGXWKSRYSBRYSGGIQSUZPSHHSQRGHMSUCSPWZMRIJBSZR
MRYSHWIMRQUJMSGHUQRWGZWMRYSISHGISVUIISPVEGJIMWNESUIMRURJRSGHCWBWRUHWGZMUZPW
RWMUUQGIPWZXCEJZZSQSMMUIERGQGZMWPSILYSRYSIRYSIJMMWUZMRURJRSGHCWBWRURWGZMGHR
SZESUIMQGBBSZQSPRGYJZJDGZRYSUQQIJUCGHRYSQUJMSGHUQRWGZUMRYSPSHSZPUZRQGZRSZPM
GIJDGZRYSQGBBSZQSBSZRGHRYSUQRWGZUMRYSDCUWZRWHHUIXJSMWRWMCWTSLWMSJZZSQSMMUIE
RGPSRSIBWZSLYSRYSIRYSJJZZWZXGHRYWMMRURJRSLUMMJMDSZPSPVSQUJMSGHRYSQCGMWZXGHR
YSIJMMWUZQGJIRMRGQGZWIGKSILWSMUIWMWZXGJRGHQWKWCISCURWGZMYWDGIWXWZURWZXDIWGI
RGRYSMGKWSRISKGCJRWGZRYSBGRWGZRGPWMBWMMRYSQGBDCUWZRWMXIUZRSPUMRGRYSHWIMRQUJ
MSGHUQRWGZ
0.0709662882937

$ pypy2 hc_substi.py -c /tmp/2h_pseudo.txt
(0, 8091.574477026247, 'IMEDTORIALONSKUEIMEREHTREITFERDEABEIMEFRELENDEANIM')
(1, 7924.771191680076, 'INOASDLITEDHRBYOINOLOFSLOISPOLAOTCOINOPLOEOHAOTHIN')
(2, 8714.78289227325, 'THECOURTISUNABLETHEREFORETOPERCEIVETHEPRESENCEINTH')
(3, 8714.78289227325, 'THECOURTISUNABLETHEREFORETOPERCEIVETHEPRESENCEINTH')
(4, 8118.379348544824, 'ICERNDSIOLDTABMEICESEGNSEINWESREOVEICEWSELETREOTIC')
(5, 8098.842117537764, 'EROMADNEITDSUKBOERONOLANOEAWONMOIPOEROWNOTOSMOISER')
(6, 8714.78289227325, 'THECOURTISUNABLETHEREFORETOPERCEIVETHEPRESENCEINTH')
(7, 8102.5004414960285, 'EDIONUSEALUTRBPIEDISIGNSIENMISOIAVIEDIMSILITOIATED')
(8, 8132.721747105868, 'OREHAUDOITUSNYGEOREDELADEOAFEDHEIVEOREFDETESHEISOR')
(9, 7896.665857427598, 'INTDSMOIRAMEHBLTINTOTWSOTISPTODTRYTINTPOTATEDTREIN')
()
('Key: ', 'UVQPSHXYWOTCBZGDAIMRJKLNEF', ' Score: ', 8714.78289227325)
THECOURTISUNABLETHEREFORETOPERCEIVETHEPRESENCEINTHEINSTANTCASEOFANYCIRCUMST
ANCEWHICHMIGHTDIFFERENTIATEITFROMTHECASESINVOLVINGTHEUSEOFTHERUSSIANSEALDEC
IDEDBYQUDGEANQGEWSINTHISSITUATIONTHECOURTISUNWILLINGTOREACHACONCLYSIONDIAME
TRICALLYOPPOSEDTOTHATARRIVEDATBYTHATABLEQURISTAFTERTHECAREFULANDPAINSTAKING
CONSIDERATIONOFTHELAWANDTHEFACTSWHICHHISLEARNEDOPINIONINDICATESHEGAVETOTHEA
CTIONSTRIEDBEFOREHIMTHECOURTACCORDINGLYHOLDSTHATTHEREININSUFFICIENTPROOFINT
HEPRESENTRECORDTOSUSTAINAFINDIZGTHATTHERUSSIANBUSINESSORTHATITSAGENTSWEXEEM
POWEREDTOAFFIXITTOPOLICIESSOASTOGIVETHEMTHEOORCEANDEFFECTOFSEALEDINSTRUMENT
STHEFIRSTCAUSEOFACTIONISTHEREFOREBARREDBYOURSIXYEARSTATUTEOFLIMITAFIONSANDI
TISAACORDINGLYUNNECESSARYTOCONSIDERWHETHERTHERUSSIANSTATUTEOFLIMITATIONSOFT
ENYEARSCOMMENCEDTOHUNUPONTHEACCRUALOFTHECAUSEOFACTIONASTHEDEFENDANTCONTENDS
ORUPONTHECOMMENCEMENTOFTHEACTIONASTHEPLAINTIFFARGUESITISLIKEWISEUNNECESSARY
TODETERMINEWHETHERTHEUUNNINGOFTHISSTATUTEWASSUSPENDEDBECAUSEOFTHECLOSINGOFT
HERUSSIANCOURTSTOCONIROVERWIESARISINGOUTOFCIVILRELATIONSHIPORIGINATINGPRIOR
TOTHESOVIETREVOLUTIONTHEMOTIONTODISMISSTHECOMPLAINTISGRANTEDASTOTHEFIRSTCAU
SEOFACTION