Chapter 1 : Introduction - Explanations

1. This is the Hagelin cipher machines encipherment formula :
        cipher_letter = ((key + slide) - plain_letter) % 26     
   The % operator means "Modulo"
   A=0, B=1, C=2, ... and so one.
   The key depends on Lugs and Pins.

   On the M-209 the slide is fixed to "Z" (25 or -1 % 26 ).
   On others Hagelin cipher machines (C-36, C-38m, C-48, C-446, CD-57, ...), 
   one can set the slide, it is a part of the message key.

   Plain text = "YOU..." = 24, 14, 20, ...
   Key = 18, 18, 0, ...

   first_cipher_letter = ((18 -1) - 24) % 26 = 19 = T 
   second_cipher_letter= ((18 -1) - 14) % 26 =  3 = D
   third_cipher_letter = (0 -1)  - 20 ) % 26 =  5 = F

   Cipher text = "TDF..."


2. The key (which depends on the Lugs and Pins) is calculated by this formula:
      key = ( cipher_letter + plain_letter - slide ) % 26

   Cipher message: TMWIV AJGEC KTAXY KWGUH
   Plain  message: WEZAR EZNOT ZAZCL ONEZZ

   first_key = (19 + 22 +1)%26 = 42%26 = 16 
   second_key= (12 + 4 + 1)%26 = 17 
   third_key = (22 + 25 + 1)%26 = 22

   Key = 16, 17, 22, ...

3. For Hagelin machines, the encipherment and decipherment are identical : 
     cipher_letter = ((key + slide) - plain_letter) % 26     
     plain_letter  = ((key + slide) - cipher_letter)% 26

   Cipher_text = "AIO..." = 0, 8, 14, ...
   Key = 14, 13, 17, ....

   first_plain_letter = ((14 - 1) - 0)%26 = 13 = N
   second_plain_letter= ((13 - 1) -8)%26 = 4 = E
   third_plain_letter = ((17 - 1) -14)%26 = 2 = C

   Plain text = "NEC..."

4. For Hagelin machines, the encipherment of the i th letter is given by 
   this formula : 
     cipher_letter[i] = ((key[i] + slide) - plain_letter[i]) % 26     
   
   The key[i] depends on lugs and pins:
     key[i] = 0
     For each Wheel j ,do:
	key[i] = key[i] + Pins[j][i % length_of_the_wheel[j]]*Lugs[j]

   The Pins[j][i] is equal to zero if the Pin is inactive and is equal to one
   if the Pin is active.

   This algorithm is correct if there are no overlap (see later).
   This is true for Hagelin C35, C36(first model), C-362, C48, CD57... 
   and a bad setting of M-209.

   The Hagelin C35 and C36 have only 5 wheels. The first C35/C36 models have
   immutable number of lugs.

   C35:
   Lugs[0] = 1, length_of_wheel[0] = 25
   Lugs[1] = 2, length_of_wheel[1] = 23
   Lugs[2] = 4, length_of_wheel[2] = 21
   Lugs[3] = 8, length_of_wheel[3] = 19
   Lugs[4] =10, length_of_wheel[4] = 17

   C36:
   Lugs[0] = 1, length_of_wheel[0] = 25
   Lugs[1] = 2, length_of_wheel[1] = 23
   Lugs[2] = 3, length_of_wheel[2] = 21
   Lugs[3] = 7, length_of_wheel[3] = 19
   Lugs[4] =12, length_of_wheel[4] = 17

   C362: Only one lug by bar (then there are no overlap).
 
  Remark: the sum of lugs are 25 for C36 model. It is 27 for M-209
   and for the swedish C38 model, it is 29. For CD-57, it is 40.

   The key list gives the number of lugs in front of each wheel and 
   the active Pins of each wheel.

     wheel 26, Lugs :  2, Pins:  __CDEF_H_JK___OP__S_U__XYZ
     wheel 25, Lugs :  5, Pins:  _BC_E___I__LM__P_RS_U___Z
     wheel 23, Lugs :  1, Pins:  A__DEFGHI_K___O_Q_ST_V_
     wheel 21, Lugs :  3, Pins:  _BC_EF___J__M_OP_R_TU
     wheel 19, Lugs :  9, Pins:  ABC__F_H_J__M_O___S
     wheel 17, Lugs :  7, Pins:  __C__FGH_J_LM_OP_

     External Key : LLKJIH (then, the active pins are : AAAAAA)

     Plain text: AAAAA
     Cipher Text:

     Pin     :  A  B  C  D  E
     ith letter:0  1  2  3  4
     Wheel 26:  0  0  2  2  2
     Wheel 25:  0  5  5  0  5
     Wheel 23:  1  0  0  1  1
     Wheel 21:  0  3  3  0  3
     Wheel 19:  9  9  9  0  0
     Wheel 17:  0  0  7  0  0
               ==============
               10 17 26  3 11 (Key)
               -1 -1 -1 -1 -1 (Slide)
               -0 -0 -0 -0 -0 (Plain: AAAAA)
               ==============
                9 16 25  2 10 (Cipher: JQZCK)

5. Decipher the message "WGBQM". 
   The external key is "PEOPLE" and the key list is given.
	
   First, we need to know which are the active pins.
   If the external key of wheel 26 is "L" (11), the active pin is "A". 
   Then there is a shift of -11 for wheel 26.
  
   Here there are the shift for each Wheel:
   Wheel 26 : L => - 11 % 26 = 15    Wheel 21 : J => -10 % 21 = 12
   Wheel 25 : L => - 11 % 25 = 14    Wheel 19 : I => -9  % 19 = 11
   Wheel 23 : K => - 10 % 23 = 13    Wheel 17 : H => -8  % 17 = 10

   If the external key is "AAAAAA" (0,0,0,0,0,0):
   (0+15)%26=15=P, (0+14)%25=14=O, (0+13)%23=13=N, (0+12)%21=11=M,
   (0+11)%19=11=L, (0+10)%17=10=K => The active pins (used to check
   the 26 letters "A") are "PONMLK". 

   Remark: For C36, if the external key is "AAAAA", the active pins
   are "JIHHG".

   If the External Key is "PEOPLE" (15,4,14,15,11,4):
   (15+15)%26=4=E, (4+14)%25=18=S, (14+13)%23=4=E, 
   (15+12)%21=6=G, (11+11)%19=3=D, (4+10)%17=14=O,
   The active Pins are "ESEGDO".

   If we analyse the key list, we can find the weight of each wheel:
   Wheel 26 : 2 Lugs      Wheel 21 : 3 Lugs     2 overlaps:
   Wheel 25 : 5 Lugs      Wheel 19 :11 Lugs     - 1st : 2-5
   Wheel 23 : 1 Lug       Wheel 17 : 7 Lugs     - 2nd : 5-6

   Overlaps
   Mr Hagelin took the advice of Mr Yves Gylden and improved its C-36 which
   became C-38. First, was added a wheel (with 26 pins). Second, the lugs 
   were no longer set and it was possible to have overlaps. Then on each bar, 
   it was possible to set one or two lugs to active positions (so opposite 
   any of the six wheels). If one sets two lugs, there is an overlap. 
   According to which pins are active, the key will be increased only by one 
   even if the two pins opposite lugs are active. This new feature 
   significantly improves the security of the C models. For each overlap, 
   we need to substract one from the sum of lugs at each position.

   Remarks: 
   If Lugs can be set, it is possible to set one (or two) Lug(s) to the zero 
   position. In this case, the Lug is not effective. If we set the two lugs 
   of a bar in zero position then the sum of effective Lugs is lower than 27. 
   It is senseless, it is better to use the maximum number of lugs. 

   For security reasons, we must set all lugs to zero position when
   transporting cipher machine. In this case, all pins must be set to 
   ineffective positions too.

               Active Pins      The weight of each wheel
   Wheel 26    E  F  G  H  I    2  2  0  2  0
   Wheel 25    S  T  U  V  X    5  0  5  0  0
   Wheel 23    E  F  G  H  I    1  1  1  1  1
   Wheel 21    G  H  I  J  K    0  0  0  3  0
   Wheel 19    D  E  F  G  H    0  0 11  0 11
   Wheel 17    O  P  Q] A  B    7  7  0] 0  0
               =========================================
                               15 10 17  6 12
               Overlaps:       -0 -0 -1 -0 -0
               =========================================
               Key :           15 10 16  6 12
               Slide :         -1 -1 -1 -1 -1
               Cipher :       -22-16 -1-16-12  W G B Q M
               =========================================
               Plain :         18 19 14 15 25  S T O P Z


6. Try to calculate the key :
      key = ( cipher_letter + plain_letter - slide ) % 26

   Plain : "WHEN", perhaps the next letter is "Z" ?
   Cipher: "DSVMA RMAIL..."

   key1 : (22+3+1)%26 = 0
   key2 : (7+18+1)%26 = 0
   key3 : (4+21+1)%26 = 0
   key4 : (13+12+1)%26= 0
   key5 : (25+0+1)%26 = 0

   Perhaps the key is a sequence of zero ?
   Test this hypothesis : 
   plain6: ((0-1) - 17)%26 = 8 = I
   plain7: ((0-1) - 12)%26 = 13 = N
   plain8: ((0-1) - 0)%26 = 25 = Z
   ...
   We are on the good track!


7. Perhaps, the superposition method is the most powerful cryptanalysis one.
   It was invented by Auguste Kerckhoffs. To use it, it is necessary to have 
   several messages which were enciphered with the same key. The number of 
   messages need to be placed over each other, "in depth". All letters of 
   each column are ciphered by the same alphabet!

   With a cipher machine like Enigma, it is necessary to have more than a dozen 
   messages to have any hope of finding the solution. But, in case of cipher 
   machines which use an additive method, only two messages in depth are sufficent. 
   Naturally the more messages in depth you have, the easier it is.

   In the same column, for a single letter:
           cipher1 = ((key + slide) - plain1) % 26   
           cipher2 = ((key + slide) - plain2) % 26
   We can calculate difference (delta):
           DELTA = cipher1 - cipher2 = plain2 - plain1
           plain2 = DELTA + plain1

   We know cipher1 and cipher2. If we guess plain1 (or plain2) we can calculate
   the letter plain2 (or plain1).
   In short, if we guess a probable word in the good position, plain text
   will appear (a word or the begining of a word or the ending of a word).

   We guess that the 3rd message begins by the word "BIRDS". Try to decipher
   the 1st message:
       F - X = B - ? =  5 - 23 =  1 - ? = 8  => ? = 1 - 8 = -7 = 19 = T
       Z - Y = I - ? = 25 - 24 =  8 - ? = 1  => ? = 8 - 1 = 7 = H
       G - T = R - ? =  6 - 19 = 17 - ? = 13  => ? = 17 - 13 = 4 = E
       Y - U = D - ? = 24 - 20 =  3 - ? = 4  => ? = 3 - 4 = -1 = 25 = Z
       U - R = S - ? = ...
   Then, the 1st message begins by the word "THE". We are on the good track!

   The 1st hint is the probable word "WHEEL" in the 2nd message. We need to
   try this word in all positions and verify if plain text appears in other
   texts.

   For example, if we scan the 2nd message with probable string "ZWHEELZ",
   we find the following strings in the 1st message:
   ...
   06 SRPMFYL
   07 UEPFRXY
   08 HEIRQKC
   09 HXUQDOO
   10 AJTDHAS
   11 MIGHTER   (Hooray !!!)

   We have found several things:
   - The string "ZWHEELZ" appears in the 2nd message at the 11th position.
   - The string "MIGHTER" appears in the 1st message at the 11th position.  
     Perhaps it is bordered by "Z": "ZMIGHTERZ" ?
   - We can search for plain text at the 11th position in other messages:
     in the 3rd message, we found "FEATHER".
     in the 4th message, we found "ED IS A".
 
   Normally, with the 2nd hint ("PENNY"), it is easy to find the solution.


8. We can use the superposition method. But as there are only three messages,
   it is harder to find the solution.
     
9. Here also we can use the superposition method.
   But as we have many messages in depth we can use statistical scoring instead
   of searching for probable words.

   Instead of searching for probable words, we can try every frequent trigram of 
   the language used at each position of each message. Naturally, we need to 
   know this language. At each position we can see other trigrams for others
   messages. Normally they too are frequent in the language used if the first
   trigram is correct. If we sum the logarithm of the probability of these 
   trigrams we have a weight for the first trigram at this position. We retain 
   the trigram which gives the top weight. It is also possible to keep only 
   trigrams which give high weight for trigrams at each side. If we are lucky, 
   we can solve all messages automatically .


10. Here also we can use the superposition method.
    Normally  it is very hard to solve only two messages in depth. But in 
    some cases the solution is easy to find.
    The same message can be transmited twice but with light differences 
    (spacing, abbreviations, ...). This phenomena is called "Stagger".
    We can detect them by the length of the messages, timestamp, number of messages, .... 
    In this case (in additive ciphers), if we guess a simple word we can 
    recover the whole message. The only difficulty is to find the initial shift 
    between the two messages.


11. Here also we can use the superposition method.
    But there are only two messages in depth. It is very hard to find the
    solution. It is necessary to try several hundred probable words.
    It is very important to know the subject and any information about
    the content of the messages.
     
    Examples of probable words or probable strings:

    The figures: "ONE" (or "ZONEZ"), "TWO", "THREE", "FOUR", ...

    Signs or abbreviations: "PAREN" (or "ZPARENZ"), "DASH", "QUOTE",
    "CMA" [meaning "comma" ?], "PD" [period ?], "CLN" [meaning ???], "XXX", ...

    Short words: THE (or "ZTHEZ"), AND (or "ZANDZ"), HAS (or "ZHAS"), 
    FOR (or "ZFORZ"), THAT (or "ZTHATZ"), OFTEN (or "ZOFTENZ"), 
    WITH (or "ZWITHZ"), NEVER (or "ZNEVERZ"), IN (or "ZINZ), TO (or "ZTOZ"), ...

    Beginning or ending of words: -TION (or "TIONZ"), -MENT (or "MENTZ"), 
    -ING (or "INGZ"), THER- (or "ZTHER"), ...

    With experience, a depth of two messages can be solved easily (with the
    help of a computer :-).