Collatz Conjecture      Bradley Berg            November 21, 2018
                                                Edited January  24, 2019


                              ABSTRACT

      A revised definition of the Collatz conjecture is used to
      understand its behavior.  Two thirds of the values in the original
      series are skipped by considering values with:   N % 3 = 2

      A graph of these transitions form a directed acyclic tree.  For
      the tree to be complete it can have no disjoint branches and all
      branches converge to the root.  Such a branch would be infinitely
      long and the transactions would produce ever increasing values.

      An argument is made that this branch could not exist.  The
      distribution of transactions in the branch would need to be
      substantially skewed from the pool of possible transactions.

      The key observation is that values in the sequence maintain enough
      entropy so that skew cannot be sustained over an infinite path.
      A proof of the conjecture would need to account for this skew.



I. Collatz Conjecture

     even S:  S' = S / 2
      odd S:  S' = S * 3 + 1

   The sequence eventually reduces to 1.


A. Rewritten Conjecture

      notation:   %   modulus


   The original sequence is rewritten to produce only values where:  N % 3 = 2
   This eliminates 2/3 of the values produced in order to simplify analysis.
   The rewrite combines two or three iterations in each step.

   The transition from an odd number is always even so we can take 2 steps:

      N' = {N * 3 + 1} / 2


   On an even number apply one of these two transitions:

      When N % 4 = 0 then:  N' = N / 4
      When N % 4 = 2 then:  N' = ([N / 2] * 3 + 1) / 2 = {N * 3 + 2} / 4


   The first even case takes 2 steps and the second takes 3.

   If the starting value does not meet the condition N % 3 = 2, apply the
   original transitions until it does.  Then repeatedly apply the rewritten
   transitions.  Eventually they reduce to two; which is trivially reduced
   to one using the original series.

    original:  13 => 40 => 20 => 10 => 5 => 16 => 8 => 4 => 2 => 1  terminate

     revised:  13  advance 20    =>    5    =>    8    =>   2  terminate => 1


Here is a program to print the revised Collatz series:  collatz.exe

   Run from a command line:    collatz   63_728_127


B. Values Where N % 3 = 2

   The revised transitions consume and produce values where:  N % 3 = 2

       If N is odd  and  N = 6 * k + 5  then  N % 3 = 2
          N' = {3 * N + 1} / 2
             =  3 * {[6 * k + 5] + 1} / 2
             =      [18 * k + 16] / 2
             =        9 * k + 8  then  N' % 3 = 2


       If N is even  and  N = 12 * k + 8  then  N % 4 = 0  and  N % 3 = 2  
          N' = N / 4
             = [12 * k + 8] / 4
             =   3 * k + 2  then  N' % 3 = 2


       If N is even  and  N = 12 * k + 2  then  N % 4 = 2  and  N % 3 = 2  
          N' = (3 * N + 2) / 4
             = (3 * [12 * k + 2] + 2) / 4
             =      [36 * k + 8] / 4
             =        9 * k + 2  then  N' % 3 = 2


   The number of possible values between transitions is reduced by 2/3.


II. Advance the Initial Value

       notation:  /\    bitwise and


   An arbitrary starting number may not have N % 3 = 2.  To begin the revised
   sequence we need to first advance from the starting number until the
   condition is met.


A. When N % 3 = 0

      DO until N /\ 1:           DO until the number is odd,
         N = N / 2;                 Advance one even transition.
      -

      N = (3 * N + 1) / 2;       Advance an odd then even transition.


         Examples:  20 => 10 => 5 => 8     36 => 18 => 9 => 14


B. When N % 3 = 1

      IF N /\ 1                  IF N is odd,
         N = (3 * N + 1) / 2;       Advance an odd then even transition.

      ELSE:                      ELSE N is even,
         N = N / 2;                 Advance one even transition.
      .


         Examples:  7 => 11         10 => 5


III. Transition Table

   To graph the transitions we start by using a table to illustrate node
   values and their interconnections.  The table below shows the first few
   elements for each of the three transitions labeled A, B, and C.
   In the first column reading down we have:  5 => 8   8 => 2   2 => 2

   The labels under each in/out pair of values designate the location of the
   next node; represented as an edge in the graph.  The letter denotes the
   next transition and the number denotes the number of columns to move left
   or right.  Positive numbers move right and negatives move left.
   In the first column all cases stay in the first column.

   Each row marked "Shift" in the table corresponds to an edge connecting two
   transition nodes.  There are 9 kinds or edges.  The percentage that each
   edge occurs in the graph is listed.


A. Odd N Apply (3n+1)/2

      in             5  11  17  23  29  35  41  47  53  59  65  71  77  83  89
   (3n+1)/2          8  17  26  35  44  53  62  71  80  89  98 107 116 125 134

   Shift:A    25.0%     1A      2A      3A      4A      5A      6A      7A
   Shift:B    12.5%  0B            -1B             -2B             -3B
   Shift:C    12.5%         0C             -1C             -2C             -3C
                  
                  
B. N % 4 = 0 Apply n/4

      in             8  20  32  44  56  68  80  92 104 116 128 140 152 164 176
     n/4             2   5   8  11  14  17  20  23  26  29  32  35  38  41  44

   Shift:A    12.5%    -1A     -2A     -3A     -4A     -5A     -6A     -7A
   Shift:B    6.25%         -2B             -5B             -8B             -11B
   Shift:C    6.25%  0C             -3C             -6C             -9C
                  
                  
C. N % 4 = 2 Apply (3n+2)/4

      in             2  14  26  38  50  62  74  86  98 110 122 134 146 158 170
   (3n+2)/4          2  11  20  29  38  47  56  65  74  83  92 101 110 119 128

   Shift:A    12.5%     0A      1A      2A      3A      4A      5A      6A
   Shift:B    6.25%        -1B             -2B             -3B             -4B
   Shift:C    6.25%  0C             -1C             -2C             -3C


               Example:  35 => 53 => 80 => 20 => 5 => 8 => 2
                         3A   -2B   -5B   -1A  -0B   0C   0C



IV. Tree Components

   Transitions between values can be represented graphically.  The resulting
   graph forms an acyclic directed tree.  To show this we start by
   describing elements of the graph.  Lastly in section D the elements are
   interconnected to form a complete graph.


      notation:  ..    A chain of values that infinitely increase by four.


A. Graph Edges

   Edges between graph nodes move up for odds (increasing) and left for evens
   (decreasing).  The result is a directed acyclic tree.  It is acyclic
   because a cycle would require edges that move right and down as well.

       aa      ab    ac      ba       bb       bc       ca      cb       cc

        a       b     c    a <= b   b <= b   a <= b   a <= c  b <= c   c <= c
        ^       ^     ^
        a       a     a

   
   This example subtree contains each of the 9 kinds of edges:

       c    b     b      b
       2 <= 8 <= 32 <= 128 <= ..
            ^
            |     b     c      b
          a 5 <= 20 <= 26 <= 104 <= ..
                        ^
                        |     b
                     a 17 <= 68 <= ..
                        ^
                        |      c
                        |  <= 14 <= ..
                     a 11 {
                           <= 44 <= ..
                             b ^
                               |
                              29 <= 38 <= 50 <= 200 <= ..
                               a     c     c      b


B. Traverse in Reverse

   The table in section III can be inverted and used to walk the tree in
   reverse.  That is, outputs are mapped to inputs.  The table below lists
   the first few values in this mapping.   Each column in the table
   represents a single node.

      Output          2   5   8  11  14  17  20  23  26  29  32  35  38  41  44

     (Output*2-1)/3           5          11          17          23          29
      Output*4        8  20  32  44  56  68  80  92 104 116 128 140 152 164 176
     (Output*4-2)/3   2          14          26          38          50


   Inputs to each node are the one or two numbers listed under the outputs.
   They correspond to values such that:

         [ (Output*2-1)/3 ] % 3 = 2     True 1 out of 3 times.
               [ Output*4 ] % 3 = 2     True for all Outputs.
         [ (Output*4-2)/3 ] % 3 = 2     True 1 out of 3 times.


   Nodes are uniquely identified by the output value.
   There is one output for every number such that:  n % 3 = 2

   Collectively inputs also cover all numbers where:  n % 3 = 2
   For every value there is one corresponding input, one output,
   and one edge connecting them.


C. Graph Transitions

   Each of the three transitions has a corresponding graphical feature.
   When a node transitions in sequence to nodes of the same type then a
   chain is formed.  These chains are subtrees that can then be assembled
   into a complete tree.


1. By-Four Chain

   A By-Four chain has an output value and all it's multiples of four.
   The lowest output is any number such that:  N % 8 ~= 0

       => {2, 5, 11, 14, 17, 23, 26, 29, 35, 38, 41, 50, ...}


   By-Four chains have infinite length.  The root path is the first chain
   beginning with 2.

                 b      b      b        b         b          b          b

      c    2 <=  8 <=  32 <=   128 <=   512 <=  2_048 <=  8_192 <=  32_768 ..
      a    5 <= 20 <=  80 <=   320 <= 1_280 <=  5_120 <= 20_480 <= 801_920 ..
      a   11 <= 44 <= 176 <=   704 <= 2_816 <= 11_264 <= 45_056 <= 180_224 ..
      c   14 <= 56 <= 224 <=   896 <= 3_584 <= 14_336 <= 57_344 <= 229_376 ..
      a   17 <= 68 <= 272 <= 1_088 <= 4_352 <= 17_408 <= 69_632 <= 278_528 ..


   All values are contained in some By-Four chain with no duplicates.
   Odd terminal values are type A nodes and the evens are type C.
   All interior values are of type B.

   The chains contain all edges with the form A<=B, C<=B, and B<=B.
   The other six kinds of edges are not covered.  Instead they
   interconnect these By-Four chains to form a complete tree.


2. Odd Number Chain

   An odd number chain has one even output and multiple even inputs.
   The chain can have many odd numbers, but are always finite in length.
   In this example the output is 26 and the inputs are {14, 44, 68}

                 20 <= 26
                        ^
                        |
                       17 <= 68 ..
                        ^
                        |     <= 44 ..
                       11 <= {
                              <= 14 <= 56 ..


   The length of the chain is determined by the lowest odd bit pattern.
   The number of consecutive low order 1 bits determines the chain length.
   The bit pattern for 11 is 1011 base 2 so there are 2 elements in its chain.

   Some odd chains take two input values and some have only one.
   There is a second value when the lowest odd value in the chain has:

      [(Input * 3 + 2) / 4] % 3 = 2


   The lowest odd input in a chain has:  N % 18 = 5 | 11
                                           => {5, 11, 23, 29, ...}

   Outputs have the form:  18 * k + 8 => {8, 26, 44, 62, 80, 98, ...}

   Taken sequentially outputs connect alternatively to a type B value and
   then the next chain has a type C output value.

          2 <=   8 <=   5 <= 20
         20 <=  26 <=  17 <= 11 <= {44, 14}
         11 <=  44 <=  29 <= {116, 38}
         47 <=  62 <=  41 <= 164
         20 <=  80 <=  53 <= 35 <= 23 <= 92
         74 <=  98 <=  65 <= {260, 86}
         29 <= 116 <=  77 <= 308
        101 <= 134 <=  89 <= 59 <= 236
         38 <= 152 <= 101 <= {404, 134}
        128 <= 170 <= 113 <= 452
         47 <= 188 <= 125 <= 83 <= {332, 110}
        155 <= 206 <= 137 <= {548, 182}
         56 <= 224 <= 149 <= 596
        182 <= 242 <= 161 <= 107 <= 71 <= 47 <= {188, 62}


   Every odd number is a part of an odd chain and only one odd chain.  All
   By-Four chains with an odd outout are connected to a node in an odd chain.
   This shows complete coverage for this half of series values.

   For the first few lengths, the frequency that different chain lengths
   occur, their gain, and the average odd chain length is:

                                                          ______
      Length    Frequency = 2/(3^L)    Gain = (3/2)^L     Length

         1              2/3                 3/2           0.66667
         2              2/9                 9/4           1.11111
         3              2/27               27/8           1.33333
         4              2/81               81/16          1.43210
         5              2/243             243/32          1.47325


   Over all odd chains in total the average length converges to 1.5.
   The total average gain is the sum of:  2/(3^L) * (3/2)^L   => 2

   If you take one step beyond the odd chain, then half of those steps reduce
   the gain by 1/4 and the other half by 3/4.  The average gain is then
   reduced from 2 to 1.  Just considering odd chains and the next step the
   average gain breaks even.

   The combination of odd chains plus one step uses only a third of the
   reductions for even values.  Applying the remaining 2/3 of even
   transitions yield further reductions to get below break even.

   The following table shows the average observed gain for segments that
   start with an odd chain followed by even values.  Averages were computed
   for the first 10,000 samples.  The average gain of odd/even segments does
   not increase until the length of the odd chain is 4 or more.  These
   longer segments occur only 3.7% of the time in total.


      Odd Chain Length     Average Gain      Frequency

            1                 0.3858          0.66667
            2                 0.5785          0.22222
            3                 0.8678          0.07407
            4                 1.3024          0.02469
            5                 1.9515          0.00823
            6                 2.9281          0.00274
            7                 4.3923          0.00091
            8                 6.5924          0.00030


3. Even Number Chain

   Type C transitions form an even chain when they produce type C outputs.
   The output type of the first few transitions is denoted in the bottom row:
                 

       in     2  14  26  38  50  62  74  86  98 110 122 134 146 158 170 182 194
    (3n+2)/4  2  11  20  29  38  47  56  65  74  83  92 101 110 119 128 137 146

              c   a   b   a   c   a   b   a   c   a   b   a   c   a   b   a   c


   Each output of an even chain at every step also connects to the output of
   a By-Four chain.  

                              b
                          <= 440 ..     b
             83 <= 110 <={          <= 584 ..     b 
              a     c     <= 146 <={          <= 776 ..
                              c     <= 194 <={
                                        c     <= 258 <= 1032 ..
                                                  c       b


   The length of the chain can be determined by looking at the low order bits
   of the highest type C input that starts the chain.  All type C values end
   with bits 10.  Those that form a chain have a series of zero bits above
   that until the next one bit.  The length of the chain is half the number of
   zero bits plus one.

       258 = 1_00_00_00_10 base 2


   There are 3 pairs of zeros so there are 4 type C nodes in the chain before
   it transitions to the odd value 83.  Along the chain each value is the
   output of a By-Four chain as well.  In this next example there are and odd
   number of zero bits above that until the next one bit.

       386 = 110_00_00_10 base 2


   There are 2 pairs of zeros so there are 3 type C nodes before it
   transitions to the type B value 164.  When there is an odd number of
   zeros then the output is Type B (x4) and when even the output is odd.

                               b
                           <= 872 ..      b
             164 <= 218 <={          <= 1160 ..
              b      c     <= 290 <={            b
                               c     <= 386 <= 1544 ..
                                         c   


   The two edges out of type C chains are:

      A <= C    Connects to the input of an odd chain.
      B <= C    Connects to an interior node within a By-Four chain.


   Every even number that is a type C value is a part of an even chain and
   only one even chain.  On all By-Four chains with an even outout, the
   output is a type C value.  This connects every even By-Four chains to a
   node in an even chain.

   The set of all even chains is ordered by their input values; which can be
   either the output of an odd chain or the output of a By-Four chain.


                            next     c       b  By-Four chain
 
                     ... <=   2 <=   2 <=    8
                             11 <=  14 <=   56
                             20 <=  26 <=  104
                             29 <=  38 <=  152
                      29 <=  38 <=  50 <=  200
                             47 <=  62 <=  248
                             56 <=  74 <=  296
                             65 <=  86 <=  344
                      56 <=  74 <=  98 <=  392
                             83 <= 110 <=  440
                             92 <= 122 <=  488
                            101 <= 134 <=  536
                      83 <= 110 <= 146 <=  584
                            119 <= 158 <=  632
                            128 <= 170 <=  680
                            137 <= 182 <=  728
               83 <= 110 <= 146 <= 194 <=  776


                           next       c     a  Odd Chain

                             20 <=   26 <=  17
                             47 <=   62 <=  41
                     56 <=   74 <=   98 <=  65
                            101 <=  134 <=  89
                            128 <=  170 <= 113
                            155 <=  206 <= 137
                    137 <=  182 <=  242 <= 161
                            209 <=  278 <= 185
                            236 <=  314 <= 209
                            263 <=  350 <= 233
            164 <=  218 <=  290 <=  386 <= 257
 

D. Construct a Complete Tree

   Using elements described in the previous section B, we can construct a
   complete tree.  Even chains interconnect odd and By-Four chains in these
   four forms.  In these templates the uppercase letters represent each
   type of chain.

                                        ^
                                        |
           <= B <= C <= B <=            A <= C <= B <=


              ^
              |
              A <= C                 <= B <= C
                   ^                         ^
                   |                         |
                   A <=                      A <=


V. Scale Factor Per Transition

   A statistical argument for termination shows that on average values along
   a path decrease and eventually reduce to the root node.  The average
   change in each transition is represented using a scale factor.  A scale
   factor under one means that on average each transition is a reduction.

   This is similar to American Roulette where betting on red each round has
   odds that ensure you will lose an average of 7.26%; making the scale
   factor 0.9274 (1 - 7.26%).  Even if you have an initial lucky run
   eventually you will lose all your money.


A. Average Scale Factor

   The average scale factor over the entire graph is computed using the
   frequency that each transition occurs.  In the limit the scale factors
   for the three transitions are:  3/2, 1/4, 3/4

   The average scale factor over the entire graph is then:

       (3/2)^.5 * (1/4)^.25 * (3/4)^.25 = 0.80593... = (27/64) ^ (1/4)
      

   Note that in the limit the effect of the "+1" in 3n+1 diminishes to zero.
   This means that in practice the overall average will be a little larger.
   The following observations reveal variations in the scale factor.

                 (3n+1)/2      n/4     (3n+2)/4   1.5^pa * .25^pb * .75^pc

         graph      50%        25%       25%            0.80593

        to 10^6   0.5068     0.2501     0.2431          0.80969
        to 10^9   0.5043     0.2502     0.2456          0.80815
        1B to 2B  0.5039     0.2501     0.2458          0.80800


   For a given starting number there is an individual path taken within the
   tree.  The average scale factor for each path can vary quite a bit.  When
   there are none or few odd chains it is low.

             n          (3n+1)/2     n/4   (3n+2)/4   1.5^pa * .25^pb * .75^pc

        1_267_611_776       0      0.8236   0.1764          0.303485


   As the ratio of odd chains increases the average scale factor increases.

             n          (3n+1)/2     n/4   (3n+2)/4   1.5^pa * .25^pb * .75^pc

        1_005_925_919    0.5687    0.1400   0.2913        0.953995


B. Disjoint Branch

   If the tree was not fully interconnected there would be a path that never
   reached the root.  The branch would either be cyclic or infinitely long.
   Cyclic branches would have repeating values whereas within acyclic branches
   all values are unique.  To remain unique an acyclic disjoint branch would
   contain values from an increasingly larger pool of numbers.

   As with the gambler that does not quit while ahead, an infinite branch
   would succumb to the pressure of the house odds.  Odd transitions would
   need to be applied over 2/3 of the time.  An infinite disjoint branch
   would contain more than twice as many odd values as even.

      1.5^.672 * .25^.164 * .75^.164 = 0.99794     Just under breaking even

      1.5^.674 * .25^.163 * .75^.163 = 1.00043     Just over breaking even


   Since odds make up half the tree and even nodes each occur a fouth of the
   time, it should not be possible to construct an infinite disjoint branch.
   With no disjoint branches the tree is complete and there is a path from
   every node to the root.

   If By-Four chains were to never occur, even and odd values could appear
   half the time, but never converge.  This too, could not happen in an
   infinite disjoint branch.  Long paths can be contrived that begin with
   few By-Four chains values such as the path starting with 71.

      1.5^.5 * .25^.0 * .75^.5 = 1.06066


   Physical imperfections aside, the model for roulette assumes that each
   spin is an independent purely random event.  This differs from the
   Collatz sequence where a given starting number produces a deterministic
   sequence.

   The remaining question then is whether even and odd values are evenly
   distributed over an infinite disjoint branch.  Actually though, we just
   need to show there is enough entropy so that there are fewer than twice
   as many odd values as evens for the series to converge.

   The lowest value at the start of an odd chain has a sequence of low order
   one bits.  They correspond to the Mersenne number 2^k-1.  The upper bits
   can be written as s*2^(k+1) for a total of:  s*2^(k+1) + 2^k-1

   After advancing k steps the even output of the the odd chain is:

         s*3^k + 3^k - 1


   Starting with 11:  k = 2; s = 1; and 11 = s*2^(k+1) + 2^k-1.

   Taking two steps the output is:  26 = s*3^k + 3^k-1

   The entropy of a binary number is measured in bits:

      p0 = (zero bits) / (total bits)
      p1 =  (one bits) / (total bits) = 1 - p0
       H = -p0 * log2( p0 ) + -p1 * log2( p1 )

   The entropy of the bits in 11 is 0.81; which increases to 0.97 for 26.
   Starting with 29 (H = .72) the output is 44 which has the maximum H = 1.

   Mersenne numbers have no entropy as the bits are all one.  The number
   3^k-1 has a mix of zero and one bits so it's entropy will be higher.  The
   upper bits are also blended with the low order bits to further increase
   entropy.

   After advancing the sequence the low order bits of the output value will
   have more entropy.  The overall increase in entropy for odd chains is a
   self-regulating mechanism that corrects any skew in the ratio of odd to
   even values.

   By-Four chains also increase entropy.  Values within the chain end in
   pairs of zero bits.  These bits also have no entropy as they are all zero.
   Each tranisition deletes the lowest pair until they are all deleted;
   increasing entropy.

   For a branch to continuously increase its path needs to consist primarily
   of odd and type C chains.  However after each long odd chain is reached
   the bit pattern becomes more randomized.  A branch that started with a
   dearth of By-Four chains would eventually converge due to entropy, making
   it impossible to form an infinite disjoint branch.

   Working with this revised series has the advantage that it separates the
   three types of chains.  It also elides two thirds of the values.  Mainly
   it makes clear that considering just the ratio of even and odd values is
   insufficient.  Even values are split into two types of chains.  The blend
   of these three chains along a path needs to be considered.