Collatz Conjecture Bradley Berg November 21, 2018 Edited January 24, 2019 ABSTRACT A revised definition of the Collatz conjecture is used to understand its behavior. Two thirds of the values in the original series are skipped by considering values with: N % 3 = 2 A graph of these transitions form a directed acyclic tree. For the tree to be complete it can have no disjoint branches and all branches converge to the root. Such a branch would be infinitely long and the transactions would produce ever increasing values. An argument is made that this branch could not exist. The distribution of transactions in the branch would need to be substantially skewed from the pool of possible transactions. The key observation is that values in the sequence maintain enough entropy so that skew cannot be sustained over an infinite path. A proof of the conjecture would need to account for this skew. I. Collatz Conjecture even S: S' = S / 2 odd S: S' = S * 3 + 1 The sequence eventually reduces to 1. A. Rewritten Conjecture notation: % modulus The original sequence is rewritten to produce only values where: N % 3 = 2 This eliminates 2/3 of the values produced in order to simplify analysis. The rewrite combines two or three iterations in each step. The transition from an odd number is always even so we can take 2 steps: N' = {N * 3 + 1} / 2 On an even number apply one of these two transitions: When N % 4 = 0 then: N' = N / 4 When N % 4 = 2 then: N' = ([N / 2] * 3 + 1) / 2 = {N * 3 + 2} / 4 The first even case takes 2 steps and the second takes 3. If the starting value does not meet the condition N % 3 = 2, apply the original transitions until it does. Then repeatedly apply the rewritten transitions. Eventually they reduce to two; which is trivially reduced to one using the original series. original: 13 => 40 => 20 => 10 => 5 => 16 => 8 => 4 => 2 => 1 terminate revised: 13 advance 20 => 5 => 8 => 2 terminate => 1 Here is a program to print the revised Collatz series: collatz.exe Run from a command line: collatz 63_728_127 B. Values Where N % 3 = 2 The revised transitions consume and produce values where: N % 3 = 2 If N is odd and N = 6 * k + 5 then N % 3 = 2 N' = {3 * N + 1} / 2 = 3 * {[6 * k + 5] + 1} / 2 = [18 * k + 16] / 2 = 9 * k + 8 then N' % 3 = 2 If N is even and N = 12 * k + 8 then N % 4 = 0 and N % 3 = 2 N' = N / 4 = [12 * k + 8] / 4 = 3 * k + 2 then N' % 3 = 2 If N is even and N = 12 * k + 2 then N % 4 = 2 and N % 3 = 2 N' = (3 * N + 2) / 4 = (3 * [12 * k + 2] + 2) / 4 = [36 * k + 8] / 4 = 9 * k + 2 then N' % 3 = 2 The number of possible values between transitions is reduced by 2/3. II. Advance the Initial Value notation: /\ bitwise and An arbitrary starting number may not have N % 3 = 2. To begin the revised sequence we need to first advance from the starting number until the condition is met. A. When N % 3 = 0 DO until N /\ 1: DO until the number is odd, N = N / 2; Advance one even transition. - N = (3 * N + 1) / 2; Advance an odd then even transition. Examples: 20 => 10 => 5 => 8 36 => 18 => 9 => 14 B. When N % 3 = 1 IF N /\ 1 IF N is odd, N = (3 * N + 1) / 2; Advance an odd then even transition. ELSE: ELSE N is even, N = N / 2; Advance one even transition. . Examples: 7 => 11 10 => 5 III. Transition Table To graph the transitions we start by using a table to illustrate node values and their interconnections. The table below shows the first few elements for each of the three transitions labeled A, B, and C. In the first column reading down we have: 5 => 8 8 => 2 2 => 2 The labels under each in/out pair of values designate the location of the next node; represented as an edge in the graph. The letter denotes the next transition and the number denotes the number of columns to move left or right. Positive numbers move right and negatives move left. In the first column all cases stay in the first column. Each row marked "Shift" in the table corresponds to an edge connecting two transition nodes. There are 9 kinds or edges. The percentage that each edge occurs in the graph is listed. A. Odd N Apply (3n+1)/2 in 5 11 17 23 29 35 41 47 53 59 65 71 77 83 89 (3n+1)/2 8 17 26 35 44 53 62 71 80 89 98 107 116 125 134 Shift:A 25.0% 1A 2A 3A 4A 5A 6A 7A Shift:B 12.5% 0B -1B -2B -3B Shift:C 12.5% 0C -1C -2C -3C B. N % 4 = 0 Apply n/4 in 8 20 32 44 56 68 80 92 104 116 128 140 152 164 176 n/4 2 5 8 11 14 17 20 23 26 29 32 35 38 41 44 Shift:A 12.5% -1A -2A -3A -4A -5A -6A -7A Shift:B 6.25% -2B -5B -8B -11B Shift:C 6.25% 0C -3C -6C -9C C. N % 4 = 2 Apply (3n+2)/4 in 2 14 26 38 50 62 74 86 98 110 122 134 146 158 170 (3n+2)/4 2 11 20 29 38 47 56 65 74 83 92 101 110 119 128 Shift:A 12.5% 0A 1A 2A 3A 4A 5A 6A Shift:B 6.25% -1B -2B -3B -4B Shift:C 6.25% 0C -1C -2C -3C Example: 35 => 53 => 80 => 20 => 5 => 8 => 2 3A -2B -5B -1A -0B 0C 0C IV. Tree Components Transitions between values can be represented graphically. The resulting graph forms an acyclic directed tree. To show this we start by describing elements of the graph. Lastly in section D the elements are interconnected to form a complete graph. notation: .. A chain of values that infinitely increase by four. A. Graph Edges Edges between graph nodes move up for odds (increasing) and left for evens (decreasing). The result is a directed acyclic tree. It is acyclic because a cycle would require edges that move right and down as well. aa ab ac ba bb bc ca cb cc a b c a <= b b <= b a <= b a <= c b <= c c <= c ^ ^ ^ a a a This example subtree contains each of the 9 kinds of edges: c b b b 2 <= 8 <= 32 <= 128 <= .. ^ | b c b a 5 <= 20 <= 26 <= 104 <= .. ^ | b a 17 <= 68 <= .. ^ | c | <= 14 <= .. a 11 { <= 44 <= .. b ^ | 29 <= 38 <= 50 <= 200 <= .. a c c b B. Traverse in Reverse The table in section III can be inverted and used to walk the tree in reverse. That is, outputs are mapped to inputs. The table below lists the first few values in this mapping. Each column in the table represents a single node. Output 2 5 8 11 14 17 20 23 26 29 32 35 38 41 44 (Output*2-1)/3 5 11 17 23 29 Output*4 8 20 32 44 56 68 80 92 104 116 128 140 152 164 176 (Output*4-2)/3 2 14 26 38 50 Inputs to each node are the one or two numbers listed under the outputs. They correspond to values such that: [ (Output*2-1)/3 ] % 3 = 2 True 1 out of 3 times. [ Output*4 ] % 3 = 2 True for all Outputs. [ (Output*4-2)/3 ] % 3 = 2 True 1 out of 3 times. Nodes are uniquely identified by the output value. There is one output for every number such that: n % 3 = 2 Collectively inputs also cover all numbers where: n % 3 = 2 For every value there is one corresponding input, one output, and one edge connecting them. C. Graph Transitions Each of the three transitions has a corresponding graphical feature. When a node transitions in sequence to nodes of the same type then a chain is formed. These chains are subtrees that can then be assembled into a complete tree. 1. By-Four Chain A By-Four chain has an output value and all it's multiples of four. The lowest output is any number such that: N % 8 ~= 0 => {2, 5, 11, 14, 17, 23, 26, 29, 35, 38, 41, 50, ...} By-Four chains have infinite length. The root path is the first chain beginning with 2. b b b b b b b c 2 <= 8 <= 32 <= 128 <= 512 <= 2_048 <= 8_192 <= 32_768 .. a 5 <= 20 <= 80 <= 320 <= 1_280 <= 5_120 <= 20_480 <= 801_920 .. a 11 <= 44 <= 176 <= 704 <= 2_816 <= 11_264 <= 45_056 <= 180_224 .. c 14 <= 56 <= 224 <= 896 <= 3_584 <= 14_336 <= 57_344 <= 229_376 .. a 17 <= 68 <= 272 <= 1_088 <= 4_352 <= 17_408 <= 69_632 <= 278_528 .. All values are contained in some By-Four chain with no duplicates. Odd terminal values are type A nodes and the evens are type C. All interior values are of type B. The chains contain all edges with the form A<=B, C<=B, and B<=B. The other six kinds of edges are not covered. Instead they interconnect these By-Four chains to form a complete tree. 2. Odd Number Chain An odd number chain has one even output and multiple even inputs. The chain can have many odd numbers, but are always finite in length. In this example the output is 26 and the inputs are {14, 44, 68} 20 <= 26 ^ | 17 <= 68 .. ^ | <= 44 .. 11 <= { <= 14 <= 56 .. The length of the chain is determined by the lowest odd bit pattern. The number of consecutive low order 1 bits determines the chain length. The bit pattern for 11 is 1011 base 2 so there are 2 elements in its chain. Some odd chains take two input values and some have only one. There is a second value when the lowest odd value in the chain has: [(Input * 3 + 2) / 4] % 3 = 2 The lowest odd input in a chain has: N % 18 = 5 | 11 => {5, 11, 23, 29, ...} Outputs have the form: 18 * k + 8 => {8, 26, 44, 62, 80, 98, ...} Taken sequentially outputs connect alternatively to a type B value and then the next chain has a type C output value. 2 <= 8 <= 5 <= 20 20 <= 26 <= 17 <= 11 <= {44, 14} 11 <= 44 <= 29 <= {116, 38} 47 <= 62 <= 41 <= 164 20 <= 80 <= 53 <= 35 <= 23 <= 92 74 <= 98 <= 65 <= {260, 86} 29 <= 116 <= 77 <= 308 101 <= 134 <= 89 <= 59 <= 236 38 <= 152 <= 101 <= {404, 134} 128 <= 170 <= 113 <= 452 47 <= 188 <= 125 <= 83 <= {332, 110} 155 <= 206 <= 137 <= {548, 182} 56 <= 224 <= 149 <= 596 182 <= 242 <= 161 <= 107 <= 71 <= 47 <= {188, 62} Every odd number is a part of an odd chain and only one odd chain. All By-Four chains with an odd outout are connected to a node in an odd chain. This shows complete coverage for this half of series values. For the first few lengths, the frequency that different chain lengths occur, their gain, and the average odd chain length is: ______ Length Frequency = 2/(3^L) Gain = (3/2)^L Length 1 2/3 3/2 0.66667 2 2/9 9/4 1.11111 3 2/27 27/8 1.33333 4 2/81 81/16 1.43210 5 2/243 243/32 1.47325 Over all odd chains in total the average length converges to 1.5. The total average gain is the sum of: 2/(3^L) * (3/2)^L => 2 If you take one step beyond the odd chain, then half of those steps reduce the gain by 1/4 and the other half by 3/4. The average gain is then reduced from 2 to 1. Just considering odd chains and the next step the average gain breaks even. The combination of odd chains plus one step uses only a third of the reductions for even values. Applying the remaining 2/3 of even transitions yield further reductions to get below break even. The following table shows the average observed gain for segments that start with an odd chain followed by even values. Averages were computed for the first 10,000 samples. The average gain of odd/even segments does not increase until the length of the odd chain is 4 or more. These longer segments occur only 3.7% of the time in total. Odd Chain Length Average Gain Frequency 1 0.3858 0.66667 2 0.5785 0.22222 3 0.8678 0.07407 4 1.3024 0.02469 5 1.9515 0.00823 6 2.9281 0.00274 7 4.3923 0.00091 8 6.5924 0.00030 3. Even Number Chain Type C transitions form an even chain when they produce type C outputs. The output type of the first few transitions is denoted in the bottom row: in 2 14 26 38 50 62 74 86 98 110 122 134 146 158 170 182 194 (3n+2)/4 2 11 20 29 38 47 56 65 74 83 92 101 110 119 128 137 146 c a b a c a b a c a b a c a b a c Each output of an even chain at every step also connects to the output of a By-Four chain. b <= 440 .. b 83 <= 110 <={ <= 584 .. b a c <= 146 <={ <= 776 .. c <= 194 <={ c <= 258 <= 1032 .. c b The length of the chain can be determined by looking at the low order bits of the highest type C input that starts the chain. All type C values end with bits 10. Those that form a chain have a series of zero bits above that until the next one bit. The length of the chain is half the number of zero bits plus one. 258 = 1_00_00_00_10 base 2 There are 3 pairs of zeros so there are 4 type C nodes in the chain before it transitions to the odd value 83. Along the chain each value is the output of a By-Four chain as well. In this next example there are and odd number of zero bits above that until the next one bit. 386 = 110_00_00_10 base 2 There are 2 pairs of zeros so there are 3 type C nodes before it transitions to the type B value 164. When there is an odd number of zeros then the output is Type B (x4) and when even the output is odd. b <= 872 .. b 164 <= 218 <={ <= 1160 .. b c <= 290 <={ b c <= 386 <= 1544 .. c The two edges out of type C chains are: A <= C Connects to the input of an odd chain. B <= C Connects to an interior node within a By-Four chain. Every even number that is a type C value is a part of an even chain and only one even chain. On all By-Four chains with an even outout, the output is a type C value. This connects every even By-Four chains to a node in an even chain. The set of all even chains is ordered by their input values; which can be either the output of an odd chain or the output of a By-Four chain. next c b By-Four chain ... <= 2 <= 2 <= 8 11 <= 14 <= 56 20 <= 26 <= 104 29 <= 38 <= 152 29 <= 38 <= 50 <= 200 47 <= 62 <= 248 56 <= 74 <= 296 65 <= 86 <= 344 56 <= 74 <= 98 <= 392 83 <= 110 <= 440 92 <= 122 <= 488 101 <= 134 <= 536 83 <= 110 <= 146 <= 584 119 <= 158 <= 632 128 <= 170 <= 680 137 <= 182 <= 728 83 <= 110 <= 146 <= 194 <= 776 next c a Odd Chain 20 <= 26 <= 17 47 <= 62 <= 41 56 <= 74 <= 98 <= 65 101 <= 134 <= 89 128 <= 170 <= 113 155 <= 206 <= 137 137 <= 182 <= 242 <= 161 209 <= 278 <= 185 236 <= 314 <= 209 263 <= 350 <= 233 164 <= 218 <= 290 <= 386 <= 257 D. Construct a Complete Tree Using elements described in the previous section B, we can construct a complete tree. Even chains interconnect odd and By-Four chains in these four forms. In these templates the uppercase letters represent each type of chain. ^ | <= B <= C <= B <= A <= C <= B <= ^ | A <= C <= B <= C ^ ^ | | A <= A <= V. Scale Factor Per Transition A statistical argument for termination shows that on average values along a path decrease and eventually reduce to the root node. The average change in each transition is represented using a scale factor. A scale factor under one means that on average each transition is a reduction. This is similar to American Roulette where betting on red each round has odds that ensure you will lose an average of 7.26%; making the scale factor 0.9274 (1 - 7.26%). Even if you have an initial lucky run eventually you will lose all your money. A. Average Scale Factor The average scale factor over the entire graph is computed using the frequency that each transition occurs. In the limit the scale factors for the three transitions are: 3/2, 1/4, 3/4 The average scale factor over the entire graph is then: (3/2)^.5 * (1/4)^.25 * (3/4)^.25 = 0.80593... = (27/64) ^ (1/4) Note that in the limit the effect of the "+1" in 3n+1 diminishes to zero. This means that in practice the overall average will be a little larger. The following observations reveal variations in the scale factor. (3n+1)/2 n/4 (3n+2)/4 1.5^pa * .25^pb * .75^pc graph 50% 25% 25% 0.80593 to 10^6 0.5068 0.2501 0.2431 0.80969 to 10^9 0.5043 0.2502 0.2456 0.80815 1B to 2B 0.5039 0.2501 0.2458 0.80800 For a given starting number there is an individual path taken within the tree. The average scale factor for each path can vary quite a bit. When there are none or few odd chains it is low. n (3n+1)/2 n/4 (3n+2)/4 1.5^pa * .25^pb * .75^pc 1_267_611_776 0 0.8236 0.1764 0.303485 As the ratio of odd chains increases the average scale factor increases. n (3n+1)/2 n/4 (3n+2)/4 1.5^pa * .25^pb * .75^pc 1_005_925_919 0.5687 0.1400 0.2913 0.953995 B. Disjoint Branch If the tree was not fully interconnected there would be a path that never reached the root. The branch would either be cyclic or infinitely long. Cyclic branches would have repeating values whereas within acyclic branches all values are unique. To remain unique an acyclic disjoint branch would contain values from an increasingly larger pool of numbers. As with the gambler that does not quit while ahead, an infinite branch would succumb to the pressure of the house odds. Odd transitions would need to be applied over 2/3 of the time. An infinite disjoint branch would contain more than twice as many odd values as even. 1.5^.672 * .25^.164 * .75^.164 = 0.99794 Just under breaking even 1.5^.674 * .25^.163 * .75^.163 = 1.00043 Just over breaking even Since odds make up half the tree and even nodes each occur a fouth of the time, it should not be possible to construct an infinite disjoint branch. With no disjoint branches the tree is complete and there is a path from every node to the root. If By-Four chains were to never occur, even and odd values could appear half the time, but never converge. This too, could not happen in an infinite disjoint branch. Long paths can be contrived that begin with few By-Four chains values such as the path starting with 71. 1.5^.5 * .25^.0 * .75^.5 = 1.06066 Physical imperfections aside, the model for roulette assumes that each spin is an independent purely random event. This differs from the Collatz sequence where a given starting number produces a deterministic sequence. The remaining question then is whether even and odd values are evenly distributed over an infinite disjoint branch. Actually though, we just need to show there is enough entropy so that there are fewer than twice as many odd values as evens for the series to converge. The lowest value at the start of an odd chain has a sequence of low order one bits. They correspond to the Mersenne number 2^k-1. The upper bits can be written as s*2^(k+1) for a total of: s*2^(k+1) + 2^k-1 After advancing k steps the even output of the the odd chain is: s*3^k + 3^k - 1 Starting with 11: k = 2; s = 1; and 11 = s*2^(k+1) + 2^k-1. Taking two steps the output is: 26 = s*3^k + 3^k-1 The entropy of a binary number is measured in bits: p0 = (zero bits) / (total bits) p1 = (one bits) / (total bits) = 1 - p0 H = -p0 * log2( p0 ) + -p1 * log2( p1 ) The entropy of the bits in 11 is 0.81; which increases to 0.97 for 26. Starting with 29 (H = .72) the output is 44 which has the maximum H = 1. Mersenne numbers have no entropy as the bits are all one. The number 3^k-1 has a mix of zero and one bits so it's entropy will be higher. The upper bits are also blended with the low order bits to further increase entropy. After advancing the sequence the low order bits of the output value will have more entropy. The overall increase in entropy for odd chains is a self-regulating mechanism that corrects any skew in the ratio of odd to even values. By-Four chains also increase entropy. Values within the chain end in pairs of zero bits. These bits also have no entropy as they are all zero. Each tranisition deletes the lowest pair until they are all deleted; increasing entropy. For a branch to continuously increase its path needs to consist primarily of odd and type C chains. However after each long odd chain is reached the bit pattern becomes more randomized. A branch that started with a dearth of By-Four chains would eventually converge due to entropy, making it impossible to form an infinite disjoint branch. Working with this revised series has the advantage that it separates the three types of chains. It also elides two thirds of the values. Mainly it makes clear that considering just the ratio of even and odd values is insufficient. Even values are split into two types of chains. The blend of these three chains along a path needs to be considered.