Entropy Makes the Collatz Series Go Down Bradley Berg November 21, 2018 Edited September 27, 2022
ABSTRACT We look at the Collatz Series from an information theory perspective to lay out its underlying computational mechanics. The mechanisms are similar to those used by pseudo random number generators and one way hashes. An equivalent restatement of the Collatz series steps through alternating even and odd chains. The result of each step is an even number with one or more trailing zeros. The next higher bits will be one or more ones. The remaining high order bits are an even number with no special ordering. Integers with all ones or zeros have no entropy. Transitions discard low order zeros; increasing the entropy of the output value. The next higher group of ones is then replaced by combining them with the remaining high order bits. This also increases entropy. Generally, as the entropy of a series increases the values become more randomized. The more random the series the more it trends towards statistical averages for the series. The average gain of the Collatz Series is well below one so that values drift lower until reaching one; providing it is not circular. notation: ^ exponentiation operator // logical shift right operator \\ logical shift left operator -> "Transition to" the the next value in a sequence. 1. Combining Transitions This is an informal explanation of the computational mechanics for the Collatz Sequence. The same class of computations are used in cryptographic random number generators. Often the sequence is defined where the Odd transition is just "3*n+1". Instead we also divide the result by two. This way the average odds of taking either step are even. This is easy to see by taking a range of inputs and you'll notice the outputs will be exactly evenly distributed. N is Even: N' = N / 2 N is Odd: N' = (3 * N + 1) / 2 The gain of a transition is its Output divided by the Input (N' / N). As N gets larger the "+ 1" term quickly becomes insignificant. It is dropped so we can compute the gain for the transition in the limit. Output / Input Gain N is Even: {N / 2} / N 0.5 N is Odd: {(3 * N + 1) / 2} / N 1.5 The gain of a series of transitions is the product of the gains for each transition. We get the average gain of a sequence by using the probability of taking either an odd or even path. Gs = 1.5 ^ p(odd) * .5 ^ p(even) If a sequence produces uniformly randomized values then the chances of using either transitions are 50:50. In this case the choice of path is determined by the low order bit of the input value. This means the low bit would need to be randomized over the sequence. The average gain of a randomized sequence is then: Average transaction gain = <transaction gain> ^ p(<transaction>) Average odd gain = 1.5 ^ .5 = 1.22474 Average even gain = 0.5 ^ .5 = 0.70711 Average sequence gain: Ga = 1.22474 * 0.70711 = 0.86603+ The statistical average gain in each step is less than one so on average the sequence declines. However, the gain in an actual series can vary significantly. There could potentially still be a series where the the total gain indefinitely exceeds one and never terminates. Gain = 1.5 ^ p * 0.5 ^ (1 - p) = 1 ln( 1.5 ^ p * 0.5 ^ [1 - p]) = ln( 1 ) = 0 = ln( 1.5 ^ p ) + ln( 0.5 ^ (1 - p) = p * ln( 1.5 ) + (1 - p) * ln( 0.5 ) p * ln( 1.5 ) = -(1 - p) * ln( 0.5 ) ln( 1.5 ) / ln( 0.5 ) = -(1 - p) / p ln( 1.5 ) / ln( 0.5 ) = -1 / p + 1 ln( 1.5 ) / ln( 0.5 ) - 1 = -1 / p p = -1 / {ln( 1.5 ) / ln( 0.5 ) - 1} = 0.63093+ Gain = 1.5^0.63 * 0.5^0.37 = 0.99898 Just under breaking even Gain = 1.5^0.64 * 0.5^0.36 = 1.01001 Just over breaking even For a series to run on forever it would have to sustain an average gain over one. To do this odd transitions would need to be applied over 1.7 times more than evens. We've determined that an equal number of transitions gives an average gain of about 0.86603. 1.1 Even and Odd Chains There are often consecutive iterations of the same kind of transition in the sequence to form a chain. Even chains start with an even seed value that in binary have one or more trailing zeroes. The Even transition simply removes the zeros at the end of an even chain. Odd chains consume an odd input and can have multiple odd intermediate values. The output of an Odd chain is an even number. The number of consecutive low order one bits determines the chain length. For example, an input of 19 is a binary 10011 so the subsequent chain has two Odd transitions: 19 -> 29 -> 44 A chain of length k takes an input of the form: j * 2^k - 1 The output value simplifies to: j * 3^k - 1 N(1) = (3 * N + 1) / 2 First transition = (3 * [j * 2^k - 1] + 1) / 2 Substitute N = j * 2^k - 1 = ([3 * j * 2^k - 3] + 1) / 2 = [3 * j * 2^k - 2] / 2 = 3^1 * j * 2^[k-1] - 1 N(i+1) = (3 * {3^i * j * 2^[k-i] - 1} + 1) / 2 Subsequent transitions = ({9 * j * 2^[k-i] - 3} + 1) / 2 = {9 * j * 2^[k-i] - 2} / 2 = 3^[i+1] * j * 2^[k-i-1] - 1 N(k) = 3^k * j - 1 Odd chain output A Collatz sequence will have segments with consecutive runs of each transition. These chains have length k and have the form: Chain Input Chain Output Gain N is Odd: j * 2^k - 1 j * 3^k - 1 1.5^k N is Even: j * 2^k j 0.5^k Note that for even inputs j is the input shifted right by the number of its trailing zeros. For odd inputs j is 1 plus the input shifted right by the number of trailing ones. For reference, here are the first few chains for the series beginning with a seed of 27. Original Even Odd j k 27 -> 41 -> 124 -> 62 27 -> 62 7 2 -> 31 -> 31 31 1 -> 47 -> 71 -> 107 -> 161 -> 242 -> 242 1 5 -> 121 -> 121 121 1 -> 364 -> 182 -> 182 61 1 -> 91 -> 91 91 1 -> 137 -> 206 -> 206 23 2 -> 103 -> 103 103 1 -> 155 -> 233 -> 350 -> 350 13 3 -> 175 -> 175 175 1 -> 263 -> 395 -> 593 -> 890 -> 890 11 4 1.2 Combining Even And Odd Chains The series alternates between even and odd chains. We represent this aspect by algebraically merging both into a single step. This gives us a series defined using a single transition. Since all intermediate values using the combined definition are even, an initial odd value needs to transition one step using the "3 * n + 1" rule to get an even number. Every input has the binary form: j ones zeros N = ([j + 1] * 2^ko - 1) * 2^kz --> [j + 1] * 3^ko - 1 where: kz - number of trailing zeros in N ko - number of the next higher set of ones in N j - N shifted right by kz + ko bits: N / 2^[kz + ko] Using the previous example, initially we transition 27 to 82. From there the next few steps are: j ko kz binary 82 -> 41 -> 124 -> 62 20 1 2 101_0010 -> 31 -> 484 -> 242 0 5 1 11_1110 -> 121 -> 364 -> 182 60 1 1 1111_0010 -> 91 -> 274 -> 206 22 2 1 1011_0110 -> 103 -> 310 -> 466 12 3 1 1100_1110 -> 233 -> 700 -> 350 116 1 1 1_1101_0010 -> 593 -> 1780 -> 890 10 4 1 1_0101_1110 2. Sequence Entropy Shannon entropy is a measure of information indicating the level of uncertainty about the possible outcomes of a random variable. H = -p0 * log2( p0 ) + -p1 * log2( p1 ) where: p0 is the probability a bit is zero. p1 is the probability a bit is one (1 - p0). A set of coin tosses has p0 = p1 =.5 so its entropy is 1, totally random. When looking at the entropy of bits in a number then p0 is the percentage of zero bits. For the binary number, 1010_1111, p0 is .25 (H = 0.811). Strings of all ones or zeros have no entropy (H = 0). In these examples we are measuring the bits in a number horizontally. Bits in a series of numbers have two dimensions - horizontal bits in each individual value and vertical bits over the duration of the series. We can also measure entropy vertically over a number series. That means we can look at a select bit in each value as the series progresses. For Collatz the low order bit is of interest because it determines if a number is odd or even. In turn that determines which transition to take. When the entropy of the low order bit is high then on average there will be nearly as many even transitions as odds. Each kind of chain takes a value where the low order bits are a string of zeros or ones and either removes or replaces them. Even inputs remove low order zeros. The expression for odd chain inputs has a 2^k term that transitions to 3^k. Strings of zeros or ones have no entropy and the 3^k term has positive entropy. This adds entropy each time a chain of transitions is applied. Removing the low order zeros that have no entropy only increases the entropy of the result. Entropy is also increased by removing the repeated ones and replacing them with scrambled bits. The upper bits, j, are scrambled by multiplying j by a power of 3. This uniformly randomizes the choice of transition to apply next. 2.1 Losing Information The seed can be contrived to produce a Collatz series of any desired length. The longer the series the larger the seed has to be. It has to contain enough information to influence the desired outcome. Initially, as the series progresses the information contained in the seed is lost. When there are two ways to reach the same number in a sequence we lose the information about which path was taken. Odd numbers always transition (3 * n + 1) to even numbers so an odd value cannot be reached from another odd. You can reach certain even numbers from either an odd or even transition. For example, an output of 16 can be reached from either 32 or 5. 32 -> 32 / 2 = 16 5 -> 3 * 5 + 1 = 16 Whenever transitioning to an even value such that (even - 1) % 3 = 0 then the previous value could have been either: 2 * even or (even - 1) / 3 For Collatz, a bit of information contained in the seed is lost each time one of these select even numbers is reached. After all bits in the seed are scrubbed the initial phase is complete. Any attempt to contrive a seed to skew results can only directly affect values during this phase. One way hash functions use this concept of lost information. Secure hashes have very many ways to reach each hash value. This is how passwords are encoded so they can be authenticated. In this sense Collatz is a trivial example of a one way hash. 2.2 Randomization Phase This next phase is key as the sequence acts as a pseudo random number generator. Randomized values eventually trend towards their average. In turn this drives transitions towards their average gain. In the introduction we showed that Collatz has an average gain of 0.86603; driving the series below the seed value. Computations for a single Odd step can be performed using shifts and additions: N' = (3 * N + 1) / 2 = [(N \\ 1) + N + 1] // 1 The input is odd so the low order bit is one. The left shift and the two additions multiply N by 3 and increment the result. This scrambles the bits in N and the increment ensures intermediate values are even. The final right shift eliminates the known low order zero; leaving only the scrambled bits. Binary addition is used to scramble values. The following table shows all combinations for the three inputs (A, B, Carry In) and the two outputs (Sum, Carry Out). It also shows changes (xor) between the sum and inputs A and B. Carry Carry A xor B xor A B In Sum Out Sum Sum 0 0 0 | 0 0 0 0 0 0 1 | 1 0 1 1 0 1 0 | 1 0 1 0 0 1 1 | 0 1 0 1 1 0 0 | 1 0 0 1 1 0 1 | 0 1 1 0 1 1 0 | 0 1 1 1 1 1 1 | 1 1 0 0 Input bits A and B are vertically aligned and are altered by addition. Carries are applied horizontally and propagate to higher order bits. Bits in both directions become scrambled. Note that all the columns in the table are different. This aspect is how randomized results are produced. Also note that all columns contain 4 zeros and 4 ones. This balance produces results that are unbiased towards zero or one bits. The end result is a series pseudo random numbers whose values have uniformly distributed bits. Each odd input has a low order one bit and upper bits that are unknown. The known low order one bit is removed in the output and only the unknown bits remain. Entropy is increased in this step by scrambling the input. Similarly, in an Even transition the known low order zero is shifted out. Removing the known low order zero leaves all unknown bits which also increases entropy. Adding a constant to uniformly random numbers does not change the uniformity of the values. They are simply offset by the constant value. Adding 1 to random numbers from 0 to 9 produces equally random numbers from 1 to 10. The product of a random variable by a constant is also random, but with a larger gap between them. Multiplying random numbers from 1 to 10 by 3 yields random numbers from 3 to 33. They simply have a gap of 3 between them instead of 1. For the combined series from section 1.2 we have: N = ([j + 1] * 2^ko - 1) * 2^kz -> [j + 1] * 3^ko - 1 Overall we are taking a value with unknown bits, R, and scramble them by multiplying by a power of 3. Note that the kz term is discarded. R' = R * 3^ko - 1 Outputs in any individual series depend on the values kz and ko. The more random they are then the more random the series. The k values measure the width of a horizontal subset of bits in each value. Pseudo random number generators that conflate operations on horizontal and vertical bit sets rely on the independence of these orthogonal values. Repeated low order bits with low entropy are continuously removed and replaced by scrambled bits. This creates a self regulating system that randomizes the low order bits. Those bits control the choice of which even or odd transition to apply. If the series is uniformly random then the average number of even and odd transitions are the same. In turn this causes the series to decrease. It remains to formally prove uniformity. The computations must be shown to have no aspect that is biased towards more one bits than zero bits. Lets look at the output of the combined transition: [j + 1] * 3^ko - 1 [j + 1] Changes the low zero bit to one. Adding a constant is unbiased. * 3^ko repeated shift and add x \\ 1 adds a zero to the end + x mixes the bits; the low bit remains one - 1 Borrows 1 from the result to reset the low bit to zero. Subtracting one is unbiased. If the sequence was not uniformly random then bias would be reflected in the arithmetic in the transitions. The result of each of these steps is unbiased so that the series acts as a random number generator. Even if the seed initially results in highly skewed values, that cannot be sustained. Values will become more randomized as the series progresses and trend toward average results. Even if you get lucky and call the results of several coin tosses, your luck will run out in the long run. 2.3 Reduce To One The previous phase leaves us with a value of N that is below the seed. Once this happens we know the series will eventually terminate at one. Firstly, we know all values below some arbitrary small number M (say 10) transition to one. Next, starting with the next higher seed, M + 1, we transition until it reaches M or less, Since we already know seeds of M or less will reach one, by induction, a series that goes below its seed will reach one. In this final phase the law of small numbers takes over and the series will likely behave less randomly. Once one is reached all information about previous values leading up to it is gone. The final value is a 1 which has no entropy. 3. Conclusion The Collatz sequence integrates several mechanisms used to create pseudo random number generators. * A one way hash loses regularities due to a contrived seed. * Repetition in low order bits is erased in each step. * A product of independent values is used to randomize values. Any individual Collatz series is partitioned into three phases. In the initial phase the seed value can influence the outcome to produce arbitrarily long runs. After that the series generates randomized values until it goes below the seed value. From there it is guaranteed to reduce to one unless the series is circular. A uniformly randomized series eventually moves towards a statistically average gain. For a Collatz series to sustain an average gain above one would require over 1.7 times more odd transitions than even. This is well above parity. When the low order bit is random the series will average out and decrease until it inevitably goes below the seed. Once it does that we know it will terminate. The random behavior of the Collatz series makes it impossible to prove algebraically. The irony is that this randomness is the force that leads to its convergence.