```
Entropy Makes the Collatz Series Go Down

Edited August  6, 2022

ABSTRACT

We look at the Collatz Series from an information theory
perspective to lay out its underlying computational mechanics.
The mechanisms are similar to those used by pseudo random number
generators and one way hashes.

An equivalent restatement of the Collatz series steps through
alternating even and odd chains.  The result of each step is an
even number with one or more trailing zeros.  The next higher
bits will be one or more ones.  The remaining high order bits are
an even number with no special ordering.

Numbers with all ones or zeros have no entropy.  Transitions
discard low order zeros; increasing the entropy of the output
value.  The next higher group of ones is then replaced by
combining them with the remaining high order bits.  This also
increases entropy.

Generally, as the entropy of a series increases the values become
more randomized.  The more random the series the more it trends
towards statistical averages for the series.  The average gain of
the Collatz Series is well below one so that values drift lower
until reaching one; providing it is not circular.

notation:
->     "Transition to" the the next value in a sequence.
^     exponentiation operator
//     logical shift right operator
\\     logical shift left operator

1.  Combining Transitions

This is an informal explanation of the computational mechanics for the
Collatz Sequence.  The same class of computations are used in
cryptographic random number generators.

Often the sequence is defined where the Odd transition is just "3*n+1".
Instead we also divide the resilt by two.  This way the average odds of
taking either step are even.  This is easy to see by taking a range of
inputs and you'll notice the outputs will be exactly evenly distributed.

N is Even:    N' = N / 2

N is Odd:     N' = (3 * N + 1) / 2

The gain of a transition is its Output divided by the Input (N' / N).
As N gets larger the "+ 1" term quickly becomes insignificant.  It is
dropped so we can compute the gain for the transition in the limit.

Output / Input             Gain

N is Even:    {N / 2} / N                  0.5

N is Odd:     {(3 * N + 1) / 2} / N        1.5

The gain of a series of transitions is the product of the gains for
each transition.  We get the average gain of a sequence by using the
probability of taking either an odd or even path.

Gs = 1.5 ^ p(odd) * .5 ^ p(even)

If a sequence produces uniformly randomized values then the chances of
using either transitions are 50:50.  In this case the choice of path is
determined by the low order bit of the input value.  This means the low
bit would need to be randomized over the sequence.  The average gain of a
randomized sequence is then:

Average transaction gain = <transaction gain> ^ p(<transaction>)

Average odd gain  = 1.5 ^ .5 = 1.22474

Average even gain = 0.5 ^ .5 = 0.70711

Average sequence gain:   Ga = 1.22474 * 0.70711 = 0.86603+

The statistical average gain in each step is less than one so on average
the sequence declines.  However, the gain in an actual series can vary
significantly.  There could potentially still be a series where the the
total gain indefinitely exceeds one and never terminates.

Gain = 1.5 ^ p * 0.5 ^ (1 - p) = 1

ln( 1.5 ^ p * 0.5 ^ [1 - p]) = ln( 1 ) = 0

= ln( 1.5 ^ p ) + ln( 0.5 ^ (1 - p)

= p * ln( 1.5 ) + (1 - p) * ln( 0.5 )

p * ln( 1.5 )  =  -(1 - p) * ln( 0.5 )

ln( 1.5 ) / ln( 0.5 )  =  -(1 - p) / p

ln( 1.5 ) / ln( 0.5 )  =  -1 / p + 1

ln( 1.5 ) / ln( 0.5 ) - 1  =  -1 / p

p  =  -1 / {ln( 1.5 ) / ln( 0.5 ) - 1} = 0.63093+

Gain = 1.5^0.63 * 0.5^0.37 = 0.99898      Just under breaking even

Gain = 1.5^0.64 * 0.5^0.36 = 1.01001      Just over breaking even

For a series to run on forever it would have to sustain an average gain
over one.  To do this odd transitions would need to be applied over 1.7
times more than evens.  We've determined that an equal number of
transitions gives an average gain of about 0.86603.

1.1 Even and Odd Chains

There are often consecutive iterations of the same kind of transition in
the sequence to form a chain.  Even chains start with an even seed value
that in binary have one or more trailing zeroes.  The Even transition
simply removes the zeros at the end of an even chain.

Odd chains consume an odd input and can have multiple odd intermediate
values.  The output of an Odd chain is an even number.  The number of
consecutive low order one bits determines the chain length.  For example,
an input of 19 is a binary 10011 so the subsequent chain has two Odd
transitions:  19 -> 29 -> 44

A chain of length k takes an input of the form:  j * 2^k - 1
The output value simplifies to:  j * 3^k - 1

N(1) = (3 * N + 1) / 2                         First transition

= (3 * [j * 2^k - 1] + 1) / 2             Substitute N = j * 2^k - 1

= ([3 * j * 2^k - 3] + 1) / 2

= [3 * j * 2^k - 2] / 2

= 3^1 * j * 2^[k-1] - 1

N(i+1) = (3 * {3^i * j * 2^[k-i] - 1} + 1) / 2   Subsequent transitions

= ({9 * j * 2^[k-i] - 3} + 1) / 2

= {9 * j * 2^[k-i] - 2} / 2

= 3^[i+1] * j * 2^[k-i-1] - 1

N(k) = 3^k * j - 1                             Odd chain output

A Collatz sequence will have segments with consecutive runs of each
transition.  These chains have length k and have the form:

Chain Input        Chain Output        Gain

N is Odd:     j * 2^k - 1        j * 3^k - 1         1.5^k

N is Even:    j * 2^k            j                   0.5^k

Note that for even inputs j is the input shifted right by the number of
its trailing zeros.  For odd inputs j is 1 plus the input shifted right
by the number of trailing ones.  For reference, here are the first few
chains for the series beginning with a seed of 27.

Original                        Even        Odd         j   k

27 -> 41 -> 124 -> 62                             27 -> 62       7   2

-> 31                           -> 31                  31   1

-> 47 -> 71 -> 107 -> 161 -> 242                  -> 242      1   5

-> 121                          -> 121                121   1

-> 364 -> 182                               -> 182     61   1

-> 91                           -> 91                  91   1

-> 137 -> 206                               -> 206     23   2

-> 103                          -> 103                103   1

-> 155 -> 233 -> 350                        -> 350     13   3

-> 175                          -> 175                175   1

-> 263 -> 395 -> 593 -> 890                        -> 890     11   4

1.2  Combining Even And Odd Chains

The series alternates between even and odd chains.  We represent this
aspect by algebraically merging both into a single step.  This gives us a
series defined using a single transition.  Since all intermediate values
using the combined definition are even, an initial odd value needs to
transition one step using the "3 * n + 1" rule to get an even number.

Every input has the binary form:   j  ones  zeros

N = ([j + 1] * 2^ko - 1) * 2^kz  -->  [j + 1] * 3^ko - 1

where:  kz - number of trailing zeros in N
ko - number of the next higher set of ones in N
j - N shifted right by kz + ko bits:  N / 2^[kz + ko]

Using the previous example, initially we transition 27 to 82.  From there
the next few steps are:

j     ko     kz       binary

82 -> 41 ->  124 -> 62         20      1      2       101_0010

-> 31 ->  484 -> 242         0      5      1        11_1110

-> 121 ->  364 -> 182        60      1      1      1111_0010

->  91 ->  274 -> 206        22      2      1      1011_0110

-> 103 ->  310 -> 466        12      3      1      1100_1110

-> 233 ->  700 -> 350       116      1      1     1_1101_0010

-> 593 -> 1780 -> 890        10      4      1     1_0101_1110

2.  Sequence Entropy

Entropy is a measure of information indicating the level of uncertainty
about the possible outcomes of a random variable.

H = -p0 * log2( p0 ) + -p1 * log2( p1 )

where:  p0 is the probability a bit is zero.
p1 is the probability a bit is one (1 - p0).

A set of coin tosses has p0 = p1 =.5 so its entropy is 1, totally random.
When looking at the entropy of bits in a number then p0 is the percentage
of zero bits.  For the binary number, 1010_1111, p0 is .25 (H = 0.811).
Strings of all ones or zeros have no entropy (H = 0).  In these examples
we are measuring the bits in a number horizontally.

Bits in a series of numbers have two dimensions - horizontal bits in each
individual value and vertical bits over the duration of the series.  We
can also measure entropy vertically over a number series.  That means we
can look at a select bit in each value as the series progresses.

For Collatz the low order bit is interesting because it determines if a
number is odd or even.  In turn that determines which transition to take.
When the entropy of the low order bit is high then on average there will
be nearly as many even transitions as odds.

Each kind of chain takes a value where the low order bits are a string of
zeros or ones and either removes or replaces them.  Even inputs remove low
order zeros.  The expression for odd chain inputs has a 2^k term that
transitions to 3^k.

Strings of zeros or ones have no entropy and the 3^k term has positive
entropy.  This adds entropy each time a chain of transitions is applied.
Removing the low order zeros that have no entropy only increases the
entropy of the result.  Entropy is also increased by removing the
repeated ones and replacing them with scrambled bits.  The upper bits, j,
are scrambled by multiplying j by a power of 3.  This uniformly
randomizes the choice of transition to apply next.

2.1 Losing Information

The seed can be contrived to produce a Collatz series of any desired
length.  The longer the series the larger the seed has to be.  It has to
contain enough information to influence the desired outcome.

Initially, as the series progresses information in the seed is lost.
When there are two ways to reach the same number in a sequence we lose
the information about which path was taken.

Odd numbers always transition (3 * n + 1) to even numbers so an odd value
cannot be reached from another odd.  You can reach certain even numbers
from either an odd or even transition.  For example, an output of 16 can
be reached from either 32 or 5.

32 -> 32 / 2 = 16     5 -> 3 * 5 + 1 = 16

Whenever transitioning to an even value such that (even - 1) % 3 = 0
then the previous value could have been either:

2 * even   or   (even - 1) / 3

For Collatz, a bit of information contained in the seed is lost each time
one of these select even numbers is reached.  After all bits in the seed
are scrubbed the initial phase is complete.  Any attempt to contrive a
seed to skew results can only directly affect values during this phase.

One way hash functions use this concept of lost information.  Secure
hashes have very many ways to reach each hash value.  This is how
passwords are encoded so they can be authenticated.  In this sense
Collatz is a trivial example of a one way hash.

2.2  Randomization Phase

This next phase is key as the sequence acts as a pseudo random number
generator.  Randomized values eventually trend towards their average.
In turn this drives transitions towards their average gain.  In the
introduction we showed that Collatz has an average gain of 0.86603;
driving the series below the seed value.

Computations for a single Odd step can be performed using shifts and

N'  =  (3 * N + 1) / 2  =  [(N \\ 1) + N + 1] // 1

The input is odd so the low order bit is one.  The left shift and the two
additions multiply N by 3 and increment the result.  This scrambles the
bits in N and the increment ensures an even intermediate value.  The final
right shift eliminates the known low order zero; leaving only the
scrambled bits.

A input value has a low order one bit and the upper bits are unknown.
The output removes the known low order one bit and produces an unknown
output.  Entropy is increased in this step by scrambling the input.

Similarly, in the Even step the known low order zero is shifted out.
Removing the known low order zero leaves all unknown bits which also
increases entropy.

Adding a constant to uniformly random numbers does not change the
uniformity of the values.  They are simply offset by the constant value.
Adding 1 to random numbers from 0 to 9 produces equally random numbers
from 1 to 10.

The product of a random variable by a constant is also random, but they
have a larger gap between them.  Multiplying random numbers from 1 to 10
by 3 yields random numbers from 3 to 33.  They simply have a gap of 3
between them.

For the combined series from section 1.2 we have:

N  =  ([j + 1] * 2^ko - 1) * 2^kz  ->  [j + 1] * 3^ko - 1

Overall we are taking a value with unknown bits, R, and scramble them
by multiplying by a power of 3.  Note that the kz term is discarded.

R'  =  R * 3^ko - 1

Outputs in any individual series depend on the values kz and ko.  The more
random they are then the more random the series.  The k values measure
the width of a horizontal subset of bits in each value.  Pseudo random
number generators that conflate operations on horizontal and vertical bit
sets rely on the independence of these orthogonal values.

Repeated low order bits with low entropy are continuously removed and
replaced by scrambled bits.  This creates a self regulating system that
randomizes the low order bits.  Those bits control the choice of which
even or odd transition to apply.

If the series is uniformly random then the average number of even and odd
transitions are the same.  In turn this causes the series to decrease. It
remains to formally prove uniformity.  The computations must be shown
to have no aspect that is biased towards more one bits than zero bits.

Lets look at the output of the combined transition:  [j + 1] * 3^ko - 1

[j + 1]   Changes the low zero bit to one.

* 3^ko   repeated shift and add

x \\ 1  adds a zero to the end

+ x  mixes the bits; the low bit remains one

- 1      Borrows 1 from the result to reset the low bit to zero.
Subtracting one is unbiased.

If the sequence was not uniformly random  then that would be reflected in
the arithmetic in the transitions.  The result of each of these steps is
unbiased so that the series acts as a random number generator.   Even if
the seed initially results in highly skewed values, that cannot be
sustained.  Values will become more randomized as the series progresses
and trend toward average results.  Even if you get lucky and call the
results of several coin tosses, your luck will run out in the long run.

2.3  Reduce To One

The previous phase leaves us with a value of N that is below the seed.
Once this happens we know the series will eventually terminate at one.
Firstly, we know all values below some arbitrary small number M (say 10)
transition to one.

Next, starting with the next higher seed, M + 1, we transition until it
reaches M or less,  Since we already know seeds of M or less will reach
one, by induction, a series that goes below its seed will reach one.

In this final phase the law of small numbers takes over and the series
will likely behave less randomly.  Once one is reached all information
about previous values leading up to it is gone.  The final value is a 1
which has no entropy.

3. Conclusion

The Collatz sequence integrates several mechanisms used to create
pseudo random number generators.

* A one way hash loses regularities due to a contrived seed.

* Repetition in low order bits is erased in each step.

* The product of independent values is used to randomize values.

Any individual series is partitioned into three phases.  In the initial
phase the seed value can influence the outcome to produce arbitrarily
long runs.  After that the series generates randomized values until it
goes below the seed value.  From there it is guaranteed to reduce to one
unless the series is circular.

A uniformly randomized series eventually moves towards a statistically
average gain.  For a Collatz series to sustain an average gain above one
would require over 1.7 times more odd transitions than even.  This is
well above parity.  When the low order bit is random the series will
average out and decrease until it inevitably goes below the seed.  Once
it does that we know it will terminate.

The random behavior of the Collatz series makes it impossible to prove
algebraically.  The irony is that this randomness is the force that
causes it to converge.

```