Syracuse library documentation

This package provides some tools to deal with the Collatz conjecture.

The Collatz conjecture, also known as the "3n + 1 problem" or the "Syracuse problem", is an unsolved hypothesis in mathematics that concerns a sequence of operations applied to a positive integer. The conjecture states that no matter what the starting value of this integer is, the sequence will always eventually reach the value 1.

More specifically, the Collatz conjecture states that if one takes a positive integer n and applies the following function:

If n is even, divide it by 2.
If n is odd, multiply it by 3 and add 1.

Then, repeat the process with the resulting value, applying the function over and over again until the value of n eventually reaches 1.

Example

If one takes n = 6, the sequence for the Collatz conjecture would be: 6, 3, 10, 5, 16, 8, 4, 2, 1.

Although the conjecture has been computationally tested for extremely large values, it remains unproven to this day.

Since 3n+1 is even whenever n is odd, it is possible to define a "shortcut" (or "compressed") form for the Collatz function:

If n is even, divide it by 2.
If n is odd, multiply it by 3, add 1 and divide the whole result by 2.

This definition yields smaller values for the stopping time and total stopping time without changing the overall dynamics of the process.

See the related article on Wikipedia for deeper information.

The package provides the following modules:

Module	Description
`core`	Provides classes to define Collatz sequences
`drawing`	Provides tool functions to render in various ways the Collatz sequences
`sequences`	Provides well-known sequences related to Collatz sequences

It is important to notice that the core module can be used without being imported explicitly. However, all other modules need to be imported explicitly.

>>> import syracuse # import all objects exposed by the core module
>>> import syracuse.drawing # import objects exposed by the drawing module
>>> import syracuse.sequences # import objects exposed by the sequences module

`syracuse.core`

This module provides two classes to represent a full or a compressed form of Collatz sequence.

`Syracuse`

A (full) Collatz sequence instanciated with a given initial value.

The instance is iterable, and gives the successive values of the Collatz sequence.

Example

syracuse = Syracuse(1000)
for value in syracuse:
        print(value)
        if value == 1:
                break

Be aware that a Collatz sequence is infinite: once it reaches 1 (and that is apparently the case for all tested initial values until now !), the cycle 4,2,1 returns indefinitely. So make sure you have implemented a stopping condition when you iterate over the sequence.

For each sequence an underlying, directional graph is implemented, and is populated the first time several attributes are read (like total_stopping_time, max and of course graph and graph_view), resulting an additionnal computing duration, that can be noticeable on "big" sequences. But the duration becomes imperceptible next times those attributes are read.

Furthermore, a "global" graph can be optionaly initialized, and is shared with all instances. It is basically useful when working on a large number of sequences, or on same sequences repeatedly. If a global graph population is asked for a sequence instantiation, then the local graph is deduced from it only when needed, and stored in a class attribute for further use. Be aware that the global graph activation generates a significant amount of additional memory storage, but on the other side the computation speed is generally highly improved.

Parameters:

Name	Type	Description	Default
`initial`	`int`	Initial value of the Collatz sequence. Must be strictly positive.	required
`populate_global_graph`	`bool`	If `True`, a global graph is built, and populated with the graph of the current sequence. Default is `False`	`False`

Attributes:

Name	Type	Description
`initial_value`	`int`	Reminder of the initial value given for the sequence initialisation. read-only
`stopping_time`	`int`	The smallest rank such that the value becomes lower than the initial one. read-only
`total_stopping_time`	`int`	The smallest rank such that the value equals 1. read-only
`total_stopping_sequence`	`Tuple[int]`	Tuple of the sequence from the initial value, to 1. read-only
`max`	`int`	Maximum value of the sequence. read-only
`max_rank`	`int`	Rank of the maximum value of the sequence. read-only
`graph_view`	`dict`	A dict-like structure representing the graph of the successive members of the current sequence. read-only
`graph`	`networkx.Digraph`	The underlying, directional graph for the current Collatz sequence. read-only
`global_graph`	`networkx.Digraph`	The global graph shared with all sequences, gathering in the same graph all sequences' graphs instanciated with `populate_global_graph = True`. read-only
`single_graphs`	`dict[networkx.Digraph]`	Store the graphs generated for a single sequence to avoid repeated computations of the same graph (used only if `populate_global_graph` is `True`)

`generate_global_graph(max_initial_value, min_initial_value=1, excludes=[], reverse=False, parallel=False, parallel_algo=0)` `classmethod`

Generate a global graph gathering all Collatz sequences with initial values from min_initial_value to max_initial_value, without those beginning by members of excludes.

To populate the graph, this function temporarly instanciates the needed Syracuse objects, that can cause lot of memory consumtion; nevertheless each instance are deleted before the initialization of the next one. The resulting graph is stored in the class attribute global_graph, and is returned by this function too for convenience.

It is possible to switch to an alternative computation algorithm, using the ability of the computer/OS to execute simultaneous tasks. Depending on the hardware (ie: number of "cores" of the CPU), the benefit can be really interesting for a large range of values (the definition of "large" depends heavily on your configuration). For the most little ranges, it is better to use the classical, sequential approach.

The available types of parallel computation are the following:

Type number	Description
0	Usage of `multiprocessing.Pool` objects.
1	Usage of `multiprocessing.Process` objects: each process computes a part of edges and nodes of the global graph as shared objects, and the global graph is built in the parent process once all edges and nodes are ready.
2	Usage of `multiprocessing.Process` objects: each process computes its own part of the global graph, and the full global graph is later gathered in the parent process.

Parameters:

Name	Type	Description	Default
`min_initial_value`	`int`	The minimal initial value of the proceeded sequences.	`1`
`max_initial_value`	`int`	The maximal initial value of the proceeded sequences.	required
`excludes`	`list[int]`	list of sequences excluded in the range. No effect for members lower than `min_initial_value` or greater than `max_initial_value`.	`[]`
`reverse`	`bool`	If `True`, generate the global graph by computing the constitutive sequences from the maximal to the minimal initial value. Only relevant for non-parallel computation (ie: if `parallel` is `False`).	`False`
`parallel`	`bool`	If `True`, activates the parallel computation algorithm, using pool of multiprocessing workers.	`False`
`parallel_algo`	`int`	Type of algorithm used for the parallel computation. Only relevant when `parallel` is `True`.	`0`

Returns:

Type	Description
`nx.DiGraph`	networkx.DiGraph: A representation of the global graph of the Collatz sequences.

`total_stopping_times_range(max_initial_value, min_initial_value=1, parallel=False)` `classmethod`

Generate the tuple of the total stopping times of all Collatz sequences with initial values from min_initial_value to max_initial_value.

It is possible to switch to an alternative computation algorithm, using the ability of the computer/OS to execute simultaneous tasks. Depending on the hardware (ie: number of "cores" of the CPU), the benefit can be really interesting for a large range of values (the definition of "large" depends heavily on your configuration). For the most little ranges, it is better to use the classical, sequential approach.

Parameters:

Name	Type	Description	Default
`min_initial_value`	`int`	The minimal initial value of the proceeded sequences	`1`
`max_initial_value`	`int`	The maximal initial value of the proceeded sequences	required
`parallel`	`bool`	If True, activates the parallel computation algorithm, using pool of multiprocessing workers	`False`

Returns:

Type	Description
`Tuple[int]`	The ordered total stopping times of the Collatz sequences

`max_reached_values_range(max_initial_value, min_initial_value=1, parallel=False)` `classmethod`

Generate the tuple of the maximal values reached in all Collatz sequences with initial values from min_initial_value to max_initial_value.

It is possible to switch to an alternative computation algorithm, using the ability of the computer/OS to execute simultaneous tasks. Depending on the hardware (ie: number of "cores" of the CPU), the benefit can be really interesting for a large range of values (the definition of "large" depends heavily on your configuration). For the most little ranges, it is better to use the classical, sequential approach.

Parameters:

Name	Type	Description	Default
`min_initial_value`	`int`	The minimal initial value of the proceeded sequences	`1`
`max_initial_value`	`int`	The maximal initial value of the proceeded sequences	required
`parallel`	`bool`	If True, activates the parallel computation algorithm, using pool of multiprocessing workers	`False`

Returns:

Type	Description
`Tuple[int]`	A tuple with the ordered maximal reached values of the Collatz sequences

`reset_global_graph()` `classmethod`

Reset all graphs recorded in the class level: the global graph and the single_graphs list.

`CompressedSyracuse`

Bases: Syracuse

A compressed Collatz sequence instanciated with a given initial value.

This class provides the same parameters, attributes and methods as the inherited Syracuse class.

`syracuse.drawing`

This module provides several functions to render Collatz sequences into various displaying formats

`single_sequence_to_dot_string(syr_sequence, converter='native')`

Create a string containing a Graphviz dot format representing a single Collatz sequence, that can be read by Graphviz to render into various image formats.

It is possible to choose among several "converters" to create dot strings from the Collatz sequence. The "native" one (the default) does not need any external module. It is inspirated by https://en.wikipedia.org/wiki/File:Collatz-graph-300.svg. The "pydot" one uses the pydot library, and creates the dot string from the internal Networkx graph.

Examples:

>>> import syracuse
>>> import syracuse.drawing
>>>
>>> # Native converter
>>> syr = syracuse.Syracuse(6)
>>> collatz6 = syracuse.drawing.single_sequence_to_dot_string(syr)
>>> print(collatz6)
digraph {
6 -> 3 -> 10 -> 5 -> 16 -> 8 -> 4 -> 2 -> 1;
}
>>>
>>> # pydot converter
>>> syr = syracuse.Syracuse(10)
>>> collatz10 = syracuse.drawing.single_sequence_to_dot_string(syr, "pydot")
>>> print(collatz10)
strict digraph {
10;
5;
16;
8;
4;
2;
1;
10 -> 5;
5 -> 16;
16 -> 8;
8 -> 4;
4 -> 2;
2 -> 1;
}

Parameters:

Name	Type	Description	Default
`syr_sequence`	`Syracuse`	Sequence to be rendered	required
`converter`	`str`	The converter used to create the dot string. The available values are: "native" or "pydot"	`'native'`

Returns:

Type	Description
`str`	A Graphviz dot format representation of the sequence graph

`range_sequences_to_dot_string(compressed=False, limit=10, orientation='portrait', excludes=[], colored=False, converter='native')`

Create a string containing a Graphviz dot format representing a range of Collatz sequences, that can be read by Graphviz to render into various image formats.

It is possible to choose among several "converters" to create dot strings from the Collatz sequence. The "native" one (the default) does not need any external module. It is inspirated by https://en.wikipedia.org/wiki/File:Collatz-graph-300.svg. The "pydot" one uses the pydot library, and creates the dot string from the Networkx global graph.

Examples:

>>> import syracuse
>>> import syracuse.drawing
>>> collatz_range15 = syracuse.drawing.range_sequences_to_dot_string(limit = 15)
>>> print(collatz_range15)
digraph {
1;
2 -> 1;
3 -> 10 -> 5 -> 16 -> 8 -> 4 -> 2;
4;
5;
6 -> 3;
7 -> 22 -> 11 -> 34 -> 17 -> 52 -> 26 -> 13 -> 40 -> 20 -> 10;
8;
9 -> 28 -> 14 -> 7;
10;
11;
12 -> 6;
13;
14;
15 -> 46 -> 23 -> 70 -> 35 -> 106 -> 53 -> 160 -> 80 -> 40;
}
>>> collatz_range10_horizontal_colored = syracuse.drawing.range_sequences_to_dot_string(orientation = "landscape", colored = True)
>>> print(collatz_range10_horizontal_colored)
digraph {
node [colorscheme=spectral10]
1 [color=1]
2 [color=1]
3 [color=1]
10 [color=3]
5 [color=2]
16 [color=4]
8 [color=2]
4 [color=1]
6 [color=2]
7 [color=2]
22 [color=5]
11 [color=3]
34 [color=7]
17 [color=4]
52 [color=10]
26 [color=6]
13 [color=3]
40 [color=9]
20 [color=5]
9 [color=2]
28 [color=6]
14 [color=3]
rankdir="LR"
1;
2 -> 1;
3 -> 10 -> 5 -> 16 -> 8 -> 4 -> 2;
4;
5;
6 -> 3;
7 -> 22 -> 11 -> 34 -> 17 -> 52 -> 26 -> 13 -> 40 -> 20 -> 10;
8;
9 -> 28 -> 14 -> 7;
10;
}

Parameters:

Name	Type	Description	Default
`compressed`	`bool`	indicates weither one deals with either compressed or "normal" sequences	`False`
`limit`	`int`	the upper bound of the range values from whom the sequences are generated	`10`
`orientation`	`str`	orientation of the rendered graph. Possible values: `landscape` or `portrait`	`'portrait'`
`excludes`	`list[int]`	list of sequences excluded in the range. No effect for members greater than `limit`.	`[]`
`colored`	`bool`	If `True`, the rendered graph will have gradient colors, depending on the value of each node	`False`
`converter`	`str`	The converter used to create the dot string. The available values are: "native" or "pydot". Please notice that this parameter is only relevant if colored is `False` (if colored is `True`, the native converter is used).	`'native'`

Returns:

Type	Description
`str`	A Graphviz dot format representation of the sequences range graph

`single_sequence_to_matplotlib_figure(syr_sequence, test=False)`

Create a Matplotlib Figure object representing the evolution of the successive values of a particular Collatz sequence.

Parameters:

Name	Type	Description	Default
`syr_sequence`	`Syracuse`	Sequence to be rendered	required
`test`	`bool`	if `True`, displays the graph immediately (for testing purposes)	`False`

Returns:

Type	Description
`mpl.figure.Figure`	matplotlib.figure.Figure: A representation of the successive values of the Collatz sequence

`range_sequences_distribution_to_matplotlib_figure(max_initial_value, min_initial_value=1, statistic=True, test=False)`

Create a Matplotlib Figure object representing the distribution of the successive values of a range of Collatz sequences.

Parameters:

Name	Type	Description	Default
`min_initial_value`	`int`	The minimal initial value of the proceeded sequences	`1`
`max_initial_value`	`int`	The maximal initial value of the proceeded sequences	required
`statistic`	`bool`	If `True`, displays some statistical charateristics on the returned figure	`True`
`test`	`bool`	if `True`, displays the graph immediately (for testing purposes)	`False`

Returns:

Type	Description
`mpl.figure.Figure`	matplotlib.figure.Figure: A bar graph representation of the distribution

`syracuse.sequences`

This module provides several well-known sequences related to Collatz sequences.

Unless explicitely mentioned, all sequences provided here implement the iterator protocol. That is, they can be used in for loops, and as parameter for the built-in iter() and next() functions. It also means that only one item is kept in memory at a time, the next items being generated on demand or lazily. Unlike container types like lists or dictionnaries, data are not stored but forgotten once the next item is yield. By the way, iterators are very memory-efficient and can process infinite data streams, like most of the sequences generated here.

The available sequences are the following:

sequence	OEIS reference	Description
total_stopping_time_records	A006877	Sequence of starting values of Collatz sequences with a total stopping time longer than of any smaller starting value.
max_value_records	A006884	Sequence of starting values of Collatz sequences with maximum values higher than of any smaller starting value.

`total_stopping_time_records`

Bases: Iterator

Sequence of starting values of Collatz sequences with a total stopping time longer than of any smaller starting value. OEIS reference: A006877.

Example

6 is included in this sequence because the total stopping time of the Collatz sequence beginning by 6 (that is, 8) is greater than total stopping times of all Collatz sequences beginning by 1, 2, 3, 4 and 5.

In order to speedup the computation, it is possible to apply optional optimizations. Those optimizations are combinable (by addition of their numbers) for a greater efficiency. Deeper details (including mathematical demontrations) are available in this article from T. Leavens and M. Vermeulen.

The available sorts of optimizations are the following (the speedup factor is calculated with the first 50 records. It is only indicative, since it may differ on your own configuration):

Number	Name	Description	Speedup factor
0	None	No optimization	1.0
1	Even numbers	Due to the fact that `Syracuse(2k).total_stopping_time` = `Syracuse(k).total_stopping_time + 1`, it is easy to predict the next even candidate for this sequence, and then it is not necessary to compute the Collatz sequences for the other even numbers below this candidate.	1.9
2	k mod 6 = 5	If the remainder of the division of the initial value by 6 is 5, then this cannot be a record for the total stopping time.	1.2
4	a posteriori cutoff	It is possible to stop the iteration process of a Collatz sequence before its end, by comparing the current iterate value with all previous records and the number of steps necessary to reach it. This optimization needs to store all items previously computed, making this iterator less memory-efficient.	2.0
8	make_odd	Speed up the iterations of the Collatz sequences by replacing the successive divisions by 2, by only one step. Be aware that using this "optimization" alone is absolutly not efficient (as you can see, the speedup factor is more or less 1). However, it is a real "booster" when associated with other optimization algorithms. For instance, the a posteriori cutoff is boosted with a speedup factor of 3.3 when used together with the "make_odd" one.	1.0

Parameters:

Name	Type	Description	Default
`optimization`	`int`	Type of optimization(s) applied. No optimization by default.	`0`

`max_value_records`

Bases: Iterator

Sequence of starting values of Collatz sequences with maximum values higher than of any smaller starting value. OEIS reference: A006884.

Example

7 is included in this sequence because the maximum value of the Collatz sequence beginning by 7 (that is, 52) is greater than maximum values of all Collatz sequences beginning by 1, 2, 3, 4, 5 and 6.

In order to speedup the computation, it is possible to apply optional optimizations. Those optimizations are combinable (by addition of their numbers) for a greater efficiency. Deeper details (including mathematical demontrations) are available in this article from T. Leavens and M. Vermeulen.

The available sorts of optimizations are the following (the speedup factor is calculated with the first 40 records. It is only indicative, since it may differ on your own configuration):

Number	Name	Description	Speedup factor
0	None	No optimization	1.0
1	Odd numbers only	Except 2, there is no even record for max value. So it is not necessary to compute Collatz sequences starting by even values.	2.0
2	k mod 6 = 5	If the remainder of the division of the initial value by 6 is 5, then this cannot be a record for the max value.	1.2
4	a posteriori cutoff	A record, if found, is a max value appearing before the iteration process has fallen under the initial value. Hence, once the initial value is reached, it is possible to stop the iteration process.	8.0
8	make_odd	Speed up the iterations of the Collatz sequences by replacing the successive divisions by 2, by only one step.	1.9

Parameters:

Name	Type	Description	Default
`optimization`	`int`	Type of optimization(s) applied. No optimization by default.	`0`

Syracuse library documentation

syracuse.core

Syracuse

generate_global_graph(max_initial_value, min_initial_value=1, excludes=[], reverse=False, parallel=False, parallel_algo=0) classmethod

total_stopping_times_range(max_initial_value, min_initial_value=1, parallel=False) classmethod

max_reached_values_range(max_initial_value, min_initial_value=1, parallel=False) classmethod

reset_global_graph() classmethod

CompressedSyracuse

syracuse.drawing

single_sequence_to_dot_string(syr_sequence, converter='native')

range_sequences_to_dot_string(compressed=False, limit=10, orientation='portrait', excludes=[], colored=False, converter='native')

single_sequence_to_matplotlib_figure(syr_sequence, test=False)

range_sequences_distribution_to_matplotlib_figure(max_initial_value, min_initial_value=1, statistic=True, test=False)

syracuse.sequences

total_stopping_time_records

max_value_records

`syracuse.core`

`Syracuse`

`generate_global_graph(max_initial_value, min_initial_value=1, excludes=[], reverse=False, parallel=False, parallel_algo=0)` `classmethod`

`total_stopping_times_range(max_initial_value, min_initial_value=1, parallel=False)` `classmethod`

`max_reached_values_range(max_initial_value, min_initial_value=1, parallel=False)` `classmethod`

`reset_global_graph()` `classmethod`

`CompressedSyracuse`

`syracuse.drawing`

`single_sequence_to_dot_string(syr_sequence, converter='native')`

`range_sequences_to_dot_string(compressed=False, limit=10, orientation='portrait', excludes=[], colored=False, converter='native')`

`single_sequence_to_matplotlib_figure(syr_sequence, test=False)`

`range_sequences_distribution_to_matplotlib_figure(max_initial_value, min_initial_value=1, statistic=True, test=False)`

`syracuse.sequences`

`total_stopping_time_records`

`max_value_records`