whisker.random¶
whisker/random.py
Copyright © 2011-2020 Rudolf Cardinal (rudolf@pobox.com).
This file is part of the Whisker Python client library.
Licensed under the Apache License, Version 2.0 (the “License”); you may not use this file except in compliance with the License. You may obtain a copy of the License at
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an “AS IS” BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.
Randomization functions that may be used by Whisker tasks.
-
class
whisker.random.
ShuffleLayerMethod
(flat: bool = False, layer_key: int = None, layer_attr: str = None, layer_func: Callable[Any, Any] = None, shuffle_func: Callable[Sequence[Any], List[int]] = None)[source]¶ Class to representing instructions to
layered_shuffle()
(q.v.).Parameters: - flat – take data as
x[index]
? - layer_key – take data as
x[index][layer_key]
? - layer_attr – take data as
getattr(x[index], layer_attr)
? - layer_func – take data as
layer_func(x[index])
? - shuffle_func – function (N.B. may be a lambda function with parameters attached) that takes a list of objects and returns a list of INDEXES, suitably shuffled.
Typical values of
shuffle_func
:None
: no shufflerandom_shuffle_indexes
: plain shuffle; seerandom_shuffle_indexes()
dwor_shuffle_indexes
: DWOR-style shuffle (will need alambda
for itsmultiplier parameter
; seedwor_shuffle_indexes()
block_shuffle_indexes_by_value
: aggregate items into blocks, defined by value, and shuffle those blocks; seeblock_shuffle_indexes_by_value()
sort_indexes
: not exactly shuffling! Seesort_indexes()
.reverse_sort_indexes
: not exactly shuffling! Seereverse_sort_indexes()
.
-
get_indexes_for_value
(x: List[Any], value: Any) → List[int][source]¶ Returns a list of indexes of
x
where its value (as defined by this layer) isvalue
.
- flat – take data as
-
whisker.random.
block_shuffle_by_attr
(x: List[Any], attrorder: List[str], start: int = None, end: int = None) → None[source]¶ DEPRECATED:
layered_shuffle()
is more powerful.Exactly as for
block_shuffle_by_item()
, but by item attribute rather than item index number.For example:
from collections import namedtuple import itertools from whisker.random import block_shuffle_by_attr p = list(itertools.product("ABC", "xyz", "123")) Trio = namedtuple("Trio", ["upper", "lower", "digit"]) q = [Trio(*x) for x in p] block_shuffle_by_attr(q, ['upper', 'lower', 'digit'])
q
started off as:[ Trio(upper='A', lower='x', digit='1'), Trio(upper='A', lower='x', digit='2'), Trio(upper='A', lower='x', digit='3'), Trio(upper='A', lower='y', digit='1'), Trio(upper='A', lower='y', digit='2'), Trio(upper='A', lower='y', digit='3'), Trio(upper='A', lower='z', digit='1'), Trio(upper='A', lower='z', digit='2'), Trio(upper='A', lower='z', digit='3'), Trio(upper='B', lower='x', digit='1'), Trio(upper='B', lower='x', digit='2'), Trio(upper='B', lower='x', digit='3'), Trio(upper='B', lower='y', digit='1'), Trio(upper='B', lower='y', digit='2'), Trio(upper='B', lower='y', digit='3'), Trio(upper='B', lower='z', digit='1'), Trio(upper='B', lower='z', digit='2'), Trio(upper='B', lower='z', digit='3'), Trio(upper='C', lower='x', digit='1'), Trio(upper='C', lower='x', digit='2'), Trio(upper='C', lower='x', digit='3'), Trio(upper='C', lower='y', digit='1'), Trio(upper='C', lower='y', digit='2'), Trio(upper='C', lower='y', digit='3'), Trio(upper='C', lower='z', digit='1'), Trio(upper='C', lower='z', digit='2'), Trio(upper='C', lower='z', digit='3') ]
but after the shuffle
q
might now be:[ Trio(upper='B', lower='z', digit='1'), Trio(upper='B', lower='z', digit='3'), Trio(upper='B', lower='z', digit='2'), Trio(upper='B', lower='x', digit='1'), Trio(upper='B', lower='x', digit='3'), Trio(upper='B', lower='x', digit='2'), Trio(upper='B', lower='y', digit='3'), Trio(upper='B', lower='y', digit='2'), Trio(upper='B', lower='y', digit='1'), Trio(upper='A', lower='z', digit='2'), Trio(upper='A', lower='z', digit='1'), Trio(upper='A', lower='z', digit='3'), Trio(upper='A', lower='x', digit='1'), Trio(upper='A', lower='x', digit='2'), Trio(upper='A', lower='x', digit='3'), Trio(upper='A', lower='y', digit='3'), Trio(upper='A', lower='y', digit='1'), Trio(upper='A', lower='y', digit='2'), Trio(upper='C', lower='x', digit='2'), Trio(upper='C', lower='x', digit='3'), Trio(upper='C', lower='x', digit='1'), Trio(upper='C', lower='y', digit='2'), Trio(upper='C', lower='y', digit='1'), Trio(upper='C', lower='y', digit='3'), Trio(upper='C', lower='z', digit='1'), Trio(upper='C', lower='z', digit='2'), Trio(upper='C', lower='z', digit='3') ]
You can see that the
A
/B
/C
group has been shuffled as blocks. Then, withinB
, thex
/y
/z
groups have been shuffled (and so on forA
andC
). Then, withinB.z
, the1
/2
/3
values have been shuffled (and so on).
-
whisker.random.
block_shuffle_by_item
(x: List[Any], indexorder: List[int], start: int = None, end: int = None) → None[source]¶ DEPRECATED:
layered_shuffle()
is more powerful.Shuffles the list
x[start:end]
hierarchically, in place.Parameters: - x – list to shuffle
- indexorder – a list of indexes of each item of
x
The first index varies slowest; the last varies fastest. - start – start index of
x
- end – end index of
x
For example:
p = list(itertools.product("ABC", "xyz", "123"))
x
is now a list of tuples looking like('A', 'x', '1')
.block_shuffle_by_item(p, [0, 1, 2])
p
might now look like:C z 1 } all values of "123" appear } first "xyz" block C z 3 } once, but randomized } C z 2 } } } C y 2 } next "123" block } C y 1 } } C y 3 } } } C x 3 } C x 2 } C x 1 } A y 3 } second "xyz" block ... } ...
A clearer explanation is in
block_shuffle_by_attr()
.
-
whisker.random.
block_shuffle_indexes_by_value
(x: List[Any]) → List[int][source]¶ Returns a list of indexes of
x
, block-shuffled by value.That is: we aggregate items into blocks, defined by value, and shuffle those blocks, returning the corresponding indexes of the original list.
-
whisker.random.
dwor_shuffle_indexes
(x: List[Any], multiplier: int = 1) → List[int][source]¶ Returns a list of indexes of
x
, DWOR-shuffled by value.This is a bit tricky as we don’t have a guarantee of equal numbers. It does sensible things in those circumstances.
-
whisker.random.
gen_dwor
(values: Iterable[Any], multiplier: int = 1) → Generator[[Any, None], None][source]¶ Generates values using a draw-without-replacement (DWOR) system.
Parameters: - values – values to generate
- multiplier – DWOR multiplier; see below.
Yields: successive values
Here’s how it works.
Suppose
values == [A, B, C]
.We’ll call the number of values (here, 3), and the “multiplier” parameter.
If you iterate through
gen_dwor(values, multiplier=1)
, you will get a sequence that might look like this (with spaces added for clarity):CAB ABC BCA BAC BAC ACB CBA ...
That is, individual are drawn randomly from a “hat” of size , containing one of each thing from
values
. When the hat is empty, it is refilled with more.If you iterate through
gen_dwor(values, multiplier=2)
, however, you might get this:AACBBC CABBAC BAACCB ...
The computer has put copies of each value in the hat, and then draws one each time at random (so the hat starts with values in it). When the hat is exhausted, it re-populates.
The general idea is to provide randomness, but randomness that is constrained to prevent unlikely but awkward sequences like
AAAAAAAAAAAAAAAA ... unlikely but possible with full randomness!
yet also have the option to avoid predictability. With , then a clever subject could infer exactly what’s coming up on every th trial. So a low value of brings very few “runs” but some predictability; as approaches infinity, it’s equivalent to full randomness; some reasonably low value of in between may be a useful experimental sweet spot.
See also, for example:
-
whisker.random.
get_dwor_list
(values: Iterable[Any], length: int, multiplier: int = 1) → List[Any][source]¶ Makes a fixed-length list via
gen_dwor()
.Parameters: - values – values to pick from
- length – list length
- multiplier – DWOR multiplier
Returns: list of length
length
Example:
from whisker.random import get_dwor_list values = ["a", "b", "c"] print(get_dwor_list(values, length=24, multiplier=1)) print(get_dwor_list(values, length=24, multiplier=2)) print(get_dwor_list(values, length=24, multiplier=3))
-
whisker.random.
get_indexes_for_value
(x: List[Any], value: Any) → List[int][source]¶ Returns a list of indexes of
x
where its value isvalue
.
-
whisker.random.
get_unique_values
(iterable: Iterable[Any]) → List[Any][source]¶ Gets the unique values of its input. See https://stackoverflow.com/questions/12897374/get-unique-values-from-a-list-in-python.
(We don’t use
list(set(x))
, because if the elements ofx
are themselves lists (perfectly common!), that givesTypeError: unhashable type: 'list'
.)
-
whisker.random.
last_index_of
(x: List[Any], value: Any) → int[source]¶ Gets the index of the last occurrence of
value
in the listx
.
-
whisker.random.
layered_shuffle
(x: List[Any], layers: List[whisker.random.ShuffleLayerMethod]) → None[source]¶ Most powerful hierarchical shuffle command here.
Shuffles
x
in place in a layered way as specified by the sequence of methods.In more detail:
- for each layer, it shuffles values of
x
as defined by theShuffleLayerMethod
(for example: “shufflex
in blocks based on the value ofx.someattr
”, or “shufflex
randomly”) - it then proceeds to deeper layers within sub-lists defined by each unique value from the previous layer.
Parameters: - x – sequence (e.g. list) to shuffle
- layers – list of
ShuffleLayerMethod
instructions
Examples:
from collections import namedtuple import itertools import logging import random from whisker.random import * logging.basicConfig(level=logging.DEBUG) startlist = ["a", "b", "c", "d", "a", "b", "c", "d", "a", "b", "c", "d"] x1 = startlist[:] x2 = startlist[:] x3 = startlist[:] x4 = startlist[:] do_nothing_method = ShuffleLayerMethod(flat=True, shuffle_func=None) do_nothing_method.get_unique_values(x1) do_nothing_method.get_indexes_for_value(x1, "b") layered_shuffle(x1, [do_nothing_method]) print(x1) flat_randomshuffle_method = ShuffleLayerMethod( flat=True, shuffle_func=random_shuffle_indexes) flat_randomshuffle_method.get_unique_values(x1) flat_randomshuffle_method.get_indexes_for_value(x1, "b") layered_shuffle(x1, [flat_randomshuffle_method]) print(x1) flat_blockshuffle_method = ShuffleLayerMethod( flat=True, shuffle_func=block_shuffle_indexes_by_value) layered_shuffle(x2, [flat_blockshuffle_method]) print(x2) flat_dworshuffle_method = ShuffleLayerMethod( flat=True, shuffle_func=dwor_shuffle_indexes) layered_shuffle(x3, [flat_dworshuffle_method]) print(x3) flat_dworshuffle2_method = ShuffleLayerMethod( flat=True, shuffle_func=lambda x: dwor_shuffle_indexes(x, multiplier=2)) layered_shuffle(x4, [flat_dworshuffle2_method]) print(x4) p = list(itertools.product("ABC", "xyz", "123")) Trio = namedtuple("Trio", ["upper", "lower", "digit"]) q = [Trio(*x) for x in p] print("\n".join(str(x) for x in q)) upper_method = ShuffleLayerMethod( layer_attr="upper", shuffle_func=block_shuffle_indexes_by_value) lower_method = ShuffleLayerMethod( layer_attr="lower", shuffle_func=reverse_sort_indexes) digit_method = ShuffleLayerMethod( layer_attr="digit", shuffle_func=random_shuffle_indexes) layered_shuffle(q, [upper_method, lower_method, digit_method]) print("\n".join(str(x) for x in q))
- for each layer, it shuffles values of
-
whisker.random.
make_dwor_hat
(values: Iterable[Any], multiplier: int = 1) → List[Any][source]¶ Makes a “hat” to draw values from. See
gen_dwor()
. Does not modify the starting list; returns a copy.
-
whisker.random.
random_shuffle_indexes
(x: List[Any]) → List[int][source]¶ Returns a list of indexes of
x
, randomly shuffled.
-
whisker.random.
reverse_sort_indexes
(x: List[Any]) → List[int][source]¶ Returns the indexes of
x
in an order that would reverse-sortx
by value.
-
whisker.random.
shuffle_list_chunks
(x: List[Any], chunksize: int) → None[source]¶ Divides a list into chunks and shuffles the chunks themselves (in place). For example:
x = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12] shuffle_list_chunks(x, 4)
x
might now be[5, 6, 7, 8, 1, 2, 3, 4, 9, 10, 11, 12] ^^^^^^^^^^ ^^^^^^^^^^ ^^^^^^^^^^^^^
Uses
cardinal_pythonlib.lists.flatten_list()
andcardinal_pythonlib.lists.sort_list_by_index_list()
. (I say that mainly to test Intersphinx, when it is enabled.)
-
whisker.random.
shuffle_list_slice
(x: List[Any], start: int = None, end: int = None) → None[source]¶ Shuffles a segment of a list,
x[start:end]
, in place.Note that
start=None
means “from the beginning” andend=None
means “to the end”.
-
whisker.random.
shuffle_list_subset
(x: List[Any], indexes: List[int]) → None[source]¶ Shuffles some elements of a list (in place). The elements to interchange (shuffle) as specified by
indexes
.
-
whisker.random.
shuffle_list_within_chunks
(x: List[Any], chunksize: int) → None[source]¶ Divides a list into chunks and shuffles WITHIN each chunk (in place). For example:
x = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12] shuffle_list_within_chunks(x, 4)
x
might now be:[4, 1, 3, 2, 7, 5, 6, 8, 9, 12, 11, 10] ^^^^^^^^^^ ^^^^^^^^^^ ^^^^^^^^^^^^^
-
whisker.random.
shuffle_where_equal_by_attr
(x: List[Any], attrname: str) → None[source]¶ DEPRECATED:
layered_shuffle()
is more powerful.Shuffles a list
x
, in place, where list members are equal as judged by the attributeattrname
.This is easiest to show by example:
from collections import namedtuple import itertools from whisker.random import shuffle_where_equal_by_attr p = list(itertools.product("ABC", "xyz", "123")) Trio = namedtuple("Trio", ["upper", "lower", "digit"]) q = [Trio(*x) for x in p] shuffle_where_equal_by_attr(q, 'digit')
q
started off as:[ Trio(upper='A', lower='x', digit='1'), Trio(upper='A', lower='x', digit='2'), Trio(upper='A', lower='x', digit='3'), Trio(upper='A', lower='y', digit='1'), Trio(upper='A', lower='y', digit='2'), Trio(upper='A', lower='y', digit='3'), Trio(upper='A', lower='z', digit='1'), Trio(upper='A', lower='z', digit='2'), Trio(upper='A', lower='z', digit='3'), Trio(upper='B', lower='x', digit='1'), Trio(upper='B', lower='x', digit='2'), Trio(upper='B', lower='x', digit='3'), Trio(upper='B', lower='y', digit='1'), Trio(upper='B', lower='y', digit='2'), Trio(upper='B', lower='y', digit='3'), Trio(upper='B', lower='z', digit='1'), Trio(upper='B', lower='z', digit='2'), Trio(upper='B', lower='z', digit='3'), Trio(upper='C', lower='x', digit='1'), Trio(upper='C', lower='x', digit='2'), Trio(upper='C', lower='x', digit='3'), Trio(upper='C', lower='y', digit='1'), Trio(upper='C', lower='y', digit='2'), Trio(upper='C', lower='y', digit='3'), Trio(upper='C', lower='z', digit='1'), Trio(upper='C', lower='z', digit='2'), Trio(upper='C', lower='z', digit='3') ]
but after the shuffle
q
might now be:[ Trio(upper='A', lower='x', digit='1'), Trio(upper='A', lower='y', digit='2'), Trio(upper='A', lower='z', digit='3'), Trio(upper='B', lower='z', digit='1'), Trio(upper='A', lower='z', digit='2'), Trio(upper='C', lower='x', digit='3'), Trio(upper='B', lower='y', digit='1'), Trio(upper='A', lower='x', digit='2'), Trio(upper='C', lower='y', digit='3'), Trio(upper='A', lower='y', digit='1'), Trio(upper='C', lower='y', digit='2'), Trio(upper='C', lower='z', digit='3'), Trio(upper='C', lower='y', digit='1'), Trio(upper='C', lower='z', digit='2'), Trio(upper='A', lower='y', digit='3'), Trio(upper='B', lower='x', digit='1'), Trio(upper='B', lower='z', digit='2'), Trio(upper='B', lower='y', digit='3'), Trio(upper='C', lower='z', digit='1'), Trio(upper='C', lower='x', digit='2'), Trio(upper='B', lower='z', digit='3'), Trio(upper='C', lower='x', digit='1'), Trio(upper='B', lower='x', digit='2'), Trio(upper='A', lower='x', digit='3'), Trio(upper='A', lower='z', digit='1'), Trio(upper='B', lower='y', digit='2'), Trio(upper='B', lower='x', digit='3') ]
As you can see, the
digit
attribute seems to have stayed frozen and everything else has jumbled. What has actually happened is that everything withdigit == 1
has been shuffled among themselves, and similarly fordigit == 2
anddigit == 3
.
-
whisker.random.
sort_indexes
(x: List[Any]) → List[int][source]¶ Returns the indexes of
x
in an order that would sortx
by value.See https://stackoverflow.com/questions/7851077/how-to-return-index-of-a-sorted-list