Branch and Keyspace Management

Tools for managing the parameter space of a parallel run.

class vivarium_cluster_tools.psimulate.branches.Keyspace(branches, keyspace)[source]

A representation of a collection of simulation configurations.

Parameters:
classmethod from_branch_configuration(branch_configuration_file)[source]
Return type:

Keyspace

Parameters:

branch_configuration_file (str | Path) – Absolute path to the branch configuration file.

classmethod from_previous_run(keyspace_path, branches_path)[source]
Return type:

Keyspace

Parameters:
  • keyspace_path (Path)

  • branches_path (Path)

classmethod for_load_test(num_workers)[source]

Create a keyspace for load testing.

Return type:

Keyspace

Parameters:

num_workers (int) – The number of workers (and thus jobs) to create.

Returns:

A Keyspace with the specified number of unique random seeds and input draws and an empty branch configuration.

classmethod from_entry_point_args(input_branch_configuration_path, keyspace_path, branches_path, extras)[source]
Return type:

Keyspace

Parameters:
  • input_branch_configuration_path (Path | None)

  • keyspace_path (Path)

  • branches_path (Path)

  • extras (dict[str, Any])

persist(keyspace_path, branches_path)[source]
Return type:

None

Parameters:
  • keyspace_path (Path)

  • branches_path (Path)

add_draws(num_draws)[source]
Return type:

None

Parameters:

num_draws (int)

add_seeds(num_seeds)[source]
Return type:

None

Parameters:

num_seeds (int)

vivarium_cluster_tools.psimulate.branches.calculate_input_draws(input_draw_count, existing_draws=None)[source]

Determines a random sample of the GBD input draws to use.

Return type:

list[int]

Parameters:
  • input_draw_count (int)

  • existing_draws (list[int] | None)

Notes

The input draws returned account for the draw count provided and any already-used draws.

Parameters:
  • input_draw_count (int) – The number of draws to pull.

  • existing_draws (list[int] | None) – Any draws that have already been pulled and should not be pulled again.

Returns:

A set of unique input draw numbers, guaranteed not to overlap with any existing draw numbers.

Return type:

list[int]

vivarium_cluster_tools.psimulate.branches.calculate_random_seeds(random_seed_count, existing_seeds=None)[source]

Generates random seeds to use given a count of seeds and any existing seeds.

Return type:

list[int]

Parameters:
  • random_seed_count (int) – The number of random seeds to generate.

  • existing_seeds (list[int] | None) – Any random seeds that have already been generated and should not be generated again.

Returns:

A set of unique random seeds, guaranteed not to overlap with any existing random seeds.

vivarium_cluster_tools.psimulate.branches.calculate_keyspace(branches)[source]
Return type:

dict[str, list[Any]]

Parameters:

branches (list[dict[str, Any]])

vivarium_cluster_tools.psimulate.branches.load_branch_configuration(path)[source]
Return type:

tuple[list[dict[str, Any]], int, int, list[int] | None, list[int] | None]

Parameters:

path (Path)

vivarium_cluster_tools.psimulate.branches.expand_branch_templates(templates)[source]

Expand branch template lists into individual branches.

Take a list of dictionaries of configuration values (like the ones used in experiment branch configurations) and expand it by taking any values which are lists and creating a new set of branches which is made up of the product of all those lists plus all non-list values.

For example this:

{'a': {'b': [1,2], 'c': 3, 'd': [4,5,6]}}

becomes this:

[
    {'a': {'b': 1, 'c': 3, 'd': 4}},
    {'a': {'b': 2, 'c': 3, 'd': 5}},
    {'a': {'b': 1, 'c': 3, 'd': 6}},
    {'a': {'b': 2, 'c': 3, 'd': 4}},
    {'a': {'b': 1, 'c': 3, 'd': 5}},
    {'a': {'b': 2, 'c': 3, 'd': 6}}
]
Return type:

list[dict[str, Any]]

Parameters:

templates (list[dict[str, Any]]) – A dictionary of configuration values that may contain lists.

Returns:

A list of dictionaries, each representing a single branch configuration.

vivarium_cluster_tools.psimulate.branches.validate_artifact_path(artifact_path)[source]

Validates that the path to the data artifact from the branches file exists.

The path specified in the configuration must be absolute

Return type:

None

Parameters:

artifact_path (str) – The path to the artifact.

Raises:

FileNotFoundError – If the artifact path is not an absolute path or does not exist.