psimulate CLI

Command line interface for psimulate.

psimulate

A command line utility for running many simulations in parallel.

You may initiate a new run with the run sub-command or restart a run from where it was stopped by using the restart sub-command.

psimulate [OPTIONS] COMMAND [ARGS]...

expand

Expand a previous run at RESULTS_ROOT by adding input draws and/or random seeds.

Expanding will not erase existing results, but will start workers to perform the additional simulations determined by the added draws/seeds. RESULTS_ROOT is expected to be an output directory from a previous psimulate run invocation.

psimulate expand [OPTIONS] RESULTS_ROOT

Options

--add-draws <add_draws>

The number of input draws to add to a previous run.

Default:

0

--add-seeds <add_seeds>

The number of random seeds to add to a previous run.

Default:

0

--pdb

Drop into python debugger if an error occurs.

-v

Configure logging verbosity.

--no-batch

Don’t batch results, write them as they come in.

--redis <redis>

Number of redis databases to use. Defaults to a redis instance for every 1000 workers.

-w, --max-workers <max_workers>

The maximum number of workers (and therefore jobs) to run concurrently. Defaults to the total number of jobs.

Default:

8000

-h, --hardware <hardware>

The (comma-separated) specific hardware to run the jobs on. This can be useful to request specifically fast nodes (‘-h r650xs’) vs high capacity nodes (‘-h r630,r650,r650v2’). Note that the hardware changes on a roughly annual schedule. The currently-supported options are: [‘c6320’, ‘r630’, ‘c6420v1’, ‘c6420v2’, ‘r650’, ‘r650v2’, ‘r650xs’]. For details, refer to: https://docs.cluster.ihme.washington.edu/#hpc-execution-host-hardware-specifications

-m, --peak-memory <peak_memory>

The estimated maximum memory usage in GB of an individual simulate job. The simulations will be run with this as a limit.

Default:

3

-r, --max-runtime <max_runtime>

The estimated maximum runtime (hh:mm:ss) of the simulation jobs. The maximum supported runtime is 3 days. Keep in mind that the session you are launching from must be able to live at least as long as the simulation jobs, and that runtimes by node vary wildly.

Default:

24:00:00

-q, --queue <queue>

The cluster queue to assign psimulate jobs to. Queue defaults to the appropriate queue based on max-runtime. long.q allows for much longer runtimes but there may be reasons to send jobs to that queue even if they don’t have runtime constraints, such as node availability.

Options:

all.q | long.q

-P, --project <project>

Required The cluster project under which to run the simulation.

Options:

proj_simscience | proj_simscience_prod | proj_csu

Arguments

RESULTS_ROOT

Required argument

restart

Restart a parallel simulation from a previous run at RESULTS_ROOT.

Restarting will not erase existing results, but will start workers to perform the remaining simulations. RESULTS_ROOT is expected to be an output directory from a previous psimulate run invocation.

psimulate restart [OPTIONS] RESULTS_ROOT

Options

--pdb

Drop into python debugger if an error occurs.

-v

Configure logging verbosity.

--no-batch

Don’t batch results, write them as they come in.

--redis <redis>

Number of redis databases to use. Defaults to a redis instance for every 1000 workers.

-w, --max-workers <max_workers>

The maximum number of workers (and therefore jobs) to run concurrently. Defaults to the total number of jobs.

Default:

8000

-h, --hardware <hardware>

The (comma-separated) specific hardware to run the jobs on. This can be useful to request specifically fast nodes (‘-h r650xs’) vs high capacity nodes (‘-h r630,r650,r650v2’). Note that the hardware changes on a roughly annual schedule. The currently-supported options are: [‘c6320’, ‘r630’, ‘c6420v1’, ‘c6420v2’, ‘r650’, ‘r650v2’, ‘r650xs’]. For details, refer to: https://docs.cluster.ihme.washington.edu/#hpc-execution-host-hardware-specifications

-m, --peak-memory <peak_memory>

The estimated maximum memory usage in GB of an individual simulate job. The simulations will be run with this as a limit.

Default:

3

-r, --max-runtime <max_runtime>

The estimated maximum runtime (hh:mm:ss) of the simulation jobs. The maximum supported runtime is 3 days. Keep in mind that the session you are launching from must be able to live at least as long as the simulation jobs, and that runtimes by node vary wildly.

Default:

24:00:00

-q, --queue <queue>

The cluster queue to assign psimulate jobs to. Queue defaults to the appropriate queue based on max-runtime. long.q allows for much longer runtimes but there may be reasons to send jobs to that queue even if they don’t have runtime constraints, such as node availability.

Options:

all.q | long.q

-P, --project <project>

Required The cluster project under which to run the simulation.

Options:

proj_simscience | proj_simscience_prod | proj_csu

Arguments

RESULTS_ROOT

Required argument

run

Run a parallel simulation.

The simulation itself is defined by a MODEL_SPECIFICATION yaml file and the parameter changes across runs are defined by a BRANCH_CONFIGURATION yaml file.

The path to the data artifact can be provided as an argument here, in the branch configuration, or in the model specification file. Values provided as a command line argument or in the branch specification file will override a value specified in the model specifications file. If an artifact path is provided both as a command line argument and to the branch configuration file a ConfigurationError will be thrown.

Within the provided or default results directory, a subdirectory will be created with the same name as the MODEL_SPECIFICATION if one does not exist. Results will be written to a further subdirectory named after the start time of the simulation run.

psimulate run [OPTIONS] MODEL_SPECIFICATION BRANCH_CONFIGURATION

Options

-i, --artifact_path <artifact_path>

The path to the artifact data file.

-o, --result-directory <result_directory>

Required The directory to write results to. A folder will be created in this directory with the same name as the configuration file.

--pdb

Drop into python debugger if an error occurs.

-v

Configure logging verbosity.

--no-batch

Don’t batch results, write them as they come in.

--redis <redis>

Number of redis databases to use. Defaults to a redis instance for every 1000 workers.

-w, --max-workers <max_workers>

The maximum number of workers (and therefore jobs) to run concurrently. Defaults to the total number of jobs.

Default:

8000

-h, --hardware <hardware>

The (comma-separated) specific hardware to run the jobs on. This can be useful to request specifically fast nodes (‘-h r650xs’) vs high capacity nodes (‘-h r630,r650,r650v2’). Note that the hardware changes on a roughly annual schedule. The currently-supported options are: [‘c6320’, ‘r630’, ‘c6420v1’, ‘c6420v2’, ‘r650’, ‘r650v2’, ‘r650xs’]. For details, refer to: https://docs.cluster.ihme.washington.edu/#hpc-execution-host-hardware-specifications

-m, --peak-memory <peak_memory>

The estimated maximum memory usage in GB of an individual simulate job. The simulations will be run with this as a limit.

Default:

3

-r, --max-runtime <max_runtime>

The estimated maximum runtime (hh:mm:ss) of the simulation jobs. The maximum supported runtime is 3 days. Keep in mind that the session you are launching from must be able to live at least as long as the simulation jobs, and that runtimes by node vary wildly.

Default:

24:00:00

-q, --queue <queue>

The cluster queue to assign psimulate jobs to. Queue defaults to the appropriate queue based on max-runtime. long.q allows for much longer runtimes but there may be reasons to send jobs to that queue even if they don’t have runtime constraints, such as node availability.

Options:

all.q | long.q

-P, --project <project>

Required The cluster project under which to run the simulation.

Options:

proj_simscience | proj_simscience_prod | proj_csu

Arguments

MODEL_SPECIFICATION

Required argument

BRANCH_CONFIGURATION

Required argument

test

psimulate test [OPTIONS] {sleep|large_results}

Options

-n, --num-workers <num_workers>
Default:

1000

-o, --result-directory <result_directory>
Default:

/mnt/team/simulation_science/priv/engineering/load_tests

--pdb

Drop into python debugger if an error occurs.

-v

Configure logging verbosity.

--no-batch

Don’t batch results, write them as they come in.

--redis <redis>

Number of redis databases to use. Defaults to a redis instance for every 1000 workers.

-w, --max-workers <max_workers>

The maximum number of workers (and therefore jobs) to run concurrently. Defaults to the total number of jobs.

Default:

8000

-h, --hardware <hardware>

The (comma-separated) specific hardware to run the jobs on. This can be useful to request specifically fast nodes (‘-h r650xs’) vs high capacity nodes (‘-h r630,r650,r650v2’). Note that the hardware changes on a roughly annual schedule. The currently-supported options are: [‘c6320’, ‘r630’, ‘c6420v1’, ‘c6420v2’, ‘r650’, ‘r650v2’, ‘r650xs’]. For details, refer to: https://docs.cluster.ihme.washington.edu/#hpc-execution-host-hardware-specifications

-m, --peak-memory <peak_memory>

The estimated maximum memory usage in GB of an individual simulate job. The simulations will be run with this as a limit.

Default:

3

-r, --max-runtime <max_runtime>

The estimated maximum runtime (hh:mm:ss) of the simulation jobs. The maximum supported runtime is 3 days. Keep in mind that the session you are launching from must be able to live at least as long as the simulation jobs, and that runtimes by node vary wildly.

Default:

24:00:00

-q, --queue <queue>

The cluster queue to assign psimulate jobs to. Queue defaults to the appropriate queue based on max-runtime. long.q allows for much longer runtimes but there may be reasons to send jobs to that queue even if they don’t have runtime constraints, such as node availability.

Options:

all.q | long.q

-P, --project <project>

Required The cluster project under which to run the simulation.

Options:

proj_simscience | proj_simscience_prod | proj_csu

Arguments

TEST_TYPE

Required argument