Cluster Interface

class vivarium.cluster_tools.core.cluster.interface.NativeSpecification(job_name, project, queue, peak_memory, max_runtime, hardware, cores, requires_archive_node)[source]
Parameters:
job_name: str

Alias for field number 0

project: str

Alias for field number 1

queue: str

Alias for field number 2

peak_memory: float

Alias for field number 3

max_runtime: str

Alias for field number 4

hardware: list[str]

Alias for field number 5

cores: int

Number of CPU cores to request from SLURM. Default is 1.

requires_archive_node: bool

Whether the task must land on a node tagged with the SLURM archive feature.

to_jobmon_spec(worker_logging_root)[source]

Build the Jobmon compute resources dict from this NativeSpecification.

Return type:

dict[str, Any]

Parameters:

worker_logging_root (Path) – Root directory for worker logs.

Returns:

Dictionary of compute resources for Jobmon.

Notes

  • memory is passed in GB because the Jobmon SLURM plugin performs its own GB → MB conversion internally.

  • constraints is a SLURM --constraint expression built from hardware and requires_archive_node. The hardware group is always parenthesized and pipe-joined (OR); archive is AND-joined when required. Examples: "(r650)", "(r650|r650v2)", "(r650)&archive", "(r650|r650v2)&archive", "archive". The key is omitted when neither is set.

  • standard_output and standard_error route SLURM stdout/stderr to the cluster logs directory. The Jobmon SLURM plugin appends the task name and SLURM job ID to these paths automatically.

vivarium.cluster_tools.core.cluster.interface.get_workflow_timeout_seconds()[source]

Get jobmon workflow’s timeout in seconds.

The result includes a small buffer so that the workflow can shut down gracefully before SLURM terminates the runner node.

Return type:

int | None

Returns:

Remaining time in seconds (with buffer subtracted), or None if there is no SLURM allocation (in which case Jobmon’s built-in default timeout is used).

Raises:

RuntimeError – If the remaining time is less than the safety buffer, or the remaining time cannot be determined from squeue.