Workflow Configuration

Top-level workflow fields that must appear in the YAML file rather than a CLI override.

vivarium.cluster_tools.dagger.config.config.REQUIRED_WORKFLOW_FIELDS = {'steps'}

Top-level workflow fields that must appear in the YAML file.

name, project, queue, and output_directory are each required overall but may be provided via either the YAML or a CLI override; their presence is validated by load_workflow_config() rather than by WorkflowConfig.parse_yaml_file().

vivarium.cluster_tools.dagger.config.config.DEFAULT_BACKUP_FREQ_SECONDS = 1800.0

Default backup frequency in seconds (30 minutes), matching psimulate run.

class vivarium.cluster_tools.dagger.config.config.ResourceConfig(memory_gb, project=None, queue=None, runtime='01:00:00', cores=1, hardware=None, requires_archive_node=False)[source]

Compute resource specification for a workflow step.

Parameters:
  • memory_gb (int)

  • project (str | None)

  • queue (str | None)

  • runtime (str)

  • cores (int)

  • hardware (list[str] | None)

  • requires_archive_node (bool)

memory_gb: int

Memory in GB.

project: str | None = None

Cluster project to charge. Falls back to the workflow-level project.

queue: str | None = None

Cluster queue to submit to. Falls back to the workflow-level queue.

runtime: str = '01:00:00'

Maximum runtime in hh:mm:ss format. Default is 01:00:00.

cores: int = 1

Number of CPU cores to request. Default is 1.

hardware: list[str] | None = None

Optional list of hardware types to target (e.g. ["r650", "r650v2"]).

requires_archive_node: bool = False

Whether to enforce landing on an archive node.

classmethod from_dict(data, *, workflow_project=None, workflow_queue=None)[source]

Create a ResourceConfig from a dictionary.

Step-level values take precedence; workflow-level defaults fill in any that are absent.

Return type:

ResourceConfig

Parameters:
  • data (dict[str, Any]) – Resource dictionary from a step’s resources section.

  • workflow_project (str | None) – Workflow-level project used as fallback.

  • workflow_queue (str | None) – Workflow-level queue used as fallback.

to_dict()[source]

Serialize to a dictionary, omitting None values and defaults.

Return type:

dict[str, Any]

to_native_specification(job_name)[source]

Convert to a NativeSpecification for Jobmon task submission.

Return type:

NativeSpecification

Parameters:

job_name (str) – The SLURM job name for this step’s tasks.

class vivarium.cluster_tools.dagger.config.config.ParsedStep(step_type, name, api_kwargs)[source]

A parsed workflow step ready to be passed to an interface API function.

Produced by parse_step_from_yaml(). Holds the inputs to the matching get_*_step_tasks function (in api_kwargs) plus the step type tag used to dispatch task building and YAML serialization.

Parameters:
step_type: str

One of “bash”, “simulation”, “pytest”, “python”, “notebook”.

name: str

The step’s unique name within the workflow.

api_kwargs: dict[str, Any]

Kwargs ready to send into the matching interface API function. Excludes tool and is_resume, which are supplied by the builder.

class vivarium.cluster_tools.dagger.config.config.WorkflowConfig(name, project, queue, output_directory, default_environment, steps, max_attempts=2)[source]

Parsed and validated workflow configuration.

Parameters:
name: str

Name of the workflow. This is what will be displayed in Jobmon

project: str

Project that this workflow will be run under. E.g. ‘proj_simscience’.

queue: str

Queue to submit the workflow to.

output_directory: Path

Directory where workflow outputs will be stored. Both relative and absolute paths are accepted.

default_environment: str | None

Default environment to use for steps that do not specify one.

steps: list[ParsedStep]

Parsed workflow steps, each carrying the kwargs needed by the matching interface API function.

max_attempts: int = 2

Maximum number of Jobmon task attempts. Default is 2.

static parse_yaml_file(path)[source]

Read and perform basic structural validation on a workflow YAML file.

Returns the workflow dict from inside the top-level key.

Return type:

dict[str, Any]

Parameters:

path (Path) – Path to the YAML file.

Raises:

ValueError – If the file does not contain a top-level workflow key, if required workflow-level fields are missing, or if the workflow ‘steps’ list is empty.

Returns:

  • The raw workflow dictionary from the YAML file, without any further parsing or

  • validation.