Results Writing
Simple per-task result writing. The workflow script serializes metadata JSON files for the worker to pick up. Each worker writes one parquet file per metric directly to the results directory.
Directory structure:
results/
metadata/
{task_id}.json
{metric_name}/
{task_id}.parquet
Reading all results for a metric is simply pd.read_parquet(results_dir / metric_name),
which automatically combines all parquet files in the directory.
Task completion is determined by the existence of result parquet files. Metadata for completed tasks is read from the metadata JSON files in the metadata directory.
- vivarium_cluster_tools.psimulate.results.writing.write_metadata(metadata_dir, job_parameters)[source]
Write a metadata JSON file for a single task.
The metadata file serializes the job parameters for the workhorse script to pick up, and also serves as the reference for restart and expand metadata.
- Return type:
- Parameters:
metadata_dir (Path) – Directory to write the metadata file.
job_parameters (JobParameters) – The job parameters for this task.
- vivarium_cluster_tools.psimulate.results.writing.write_task_results(results_dir, job_parameters, results_dict)[source]
Write a single task’s results directly to the results directory.
- Return type:
- Parameters:
results_dir (Path) – The results directory (e.g.,
output_root/results).job_parameters (JobParameters) – The job parameters for this task.
results_dict (dict[str, DataFrame]) – Dictionary mapping metric names to results DataFrames.
- vivarium_cluster_tools.psimulate.results.writing.get_completed_task_ids(results_dir)[source]
Get task IDs that have result parquet files.
Scans all subdirectories of
results_dirfor.parquetfiles and extracts the task IDs from their filenames (stems).
- vivarium_cluster_tools.psimulate.results.writing.collect_metadata(metadata_dir, results_dir)[source]
Collect metadata for completed tasks.
Determines which tasks completed by scanning for result parquet files in
results_dir, then reads the corresponding metadata JSON files frommetadata_dirto build the metadata DataFrame.- Return type:
DataFrame- Parameters:
- Returns:
Combined metadata DataFrame with flattened job-specific parameters, or an empty DataFrame if no completed tasks exist.