<no title> — vivarium_cluster_tools 3.1.6 documentation

vivarium_cluster_tools.psimulate.performance_logger.transform_perf_df_for_appending(perf_df, output_paths)[source]

Transform performance DataFrame for appending to central logs.

Take performance dataframe from performance report and 1) turn index into columns so we can write to csv, 2) add artifact name column, and 3) aggregate scenario information into one column.

Return type:

DataFrame

Parameters:

perf_df (DataFrame) – DataFrame pulled from performance report with index values uniquely identifying each child job and column values containing their performance data.
output_paths (OutputPaths) – OutputPaths object containing information about the results directory.

Returns:

The transformed DataFrame which can be directly appended to our central logs. The data now has a simple RangeIndex, the index values as columns, a new artifact name column, and a new scenario parameters column.

vivarium_cluster_tools.psimulate.performance_logger.append_child_job_data(child_job_performance_data)[source]

Append child job data and return name of first file containing this data.

Return type:: str
Parameters:: child_job_performance_data (DataFrame) – DataFrame pulled from transform_perf_df_for_appending.
Returns:: The first file in our central logs containing child job data.

vivarium_cluster_tools.psimulate.performance_logger.generate_runner_job_data(job_number, output_paths, first_file_with_data)[source]

Create runner job data to append to central logs.

Return type:

DataFrame

Parameters:

job_number (int) – The job number for our runner job.
output_paths (OutputPaths) – OutputPaths object containing information about the results directory.
first_file_with_data (str) – The first file in our central logs containing child job data launched by our runner job.

vivarium_cluster_tools.psimulate.performance_logger.append_perf_data_to_central_logs(perf_df, output_paths)[source]

Append performance data to the central logs.

This consists of child job data and runner data. The child job data will contain performance information and identifying information for each child job and the runner data will contain data about the runner job that launched these child jobs.

Return type:

None

Parameters:

perf_df (DataFrame) – DataFrame pulled from performance report.
output_paths (OutputPaths) – OutputPaths object containing information about the results directory.