Performance Reporting

Tools for summarizing and reporting performance information.

class vivarium.cluster_tools.vipin.perf_report.PerformanceSummary(log_dir)[source]

A class to implement a getter for data in the workers’ performance logs.

Given a Path, a PerformanceSummary class provides a generator to get at each entry in the workers’ performance logs. The class also provides a method to get all entries in a pd.DataFrame. This class is intended as a singleton to provide data about a single Vivarium simulation run.

Parameters:

log_dir (Path)

log_dir

Path of log_dir

errors

Number of errors encountered while parsing logs

get_summaries()[source]

Generator to get all performance summary log messages in PerformanceSummary

Return type:

Generator[DataFrame, None, None]

to_df()[source]
Return type:

DataFrame

TELEMETRY_PATTERN = re.compile('^{\\"host\\".+\\"job_number\\".+}$')
PERF_LOG_PATTERN = re.compile('^perf\\.[0-9a-f]{16}\\.log$')
clean_perf_logs()[source]

Remove all performance logs from the log_dir (after to_df has been called)

Return type:

None

vivarium.cluster_tools.vipin.perf_report.set_index_scenario_cols(perf_df)[source]

Get the columns useful to index performance data by.

Return type:

tuple[DataFrame, list[str]]

Parameters:

perf_df (DataFrame)

vivarium.cluster_tools.vipin.perf_report.print_stat_report(perf_df, scenario_cols)[source]

Print some helpful stats from the performance data.

The stats are grouped by scenario_cols.

Return type:

None

Parameters:
  • perf_df (DataFrame)

  • scenario_cols (list[str])

vivarium.cluster_tools.vipin.perf_report.report_performance(input_directory, output_directory, output_hdf, verbose)[source]

Main method for vipin reporting.

Gets job performance data, outputs to a file, and logs a report.

Return type:

DataFrame | None

Parameters: