Rich Table ReportπŸ”—

Rich Tables are a popular way to display benchmark results in a clear and concise manner. SimpleBench leverages the rich library to generate these tables, providing visually appealing and easy-to-read reports.

Note

Rich Table reports are just one of several reporting options available in SimpleBench. You can also generate CSV reports, graph reports, and JSON reports, each providing different perspectives on your benchmark results.

Refer to the command_line_options section for more details on how to generate and customize these reports.

To generate a Rich Table report, you can use an option like –rich-table.ops when running your benchmarks. For example:

Running a benchmark with a Rich Table reportπŸ”—
  python my_benchmark_script.py --rich-table.ops --progress

This command executes the benchmarks in my_benchmark_script.py and generates a Rich Table in the terminal displaying the operations-per-second results. A basic output will look something like this:

Sample Rich Table Output (operations per second)πŸ”—
                                                                   addition_benchmark
                                                                operations per second

                                           A simple addition benchmark of Python's built-in sum function.
 ┏━━━━━━━━┳━━━━━━━━━━━━┳━━━━━━━━┳━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┳━━━━━━━━━━━━┳━━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━┳━━━━━━━━┓
 ┃        ┃            ┃        ┃ Elapsed ┃             ┃               ┃            ┃            ┃            ┃             ┃                ┃        ┃
 ┃   N    ┃ Iterations ┃ Rounds ┃ Seconds ┃ mean kOps/s ┃ median kOps/s ┃ min kOps/s ┃ max kOps/s ┃ 5th kOps/s ┃ 95th kOps/s ┃ std dev kOps/s ┃  rsd%  ┃
 ┑━━━━━━━━╇━━━━━━━━━━━━╇━━━━━━━━╇━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━╇━━━━━━━━━━━━╇━━━━━━━━━━━━╇━━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━╇━━━━━━━━┩
 β”‚      1 β”‚    44872   β”‚      1 β”‚  0.32   β”‚    143.00   β”‚     144.00    β”‚      1.07  β”‚    153.00  β”‚    140.00  β”‚    150.00   β”‚        9.28    β”‚  6.51% β”‚
 β””β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Note

Significant Digits in Reported Results

To avoid β€œfalse precision”, statistical results are shown to three significant digits. Due to the inherent variability of performance measurement, any further digits are typically meaningless statistical noise and to avoid over-interpretation, they are not shown.

This is not an issue with SimpleBench itself, but rather a fundamental aspect of benchmarking and performance measurement in the real world.

Note

Interpreting Outliers in Benchmark Results

In the sample output above, you may notice that the min kOps/s value is an extreme outlier, far from the mean and median. This is a realistic reflection of real-world benchmarking. System events like garbage collection, process scheduling, or I/O interrupts can cause individual iterations to be significantly slower than the typical case.

This is precisely why SimpleBench provides a full suite of statistics. Instead of relying solely on the mean, you should also consider:

  • The median, which is resistant to outliers and often gives a better sense of β€œtypical” performance.

  • The 5th and 95th percentiles, which show the range of performance for the vast majority of iterations, excluding the most extreme outliers.

  • The standard deviation (std dev) and RSD%, which quantify the level of inconsistency in the results. A high value indicates significant variability.

By providing these metrics, SimpleBench allows you to get a complete and honest picture of your code’s performance, including its variability.

Report Header InformationπŸ”—

The report header provides context about the benchmark being reported:

Common Header Fields
Title

The title of the benchmark function. It can be set via the title parameter of the @benchmark decorator or defaults to the name of the benchmark function if not set. In this case, it is the name of the function: addition_benchmark.

Report Section

The section of benchmark statistics being reported. In this case, we are reporting the operations-per-second statistics (OPS) as requested via the –rich-table.ops command-line option. Other available report sections include timing statistics (how long one execution of the function takes) (–rich-table.timing), and memory usage statistics (–rich-table.memory) which reports the net and peak memory usage of the benchmarked function as two separate tables.

If run as python basic_benchmark.py –rich-table (without specifying a report section), all available report sections will be included in the output.

Multiple options can be individually specified to include multiple report sections, for example: –rich-table.ops –rich-table.timing.

Description

A description of the benchmark function. This can be set via the description parameter to the @benchmark decorator, or defaults to the first line of the function’s docstring if not set there.

If not set in either location, it defaults to β€˜(no description)’.

In this case, it is the docstring of the addition_benchmark function.

Table Column DefinitionsπŸ”—

The descriptions of the fields included in each report type is described below.

Common Report ColumnsπŸ”—

These columns are common to all Rich Table report types.

Common Report Columns
N

A complexity weighting used to indicate the input size for a benchmark.

A Big-O (O(n), etc) complexity weighting. This is used to indicate the β€˜size’ of the input to a parameterized benchmark. It defaults to 1 unless overridden by the benchmark. The N value is used to help compare performance across different input sizes (if applicable) and to analyze how the function scales with different input sizes.

Iterations

The number of statistical samples taken for the benchmark.

The total number of iterations executed during the benchmark. An iteration is an execution of the benchmarked function once for statistical reporting purposes. It may be composed of multiple actual rounds to improve accuracy and precision, but is reported as a single count for the purposes of the table.

Rounds

The number of times the benchmarked function is executed within a single iteration.

The number of rounds executed during an iteration. A round is a single execution of the benchmarked function. Multiple rounds are often executed within an iteration to gather more accurate timing and performance data. They are executed in rapid succession, and their results are aggregated to produce the final metrics for an iteration.

Elapsed Seconds

The total CPU time spent executing the benchmarked code.

The total measured elapsed time in seconds for all iterations of the benchmark. This metric provides an overview of how long the benchmark took to complete. This does not include any setup or teardown time, focusing solely on the execution time of the benchmarked code. By default, this measures CPU time, not wall-clock time, to provide a more accurate representation of the code’s performance. It can be overridden to measure wall-clock time instead if so desired.

Operations Per SecondπŸ”—

The operations per second report provides a detailed overview of the performance of the benchmarked code in terms of how many operations it can perform per second. This is a common metric used to evaluate the efficiency of code, especially in performance-critical applications.

Output numbers are scaled to appropriate units (Ops/s, kOps/s, MOps/s, etc) for easier readability.

Operations Per Second Report Columns
mean Ops/s

The average number of operations per second.

The arithmetic mean average number of of operations per second (Ops/s) performed during the benchmark. This metric is calculated by dividing the total number of operations executed by the total elapsed time, then scaling it an appropriate factor (for example, kOps/s) for easier readability. It provides a quick overview of the benchmark’s performance.

median Ops/s

The 50th percentile (middle value) of operations per second.

The median (50th percentile) number of operations per second (Ops/s) performed during the benchmark. This metric represents the middle value of the Ops/s measurements collected during the benchmark, providing a robust measure of central tendency that is less affected by outliers compared to the mean.

min Ops/s

The lowest (worst) performance recorded across all iterations.

The minimum number of operations per second (Ops/s) recorded during the benchmark. This metric indicates the lowest performance observed during the benchmark runs, which can be useful for identifying potential bottlenecks or performance issues.

max Ops/s

The highest (best) performance recorded across all iterations.

The maximum number of operations per second (Ops/s) recorded during the benchmark. This metric indicates the highest performance observed during the benchmark runs, showcasing the best-case scenario for the benchmarked code.

5th Ops/s

The 5th percentile of operations per second.

The 5th percentile number of operations per second (Ops/s) recorded during the benchmark. This metric indicates that 5% of the Ops/s measurements were below this value, providing insight into the lower end of the typical performance distribution.

95th Ops/s

The 95th percentile of operations per second.

The 95th percentile number of operations per second (Ops/s) recorded during the benchmark. This metric indicates that 95% of the Ops/s measurements were below this value, providing insight into the upper end of the typical performance distribution.

std dev kOps/s

A measure of the variation or inconsistency in performance.

The standard deviation of the operations per second (Ops/s) measurements collected during the benchmark. This metric quantifies the amount of variation or dispersion in the Ops/s values, providing insight into the consistency of the benchmark’s performance. A lower standard deviation indicates more consistent performance, while a higher standard deviation suggests greater variability in the results.

rsd%

A normalized measure of performance inconsistency, expressed as a percentage.

The relative standard deviation (RSD) expressed as a percentage. This metric is calculated by dividing the standard deviation by the mean and multiplying by 100. It provides a normalized measure of variability, allowing for easier comparison of consistency across different benchmarks or parameter configurations. A lower RSD% indicates more consistent performance relative to the mean, while a higher RSD% suggests greater variability in the results.

TimingπŸ”—

A timing report focuses on the time taken to execute the benchmarked code, rather than the number of operations per second. It provides insights into the average time per operation and other timing-related statistics.

Output numbers are scaled to appropriate units (seconds, milliseconds, microseconds, etc) for easier readability.

Timing Report Columns
mean s/op

The average time in seconds per operation.

The arithmetic mean average time in seconds per operation (s/op). This metric is calculated by dividing the total elapsed time by the total number of operations. It provides a direct measure of how long a single operation takes on average.

median s/op

The 50th percentile (middle value) of seconds per operation.

The median (50th percentile) time in seconds per operation. This metric represents the middle value of the timing measurements, providing a robust measure of central tendency that is less affected by unusually fast or slow iterations (outliers).

min s/op

The lowest (fastest) time per operation recorded across all iterations.

The minimum time in seconds per operation recorded during the benchmark. This metric indicates the best-case performance observed, showcasing the fastest execution time for a single operation.

max s/op

The highest (slowest) time per operation recorded across all iterations.

The maximum time in seconds per operation recorded during the benchmark. This metric indicates the worst-case performance observed, which can be useful for identifying potential bottlenecks or performance stalls.

5th s/op

The 5th percentile of seconds per operation. 5% of iterations were faster than this.

The 5th percentile time in seconds per operation. This metric indicates that 5% of the timing measurements were faster than this value, providing insight into the best-case end of the performance distribution.

95th s/op

The 95th percentile of seconds per operation. 95% of iterations were faster than this.

The 95th percentile time in seconds per operation. This metric indicates that 95% of the timing measurements were faster than this value, providing insight into the typical worst-case performance, excluding extreme outliers.

std dev s/op

A measure of the variation or inconsistency in the time per operation.

The standard deviation of the seconds per operation (s/op) measurements. This metric quantifies the amount of variation in the timing values. A lower standard deviation indicates more consistent, predictable execution times.

rsd%

A normalized measure of timing inconsistency, expressed as a percentage.

The relative standard deviation (RSD) expressed as a percentage. This metric is calculated by dividing the standard deviation by the mean time. It provides a normalized measure of variability, allowing for easier comparison of timing consistency across different benchmarks.

Memory UsageπŸ”—

A memory usage Rich Table report provides information about the memory consumption of the benchmarked code. It includes statistics on average and peak memory usage during the benchmark runs. Output numbers are scaled to appropriate units (bytes, kB, MB, etc) for easier readability.

For a memory usage Rich Table report, two tables are generated: one for average memory usage and another for peak memory usage. The key fields in these tables include:

Memory Usage Report Columns
mean bytes

The average memory allocated per operation, in bytes.

The arithmetic mean average memory allocated per operation. This metric provides a general overview of the benchmark’s memory footprint under typical execution.

median bytes

The 50th percentile (middle value) of memory allocated per operation.

The median (50th percentile) of memory allocated per operation. This provides a robust measure of the typical memory usage that is less affected by iterations with unusually high or low memory consumption.

min bytes

The minimum memory allocated per operation across all iterations.

The minimum memory allocated per operation recorded during the benchmark. This metric indicates the lowest memory footprint observed, representing the best-case scenario for memory efficiency.

max bytes

The maximum memory allocated per operation across all iterations.

The maximum memory allocated per operation recorded during the benchmark. This metric indicates the highest memory footprint observed, which is crucial for understanding peak memory demand and potential memory-related issues.

5th bytes

The 5th percentile of memory allocated per operation.

The 5th percentile of memory allocated per operation. This metric indicates that 5% of the iterations used less memory than this value, providing insight into the lower end of the memory usage distribution.

95th bytes

The 95th percentile of memory allocated per operation.

The 95th percentile of memory allocated per operation. This metric indicates that 95% of the iterations used less memory than this value, which is useful for understanding the typical upper bound of memory usage, excluding extreme outliers.

std dev bytes

A measure of the variation in memory allocation per operation.

The standard deviation of the memory allocation measurements. This metric quantifies the amount of variation in memory usage across iterations. A lower value indicates more consistent and predictable memory behavior.

rsd%

A normalized measure of memory usage inconsistency, expressed as a percentage.

The relative standard deviation (RSD) expressed as a percentage. This metric is calculated by dividing the standard deviation by the mean memory usage. It provides a normalized measure of variability, allowing for easier comparison of memory consistency across different benchmarks.

Report Variations and DestinationsπŸ”—

The example above shows an operations-per-second report printed to the console. SimpleBench provides several variations:

  • –rich-table: Generates tables for all result types (ops, timing, and memory).

  • –rich-table.ops: Generates tables only for operations-per-second results.

  • –rich-table.timing: Generates tables only for timing results.

  • –rich-table.memory: Generates tables only for memory usage results.

By default, reports are displayed in the console. You can send a report to other destinations, such as the filesystem, by appending the destination name. For example, to save the report to a file instead of printing it to the terminal:

Saving a Rich Table report to the filesystemπŸ”—
  python my_benchmark_script.py --rich-table.ops filesystem

Advanced FeaturesπŸ”—

Beyond the basic fields shown above, the reports also support advanced features such as:

Parameterized Benchmarks

Including esults for benchmarks that take parameters, allowing for analysis of performance across different input sizes or configurations.

Custom Complexity Weightings:

Including Big-O complexity weight/size annotations to help analyze how performance scales with input size.

These features make these reports a powerful tool for understanding the performance characteristics of your code in a clear and structured manner.

Parameterized BenchmarksπŸ”—

When benchmarks are parameterized, SimpleBench generates additional columns in the report for each parameter value requested for generation.

This allows you to easily compare performance across different configurations. For example, if you have a benchmark that takes an input size parameter, the report can include how performance varies with different input sizes.

See the defining_benchmarks section for more details on defining and using parameterized benchmarks.

Custom Complexity WeightingsπŸ”—

Related to parameterized benchmarks, SimpleBench allows you to specify custom complexity weightings (number/size weighting)for your benchmarks.

These weightings are included in the report as the N column value, helping you analyze how performance scales with input size and parameterization.

For example, you might specify that a benchmark set covers input sizes 1, 20, 100, 1000, which will be indicated in the N column of the report with a row for each size.

When defining a parameterized benchmark, you can provide complexity weightings that reflect the expected performance characteristics of the code being benchmarked and are matched with the parameters being used. This helps in understanding how the performance of the benchmarked code changes as the input size or other parameters vary.

These advanced features make these reports a powerful tool for analyzing the performance of parameterized benchmarks and understanding the scalability of your code.