Benchmarks

Here at Qognitive, we use Pauli Strings and Pauli Operators throughout our codebase. Some of our most critical performance bottlenecks involve applying these operators to a state or batch of states. These are only a few of the functions we’ve optimized in fast-pauli, check out our Python API and C++ API. All benchmark figures are interactive and we encourage you to explore them!

Below are several benchmarks comparing fast-pauli to qiskit. All benchmarks were run on a single machine with the following specifications:

CPU

13th Gen Intel(R) Core(TM) i9-13950HX

RAM

64GB

Threads

32

OS

Ubuntu 22.04.4 LTS

Architecture

x86_64

Compiler (for fast-pauli)

LLVM 18.1.8

Python

3.12.7

Pauli String Applied to a State

Starting simply, we benchmarked applying a single Pauli String (\(\mathcal{\hat{P}}\)) to a single state (\(\ket{\psi}\)), which is equivalent to the following expression:

(1)\[\mathcal{\hat{P}} \ket{\psi}\]

We saw that the sparse representation of the Pauli String operator when applied to the state is significantly faster than the representation of the Pauli String operator used by Qiskit. For most operator sizes, we saw several orders of magnitude in performance improvement.

Note

All datapoints in our benchmarks have error bars indicating the standard deviation of the mean, but for most points the error bars are too small to see.

Pauli Operator Applied to a State

Next we benchmarked applying a Pauli Operator (a linear combination of Pauli Strings) to a single state:

(2)\[\big( \sum_i c_i \mathcal{\hat{P_i}} \big) \ket{\psi}\]

Again, we saw significant performance improvements for the same reasons stated above and are often an order of magnitude faster than qiskit. \(N_{\text{pauli strings}}\) is the number of Pauli Strings in the Pauli Operator, i.e. the number of terms in the linear combination shown in (2). Note that fast-pauli performs better relative to qiskit when the Pauli Operator is more sparse, i.e. when there are fewer Pauli Strings in the operator.

Expectation Values of a Pauli Operator

Finally, we benchmarked the expectation values of a Pauli Operator applied to a batch of states:

(3)\[\bra{\psi_t} \sum_i c_i \mathcal{\hat{P_i}} \ket{\psi_t}\]

In this benchmark, we chose a single number of Pauli Strings, \(N_{\text{pauli strings}} = 1024\), and varied the number of qubits and states. Similar to the previous benchmarks, we saw significant performance improvements for fast-pauli compared to qiskit. In this benchmark, we tend to perform better when applying to a larger batch of states, but we point out that our advantage compared to qiskit narrows as the number of qubits increases. With that said, we’re still more than 2x faster for these larger operators!

Note

The data point for qiskit with \(N_{\text{qubits}} = 16\) and \(N_{\text{states}} = 1000\) was not shown in the above plot because of OOM errors.