SUperman: Efficient Permanent Computation on GPUs
Deniz Elbek, Fatih Taşyaran, Bora Uçar, Kamer Kaya
公開日: 2025/2/23
Abstract
The permanent is a function, defined for a square matrix, with applications in various domains including quantum computing, statistical physics, complexity theory, combinatorics, and graph theory. Its formula is similar to that of the determinant; however, unlike the determinant, its exact computation is #P-complete, i.e., there is no algorithm to compute the permanent in polynomial time unless P=NP. For an $n \times n$ matrix, the fastest algorithm has a time complexity of $O(2^{n-1}n)$. Although supercomputers have been employed for permanent computation before, there is no work and, more importantly, no publicly available software that leverages cutting-edge High-Performance Computing accelerators such as GPUs. In this work, we design, develop, and investigate the performance of SUperman, a complete software suite that can compute matrix permanents on multiple nodes/GPUs on a cluster while handling various matrix types, e.g., real/complex/binary and sparse/dense, etc., with a unique treatment for each type. SUperman run on a single Nvidia A100 GPU is up to $86\times$ faster than a state-of-the-art parallel algorithm on 44 Intel Xeon cores running at 2.10GHz. Leveraging 192 GPUs, SUperman computes the permanent of a $62 \times 62$ matrix in 1.63 days, marking the largest reported permanent computation to date.