Recursive vectorized computation of the Frobenius norm

Vedran Novaković

Published: 2025/9/7

Abstract

Recursive algorithms for computing the Frobenius norm of a real array are proposed, based on hypot, a hypotenuse function. Comparing their relative accuracy bounds with those of the BLAS routine DNRM2 it is shown that the proposed algorithms could in many cases be significantly more accurate. The scalar recursive algorithms are vectorized with the Intel's vector instructions to achieve performance comparable to xNRM2, and are further parallelized with OpenCilk. Some scalar algorithms are unconditionally bitwise reproducible, while the reproducibility of the vector ones depends on the vector width.

Recursive vectorized computation of the Frobenius norm | SummarXiv | SummarXiv