Formal verification for robo-advisors: Irrelevant for subjective end-user trust, yet decisive for investment behavior?

Alina Tausch, Magdalena Wischnewski, Mustafa Yalciner, Daniel Neider

Published: 2025/9/10

Abstract

This online-vignette study investigates the impact of certification and verification as measures for quality assurance of AI on trust and use of a robo-advisor. Confronting 520 participants with an imaginary situation where they were using an online banking service to invest their inherited money, we formed 4 experimental groups. EG1 achieved no further information of their robo-advisor, while EG2 was informed that their robo-advisor was certified by a reliable agency for unbiased processes, and EG3 was presented with a formally verified robo-advisor that was proven to consider their investment preferences. A control group was presented a remote certified human financial advisor. All groups had to decide on how much of their 10,000 euros they would give to their advisor to autonomously invest for them and report on trust and perceived dependability. A second manipulation happened afterwards, confronting participants with either a successful or failed investment. Overall, our results show that the level of quality assurance of the advisor had surprisingly near to no effect of any of our outcome variables, except for people's perception of their own mental model of the advisor. Descriptively, differences between investments show that seem to favor a verified advisor with a median investment of 65,000 euros (vs. 50,000). Success or failure information, though influences only partially by advisor quality, has been perceived as a more important clue for advisor trustworthiness, leading to substantially different trust and dependability ratings. The study shows the importance of thoroughly investigating not only trust, but also trusting behavior with objective measures. It also underlines the need for future research on formal verification, that might be the gold standard in proving AI mathematically, but seems not to take full effect as a cue for trustworthiness for end-users.

Read Full Paper (arXiv.org)