(L-R) Guillaume Toussaint; Sara Yasmine Ouerk; Francesco Ferrero; Olivier Gorin; Stefaan Roegiers; Karim Colabelli; Artur Gaynutdinov; Alessio Buscemi; Suraj Maurya;
Credit: LIST/BIL
On Thursday 26 March 2026, the Luxembourg Institute of Science and Technology (LIST) and Banque Internationale à Luxembourg (BIL) announced the completion of their joint research project on evaluating conversational artificial intelligence (AI) systems in a banking context.
Launched in June 2025, the project set out to develop and apply a rigorous, science-based evaluation framework to assess AI systems used in customer interactions, with a focus on fairness, robustness and service quality. The collaboration combined LIST’s expertise in AI testing using the LIST AI Sandbox with BIL’s real-world operational requirements.
LIST said the project delivered actionable insights under strict confidentiality conditions, with no access to customer conversations or data by LIST researchers and all testing performed within BIL’s secure IT infrastructure.
According to LIST, the results showed that no significant discriminatory behaviour was identified. Some minor variations were observed, suggesting opportunities for refinement but without affecting the overall fairness of the system. The analysis of user experience revealed that frustration was primarily driven by situational constraints or unmet expectations, with the system’s tone and language remaining consistently professional and reassuring. In addition, adversarial testing confirmed strong safeguards, maintaining compliance and security across multiple scenarios and languages.
A key enabler of the collaboration was the LIST AI Sandbox, a platform specifically designed to test, validate and improve the reliability of AI in a controlled environment. The Sandbox allows companies, researchers and public institutions to evaluate their AI models on real-world cases, while minimising technical, regulatory and ethical risks. LIST said that by leveraging this platform, the project was able to conduct comprehensive assessments without exposing sensitive systems or data, providing actionable insights while maintaining security and compliance.
LIST highlighted that beyond the individual evaluation results, the collaboration contributed to a broader objective: establishing a replicable and structured approach to AI evaluation and governance.
The methodologies and tools developed during the project support a continuous learning cycle from assessment to reporting and help organisations better understand and manage the behaviour of AI systems over time.
Francesco Ferrero, Leader of the Flagship Initiative on Artificial Intelligence and the Human-Centred AI, Data and Software Research Unit at LIST, stated: “This joint project with BIL demonstrates how financial institutions can combine scientific methods with secure infrastructure to ensure AI systems are trustworthy, reliable, and aligned with regulatory and ethical standards.”
Olivier Gorin, Head of Digital Banking at BIL, concluded: “At BIL, we are always looking for ways to make our services genuinely easier and more helpful for our clients. This project has helped us improve how we use AI so that Berry, our AI assistant, can provide support that is clearer, quicker and reassuring. Since its launch in December 2025, Berry has already made everyday banking smoother for many of our clients. We are grateful to the LIST teams for their excellent work and the constructive collaboration throughout the project.”