Benchmarks and European AI sovereignty in finance: key challenges for the sector

Workshop organized by the Institut Louis Bachelier, GENCI, and the AI Factory Finance Vertical, held at the Palais Brongniart on February 3,

Apr 20, 2026 16:17

Apr 20, 2026

WORKSHOP SUMMARY:
THE CHALLENGE OF BENCHMARKS AND EUROPEAN AI SOVEREIGNTY IN FINANCE

The Institut Louis Bachelier (ILB), in partnership with GENCI, organized a workshop at the Palais Brongniart focused on the “AI Factory” and the challenges of artificial intelligence in finance. The event brought together researchers, institutions, and market participants around a shared goal: identifying the sector’s needs in terms of reference datasets, benchmarks, and evaluation metrics tailored to financial realities. Discussions were particularly rich and followed by numerous questions from the audience, highlighting the ecosystem’s strong interest in these topics. Finance is among the sectors most impacted by AI, reinforcing the need for robust and shared evaluation standards.

During the opening session, Marie Brière (CEO of ILB) and Cédric Auliac emphasized that AI is already embedded at every level of financial decision-making—compliance, fraud detection, forecasting, allocation, risk analysis, and client relations. Cédric Auliac noted in particular that while GENCI’s computing power historically served nuclear physics and fluid mechanics, finance has now become a key vertical for these national resources. The priority is shifting toward reliability, measurement, and evaluation frameworks.

Speakers addressed fundamental questions: what defines a reliable AI model in finance? How can accuracy and robustness be measured in the face of regime shifts or adversarial attacks? How should biases and overall value be assessed within a multidimensional framework that includes both financial and environmental costs? The workshop also raised major issues of sovereignty and independence: should these models be specifically trained and evaluated on European data?

The roundtable brought together researchers and practitioners from BNP Paribas Global Markets, Natixis CIB, Ardian, Dragon LLM, École Polytechnique, and Google DeepMind. Discussions highlighted a significant gap: while use cases for large language models (LLMs) already exist in finance, evaluation frameworks remain insufficiently structured. The main bottleneck is not application, but reliability measurement. Similarly, in risk management and valuation (deep hedging, fast pricing), experts stressed the need for strict scientific standards, including dedicated evaluation test sets for each use case before deployment. Human validation remains central, with continuous supervision required to ensure AI remains a “productivity assistant” rather than an autonomous risk.

Speakers also emphasized efficiency and targeting: choosing the right model for the right task, prioritizing methodology, traceability, and control. They warned against in-sample testing and purely qualitative evaluations, advocating instead for out-of-sample comparisons based on clearly defined benchmarks. For text processing, this includes the use of smaller, specialized models capable of outperforming large general-purpose models on targeted financial tasks, while being more cost-efficient and energy-efficient.

The roundtable covered operational productivity, valuation, hedging, risk management, and model evaluation under distribution shifts, concluding with sovereignty challenges.

The shared conclusion is that the objective is no longer simply to “make models work,” but to demonstrate, measure, and govern their behavior within a financial environment. All participants stressed the importance of developing structured European benchmarks to ensure sovereignty. This effort requires collective action to pool data, metrics, and methodologies across research and industry. In this context, a working group is currently being formed in partnership with the Institut Louis Bachelier, the AI Finance vertical of the AI Factory, GENCI, and the Cercle IA et Finance, to advance these sovereign standards.

‍

download the publication view the publication