Partner with us

Get your ticket

Call to action

Your text goes here. Insert your content, thoughts, or information in this space.

Button

Back to speakers

Juan Cruz

Fortunatti

Agentic AI Engineer

Delivery Hero

Juan is an Agentic AI Engineer at Delivery Hero, where he builds and operates production-grade multi-agent systems supporting vendor operations at scale. His work focuses on the infrastructure behind agent behavior, including evaluation frameworks, trace replay, observability, annotations, tool orchestration, and debugging workflows. A hands-on engineer and active open-source contributor, Juan works across agent runtimes, memory systems, MCP tooling, coordination layers, and portable agent architectures. Prior to Delivery Hero, he led AI and product innovation initiatives at PayPal, TriNet, and The Walt Disney Studios, delivering GenAI applications, internal assistants, developer tools, and data platforms.

Button

29 July 2026 10:30 - 11:00

Evaluation is the new testing: Measuring agent quality at scale

As organizations deploy increasingly complex agentic systems, evaluating performance has become one of the biggest barriers to production readiness. Unlike traditional software, agent behaviour can vary across tasks, environments, and interactions, making quality difficult to measure and maintain. This panel examines how leading teams are approaching evaluation at scale. From automated assessments and benchmark design to human review and production monitoring, we'll explore the frameworks, metrics, and processes being used to understand agent performance and identify regressions before they impact users. The conversation will address the growing challenge of establishing trust in agentic systems through rigorous, repeatable, and scalable evaluation practices.