Cover Image for The One About AI Testing
Cover Image for The One About AI Testing
Avatar for Lorong AI
Presented by
Lorong AI
Registration
Past Event
Please click on the button below to join the waitlist. You will be notified if additional spots become available.
About Event

How do we ensure AI systems are both reliable and safe for our context? From enterprise deployments to local language safety, join us to explore critical aspects of making AI work in our region.

More About the Sharings:

  • Gabriel (Data Scientist, GovTech's AI Practice) will share more about RabakBench, a safety benchmark for Singapore’s context. Large language models (LLMs) and their safety classifiers often perform poorly on low-resource languages due to limited training data and evaluation benchmarks. Evaluations of 11 popular open-source and closed-source guardrail classifiers on RabakBench reveal significant performance degradation. In this talk, they will share how they built this new multilingual safety benchmark localized to Singapore’s unique linguistic context, covering Singlish, Chinese, Malay, and Tamil.  In particular, they will share how they leveraged LLMs to scale human supervision for both annotation and translation. (Technical Level: 200)

  • Shameek (Executive Director, AI Verify Foundation) will share more about "GenAI Accuracy, Reliability in Real-World Scenarios". Hear more insights from their Global AI Assurance Pilot, which tested GenAI applications across 17 use cases and 30 global companies. While most efforts focus on model safety, real-world deployment demands attention to end-to-end system reliability. Through practical examples spanning 10 industries, discover how context and complexity shape AI performance at scale and learn more about testing frameworks and risk assessment methods that bridge the gap between lab performance and real-world success, helping organizations build GenAI solutions that truly deliver value. (Technical Level: 100 -200)

More About the Speakers:

  • Gabriel Chua is a Data Scientist at GovTech, focusing on MLOps, LLM solutions, and Responsible AI. He co-organizes AI Wednesdays and various community tech events, bringing together AI practitioners across Singapore. Previously a policy analyst at the Ministry of Health working on Healthcare Finance, he holds degrees from LSE (Economics) and MIT (Business Analytics). Outside of tech, you'll find him at pilates, hiking trails, or enjoying craft beer.

  • Shameek Kundu (Executive Director, AI Verify Foundation) is a senior Data and AI professional, with 25+ years of experience across AI safety and testing (AI Verify, TruEra), Financial Services (Group CDO at Standard Chartered) and Consulting (McKinsey). Before joining AI Verify, Shameek helped build and scale an AI testing software business at Silicon Valley startup TruEra. He serves/ has served on multiple consultative forums on AI governance, including those of the Bank of England, the Monetary Authority of Singapore and the OECD/ Global Partnership on AI.

Location
Lorong AI (WeWork@22 Cross St.)
Take the back entrance and head up to L3!
Avatar for Lorong AI
Presented by
Lorong AI