Episode 507: Not All Hammers Are Equal: Benchmarking AI for AL Code

One developer decided to stop guessing which AI model is best for AL coding — and built a system to find out. In this episode of Dynamics Corner, Brad and Kristoffer sit down with Torben Leth, the creator of CentralGage, an open-source benchmarking tool that ranks LLMs specifically on their ability to write AL code for Business Central.
Torben walks through how he built an automated testing pipeline that gives each model multiple passes, compiles the output, runs pre-built AL tests, and scores everything from zero to 100. What he discovered about why developers swear by completely different models might change how you think about your own setup — and he's found a way to use those failures to patch a cheaper model's blind spots so it rivals the top performers.
Plus: gamertags embroidered on wedding suits, chili plants managed by Home Assistant, and the philosophical question nobody in this space can seem to answer — should we even try to keep up?
Find Torben: blog.sshadows.dk | LinkedIn | X: @Sshadows
CentralGage: ai.sshadows.dk
#MSDyn365BC #BusinessCentral #BC #DynamicsCorner
Follow Kris and Brad for more content:
https://matalino.io/bio
https://bprendergast.bio.link/










