A benchmark for evaluating how well LLMs can generate SQL queries from natural language questions. Models are tested against the Microsoft AdventureWorks sample database using an agentic tool-calling ...
A production-grade, full-stack AI system that lets users upload datasets and ask natural-language questions. An autonomous Planner → Executor → Critic agent pipeline understands the question, ...