4 MINUTE READ | February 16, 2026

How PMG Makes Agentic AI Faster: Caching Reasoning, Not Just Responses
Dylan Kline is an AI & Software Engineer with PMG's Alli Data team, where he focuses on improving AI efficiency and search performance across products. He has interests across AI subfields, including reinforcement learning and deep learning for generative applications.
Varun Chillara is a Lead AI & SWE Engineer with the Alli Data team at PMG. He focuses on building scalable AI systems at the intersection of software engineering and applied AI. His current interests include deep learning, RLHF, and improving accuracy and reliability in Agentic AI systems.
AI is powerful, but it can be slow and expensive. When you need quick answers for dashboards and reports, waiting for an AI to think from scratch every time slows everything down.
To solve this, PMG developed Alli. Alli is a proprietary operating system that connects data, strategy, and execution for every team across PMG. It serves as the foundation for our operations, allowing teams to manage the entire lifecycle of their work. This includes measuring performance, planning campaigns, forecasting outcomes, and automating workflows in a single environment.
Central to this system is the ability to ask questions about your business using plain language. To make this interaction fast enough for daily enterprise use, we built a new architecture called SemanticALLI. It makes the AI engine faster by working smarter behind the scenes.
Standard caching has a flaw. It assumes that if two questions look the same, the answer is the same. But in a real business, people ask for the exact same data in a hundred different ways.
Because of this, it rarely works well. It also treats the AI process as one big step. It ignores the fact that answering a question requires multiple smaller steps. The AI has to figure out your goal, pick the right metrics, and then build a chart. Those internal steps often repeat even if the user asks a completely different question.
SemanticALLI solves this by remembering the logic instead of just the final answer. We break the AI thought process into two distinct stages:
Understand the goal. Alli translates your words into a clear map of metrics and filters.
Build the visual. Alli turns that map into a clear chart or dashboard.
By saving the work at both of these stages, Alli reuses its past reasoning. It does not matter if two users phrase their questions completely differently. If they want the same outcome, Alli already knows how to build it.
The results are immediate. In real-world marketing tests, standard AI memory only matched questions 38% of the time. By saving the visual building blocks instead, SemanticALLI achieved an 83% hit rate.
Across just 500 questions, this saved the AI from having to think from scratch over 4,000 times. Median response time was under three milliseconds. It also reduced the token usage required by 78%. Alli reuses what it already knows instead of doing the same work twice.
To move fast without the risk, enterprise AI must be efficient and secure. By remembering the steps of reasoning instead of just the final words, AI systems become significantly faster. They maintain perfect accuracy around your business logic while cutting out the wait time.
For teams using Alli, this means instant answers, lower costs, and a platform that scales with your business. You get the clarity you need to act instantly.
Download the full whitepaper: SemanticALLI: Caching Reasoning in Agentic Systems.
Stay in touch
Subscribe to our newsletter
By clicking and subscribing, you agree to our Terms of Service and Privacy Policy