Optimizing Agent Latency
Speed is a feature. In an agent chain, delays compound. If Agent A takes 2s and Agent B takes 2s, the user waits 4s.
Edge Deployment
We are moving our agent runtime to the Edge. By executing the logic closer to the user (and closer to the database), we shave off critical milliseconds.
Speculative Execution
We are also experimenting with Speculative Execution. If an agent is likely to call a tool, we start warming up that tool before the agent even finishes generating the token.
These optimizations are live in our Enterprise Tier.