At DefCon this year a dude had a robot that used different LLMs for different pieces including a movement one that out put specific API structured output.
Maybe something like that to take complex output and turn them into lower latency smaller outputs
How do they translate between the continuous character movements, the 60fps 3D world and the words that go in and out of a LLM?
At DefCon this year a dude had a robot that used different LLMs for different pieces including a movement one that out put specific API structured output.
Maybe something like that to take complex output and turn them into lower latency smaller outputs