At DefCon this year a dude had a robot that used different LLMs for different pieces including a movement one that out put specific API structured output.
Maybe something like that to take complex output and turn them into lower latency smaller outputs
At DefCon this year a dude had a robot that used different LLMs for different pieces including a movement one that out put specific API structured output.
Maybe something like that to take complex output and turn them into lower latency smaller outputs