End-of-speech detection is the voice-agent setting nobody tunes, and it is why yours feels slow or talks over people
The lag users complain about usually is not the model thinking. It is the endpointer waiting.
Two complaints kill voice agents, and they sound like opposites. "It is slow, it leaves these awkward paus