Hoovik — a video call app that watches your face and listens to your voice to detect emotions in real time.
WebRTC alone is already painful. Add a Python ML service trying to analyze live audio/video streams on top of that and everything just… breaks in ways you don't expect.
The stack ended up being 4 separate services just to make it work. React + WebRTC on the frontend, Node.js handling all the signaling chaos, FastAPI running PyTorch + MediaPipe for emotions, and Whisper for transcripts with per-speaker tags.
Wrote up everything that went wrong and what I'd do differently:
No responses yet.