I was running sensor nodes on 4G cellular and kept finding gaps in my data with zero errors logged. Turns out even with QoS 1 there's a window — the message leaves your code, the network drops before the broker sends PUBACK, and it's just gone. paho doesn't track that gap across reconnects, and there's no log unless you write one yourself.
I ended up building a resilient client to handle it (offline SQLite queue, inflight tracking, priority eviction, backoff) and published it as robmqtt on PyPI. But I'm curious how others deal with this:
Would love to hear how people solve message reliability on unreliable networks.
No responses yet.