That memory bleed is real. We hit it too. The contrib image ships with everything enabled by default, which is... not great for ops.
The trick we found: separate collectors by signal type. One lightweight instance just for metrics (Prometheus exporter, maybe 80mb), another for traces with aggressive sampling at ingestion (before buffering). That way you're not paying for unused processors.
On the sampling trade-off: 5% is too aggressive if you're catching production bugs. We do probabilistic sampling based on error status (100% on 5xx, 0.5% on 2xx). Costs maybe 15-20% more in ingestion but catches the actual failures.
What exporter are you pushing to. Some backends are way more expensive per span than others.