because how can you determinate when to invalidate it ? only if you know the future you could actually do it properly. At least that's the idea of Bélády's.
The next question is what and where to cache and for how long, for example if you use a memcache cluster that is caching your database ... how do you know when to invalidate the cache ? you could now cocky say "on the succession of a database write" which leads to the next question how to keep it consistent ? should the database trigger your memcache invalidation because that would be the most realistic way. And that's still ignoring the CAPs theorem and the question of "when do you know something is done" and concurrency issues.
Is it efficient to discard information at this point ? The main problem with cache invalidation is the efficiency.
And when you think of the lvl of complexity in distribution models between a single core vs multi core and then you just add multiple machines, containers in virtual memory and network distributions the whole thing start's to really get complicated .... Those are just things of the top of my head and I won't get into the algorithms :)