I cannot speak to the architecture used by Facebook and Netflix, but my company develops a product for just such a purpose. Our particular approach is rather cutting edge in that we use genetic evolution and artificial intelligence to evaluate and optimize multivariate experiments. By using evolution we are able to handle a larger search space without evaluating the full set of permutations. And we are able to handle multivariate experiments rather than simply A/B tests by using AI to analyze the importance of individual features without the need for isolation.
The process goes something like this: a subset of candidates are generated from the full set of permutations. Users are allocated to different candidates and the interactions with the site are recorded. After a certain amount of time and traffic, the results are analyzed. Candidates that performed well are chosen to move on, the rest are killed off (think survival of the fittest). These candidates then "reproduce" with one another to form offspring with similar but slightly varied features. This process is repeated again and again until the performance of these elite candidates between successive generations becomes insignificant.
In a similar vane, for feature flagging (not content optimization) we use a tool made by LaunchDarkly to toggle features on and off for specific users or environments. These features however can configured by hand as we roll out certain features.