@chrischapman
R, Quant UX, and Marketing Research author. Executive Director, Quant UX Association.
Nothing here yet.
You're welcome and I'm glad it was useful. The VW model here has various problems and limitations and it is not unusual for the lines to be strange or not to cross. The main point is that VW is more of a semi-qualitative tool about stated expectations than a rigorous quantitative assessment. It is sensitive to exact wording and sample understanding. My recommendation in that case is to describe the lines of interest with summary stats such as the median or other percentile. For example, we might report something like "50% of the sample believed $50 was too cheap, and 75% would consider it at $100." That can accomplish similar insight while staying with the data we have. As for supplementing with another tool, if you can get more sample then I'd recommend conjoint analysis and/or a "monadic pricing" approach. If you can't get more sample, then I would go with summary stats as noted above. HTH!
Thomas Hettig That is exactly what I would do — run the survey with the best experience I can to get good data from a good respondent experience, and not worry about whether I have some "right" number of observations. The other thing I would say is this: for any single individual, the specific estimates in that case are imprecise, but the overall averages & estimated numbers of individuals preferring one thing or another will be precise. Unless you are specifically targeting individuals, that is good enough for almost everything (some forms of segmentation being the exception).
Yes, generally speaking, segmentation benefits from higher per-respondent precision. The usual recommendation is to have 3 exposures of each item to each respondent. So, with N=14 items, 5 at a time, that would be 14x3 / 5 = 8.4 (or 8) screens. OTOH, respondent attention lags over time, and there is some evidence that preferences do not change much after 6-8 tasks. At least not unless attention is refreshed, e.g., with an interstitial task or screen. In the present case, I went with 6 screens x 5 tasks / 14 items = 2.1 average item presentations per respondent. But the goal here was mostly to do high-level summaries. (In a later post, I look at segmentation as a kind of side analysis.) As for individual-level precision and segmentation, it also depends whether you want to do LCA (estimating segments together with utilities) or post-hoc clusters (utilities first, then segments). LCA is preferable to post-hoc clustering in the case of smaller numbers of per-respondent tasks. HTH!
The answer would depend on various details. If you're looking to compare groups for a difference in counts between them, then general Poisson regression would work. Just use the group membership as a factor in the model. However, if you want to characterize the statistics of the difference itself, then yes, some other approach would be needed. The possibilities depend on the details. If it's a change over time, then a time series regression could work. If it's about the absolute difference in itself then I would probably lean towards bootstrapping because it's hard to say much a priori about the distribution or expectation. A way to sidestep some of those issues is to consider whether proportions might work for your problem instead of counts. Then an option like beta regression could work (where the outcome is a beta variable, i.e., a proportion). For my 0.02, those are probably the order I'd try: regression with a class variable; time series if appropriate; bootstrapping; beta or similar regression. HTH!
I agree! Anchored MaxDiff is a great option to get at something beyond purely relative importance in MaxDiff. OTOH, the primary implementation of Anchored MaxDiff relies on hierarchical Bayes estimation, and that is beyond the scope of something like the "easy" counts-based approach taken here. I'll think about it! (This post is not a recommendation to do counts, but rather was intended to be a demonstration someone might use for learning, before jumping into a real MaxDiff implementation.)
I think you're right and the future is not so much "AGI" as it will be "what does X system solve?" More generally, I imagine we'll see a few paths and I'm not sure which (or maybe all) will endure. One path is finding the places where LLMs fit well. I think today they are vastly overestimated, but the hype should recede as they settle into specific use cases. Second should be growth in other AI approaches such as conceptual / symbolic / "ontological" AI. Those are more difficult than the grab-and-train black box approach of LLMs but are also more plausibly powerful. What is unclear is when and whether they will achieve large-scale success. A third category is various mash-ups between the two, which are already happening. Such as LLMs for generality, plus targeted symbolic AI for depth in some area or other.