Wow, (4) sounds really cool! Putting it on my zotero reading list. I don't understand your walk-through just yet. The generative part of the model learns from the actual occurrence of "bread and butter" vs "butter and bread". Is the binomial now drawn per usage instance of that phrase, or per user, or per context in some other way? How would you even encounter a situation where the polarity of the beta distribution is flipped? If each binomial is drawn iid from the beta, how would they all gravitate towards opposite polarity? And isn't the beta learned from the actual text, so it should learn to place a high weight on the final correct polarity? Maybe I just need to sit down with the statistics for a while to understand. (2) is a really neat example of how we develop a consensus on language through recursive modeling! I don't know that much about pragmatics or this sort of maxim, coming from a computational background, but I did get really excited by a NIPS paper a couple years ago on a model of consensus-building in language learning, also from Goodman's lab.Then when you look at the parameters of this model after training, you find that very high-frequency expressions' concentration parameter is quite low, so that most expressions become very polarized toward their generative preferred order and a very few become polarized, but in the opposite direction, and nothing in between.