Permission to conduct this study was granted by mk and thenewgreen.
Opinions are my own. Data was collected from hubski.com/community. If you would like the pdf document or the raw form to the data in excel, that is available upon request.
If you would like to contact me, feel free to message me via Hubski, email (cadell.last@gmail.com) or via my website (www.theadvancedapes.com).
First off, well done! This is incredible. I think I know why the commenting data might not correlate well, however. Currently, the popular comment list has a time-dependent variable. The list is calculated by looking at the average score of commentors, but modified by a decay based on the time of their last comment. Due to this decay, the list is not sorted by the most highly rated commentor, but by the most highly rated recent commentors. There are a few reasons why I do this. One is to allow for new users that make good comments to be noticed and appreciated. I also don't want the list to grow stale. Finally, I don't want those that comment frequently to be missed just because the average of their comments obscures a number of quality ones they have made. Thus, the list will always feature commentors that are appreciated, but not only those that have been most appreciated overall. I was just discussing this with StephenBuckley earlier today, actually. I have put some thought to making the 'popular users' list a more dynamic as well. IMO the site is improved by a diversity of conversations, and at some level, I think it's counter-productive to reinforce the popularity of a small number of users. That said, I can provide some time-independent comment data if you'd like. I think it would be interesting to see if badges do correlate to a user's average comment score, or perhaps their top or median comment score. Really interesting analysis. I love this stuff.
Hm. I definitely get your interpretation of these data, and your rationale for organizing it this way makes sense. However, I did feel that it was unfair that some people with the top popularity rating had only made a couple comments. I still feel like it could cause people to feel that frequently thoughtfully contributing won't add to their overall status on the site. Potentially... that is my interpretation. But I can definitely understand why you want those sections to be a little more fluid. Maybe there is a middle ground? I think a future study on the the frequency of badges for posts compared to comments would maybe add some insight into what system would work the best for promoting thoughtful discussion.
I agree with this. I can help out by providing easier access to the data.Maybe there is a middle ground? I think a future study on the the frequency of badges for posts compared to comments would maybe add some insight into what system would work the best for promoting thoughtful discussion.
very nice I seem to be out of badges or I would slap one on it.
What is the ratio of badges given for comments and those given for posts? Could that give insight into the lack of strong correlation?
As I stated in the article, I think understanding the difference between ratio of badges given for comments as opposed to posts is important to quantify. In the future, I would like to collect that data and run another analysis. However, it won't explain the lack of strong correlation since comment rating is also not significantly correlated with popularity rank or follower total. I think the cause of lack of strong correlation is from not taking into consideration the number of times someone has commented. Those that comment infrequently and/or those who have just joined the site are being ranked high because they have a small sample size of comments. (I probably explain it better in point 3 of the discussion)
I think to add another dimension--and this could only be done with some greater ability to access data, which I don't think we have (do we, mk?)--is to look at the comment score in the top of a thread, and the number of threads started. The comment score of heavy users is probably driven down by long conversations between users. No one is going to upvote a comment that's the 10th response about some topic that's only interesting to the two parties involved. The number of thread started, and the average score of those initial comments could say a lot about how engaged the user is, as well as how much the user contributes to getting others involved.
Definitely agree with this. There is a lot of data that is just publicly available, but it would just take a long time to organize. In the future, I hope that those who are interested, can organize it all and hopefully offer mk and thenewgreen some useful information to make Hubski even better!
Perhaps a metric that is closer to the sum of upvotes with some kind of weight on currency rather than the mean upvotes a users comments received (I assume that is what we have) would solve the issue . . . if there is one. I would be thrilled to find out top commentary does not come from the same people as best stories. Division of labor and all that.