KuroshinRatingIssues

MeatballWiki | RecentChanges | Random Page | Indices | Categories

From MojoAnalysis. Discussion moved to separate pages:

The FairnessOfKuroshinCommentRating
See the discussion at the bottom of CommunitySolution about how the rating systems are both CommunitySolutions and TechnologySolutions muddled together.
Stalkers. See WhatIsaStalker.
RatingAsContent

Comment rating is entirely negative

Comment rating as it stands will always result in negative behaviour because comment rating is entirely negative. That is, it is punitive. The best you can do to encourage good work is not punish someone as much as you would otherwise.

Consider, if the goal is to push people towards higher moderation values in order to put them at the top of the list, then that is the natural state of the material, as the desire for posting is to be read. Thus, every vote other than 5 is a vote down.

Since negativity fosters negativity, certainly there will be issues with comment rating. People will be angered, feel dejected by bad ratings. There will be arguments over the merits of one rating over another because it's so important to get the rating up. See UnfairRatingAlert.

Slashdot moderation is equally punitive as Kuro5hin's comment rating system by being a reinforcing cycle. The punishment there is ignorance; denying people their voice. However, it does actually encourage people with the +1 ratings, which are cumulative. In that case, however, it suffers from the folly of rewarding A while expecting B. It rewards mass appeal in the hopes of achieving quality. However, as any stable democracy knows (ironically), mass appeal is the last thing you want. -- SunirShah

: The other half of Slashdot moderation: for those who do care, the moderation system works to the extent that, for the most part, good stuff gets moderated up. So that browsing with a high threshold tends to produce decent posts. But you miss a lot due to the false-negative problem I mentioned in my other post(s). -- KarstenSelf

I disagree rather strongly with your assessment that moderation are strictly negative. They're not. Content starts its life on K5 as neutral, as posted by virtually all users (untrusted being the only exception). Moderation adds value to the content by indicating its relative worth, which may be lower, or higher. If your problem is one of assigning relative value and/or merits to things, I'm afraid we have a fundamental disagreement. If your issue is that any particular measurement is going to have a degree of arbitrariness and variable appropriateness according to different individual preferences, we're in agreement. But I return to my point on FairnessOfKuroshinCommentRating: Moderation is better than no moderation, in a sufficiently data-rich environment. -- KarstenSelf

Premises:

People (mostly) want their comments to be read.
Readers read the topmost comments first.
Readers don't necessarily read all comments.

Therefore, people want their comments at the top of the sort. Thus, anything that jeopardizes that position is a move against the authors' wishes. Consider that the existing sorts are either unrated then highest, highest first, lowest first, or no sorting. In the latter case, the rating system doesn't matter. In the first case, just by rating someone, you move their post down the list. With highest first, as lower rated comments move down, the only way to satisfy a particular author's wish is to vote 5.0. Anything else would be a slight. As for lowest first, that's just the inverted case of highest first.

Now, while not every author thinks this way, certainly many do. Also, I'm willing to admit that a 4.0 or higher is actually a positive rating because it's higher than the average rating on k5. However, still, getting a 4.0 is difficult leading to a lot of frustration — especially because it's absolutely unclear what will gain a favourable rating because the feedback channel is so poor. A number over a sentence? Sentence, please. -- SunirShah

: This is a truly insightful comment. It's impossible to promote the creation of new content (or anything) through punishment. How well would a whip master cajole a herd of writers into producing literature? Not that comment rating is the same as being whipped, it's just an analogy. -- MaynardGelinas

Let's turn this around for a minute. The disposition of a comment is entirely an aspect of how the system displays it. Therefore the "reward/punishment" aspect of moderation is not a property of the moderation itself, but an emergent property of how moderation is represented. Scoop could be trivially modified such that unrated comments don't post until rated by, say, a very trusted user (VTU). In this case, moderation is a reward for good posts. Lack of moderation is the same as censoring the post. -- KarstenSelf 8 Apr 2001

: Comments start out with an undefined rating not a 5. The best way of putting it would be they start at 0/0 and then as users rate they start approaching the comments "true" rating but the comments true rating exists all along. Of course this is assuming that an objective analysis of a comment can be made, although Wiki:ChristopherAlexander's work would suggest it is possible.

: The other view is that at the beginning there is no rating and then as people rate a consensus is approached. Either way the comments don't start at 5 so it's not negative as such.

: Otherwise any rating system with an upper bound is inherently negative. -- DanielThomas

Speaking of unrated comments, quite intentionally, the default rating is null. An unrated comment is not a 0, it's not a 3, it's not a 5, it's not 3.29. It's unrated. Slashdot suffers this bug — unrated comments are '1'. You can't tell if a comment is a '1' because it hasn't been rated or because it's been rated up and down. This is a very bad bug of Slashdot -- KarstenSelf

Ratings indicate trust

The other main purpose of comment rating is to provide an "objective" idea of how much commitment to discussion an individual has, in order to select those who have the highest commitment to good discussion, so that they can be provided the tools to help administer the discussions and keep them high-signal. -- RustyFoster

: The problem with conferring trusted status upon people who write good posts is that it does not follow that a person who writes well is trustable (demagogues; did you read Ender's Game? Locke/Peter) and it does not follow that a person who does not write well is not trustable. Even on wikis, we know this from the WikiMindWipe. Indeed, the worst damage to the community comes from long time members who have become sour. The ones with the most invested are those most hurt and consequently those most vindictive.

: Really, using comment rating to indicate trust is called in managerial contexts as "The folly of rewarding A while expecting B." This is common in business. Consider a potato farm that wants to increase its profits. That means planting more seeds. The owner tells the supervisor to plant more acres. The supervisor plants the seeds at 3cm instead of 6cm because he can cover 6 more acres an hour that way. Come harvest time, most of the potatoes have died from exposure. (This case was apparently common in communist Russia.)

: And remember that people will always optimize towards the reward. Enter the whores. -- SunirShah

I consider the trust conferred by moderation to be a relatively minor aspect of the system. -- KarstenSelf 8 Apr 2001

Whoring was of course considered. Its potential is limited by two factors.

The "reward" sucks.

Trusted users don't gain posting privileges, just the ability to see spam. This is not exactly an attractive reward for most people. It's more of a burden than anything, unless you are a potential attacker. (See above)

There are no points — the "game" aspect is strongly downplayed.

: Users can't see their mojo, so there's no running score to consider. I'm very firm on never opening mojo to anyone. Its use as a metric is to the system alone. So it only appears in the two trust-related situations, and then you still only have a slightly different boundary value to guess by. Besides this, there is a very limited range of possible mojo values, so you can't go and rack up the points to spend at a later date. -- anon.

Note that several well-known "dissidents" are also trusted users. There are fewer dissenters (by definition), but almost all of the known and consistent dissidents are in fact trusted. -- RustyFoster

: Note that from FairnessOfKuroshinCommentRating, we know that attacks like the Trolltalk SID can occur to confer trusted status on anyone for free. Indeed, the only good way I know to give trust to people in through a TrustMetric of some sort. Indeed, I would advocate a FunctionalAccessTrustMetric. -- SunirShah

It's important to note that the trust conferred by trusted status is also extremely limited. Trusted users have the ability to rate comments "0". Now, that doesn't mean they can make a comment's rating 0, no matter what. They get a single rating, like everyone else, but theirs can be lower than normal.

The reason it can be so loose with its trust is simple. As long as you can maintain one "good" trusted user for every 4 "bad" trusted users, the attackers cannot gain the upper hand.

Assume a worst-case scenario. You have a group of ten users who have taken it upon themselves to gain trusted status and try to silence one person. So, they have a pact to follow one other user around and rate everything that user posts "0". Further assume that they have infinite free time, and can always be "first on the scene" when the attackee posts a comment. So no non-trusted users ever get a shot at rating these comments.

All trusted users get the "review hidden comments" link, and most of us do that from time to time. So methodically unfair rating would be quickly noticed by other TUs. Now, if there are 10 "0" ratings on a comment, all it takes is three other trusted users to bring that comment back to the visible range, and foil the plot. If the "good" TUs were rating 5, and the bad TU's 0, then the rating will end up at 15 / 13, or 1.15. This is not a "good" outcome, in that the comment has still been rated way lower than it probably deserved, but it will allow other readers a crack at rating it fairly.

Also, now that ratings are open, it will be perfectly obvious who the abusers are, and other TUs can take measures to remove their trusted status. A single user who knows that abuse is going on could likely manage to strip a trusted user of that power in a very short time.

So, while a ring of mojo is possible, actually using it to launch an attack is quite a lot more difficult than it looks. This is because the trust gained is very limited, and subject to peer review, and the power to up-rate is a lot stronger than the power to down-rate. Five times as strong, in fact.

Spam removal

The only function that I will give to Mojo without argument is the removing of spam. KuroShin is remarkably free of spam. True, it's partly because the contributors are of high quality, but Mojo itself plays a successful role. As a trusted user, I can see the hidden comments (rating < 1.0), and I can tell you they deserve to be there. In this role, Mojo essentially grants FunctionalAccess to kuro5hin's trusted users to distribute the administrative load from the site proprietors. Thus, there are more people who can dump cruft. Additionally, because people can make mistakes (as value is a subjective function), the ability for others to lift a hidden comment back to visibility (rating ≥ 1.0). In this respect, the Mojo provides limited PeerReview and ReversibleChange. -- SunirShah

: There is no doubt, to me anyway, that the system as it stands has been successful in keeping the spam away. It's working far better than I thought it would at that. Now, it can't do that without a means of distributing extra power to those who have demonstrated a clear enough investment in the site to be trusted with it. So, trusted user acclamation is a necessary precondition for the spam-removal function. And the only way I can think of to identify people who are not likely to abuse the (albeit limited) ability to censor posts are to flag those who consistently contribute excellent material. -- RustyFoster

The real key is to identify the outlier Really Bad Elements that come along, and quickly. -- KarstenSelf

Further observations on effectiveness against spam. K5's been exceptionally effective against several types. First, a brief taxonomy of various "low-value content" attacks:

Obscene language
Personal attacks
Abusive posts
Junk posts (goats.ex, parrot-dick, Natalie Portman, Hot Grits, first post, Meept, etc.).
Scripted attacks

Moderation and mojo don't do much against scripted attacks, defenses are elsewhere in the system. Obscenity is typically less systemic than other forms of attacks — it's sometimes appropriate, sometimes not. K5 moderation tends to treat it fairly well. Personal attacks are slightly more systemic (there tends to be a pattern), and the response is generally fairly strongly negative on both sides. Users who tend toward abusive or junk posts rather rapidly settle down to untrusted status and their posts are not visible. While K5 has seen FP, goats.ex, Meept!, and other attacks, they've been very short lived.

One feature which helps immensely in this is the "view hidden comments" feature which allows trusted users to see any posts which have drifted below the '1' threshold. Typically, this results in a dog-pile of moderations on these low value posts, sometimes lifting it above the threshold, very often flooring the post below it. An additional option to look for posts which have accrued any zero mods would be helpful (actually, what's needed is a rewrite of the search function). -- KarstenSelf 8 Apr 2001

: This is true. It's also more likely the badness comes from mistakes, not malice. I think it's more important to fix mistakes made in casual error, and to guide people who act in good faith but erroneously, rather than concentrate mostly on how to deal with potential threats. -- SunirShah

The real test is whether or not Mojo and moderation can scale. K5 is about 9500 users right now. How it scales past 100k will be of interest — though even growing to several tens of thousands will be of interest. -- KarstenSelf

: I don't think it's really possible to predict what will happen with a much larger scale. What happens will entirely depend on how you got there. Your best bet is to study the set of communities that have come before you, how they got where they were and what you can learn from their successes and failures. Over time, I hope MeatballWiki will be populated with a large set of patterns (or whatever) on how to solve community engineering problems. -- SunirShah

No, this is the whole point.

It's relatively possible to create a functioning community, on a sufficiently small scale. This problem has been solved (or rather, solves itself), and doesn't interest me. What's tough is to provide a flexible system on a very large scale, while protecting it from various forms of abuse — DenialOfService, SignalToNoiseRatio, promoting intelligent discussion, discouraging hostile activities. Slashdot doesn't scale, in a social sense. Usenet does, reasonably well, but in a rather spotty manner, and relying heavily on specific user choices in client and other tools to do so.

If the aim were to create YADS — yet another discussion site — then K5 has been successful, and we can go home. The objective is to create a system, spanning multiple sites if possible, which promotes intelligent, high-value interchanges among groups of people. UseNet on steroids is my goal — make that an intelligent Usenet on steroids. Mind you, Usenet still works, in certain quarters, be it by moderation, obscurity, or just a focus of minds to a topic that happens to work. -- KarstenSelf

Reader participation and rating replies

It might make sense to prevent rating not only one's own comments, but also those comments in reply. -- MaynardGelinas

: Seconded. Actually, Slashdot may have a point in preventing users from rating comments in stories they have commented on. Often the discussion weaves and wends through several disjoint threads. -- SunirShah

Not sure. I'd argue that this is a case to watch for abuse, as well as moderation of editorial comments in submission queue.

: One thing I didn't see mentioned was how to promote rating by readers alone. I think one can argue that a rating by a reader, rather than a writing and debating opponent, is of greater value to the forum. It might make sense to reward readers with mojo points who also take the trouble to rate. I suggested one way of doing this, by limiting ratings to only those who are not directly involved in writing a particular thread. However, it may make sense to actually give readers mojo points for valid ratings. Of course, you'll need a counter measure to prevent readers from wrongly rating all comments a 1 or a 5… so give mojo when a reader rates comments with a wide deviation from comment to comment. -- MaynardGelinas

As the three behaviours of kuro5hin are reading, writing and moderating, maximizing content quality means encouraging both good writers as well as good moderators. We can ignore the problem of trusting Readers simply because they don't participate in the system. I would, however, like to additionally assess moderation quality and find a way to reward those who moderate well though they may not write. -- KarstenSelf

Complexity

Many users already find the ratings to be too demanding, and a good chunk are likely to stop providing data altogether if it gets any more complex. This would have the overall effect of weakening the system, rather than strengthening it.

The second problem is computational overhead. We are discussing a real-world system, not an abstract ideal, so computation must be factored in. Many people already think the existing system is absurdly high-overhead, and adding to that would be difficult. -- RustyFoster

: Keep your FeatureKarma off the floor. No more tax codes!

Hear, hear. It's too complicated as it is. :-) -- RustyFoster

Binary Ratings

Over on DailyKos? a BinaryScoopRatingSystem? where a "recommend" is equivalent to a 4 and a "troll" rating is equivalent to a 0 seems to be working quite well... Well, that is once TheCollective got their collective heads around it after the change from a 0 to 4 ScoopRatingSystem?. -- DanielThomas

CategoryRatingSystem CategoryKuroshin