THE CRAVE GAMING CHANNEL
V'lanna
 






Affiliates

@ RPGShop.com
AnimeBooks
AnimeNation
GameMusic.com
Play-Asia.com

R P G A M E R . C O M   -   E D I T O R I A L S

Of Scoring and the Difficulty Thereof
!
!

Simon Seamann
STAFF EDITORIALIST



Introduction

The more angry words and bitterness assigned to this debate over numbers, the more it seems as though all parties involved have lost sight of the full scope of complexity inherent in the scoring system. Having followed Points of View since its inception, then a reader and now a staff member, I feel it important to remind everyone involved in this debate why it is an issue worth discussing. And, more importantly, why it is a debate that will never be resolved by current measures.

Scoring: Why Bother?

Scoring systems, by themselves, have no objective value. If I give you a review score of 8 on a 10-point scale, but with no context, there is no way you can accurately determine what that score is supposed to represent. Is it a high score? Low? Well, if I submit nine more reviews with scores of all 10, that 8 will seem quite low, comparably. Conversely, if those nine reviews were all 2, the 8 will seem exceedingly great. In effect, the value of a scoring system is based entirely on its ability to differentiate between two items in a numerical fashion. Of course, a scoring system can also be given context simply by defining what each number is supposed to represent. Points of View has done this already, and it is clear that a 5 was average, 10 was perfect, and so on.

A subtle problem of logic arises though, when such a system lacks variety. Put simply, a casual glance through the review archives will reveal that 1-5 scores are almost completely absent, at least comparably so to the oceans of 7's and 8's. Roku mentioned in his editorial "[...] the majority of the reviewers used 5/10 as average." And yet, if this were true, then why would a change in scale be necessary? Where were all the 5's when such a score was supposed to represent a clear majority of all games? I also distinctly remember a sense of disgust throughout my review career as a reader in how 7 become the unofficial standard for average - that very disgust was the foundation of my own attempts at an objective reviewing style that ensured even the games I felt were the best on the market (i.e. my favorites) were scored no higher than 8 if they had any flaws whatsoever. This leads into one of the first problems with review scores.

Objective versus Sentimental

The overwhelming majority of RPGs are not afternoon affairs, to be played between instant messages and commercial breaks. They are investments of sorts, consuming significant amounts of time and money. Quite honestly, we expect a return on our investments greater than what we funneled into it - if we are merely coming up even, we have in fact lost the opportunity to have invested that time/money into something that could have been better. No one is angrier than the person walking away from an investment that actually wasted time and money with little to no benefits.

But I know firsthand the psychologically comfortable feeling that comes from hasty rationalizations. "The battle system wasn't that bad." "The script was terrible, but there were some funny bits." "Music wasn't important in the game anyway." We have a psychological motivation to inflate our review scores, as by doing so we affirm to ourselves that the game we just completed was not as significant a waste of time and money. This is more than just a presumption - it is backed up by the very reviews whose scores are inflated. The text the numbers are supposed to be based on actually run counter to the numbers usually assigned. Moreover, I often noticed that while a reviewer might be quite candid in the numbering system inside the review (i.e. often using low scores for things such as Interaction, Music, ect), the overall score would mysteriously jump up several ranks in the final analysis.

This "mysterious" variable is supposed to be Fun Factor, and Points of View encouraged its use in the final analysis under the old system since it lacked its own explicit category. My main concern with Fun Factor is that by making it a variable that applies to everything, it reduces the usefulness of all scores. For example, you may hold the opinion that watching paint dry is exceedingly fun, and as such your scores for the hypothetical game "Final Paint Dry Fantasy IV" will inexplicably be inflated. By the Points of View guidelines, you do not have to explain that you think drying paint is fun, so there is no real way to determine the source of the inflated scores. I do not enjoy watching paint dry, and yet it is pretty difficult to argue with a 9/10 score by itself. Some might reason I should be basing my decisions on the text of the review instead on the final score, and I agree. However, the point is moot - if scoring were not an issue worth discussing, we would not be using numbers to begin with.

One of the least bereaved changes under the current system was the amalgamation of Localization, Interface, and Fun Factor into the catch-all category of Interaction. I have hitherto remained silent on my distaste for this category primarily because Fun Factor was never a real issue for my own reviews - I strive to remain as objective as possible, allowing the reader to project their own sense of Fun Factor onto my scores rather than the other way around. I began to discover, however, that many readers apparently did not get the memo that Fun Factor was no longer supposed to influence the final score more than a single category can. Additionally, I also played a series of games that had terrible interfaces and questionable localizations but remained fun (similar to Xenogears), or ones with acceptable interfaces and great localizations but were otherwise completely boring (similar to Vanguard Bandits). It makes little sense to take what makes games fun for most people in the first place - the battle system or plot - and smash it into two other things that are usually no more noteworthy than replay value.

The solution that I endorse the most would be giving Fun Factor its own category. Generally speaking, the final score will still be inflated (as Fun Factor will almost always be scored as a 5), but at least now we can tell why simply by looking at the numbers. This allows us to essentially remove the majority of a reviewer's bias from the overall score in cases where we fundamentally disagree with that person, as in the paint dry example. Having this new category will also necessitate an explanation from the reviewer into what he or she thinks is so fun about this game. Which, quite honestly, is the reason most reviewers write in the first place. Under the current system, we are supposed to sort of divine from the text and prior reviews what the reviewer enjoys in an RPG - this hypothetical system makes it explicit, and allows readers to understand the reviewer's bias from a single context.

One Step Forward, Two Steps Back

I was recently made aware that Roku's campaign to bring back the 10-point scale has resulted in a compromise of sorts with a 5.0 scale. Although I suppose a compromise in of itself is praiseworthy, it should clear by now that it does not fix the scoring problem, and really just obscures it under a false sense of progress. In practice, you can be sure that a score of 1-2.5 will rarely be used just as 1-2 and 1-5 were rarely used before. I will also make the prediction now that we will see sentimental scoring raise the overall score to a 3.5 in most reader reviews, just as a 7 was before. This only prolongs our current predicament with guessing biases while attempting to differentiate between "good," "great," "fantastic," and other such synonymous words.

Furthermore, while this change seems to make a wider range of scores available, this is a fairly moot point - the amount of reviews with an overall score of 1-2.5 will not see a measurable increase, due to the basic fact that there is little incentive to finish a game that you do not enjoy playing and then taking the time to write a review about it. Since the 5.0 scale does not allow fractions in the individual categories, the one place where more options would actually illustrate a point, the supposed victory is a hollow one.

Conclusion

Since we are unwilling to do away with the numbers altogether, the most sensible choice given the circumstances would be to create a new category labeled as Fun Factor and use that as a hedge against the natural instinct to inflate scores. A return to the 10-point scale is not entirely necessary, but it would be useful for the one numerical section of the review that actually matters: the categories themselves. In the worst case scenario of no further action taking place, hopefully this editorial has made everyone just a bit more aware of the difficulties of scoring, and why starving Points of View out of some perceived slight is not going to ever change or even address the underlying problem which caused the animosity. And to those that do take the time to review, perhaps just a bit more inclined to score objectively so as to help give the scoring system itself a sense of actual value.




© 1998-2008 RPGamer All Rights Reserved
Privacy Policy