Game moderation reloaded

Game moderation reloaded

Announcements

Cookies help us deliver our Services. By using our Services or clicking I agree, you agree to our use of cookies. Learn More.

Y
Renaissance

OnceInALifetime

Joined
24 Sep 05
Moves
30579
07 Mar 08

Originally posted by HomerJSimpson
arriakis is rumored to be steve exeter. I would like proof that he isnt steve before voting for him.
Nonsense, here is the proof (bottom page 1). Thread 82784

G
Mr. Shield

Joined
02 Sep 04
Moves
174290
07 Mar 08
1 edit

My votes would go to:

Tony
Gatecrasher


I think at least 2 or 3 others would be needed to ensure proper analysis, and a little bit of speed (spreading the load out). Perhaps CMSmaster would be a good choice, too?

G
Whale watching

33°36'S 26°53'E

Joined
05 Feb 04
Moves
41150
07 Mar 08

Originally posted by gezza
What is the workload?
Gatecrasher:"After a week or so [...] I was about 60%-70% through the workload [...]"
How many hours/days work are we talking about?

I want to be stamped as clean at this time. I have completed 54 games over the last 2 1/2 years - If the last game mods had some sort of pretty tool to analyse, run that. I would rather deal with the ...[text shortened]... on. If I decide to give my name, I am sure there will be others who recognise me.
The workload is variable. cmsmaster suggests that the previous team could not cope with the number of complaints, but in the second half of 2007 we received very few complaints and every compliant received was investigated. There was seldom a backlog. I do agree more team members would be better to reduce workload, but too many can slow down effective decision making.

I worked hard on the case in question because it was so critical, and because so much evidence had been submitted. There were many analysed games, comments and observations on specific moves, and these all these had to be verified, replicated and evaluated for objectivity. Normally, a complaint is along the lines of "so-and-so's graph looks dodgy" and we would simply run our own checks.

A game mods should be prepared to spend several hours a week modding.

Automated tools have been developed which makes life easier. Manually analyzing games with a specific engine is time consuming, but sometimes unavoidable.

For the record, and without going to into specifics that could help cheats avoid detection, I want to clarify the use of statistics by game mods.

The misuse of statistics in that blog entry (and subsequent discussion in an RHP forum thread) is indeed appalling. The test statistic produced by the tool cannot be converted into a "% chance of cheating" simply by subtracting it from 1. Rather, it is the probability that a genuine human player could innocently equal or exceed the match-up stats produced from a suspect's batch of games.

This can be likened to the tossing of coins. Toss a fair coin 10 times, and you will most likely get 5 heads. A test statistic using a cumulative binomial distribution would suggest a 62.3% chance of achieving 5 or more heads. 37.7% chance of 6 or more, 17.2% chance of 7 or more, 5.5% of 8 or more, 1.1% chance of getting 9 or more. And a 0.1% of getting 10 heads out of 10. But if we toss the coin 10 times and actually get 10 heads, it cannot be converted to mean there is a 99.9% chance that the coin was not fair. We have already established that the coin is fair. Similarly, to say there is a 75% chance that Tal used an engine to make his moves is absurd. We know he didn't.

This is hypothesis testing. The null hypothesis is that the suspect is a strong human player. So how well do the findings fit the possibility that chance factors alone might be responsible for the outcome? What the tool measures is a type I error, the probability of a false positive. If we reject the null hypothesis due to a type I error we are identifying the suspect as an engine user. To apply this we need to determine an acceptable significance level. Say we used 5%, it would mean that for every 20 tests on innocent players, one of them would get falsely banned. A 1% significance level would produce 1 false positive in 100, 0.1% gives 1 in 1000, and 0.01% gives 1 in 10000. What level of error is acceptable? What other eivdence exists? What constitutes overwhelming evidence beyond reasonable doubt?

Lastly, whether the writer of the blog has been rightly or wrongly accused of cheating is immaterial. He has every right to play here, and if innocent, he has nothing to fear. No circumstances excuse the fact that he has taken privileged game mod information, and the contents of private messages and pasted them on a site external to this one. And that he is using a tool designed exclusively for the use of RHP game mods for the sole purpose of objective game modding here at RHP, as a means to defend himself and as a weapon to attack others. That is a betrayal of trust. As the author of the tool I resent its misuse.

As with any statistical tool, data selection is critical. Biased data selection will give biased results. The tool can be used to "prove" anything as it relies heavily on the objectivity of the operator.

1...c5!

Your Kingside

Joined
28 Sep 01
Moves
40665
07 Mar 08

This is quite interesting. As an engineer familiar with inferential statistics and mathematics, I always wondered how the game mods decided that there was overwhelming evidence of cheating. A simple hypothesis test. Awesome! 😀

Child of the Novelty

San Antonio, Texas

Joined
08 Mar 04
Moves
618677
07 Mar 08

Originally posted by HomerJSimpson
arriakis is rumored to be steve exeter. he used to have an image in his profile that was linked to a image hosting service the exact image he has now with the name steve exeter in it. He did not answer the question when I asked him about it and he has since changed the image hosting of the picture. I would like proof that he isnt steve before voting for him.
Arrakis is not steve exeter.

l
Man of Steel

rushing to and fro

Joined
13 Aug 05
Moves
5930
07 Mar 08

Originally posted by Gatecrasher
...The misuse of statistics in that blog entry (and subsequent discussion in an RHP forum thread) is indeed appalling. The test statistic produced by the tool cannot be converted into a "% chance of cheating" simply by subtracting it from 1. Rather, it is the probability that a genuine human player could innocently equal or exceed the match-up stats produced from a suspect's batch of games....
LOL! If you were talking with a non-mathematically inclined individual and watched their eyes glaze over after you said, "the probability that a genuine human player could innocently equal or exceed the match-up stats produced from a suspect's batch of games...." how would you then explain all this gobbledygook. While it may not be mathematically correct, "% chance of cheating" basically expresses a layman's "equivalent" to what you are trying to say....

l
Man of Steel

rushing to and fro

Joined
13 Aug 05
Moves
5930
07 Mar 08

Originally posted by Gatecrasher
Lastly, whether the writer of the blog has been rightly or wrongly accused of cheating is immaterial. He has every right to play here, and if innocent, he has nothing to fear. No circumstances excuse the fact that he has taken privileged game mod information, and the contents of private messages and pasted them on a site external to this one. And that he ...[text shortened]... n to attack others. That is a betrayal of trust. As the author of the tool I resent its misuse.
Not knowing any of the players here, I don't particularly feel like I have a "side" to take. But I can see where the accused is coming from here. You are objecting to his use of "your" tool in defending himself. But the fact of the matter is that the system is set up such that he would otherwise never have had an opportunity to defend himself. And to make matters worse, in this particular case he'll never even have the opportunity to have what passes for a "fair trial" since the trial was suspended.

Instead he's being tried in the forums by people who are using the very same tools for offense which you object to him using for defense. If the shoe were on the other foot, how long would you tolerate having your name dragged through the mud in the forums?

London

Joined
04 Nov 05
Moves
12606
07 Mar 08

Originally posted by GalaxyShield
My votes would go to:

Tony
Gatecrasher


I think at least 2 or 3 others would be needed to ensure proper analysis, and a little bit of speed (spreading the load out). Perhaps CMSmaster would be a good choice, too?
I agree that Tony and Gatecrasher would be worth voting for. I also consider Dragon Fire a good potential candidate although he hasn't put his name forward and is on holiday...has some catching up to do when he gets back I reckon!

RHP Code Monkey

RHP HQ

Joined
21 Feb 01
Moves
2425
07 Mar 08

I'll let this run until early next week before closing the thread and moving the process on.

-Russ

wotagr8game

tbc

Joined
18 Feb 04
Moves
61941
07 Mar 08

I would volunteer if my computer wasn't so damned ustable! Fritz over heats my computer and causes it to crash, i'm yet to complete a full analysis on a game since i bought it. 🙁

TC

Joined
12 Aug 04
Moves
30813
07 Mar 08

Originally posted by Gatecrasher
The workload is variable. cmsmaster suggests that the previous team could not cope with the number of complaints, but in the second half of 2007 we received very few complaints and every compliant received was investigated. There was seldom a backlog. I do agree more team members would be better to reduce workload, but too many can slow down effective de ...[text shortened]... an be used to "prove" anything as it relies heavily on the objectivity of the operator.
Rec'd, and thanks for your clarifications.

P
Mystic Meg

tinyurl.com/3sbbwd4

Joined
27 Mar 03
Moves
17242
07 Mar 08

Originally posted by Russ
I'll let this run until early next week before closing the thread and moving the process on.

-Russ
Russ, I'm in for feedback, getting coffee, and answering PM's and public outcry over decisions. The basic stuff I did before.

P-

Naturally Right

Somewhere Else

Joined
22 Jun 04
Moves
42677
07 Mar 08

Originally posted by Phlabibit
Russ, I'm in for feedback, getting coffee, and answering PM's and public outcry over decisions. The basic stuff I did before.

P-
Can't we get someone with bigger breasts to do that?

RN
RHP Prophet

pursuing happiness

Joined
22 Feb 06
Moves
13669
07 Mar 08

Originally posted by GalaxyShield
My votes would go to:

Tony
Gatecrasher


I think at least 2 or 3 others would be needed to ensure proper analysis, and a little bit of speed (spreading the load out). Perhaps CMSmaster would be a good choice, too?
I think Tony and Gatecrasher would be fantastic choices.

RN
RHP Prophet

pursuing happiness

Joined
22 Feb 06
Moves
13669
07 Mar 08

Originally posted by Gatecrasher
The workload is variable. cmsmaster suggests that the previous team could not cope with the number of complaints, but in the second half of 2007 we received very few complaints and every compliant received was investigated. There was seldom a backlog. I do agree more team members would be better to reduce workload, but too many can slow down effective de ...[text shortened]... an be used to "prove" anything as it relies heavily on the objectivity of the operator.
I think that since gate crasher wrote the program that does the testing, he HAS to be a game mod.

What about David?