A word about your point system in Phase 2

Discuss your experiences with and ideas about Stardust@home here.

Moderators: Stardust@home Team, DustMods

Post Reply
Ronald C. Spencer
Posts: 68
Joined: Thu Sep 21, 2006 11:25 am
Location: Massachusetts

A word about your point system in Phase 2

Post by Ronald C. Spencer »

It seems a bit dishonest to me if you deduct points that were already determined Correct if one makes an error. In my view if a person has 2,225 points and then makes a mistake, only the sensitivity and specificity values should change. If a person is determined to have made a Correct determination the Correct is Correct and NOT Incorrect. Also those little faded boxes with the little dots in them seem a bit unfair since when a person clicks on them and receives a Correct determination and then later clicks on one receives an Incorrect determination is a tad deceitful in my view. In a society which seems to know little of the meaning of Ethics these days maybe it passes over the heads of some. Right will always be Right and Wrong will always be Wrong and to twist those meanings smacks of so-called political correctness. I suspect lame excuses will be made for saying that which was Correct is no longer Correct and now Right is now Wrong after people already had received a Correct determination and then lost that point because you are NOT satified in just affecting the sensitivity and specificity values. People who put 10-14 hours a day into helping you at least deserve to keep the REAL scores they accrued. Does anybody in here agree or do you feel it is fair to lose a Correct determination after you worked so hard to help in this research only to have the Correct determination taken away after you scored that point? To take away that which was earned is NOT science, at least in this astronomer's opinion.

elainekeefe
Posts: 190
Joined: Fri Aug 04, 2006 4:38 am
Location: Massachusetts, USA

Post by elainekeefe »

Hi Ronald,

Although I understand your frustration, and most certainly have felt that way myself many times, I also realize that a project of this magnitude is bound to have errors. I think the team works very hard to correct the errors we report. At the end of Phase 1, many people, including myself, had their scores changed to correct inaccuracies. I assume they will do the same at the end of Phase 2.

As far as taking away points from our total score, I think the thing we need to keep in mind is that our scores mean different things to us and to the team. We are concerned with accuracy; they are concerned with creating a curve from which they can determine which members' hits are likely to be most accurate. In the end, all movies will be looked at anyway.

Considering the fact that probably most of us are encountering the same problems, I don't know that those incorrect CM's really make a statistical difference.

I think we are all perfectionists here, LOL!

Ronald C. Spencer
Posts: 68
Joined: Thu Sep 21, 2006 11:25 am
Location: Massachusetts

A word about your point system in phase 2

Post by Ronald C. Spencer »

Thanks for your response Elaine, but as a scientist I must adhere to Right being Right and Wrong being Wrong otherwise true scientific values in computation could be affected and the end result a bit spurious but on the other hand I don't believe the fine people at Stardust are meaning to be a unfair in any sense. You have made some good points. Happy Dusting Elaine :)

Wolter
DustMod
Posts: 457
Joined: Mon May 22, 2006 2:23 am
Location: Enkhuizen, the Netherlands

Post by Wolter »

Where is that your correctly scored points are withdrawn?
As i see it, a correctly answered CM scores a possitive point.
A INcorrectly answered CM scores a negative point.
You total score is the sum of the above. So it acts as a kind of balans ;) one who scores as mutch correctly as INcorrectly answered CM's simply remains at a total score of 0. More correct answers and you go up into positive figures, more incorrect and you go into the red.
Just dusting... Image

ERSTRS
Posts: 75
Joined: Sat Aug 26, 2006 5:39 am
Location: home
Contact:

Phase 2, reply to Ronald C. Spencer

Post by ERSTRS »

Hello everyone, I've not posted since Phase 2 began because of my experiences at the end of Phase 1. I'm an old fuels lab tech gal who learned very early that Right will always be Right and Wrong will always be Wrong, as Spencer says. I might add to Spencer's post, that Fair is always Fair, as well. But, as the years have passesd, I've seen those younger than I come up believing that there are gray areas in between. That philosophy has permiated every aspect of our society, it seems, including StarDust@Home.

I worked under very strict scientific standarads in labs for both West Virginia University and The Ohio State University decades ago. I was involved in the Oak Wilt research program at WVU, and am now a researcher in American Chestnut restoration. We abide by rigid standards of research. I expected the same sort of scientific standards when I joined as a duster in August 2006. That didn't prove to be the case. It is grossly unfair, by the standards I've always worked by, to subtract a point that one has earned. It is grossly unfair to award a duster with 12,000 points at the end of the project without explanation. It is outgrageous that a system hasn't been devised where one can be counted correct when they are correct.

That is why I didn't enter Phase 2. Another reason is that I finished Phase 1 with a Rank of #61, but when I first signed onto Phase 2, they had my Rank listed as #9929! This old lady nearly fainted away before they changed it to #145. Therefore, I wanted to wait until it got started so I could check the forums to see if standards have been elevated. They haven't been in my estimation. But I'm still reading the posts the friends I've made are posting. I had to reply to Spencer's.

And for those who continue to say that scores don't matter, StarDust@at Home has said repeatedly that they DO. That is the only way they can determine the competence of the duster, they say---and they are right on that one.

Keep dusting friends, maybe the standard will be lifted and the program will finally reach scientific excellence after all.

Evelyn
ERSTRS
Evelyn

Ronald C. Spencer
Posts: 68
Joined: Thu Sep 21, 2006 11:25 am
Location: Massachusetts

A word about your point system in Phase 2

Post by Ronald C. Spencer »

Alas, a light breaks through amidst the gray area espoused by so many. Thank you Evelyn! It puzzles me when some appear to be confused :? when it is really a clear issue of a point earned is really a point earned. I will still continue though because I love the research and it is a worthy cause.

bmendez
Stardust@home Team
Stardust@home Team
Posts: 530
Joined: Wed Apr 19, 2006 11:28 am
Location: UC Berkeley Space Sciences Lab
Contact:

Post by bmendez »

Ronald and Evelyn,

I simply must take issue with your complaints. You both make accusations that we at Stardust@home are purposefully deducting points from your scores as some kind of capricious and unfair act. This is plainly a ludicrous assertion.

The VM and its scoring system are a computer program (or really several programs interacting). Yes, they do experience bugs which are caused by things ranging from actual errors in the code to strange interactions with differing browsers and operating systems. But those bugs are certainly not intentional.

We have made great efforts to assure that the system works correctly. But, some errors are so subtle and rare that they only appear after thousands of users have used the system and reported them to us. We depend on these error reports and are very grateful for them. We have a limited staff here, afterall.

If you have specific errors to report, please do so in the appropriate forum and thread.

The VM records a correct point for correctly indentifying either a track or no track in a calibration frame (which are roughly 20% of the focus movies you see). If there is a track (digitally inserted) in the CM and you miss it, a point is deducted. If there is no inserted track in a CM and you click on something in it a point will also be deducted. The program knows if you are correct or not because the movies come tagged with information about the type of movie it is and what the coordinates of an inserted track are (if present).

We know of some examples where the program has been "wrong." These cases are the exception rather than the rule (there have been dozens of reports of the problems, while millions of searches have been performed). One problem is a bug where the database correctly records a duster's clicks but then incorrectly computes the score. This is the rarest and most difficult to track down bug so far found in the code.

Other cases have been found where the CMs simply have the wrong coordinates recorded for them. This is again a rare problem. The CMs are made by digitally inserting example tracks or candidate tracks into movies that have been prescreened and determined empty of likely features by the Team. Several thousand CMs are produced automatically (i.e. via computer code, not by hand, which would be more error-prone) so some bugs in the CM production code inevitably sneak in and produce bad CMs. But again they are rare and we remove them as soon as we are made aware of them. I personally sifted through a few hundred CMs before Phase 2 started to be sure they were good, as did several other team members. Clearly we didn't catch all the bad ones, but there were thousands and the bad ones are rare.

The sensitivity and specificty scores are used to weight clicks by users. So, when a user with a specificity of 90% views a real movie and clicks "no track" the click is essentially recorded as 9/10 of a click. When a user with a sensitivity of 75% clicks on a feature in a real movie, that click is essentially recorded as 75/100 of a click. Real movies with the most number of clicks are looked at first by the Team.

Because the errors in CMs are rare and applied randomly (since movies are drawn from the database randomly), there is no statistical net effect on the overall project. In Phase 1, we had each real movie looked at by several hundred people, so the statistics were very, very good. The movies at the top of the list all had features in them worth investigating (i.e. there were no cracks, aerogel fragments, etc. in those movies).

There is one other way we have identified some candidates and that is from checking in with the "I think I've found a track, what do you think?" thread.

So far, this project has by and large worked as it was intended to and we are very pleased with the progress that has been made. This approach of using volunteers to search through the data rather than computer programs has proved useful. Most of the candidate tracks would most likely have been missed by a program rigidly looking for tracks like the simulated ones.

We are totally grateful for all of the efforts of the dusters and we want it to be clear that we hold them in the deepest respect and highest regard.

Thanks to all the dust,
-Bryan
"I am made from the dust of the stars, and the oceans flow in my veins"
- RUSH

DustSabre
Posts: 63
Joined: Sat Dec 09, 2006 4:51 pm

Post by DustSabre »

Well, you're correct that right is right and wrong is wrong, but in this case, the scoring seems to me to be an opinion. Admittedly, I have to say that the way things are now seems to me to be the the better way, since if somebody is screwing up, it will show. If, on the other hand, their score could never go down, then their score would just slowly increase every time they did succeed in doing something right, and it might eventually look like they had contributed a lot, when in actuality they just made a mess. In other words, it would allow an idiot to accrue a decent score over time while causing havoc by creating lots of false positives and negatives on real movies, messing up the project, while still making it look like they were doing good work.

Also, you could easily make a ton of points by just setting up a computer program that randomly answers the movies. Every time it correctly identified a "No track" on a calibration movie, the score would increase. If people's scores could never go down, then the computer's random answers wouldn't look much like cheating, but the computer could keep accumulating points for every CM it randomly got the right answer to. It wouldn't be long before the program would have a lot of points, while having actually done nothing to aid the project. Of course, the sensitivity of the program would be terrible, but that could be corrected by some primitive logic that helped it to get the right answer some of the time at least, enough to make it look legitimate.

In short, the present system prevents many forms of cheating that could otherwise eat out the project.

Nikita
DustMod
Posts: 994
Joined: Wed May 17, 2006 8:33 pm
Location: Indiana, USA

Post by Nikita »

We are not losing a correct score, nor to the points always continue to go up. I am not sure what other way would be the best to calibrate our efficency. We could have a percentage, right over wrong, but wrong answers would still affect the right. We could have two scores right/wrong, but I think that would make it harder for the team to judge our accuracy and it would be near impossible to judge the weight of our click on a real movie, the whole goal of the scoring.
The incorrect movies aside, the fact of the matter is that we are all getting scored the same, we all have an equal chance of getting the score. Also, consider this situation:
fjgiie starts phase 2 and nails the first CM. He has 1 right 0 wrong. Next, he accidentally clicks a wrong CM. now he's 1 right minus 1 wrong, thus 0. Then he has a big whoops and misses again for 1 right minus 2 wrong. Negative numbers time! (Just seeing if you are paying attention fjgiie, you, of course, would never have that situation!)
Actually, our computer programers have had their programs search accurately and they haven't been too bad, but they do miss the details that we are getting.
Are their ways to cheat? I'm sure there still are. But the team has done well to monitor and weed out problems. Quite frankly, no one who has achieved top scores, which is what a cheater would be going for, could do it with random clicks. I can't imagine someone spending the time our top dusters do to get a high score Random clicks won't give you a high number, no matter how much time you spend on it, because of the subtraction of errors. Simply put, if someone wants to try to cheat and randomly click, have at it. Their score couldn't affect the ranks.
But what about the project? One person won't affect the project. If I go to the bathroom and my 4 year old decides to dust, he will affect my score, but anything he clicks on won't skew the movie score much. If 100 people look at a movie and one goofs 1/100 is only 1%.
Does that make sense? I hope it helps.
Good luck dusting!
From dust we come

Jwb52z
Posts: 61
Joined: Thu Aug 03, 2006 5:05 am

Post by Jwb52z »

Ok, here's what I think is going on now. Evelyn and Ronald think that it's unfair to lower the total score based upon how many incorrect answers you get right in relation to how many you get wrong as if they are somehow interconnected. There's a problem with this thinking. I know that as scientists, Evelyn and Ronald will understand this if they are truly as smart as they seem to be so far. I also realize that science is an "other side of the brain" thing than me being interested in Literature so that does possibly make a difference in their point of view about things. It's like this. You take a test in school. That test has a certain maximum score and each question is worth a certain amount. Now, when you miss a question, that lowers your score. The only difference with dusting is that there's no actual "maximum score" unless you conceiveably could somehow look at every focus movie. Now, that being said, you cannot expect your score to not go down if you have missed some of the right answers that your "instructor" is looking for on your "test". The same goes for when you have to write a paper and I know even scientists have to write papers. Back in High School before everything began having to be typed as a standard as it is now, you had classes where you had to write papers by hand, right? Well, it's the same idea with dusting. Your teacher would tell you things, such as "penmanship counts". That is the same thing with the scores here. It may not directly involve your ideas on your paper and how well you said it, but you cant expect to get the same grade as someone else who wrote an equally good paper if the teacher cannot READ it. I had a much more thought out post that I tried to say basically the same thing in a few hours before, but my electricity went out and I lost it before it posted. I hope you all get the gist of what I am saying.

jsmaje
Posts: 616
Joined: Tue Aug 15, 2006 8:39 am
Location: Manchester UK

Post by jsmaje »

I side with those in favour of the present scoring system. It simply shows how many CMs we've correctly identified on balance, and has a purely psychological effect: gaining certificates, racing for the top 100, maintaining or increasing our position etc.
Instead, the team are more concerned with (& the science depends on) how discriminating we are, determined via our 'sensitivity/specificity' figures:
bmendez wrote:The sensitivity and specificty scores are used to weight clicks by users. So, when a user with a specificity of 90% views a real movie and clicks "no track" the click is essentially recorded as 9/10 of a click. When a user with a sensitivity of 75% clicks on a feature in a real movie, that click is essentially recorded as 75/100 of a click. Real movies with the most number of clicks are looked at first by the Team.

If someone were to score an overall 10,000 points but their sensitivity/specificity were only 10%, then their (admittedly enormous) efforts will have been of little account.

Post Reply