Page 1 of 1
Posted: Thu Feb 07, 2008 6:01 am
In the thread "my Specificity and Sensitivity against the average values" the discussion has turned to the amount of work we have set before us.
fjgiie wrote: "around 3 million movies over the whole collector times 400 views is a billion looks, give or take a few million."
This brings to my mind a question: What is the method behind showing us the "real movies"? My educated (?) guess is, that 100 (or even 50) viewings of a given movie should suffice more than well to give a reliable estimate of wether it probably contains a track or most probably doesn't (sensitivity is a more important issue than specificity). So is there a mechanism in the software which stops showing a movie after a certain number of views? If not, shouldn't there be? After all, otherwhere we read that the number of active dusters has fallen precipitously since the initial wave of enthusiasm, so there is no idea wasting volounteer work on movies already scrutinized by many. Also, there could be some cut-off points even earlier, stopping the showing of a given movie if, say, >20 have seen it and at least half of them has clicked on it, or nobody at all has. The same applies, mutatis mutandis, to movies with many "bad focus" verdicts.
Posted: Thu Feb 07, 2008 5:58 pm
Thanks for the question jelfving,
To directly answer your question, as yet there are no automated limits set to remove a movie from the active search database. Typically, movies have been removed or added manually at intervals that depend largely on the progress of automated microscope scanning and then processing of those data.
In the last several months there have been no new scans performed (though that should be changing soon) and thus to prevent the database from running out of data we have left most movies searchable. Even though there are many fewer dusters searching right now than there had been at the start of phase 1, we still are finding that all the movies are being searched plenty of times and have good statistics.
Once we begin the next phase with new data, plus begin extracting candidates and analyzing them, we expect that the number of dusters will increase again as will the updates to the movies database. So, we might never find a need to automate the process.
Thanks again for the question,