seeing the same movies?

Discuss your experiences with and ideas about Stardust@home here.

Moderators: Stardust@home Team, DustMods

Nikita
DustMod
Posts: 994
Joined: Wed May 17, 2006 8:33 pm
Location: Indiana, USA

Post by Nikita »

jsmaje,

OK, there is one thing to consider on this - there needs to be significant data for this sim to work. In my case, I have only clicked on 74 movies, all were unique. Therefore, as my data is entered, there are two interpretations:
1. I have not had any repeats.
2. I do not identify the same thing consistently to get repeats.

Before looking at my stats, #1 doesn't sound likely as we know that there will be duplicates in a random generator. So I don't sound like a very good duster, eh?

Let's look at my data:

Your Overall Score: 769
Total Movies Viewed: 2491
Your Rank: 1727 out of 23389
Specificity: 100.00%
Sensitivity: 98.71%

I haven't viewed many movies. Compared to the leaders, I have done next to nothing. But I am probably more representative of the traditional duster.

So back to the conclusions to the fact that I do not have any duplicates in my events. #1 begins to sound reasonable when we consider the total amount of movies and the amount I have seen. I certainly have lower odds of getting a duplicate than you or fjgiie as I have pulled from the large pile of movies less. So, I think we can safely say #1 is a factor.

Now on to consistency, conclusion #2. Can I honestly say that I have always judged the movies the same every time I dust? Only a computer program can do that. Therefore, I can say that I may have seen duplicates and scored them differently. Many of us have expressed regrets over several of our first picks. We got smarter. I know I became more cautions as I have reviewed the movies that have been posted on the forum. Even if there was no learning curve, I can also admit that I do not judge the movies the same as some times I am tired, in a hurry, distracted, focused, and so on. So it is very possible that I have seen duplicates and judged them differently as well.

Now on to another factor, the "bad focus" issue. I know I click that more than some as I have looked at some of the potential movies put on the forum and have considered them as bad focus. So what percentage would that be? I have no idea. Again, it varies depending on my mood.

The end conclusion is that I don't think I bring good data to this. I also think that the human judgement variables and the random input of numbers into the formula makes it difficult to determine the validity. Perhaps CK would be willing to give input. There we have consistancy built in and certainly enough "dips in the pool" to get random duplicates.

Now, my question is this: Is my assessment and conclusion valid?

You know, I really hated stats in college! :lol:
From dust we come

fjgiie
DustMod
Posts: 1253
Joined: Sat May 20, 2006 8:47 am
Location: Hampton, SC, US

Seeing Doubles

Post by fjgiie »

Nikita wrote:Now on to consistency, conclusion #2. Can I honestly say that I have always judged the movies the same every time I dust? Only a computer program can do that. Therefore, I can say that I may have seen duplicates and scored them differently. Many of us have expressed regrets over several of our first picks. We got smarter. I know I became more cautions as I have reviewed the movies that have been posted on the forum. Even if there was no learning curve, I can also admit that I do not judge the movies the same as some times I am tired, in a hurry, distracted, focused, and so on. So it is very possible that I have seen duplicates and judged them differently as well.
The simulation gives more doubles that I have clicked on. Here are mine, sim first then my actual.
70-25, 50-25, 53-30, 64,30

What this tells me is that the doubles I should have found, I did not find. They must have been questionable in my mind and I only clicked the second time on about half of my doubles.
Now on to another factor, the "bad focus" issue. I know I click that more than some as I have looked at some of the potential movies put on the forum and have considered them as bad focus. So what percentage would that be? I have no idea. Again, it varies depending on my mood.
Bad focus will of course be variable for each duster. The bad focus that a machine would pick would be between 12% and 15% of all noncalibration movies. Mine was 12.68% and I judged every movie I could.

And for you Nikita, without any Pete's and repeats this sim is not for your events list, but for your enjoyment.

Nikita
DustMod
Posts: 994
Joined: Wed May 17, 2006 8:33 pm
Location: Indiana, USA

Post by Nikita »

Well, John did want to know what my results were, so I told him and why! I did hate stats, but I did learn a little from it. A measurement needs to be reliable and valid for the results to be accurate. In this case, I don't think it was for my stats. This is a test that would need more than my data to be a good indication.
I did enjoy the mental challenge, it beats the basic math I help my kids with! :lol:
From dust we come

jsmaje
Posts: 616
Joined: Tue Aug 15, 2006 8:39 am
Location: Manchester UK

Post by jsmaje »

Nikita wrote:The end conclusion is that I don't think I bring good data to this.
fjgiie wrote:And for you Nikita, without any Pete's [whazzat?] and repeats this sim is not for your events list...
Not at all! Nikita, your data are as good as anyone's. OK, you've not had any repeats, but as you say this is hardly surprising given the large size of the pool.
Unfortunately you haven't told us what the sim actually predicted! I'm sure it would indeed predict no, or only a small, chance of repeats. For example, I entered just the first 74 movies from my list, and 8 out of 10 times it did in fact predict 74 seen only once with no repeats, the average being 73.6 once, 0.2 twice, 0 thereafter.

Yes, the % you consider 'bad focus' is personal and varies according to mood. All the sim asks is that you make an educated guess about your average figure over the whole time you've been dusting (it was because of this vagueness I first left it out, but then re-introduced it in response to fjgiie's comments, so at least you can see the effect of varying this figure; in fact, all the sim does is subtract the bad focus proportion from the total entered for the pool of reals before any further processing).

The question of consistency is more tricky (& which was Chuck Crisler's original question in this thread), since we've all had, and continue to have, different learning curves as well as moods. I'm not sure how this could be incorporated, except as another overall guestimate, as for bad focus. But this would simply reduce all repeat rates, rather than model something like fjgiie's particularly low doubles:
fjgiie wrote:What this tells me is that the doubles I should have found, I did not find.
This could well be the explanation if the majority of triples & above were acquired early on in your personal 'learning curve'. Having later become more choosy, and since doubles are more likely than higher rates, it is this figure that will be particularly slow to grow, while the higher ones remain to haunt you!
This could be confirmed if the high repeat movies are mostly in the top half of your Events list.
But no, "there isn't a test afterwards", so no obligation to check!

John

PS: I agree that CK's data could be of interest here

fjgiie
DustMod
Posts: 1253
Joined: Sat May 20, 2006 8:47 am
Location: Hampton, SC, US

Where they were

Post by fjgiie »

Doubles and etc.- % down events list
1765266V1 70%, 100%
1851152V1 80% 80%
1869354V1 60%, 70
1263732V1 70, 70
2220548V1 60, 80
2454054V1 90, 90 very next one
2965879V1 30, 40
4087315V1 70, 90
4366150V1 60, 70
4398717V1 10, 70
5670669V1 40, 50
5827299V1 80, 80
6314383V1 80, 90
6353665V1 90, 100
6623569V1 80, 90
7509204V1 90, 100
7723749V1 20, 40
7937104V1 90, 100
8153179V1 40, 100
8506668V1 10, 80
8937027V1 20, 30
8999016V1 80, 100
9252254V1 30, 100
9295021V1 90, 100
9383973V1 10, 20
9471219V1 50, 70
9588567V1 40, 60
9806342V1 50, 50
9855575V1 90, 90
9987464V1 40, 70

1839932V1 -3 40%, 60, 80
2263916V1 -3 30, 70, 80
-246463V1 - 3 40, 50, 60
3141306V1 -3 70, 75, 80
4870659V1 -3 40, 60, 70
4047608V1 -3 10, 40, 90
5831385V1 -3 10, 30, 50
5860552V1 -3 30, 30, 50
-604627V1 - 3 40, 50, 80
6125350V1 -3 20, 60, 90
6544999V1 -3 10, 70, 80
7376465V1 -3 70, 80, 90
7654784V1 -3 90, 90, 100
-772999V1 - 3 20, 50, 70


598677V1 - 4 30%, 40, 40, 70
6690749V1-4 10, 50, 80, 90
862370V1 - 4 30, 30, 30, 40

611858V1 - 5 30%, 50, 80, 80, 100

6065316V1-6 40%, 50, 50, 70, 80, 90

These appear random to me.
jsmaje wrote: without any Pete's [whazzat?] and repeats...
John, down here in the south a pete is sorta like a pant. A pant is a piece of cloth sewn into a tube that is usually used as a rag. Earlier two pants were sewn together to make a pair and worn on your legs.
The peat is the first and the repeat is the second all over again like déjà vu.

jsmaje
Posts: 616
Joined: Tue Aug 15, 2006 8:39 am
Location: Manchester UK

Post by jsmaje »

fjgiie wrote:These appear random to me.
Seems so, in which case I can't explain your present low frequency of doubles, other than, again, by chance.
And in fact, from your figures, they do seem to be growing at quite a pace: 31 vs. 25 only a week ago, with no increase in anything higher.

It would be nice to see the sim results of other high scorers like yourself to compare.

John

PS: How did you come up with those figures so quickly? - I wouldn't have known how to go about it.
PPS: I liked the pete's/repeats explanation. Pants of course means something rather different in the UK!

fjgiie
DustMod
Posts: 1253
Joined: Sat May 20, 2006 8:47 am
Location: Hampton, SC, US

Post by fjgiie »

jsmaje wrote:PS: How did you come up with those figures so quickly? - I wouldn't have known how to go about it.
Some may want to use Excell or MS Spreadsheet but I copied my events page into MS Word. Word sorted the events numerically. Then I went through and made a blank row above all multiple events so as to mark them. Copy and paste one of each multiple movie number into a separate list with a 3 for triples, etc.

With this list of multiple movies and My Events page opened, I used Ctrl F and pasted each movie number into the space and here is where the art comes in. The percentage down the events list is the location of the slider on the right side of the page with the highlighted movie in the middle of the page. For a triple just member, or remember 20%, 40%, 50% and type that on your list of multiples. For me not difficult with only 49 multiple events but a duster with more the list can be right long.

Post Reply