Phase 6 stats and Skill score 'jitter' simulation
Posted: Tue Jan 07, 2014 1:15 pm
Due to an injury at the beginning of August it hasn’t been comfortable for me to dust effectively, but I’ve used the time to revive the old ‘Top 100’ statistics site that some dusters once found of interest, even though barely more than 20 have been active lately. Also to write a simulation of the skill score ‘jitter’ issue that has been causing some frustration, which also allows testing of certain means to ameliorate it.
As before, the stats site provides interactive graphs of power & skill score alone and in combination, weekly updates (each Wednesday with luck, except for an unavoidable gap of 6 weeks) of overall and personal duster performance, general activity, and the previous phase results. Keep an eye out for the occasional ISP in the background!
The sim demonstrates the known issue of a moving average window as currently implemented, where all PM scores in the window contribute equally to the final skill score calculation*, resulting in unexpected and unwelcome drops in average skill score when the latest PM track is correctly identified at the same time as the oldest PM in the window - correctly identified but of higher value - is excluded. That the same number of rises over time occurs in the opposite circumstances generally goes unremarked, however!
In order to minimise the number and/or mean size of such unexpected ‘events’ the contributions from older PM scores that will eventually drop out of the window would need to be downgraded in some manner, i.e. carry less weight in the latest skill score calculation. To exclude only the oldest PM score(s) would simply be equivalent to narrowing the window.
Many so-called ‘weighting filter’ types have been devised for specific purposes in other applications such as financial market trending, image processing, etc . (see here & endless other web references). Filter-folk refer to the present equally-weighted average as having a (1) ‘simple’ filter (effectively none!), but the sim allows testing of three others that satisfy the above requirement, called (2) ‘weighted’, perhaps better thought of as ‘linear’ in contrast to (3) ‘exponential’ and its inverse (4) ‘logarithmic’; their respective profiles are plotted in fig 3 of the following post.
Symmetric filters such as ‘gaussian’ & ‘moving-median’ provide superior data-smoothing, but would incur the penalty of a delayed response due to downgrading the most recent as well as oldest scores, perfect for other situations but likely to be even more frustrating for the present purpose.
Each sim run delivers 1000 PMs valued randomly from 5 - 85 in 5-point steps of difficulty, and emulates the skill of a theoretical average duster by being linearly biased from 90% for the easiest (5) downward to 10% for the most difficult (85), with a superimposed random element of +- 10%. No improvement in skill, or memory/record of track coordinates, is taken into account. Skill score therefore approximates 0.5 throughout.
Demo mode reflects the present situation with a 100-PM-wide window and equally-weighted PM score contributions.
Custom mode allows for changes to window width (10 - 500 PMs), weighting filter type and %- width (10 – 100%).
My own results and conclusions using the sim are detailed in the following post.
Both programs have been tested using Windows 7 on a 16:9 aspect monitor, and the latest browser versions of Microsoft IE (9-11), Mozilla Firefox (22 -26) & Google Chrome (31) (in which the window flickers annoyingly for some reason).
My apologies for any problems if you use a different OS or other browser/version, and for any unfound bugs.
“Stardust@Home Top 100 – phase 6” is available here
“SD@H Skill Score ‘Jitter’ Simulation” is available here
John
* Present skill score calculation:
total PM track values if correctly identified / total 'adjusted' values, where 'adjusted' values = PM values if correct (5 - 85) plus their inverse-values (90 minus value, i.e. 85 - 5) if incorrect.
NB: the ‘jitter’ phenomenon has nothing to do with the skill score formulation as such, merely the use of a ‘moving average window’. This was of course adopted because of other frustrating issues arising from the previous ‘cumulative average’ employed in phase 5.
As before, the stats site provides interactive graphs of power & skill score alone and in combination, weekly updates (each Wednesday with luck, except for an unavoidable gap of 6 weeks) of overall and personal duster performance, general activity, and the previous phase results. Keep an eye out for the occasional ISP in the background!
The sim demonstrates the known issue of a moving average window as currently implemented, where all PM scores in the window contribute equally to the final skill score calculation*, resulting in unexpected and unwelcome drops in average skill score when the latest PM track is correctly identified at the same time as the oldest PM in the window - correctly identified but of higher value - is excluded. That the same number of rises over time occurs in the opposite circumstances generally goes unremarked, however!
In order to minimise the number and/or mean size of such unexpected ‘events’ the contributions from older PM scores that will eventually drop out of the window would need to be downgraded in some manner, i.e. carry less weight in the latest skill score calculation. To exclude only the oldest PM score(s) would simply be equivalent to narrowing the window.
Many so-called ‘weighting filter’ types have been devised for specific purposes in other applications such as financial market trending, image processing, etc . (see here & endless other web references). Filter-folk refer to the present equally-weighted average as having a (1) ‘simple’ filter (effectively none!), but the sim allows testing of three others that satisfy the above requirement, called (2) ‘weighted’, perhaps better thought of as ‘linear’ in contrast to (3) ‘exponential’ and its inverse (4) ‘logarithmic’; their respective profiles are plotted in fig 3 of the following post.
Symmetric filters such as ‘gaussian’ & ‘moving-median’ provide superior data-smoothing, but would incur the penalty of a delayed response due to downgrading the most recent as well as oldest scores, perfect for other situations but likely to be even more frustrating for the present purpose.
Each sim run delivers 1000 PMs valued randomly from 5 - 85 in 5-point steps of difficulty, and emulates the skill of a theoretical average duster by being linearly biased from 90% for the easiest (5) downward to 10% for the most difficult (85), with a superimposed random element of +- 10%. No improvement in skill, or memory/record of track coordinates, is taken into account. Skill score therefore approximates 0.5 throughout.
Demo mode reflects the present situation with a 100-PM-wide window and equally-weighted PM score contributions.
Custom mode allows for changes to window width (10 - 500 PMs), weighting filter type and %- width (10 – 100%).
My own results and conclusions using the sim are detailed in the following post.
Both programs have been tested using Windows 7 on a 16:9 aspect monitor, and the latest browser versions of Microsoft IE (9-11), Mozilla Firefox (22 -26) & Google Chrome (31) (in which the window flickers annoyingly for some reason).
My apologies for any problems if you use a different OS or other browser/version, and for any unfound bugs.
“Stardust@Home Top 100 – phase 6” is available here
“SD@H Skill Score ‘Jitter’ Simulation” is available here
John
* Present skill score calculation:
total PM track values if correctly identified / total 'adjusted' values, where 'adjusted' values = PM values if correct (5 - 85) plus their inverse-values (90 minus value, i.e. 85 - 5) if incorrect.
NB: the ‘jitter’ phenomenon has nothing to do with the skill score formulation as such, merely the use of a ‘moving average window’. This was of course adopted because of other frustrating issues arising from the previous ‘cumulative average’ employed in phase 5.