Automatic analisis.

Discuss your experiences with and ideas about Stardust@home here.

Moderators: Stardust@home Team, DustMods

maurix
Posts: 4
Joined: Sat Aug 05, 2006 2:10 pm

Post by maurix »

Well what i done, or better i start doing:

Divide the image in blocks of 32x32 and find the spectra of these image with FFT. I found the cut off of high frequency to see the focus level for each image. I will call this parameter focus level.

For each focus movie can be made a graph of focus level and find the max.

Where in an image a lot of subbloks agrees in the maximum of focus level, we are at the surfice, with a lot of dust focussed

If we can't find the surfice or is to low in the focus, we report bad focus.

Now we search for something focused down the surfice, if nothing found (everything blur) we give the focus movie a 0 if something found we parametrize this someting and give a value for each focus movie.
maurix
Posts: 4
Joined: Sat Aug 05, 2006 2:10 pm

Post by maurix »

Well the software is easy and i have started it, i was thinking to do a bot to get high score and have lost of more possibility to find a real track! :P

Maybe thay can give us the image in a database and in only 100 "voluteers" we can develope a good open source program for this.

ok, ok, is not my investigation, is fun seeing the images that's true but i think like a fu*ed data analyst, professional deformation! Sorry!
josemiguel
Posts: 23
Joined: Sat Aug 05, 2006 11:29 am

Post by josemiguel »

From Stardust@home FAQs

Q. Why can't a computer program find the particles?

A. Before a "pattern recognition" computer program can identify the telltale signs of the impact of an interstellar dust particle in aerogel, it has to "learn" the pattern from existing examples of such impacts. Since interstellar dust has never before been captured in aerogel, no such examples exist! As a result, no computer program is able to recognize the pattern. In contrast, the human eye can recognize such impacts with just a minimal amount of training.
maurix
Posts: 4
Joined: Sat Aug 05, 2006 2:10 pm

Post by maurix »

Yes it can be true, but a computer surely can idenify "something strange" down to the surface and neglect thousands of images were surely there are nothing strange.
I'm not so agree with this answare. And i'm not saing that _only_ computer must do this buy i'm asking if a computer can help us in the search!
icestation
Posts: 13
Joined: Sat Aug 05, 2006 9:51 am

Post by icestation »

icebike wrote: But it will take two years to fund, write, debug, and test that software, and in the meantime 10 thousand pairs of eyes are finding things faster.
i dont think its matter of funds or time , the project had been lauched in 1999 and even prepared before by NASA , so she had enough and time and money to right any program if it was possible .

computers can be used to discard the most apparent movies that seems to contain no traks , but looking for just a few particules , NASA cant take the risque of missing them by automated scan .
icebike

Post by icebike »

icestation wrote: computers can be used to discard the most apparent movies that seems to contain no traks , but looking for just a few particules , NASA cant take the risque of missing them by automated scan .
I gotta agree with that point.

It would seem many bad-focus movies could have been set aside with
a minimal amount of processing, such as finding the average shade of gray, and the darkest black, and doing some mathematical (mumble) analysis.
Chuck Crisler
Posts: 30
Joined: Wed Aug 02, 2006 9:44 am
Location: Windham, NH

most of these replies are wrong

Post by Chuck Crisler »

First, there are fascinating mathmatical methods by which systems can 'learn' to identify infinitely complexe multi-demensional patterns. The model I studied in school is Bayes-ian learning, developed by a Dr. Bayes some 70 or so years ago. It does require a person to 'hand hold' it through a learning process, but once calibrated it doesn't make 'mistakes'. Also, you always have the option of 're-calibrating' it once you learn something different or implementing an entirely different pattern recognition algorithm. It would definitely seem to me that NASA has missed the boat on this. Though there may not be another project planned exactly like this one, the framework to drive the recognition could be used on any and every project that has huge amounts of image data.

I write software professionally (and have for 30 years - CADD, database, client/server, networking, video conferencing). The project would take me about 9-12 months (rough guess). There is a lot of free and available work to draw from for the technical problems of pattern identification and solving the complex evaluation equations.

Frankly, this would be a PERFECT masters/PhD thesis for someone at Berkley!
Jwb52z
Posts: 61
Joined: Thu Aug 03, 2006 5:05 am

Post by Jwb52z »

You all may be very smart, but you sure can't pay attention very well. The moderators and the team have already stated that it would not work as you all insist that it can within the constraints they have, so there's no reason to keep on about it all.
icebike

Post by icebike »

Jwb52z wrote:You all may be very smart, but you sure can't pay attention very well. The moderators and the team have already stated that it would not work as you all insist that it can within the constraints they have, so there's no reason to keep on about it all.
Even Rocket scientists call plumbers.

Just because the moderators (who have no special training and are here to keep order) and the team (who are specialists in other areas) say it can't be done, doesn't mean that a specialist with training in automated image analysis couldn't do it.
Wolter
DustMod
Posts: 457
Joined: Mon May 22, 2006 2:23 am
Location: Enkhuizen, the Netherlands

Post by Wolter »

Indeed we, the moderators, are here only to dust the fora as well as the aerogel. :lol:

On a personal note, if a rocket scientist is able to call a plumber wouldn't he/she be able as well to call a image analysist.
In other words, i believe that the stardust team (where i doubt you will find a rocket scientist) have discussed several options on how to handle this data. (there was some time both before as during the trip to think about it)
Since there are astronemers in the team i asume that they will be familiar with automated image analysis.
They have come up with this solution, perhaps the best, perhaps the most cost effective, perhaps the most psychological rewarding one. So who am i (looking from the outside at a tiny portion of the possible problems inside) to state that i can do this better

Besides all of that. I think it is one of the greatest public sience projects ever.

Happy hunting all. :D
Just dusting... Image
cthiker
Posts: 90
Joined: Fri May 19, 2006 10:35 am
Location: Woodbridge, CT

Re: most of these replies are wrong

Post by cthiker »

Chuck Crisler wrote:First, there are fascinating mathmatical methods by which systems can 'learn' to identify infinitely complexe multi-demensional patterns. The model I studied in school is Bayes-ian learning, developed by a Dr. Bayes some 70 or so years ago. It does require a person to 'hand hold' it through a learning process, but once calibrated it doesn't make 'mistakes'.
Hi Chuck!

As someone who also does work with AI and neural networking systems (I'm using them for clinical differential diagnostic analysis on traumatic brain injured patients), I agree that a bayesian approach would seem to have merit. The true problem is always setting the "gold standard" - that set of data from which to learn the desired outcome.

As noted by others, even the Team is reluctant to overqualify the limits of finding a "hit". I myself have already found several candidates (one mustering first pass) that was not very close to any of the training models presented. Using my most near and dear massively parallel processor (my brain) I reasoned some hits rather than simply observed them. Now, does that mean that my reasoning would be closely emulated throughout the 12K or so participants? Highly doubtful. But, as the old adage goes, an infinite number of monkeys pounding on an infinite number of keyboards for an infinite period of time will produce all the works of Shakespeare. Although finite in all dimensions, there is sufficient diversity of dusters such that, when they come to a reasonably common conclusion, this actually increases the probability of that the hit will successfully pan out! Such a model would, indeed, be very difficult to replicate programmatically.

The bottom line is that, aside from being relatively cheap (sorry, folks, but the reality is we're free...only the access system and its database incur cost), this approach provides for an improvement over the assumptions required for any bayesian review. Sometime down the road, however, when we really begin to know what a "track" does and can look like, your approach would then have far greater merit.

Hope that made sense...comments/questions??

Happy Dusting!!!
icebike

Re: most of these replies are wrong

Post by icebike »

cthiker wrote:
The bottom line is that, aside from being relatively cheap (sorry, folks, but the reality is we're free...only the access system and its database incur cost), this approach provides for an improvement over the assumptions required for any bayesian review. Sometime down the road, however, when we really begin to know what a "track" does and can look like, your approach would then have far greater merit.

Hope that made sense...comments/questions??

Happy Dusting!!!
Yes, we are Cheap, and Quick.
Quick to bring on line:
12,000 participants were trained in no time, and started processing with a minimal software instalation (training movies).
Quick to process the data:
I get the impression we are running ahead of the actual photography task.

Software would, at best, be slow and quick:
It would take (by Chuck's rather ambitious estimates) 12 months to get the code working.

But from then on out, it would work very fast, and could be (with some effort) reporgrammed to start scanning all over again if/when it is found that tracks look different than expected.
Probably working in parallel, banks of generic PCs (as google uses) could whip thru the entire database in a few hours.
icebike

Post by icebike »

Wolter wrote: the stardust team (where i doubt you will find a rocket scientist) have discussed several options on how to handle this data.
I use rocket scientists in the generic sense. I doubt there really is such a specialty, but if so they probably work for NASA too...
Wolter wrote:
Since there are astronemers in the team i asume that they will be familiar with automated image analysis.
They have come up with this solution, perhaps the best, perhaps the most cost effective,
Happy hunting all. :D
The images available to astronemers are vastly cleaner than these, and many a celestial object has been found by computer analysis of images taken at different times of the same chunk of sky. Some by accident, some because they were looking for somethink specific.

However, in the end, I think it all came down to Money. You have to pay high priced programmers to develop and babysit any new software, costing piles of money and months of effort before one film can be examined. The technology is far from certain, and the knowledge needed not commonly found, and therefore, expensive.

Wed development is a known science, and once set up can run largely unattended for months or years costing only bandwidth and storage.

This is not to belittle the efforts of the web designers, I think they have done a great job, and the model works very well.

I think its all a Cost issue, and what we have is a brilliant use of the limited funding.

The movies will still be available in digital form for anyone wanting to try their hand at automated detection.
decomite
Posts: 5
Joined: Thu Aug 17, 2006 1:47 pm

Re: most of these replies are wrong

Post by decomite »

AS I said in another post below this one, I think two months would have been enough to write, debug and test a track recognition software : you do not have to rewrite everything from scratch, a lot of tools are in the public domain (SNNS, weka).
farpung
Posts: 22
Joined: Fri Aug 04, 2006 9:01 am
Location: Quebec

What is the real reason for using volunteers?

Post by farpung »

While we are on the subject of questioning whether the stardust@home project is the most efficient and cost effective approach... Perhaps it is mean-spritied of me, since I enjoy looking for tracks and contributing to this project, but I have to say I find the original justification for this approach a little dishonest.

I quote (http://stardustathome.ssl.berkeley.edu/about.php): "If we were doing this project twenty years ago, we would have searched for the tracks through a high-magnification microscope. Because the view of the microscope is so small, we would have to move the microscope more than 1.6 million times to search the whole collector. In each field of view, you would focus up and down by hand to look for the tracks. This is so much work, that even starting twenty years ago, we would still be doing it today!

"This is where you come in: By asking for help from talented volunteers like you from all over the world, we can do this project in months instead of years. Of course, we can't invite hundreds of people to our lab to do this search—we only have two microscopes!"

This statement is not logical and seems to be intentionally confusing. It compares the hypothetical situation of using 20-year-old technology (manual microscopy) and the 2 microscopes currently available (which the author says would take at least 20 years) with a volunteer-based system that will take only months with modern atuomated microscope techniques. But that is comparing apples and oranges, surely!
Viewing the aerogel manually is not an option when automated microscopy can do the job in a short time, as it is. That is the real reason this project can be completed "in months rather than years." The "more than 20 years," "only 2 microscopes," "manually focussing up and down 1.6 millions times" are all red herrings!

The original plan was to have each movie viewed by a very small number of volunteers (4, I think) and to have it viewed again only if more than 1 flagged a track (that has changed a lot because of the vast number of viewers who have volunteered, and their enthusiasm).

How long would it actually take if a small number of staff were trained to search for tracks, as the volunteers are doing?
There are 700,000 movies. Looked at conservatively, if each movie was viewed 4 times (only once for "out-of-focus" but that movie is redone and reviewed) by well-trained viewers, that should be more than enough. If each viewer takes on average 10 secs per viewing (that should be more than enough too), and worked 8 hours a day, 5 days a week. A team of 6 viewers could complete the whole project in 7.5 months. Which is about the estimated time for the project using the stardust@home system.

Now, if it be objected that nobody would want to view the movies that intensively, for the same price 25 viewers (students, etc.) could be employed one day (or 2 half-days , or whatever) a week each.

I have no idea what kind of pay would attract the right sort of viewers (a large number of applicants could take basic training and be screened for skill) but I think $20 an hour might do it. That would mean $155,500 for the viewing of the whole project. Subtract from that the reduced cost/time for programming, web design, answering forum questions, and managing the vast army of volunteers and the data they are producing (but not the cost of the moderators, because they are volunteers, bless them!), and the cost of employing a few viewers could be much less than $155,500. Compared to the reputed "millions" this stardust project has cost, it would be very minor.

I suppose I should "shut the **** up" because what I am saying could discourage volunteers, but I still think that the initial advertising of this project ("months not years") was dishonest, albeit motivating.

Possibly the @home approach will save money and/or time, not much though. I think the real reason for using the @home approach is as an experiment, a trial run for future reference, for fun - to see if it can be done, and to engage a large number of people in project and therefore in space research, catching our enthusiasm and interest, which all helps politically for future government funding of space research. And to give a large number of people a chance to have fun contributing to science. All of which are great and worth-while aims. I just want to say that the real reason cannot be money or time.

Thanks for doing it this way, just the same. Your invitation to join the party and work to make it possible is much appreciated.
Post Reply