Thursday, July 27, 2006

The Silly we think ...

Working in the field of high energy physics is now a days has become more like digging for a rash of gold in the dustful of mine. The only difference is that the golden track is a new particle (or a pile up of many basic elements, the decay products) and the mine is the enormous data sampled over ages. And more than anything, what needs your presence is this data mining.

And you have to just clever...not only in picking up the correct...but also in picking up correctly....what one would say....picking up smartly....

the nature does speak to you....but to understand it, you need to have the interpreter.....the detectors ...... and just as good as any other interpreter, detectors have their own personal touch in everything!

in all this, what can save you is called statistics....starting from detector calibration to authentication of your result...everything can be fitted nicely, only if you have good statistics...that is size of the data.....

provided we now understand the nature to a very great extent, the state-of-art research in High energy Physics demands question about particles one can't even think of seeing in his dreams.....what I mean is they are unimaginably tiny .....

So, we go for higher...much higher....no, in fact pretty much higher data size .... and that makes life worse .... you will have to spend more time analysing it ..... and believe it or not...but running your simple code on the whole data may take even months to finish.....

and here, you got to be smarter...much smarter....no, in fact pretty much smarter ..... and that makes life even worse..... you will have to go for clusters with more computers....... computers with many processors.... and processors with more speed .... so what one technically calls .... n nod clusters with multicore processors ...... (And there comes GRID computing, but more on this later)

but hardware always have a limit...the limit of its own physics ..... (does that remind you of Ouroboros : the famous tail eating snake?).... so you got to invoke software possibilities ... that is nothing but your brain .....( it is like biology coming to rescue of the Ouroboros, by taking its life away, before it eats up itself whole) .... so you device smart ways to reduce the data size you wish to tackle with ....

it is called skimming ... that means, collecting only that part of data, meaningful to you ..... when I started adopting this idea.... I thought, it will be great to store only useful events into a separate data file and use this file instead of the whole data for future analysis ..... and now, I am gauging for how silly I think I am .....

isn't it possible to make a data file called index file , which stores indices, mentioning which event amongst the full data is useful and which is not?.... won't it take smaller space and less efforts than storing the same data events again separately? ......

research is nothing about thinking ..... it is all about smart thinking .....

After all successful people don't do different things, they do things differently

5 comments:

Shashikant Kore said...

Do you get your code reviewed from somebody? Many a times, few minor changes improve the speed of computation drastically. These tricks are like black magic and are not available in books.

samudrika said...

...and I thought you guys had all the fun.

Nikhil Joshi said...

shashi:

hey...thanks for this suggestion. I never did that...will try doing it in future :)

But, the time scale I mention is the typical one, nothing specific to my code .... our one even size is about 35 KB and we have more than 600 milion events to analyse each time :)

samudrika:

we did have .... even making mistakes can be fun, you know... you become master :)

... but, yah...you are right .... sometimes we got to study when on tours :P

Anonymous said...

Great site lots of usefull infomation here.
»

Anonymous said...

Hey what a great site keep up the work its excellent.
»