Thursday, May 25, 2006

Randomized Algorithms

Since my mind is wandering around my work, I couldnt think of anything fruitful to report..:).. Interestingly, it has been my observation that any finding stems from simple facts that we experience and solve in a day to day life. Just that it has not been documented and thought through in an academical sense. Recently, I heard of applying randomized algorithms for certain technical problems.

The question is this. If you have a 1000 apples and one of them is bad, how will you detect it? The obvious solution would be to check all 1000 apples till you find the odd one out. If you get lucky, you will end up detecting the odd one out very quickly. On a bad day, you may end up looking at all the 999 apples before finding that the 1000th apple was the bad one. How does one solve this problem? Though people have come up with clever ways to solve the problem, when the state space explodes, it becomes a very difficult problem to solve.

Probability comes to the rescue. The question is twisted now. If you have the time to pick only 10 apples at any time before you have to move on, which 10 apples would you pick. What is the probability that you will succeed in getting the odd-apple out by picking only 10 apples. It has been studied probabilistically that, a smaller sample randomly picked from a given group can still detect the odd one out with very high probability. I dont want to get into the math of it. In everyday life, we apply the same principle in a lot of cases where we face complex decisions because trying to analyse all possibilities could be a night mare and we dont have that much time. In the big picture, when there are more uncertainities, we are probably better off by solving a smaller set of finite problems than trying to solve the bigger problem itself...

5 comments:

BrainWaves said...

That is what I call post-poning my decision till I have to absolutely decide. And elders call, cross the bridge when you have to. :))

It is certainly a very interesting topic. Or one of those things we take it for granted.

Manohar said...

any links to the actual theory? i'm intrigued.

Suresh Sankaralingam said...

sorry for the late reply dude. I dont have any specific links since some of the latest algorithms (specifically networking scheduler algorithms) use this as a premise.. I will let you know if I find anything concrete.

bumblebee said...

I couldn't agree more. We are in the midst of a re-org at work and not very clear about our team's objectives. I feel that to do something of value, we need a goal that is finite and something we can wrap our hands around. I believe that unless we get to the details, all we will ever do is to sit and talk ambiguously about macro-goals, but fail to do anything constructive. Many times, functional decomposition works.

Mad Max said...

got here through meeras blog...

very very interesting article and a nice subtle connection to reality..i'm kind of intrigued by the example quoted here...here is the cause for my confusion...

since u have mentioned that the task at hand is to find one rotten apple out of 1000...then does randomization work any better than deterministic alogirthms??..

let us look at using the entire sample space...the reason why i bring this point up is that randomized alogrithms is based on the concept of expected values which always assigns probabilities...now if the case was detecting say 100 bad apples out of 1000 then randomized algorithms would be far more efficient, especially if ordering is not an issue...

however it is my understanding that you need a fair bit of cross sectional variation for randomized alogirthms to actually work...hence it might not be the most efficient to sort one bad apple out of a thousand...

anywayz that was on a tangent...next point was that we can choose a subset of 10 apples from the 1000 apples and then check the probability that 1 out the 10 is bad..thats a far easier problem to solve...however it is computationally intensive...plus if u are randomizing on sample space cuts also, then the number of runs required would be quite large right..we would need several rounds of sampling with replacement and detect the probabilities...the distribution of such hit rates can then be used to draw inferences...this to my knowledge was what we call bootstrap methods/repeated sampling etc...

neyways it is just a matter of clarification coz my understanding of random algorithms is quite limited..a clarification is appreciated..thanks