Ask Experts Questions for FREE Help !
Ask
    zbrkanac's Avatar
    zbrkanac Posts: 5, Reputation: 1
    New Member
     
    #1

    May 23, 2013, 11:49 AM
    probability of multiple drawings
    Lets assume we have an urn with 20,000 numbers (1-20,000). Lets have 40 drawings of 20 numbers when after each drawing numbers are recorded and returned to urn.
    How do I calculate the probability that no number will be drawn twice, 1 number will be drawn twice, 2 numbers will be drawn twice etc.
    In other words what is the probability for a number to be drawn twice under such circumstances?

    Thank you in advance for your help
    zoran
    Curlyben's Avatar
    Curlyben Posts: 18,514, Reputation: 1860
    BossMan
     
    #2

    May 23, 2013, 12:06 PM
    What do YOU think ?
    While we're happy to HELP we won't do all the work for you.
    Show us what you have done and where you are having problems..
    ebaines's Avatar
    ebaines Posts: 12,131, Reputation: 1307
    Expert
     
    #3

    May 24, 2013, 06:17 AM
    This is a very complicatwed calculation and hence I assume is not homework. It's oretty straught forward to calculate the probability of no duplicates at all throughout all 40 drawings:

    1. For drawing 1, obviously there are no duplicates
    2. For drawing 2 there are no duplicates if all 20 numbers pulled are from the 19980 that were not chosen in the first round. The probability of this is



    where P(a,b) means the permutation of a items taken b at at time.

    3. For drawing 3 there are no duplicates if all 20 numbers pulled are from the 19960 that have not been chosen in either of the first two rounds. The probability of this is:



    So the probability of making it through round 3 with bno duplicates is 0.9802 x 0.9607 = 0.9417.

    4. Continue on like this. For the 40 rounds you end up with:



    This turns out to be a very small number - approximately 0.000000135

    You next question is about the probability of exactly one number being pulled twice. This would be quite a complicated calculation. But please clarify something - do you want to include the probability that one number will be drawn more than 2 times, or not?
    zbrkanac's Avatar
    zbrkanac Posts: 5, Reputation: 1
    New Member
     
    #4

    May 24, 2013, 01:44 PM
    Hi ebaines

    Thank you so much for this explanation.
    I think I understand it although I would need to calculate it "manually".
    I guess intuitively I wasn't expecting such a negligent probability of having no duplicates.

    Re: second part (thx for asking). I think I didn't formulate second part of my question well. I am interested to know what to expect from my drawings.

    It looks like that I can expect to see zero numbers two time with probability close to zero.
    That Implies that I have probability to see 1 or more numbers drawn twice or more times with probability close to 1 (99%).

    I guess I would like to know how to brake this 99% in smaller chunks.

    This would include probability of 1 number drawn twice,
    two numbers drawn twice
    three numbers drawn twice etc
    and than one number drawn 3 times etc.

    Intuitively chances of one number being drawn 3 times are quite small probably around 1% or less.

    I guess only few outcomes would account for most probabilities. So when I think about it I am not sure how to ask my question most precisely. What I would like to know which outcomes are most probable. I guess this is getting more complicated.

    thanks

    zoran


    What is the expectation to see two numbers twice or three numbers drawn twice?

    Regarding probability of one number being drawn 3 times must be very close to 0 although I am not exactly sure how to calculate it.



    Quote Originally Posted by ebaines View Post
    This is a very complicatwed calculation and hence I assume is not homework. It's oretty straught forward to calculate the probability of no duplicates at all throughout all 40 drawings:

    1. For drawing 1, obviously there are no duplicates
    2. For drawing 2 there are no duplicates if all 20 numbers pulled are from the 19980 that were not chosen in the first round. The probability of this is



    where P(a,b) means the permutation of a items taken b at at time.

    3. For drawing 3 there are no duplicates if all 20 numbers pulled are from the 19960 that have not been chosen in either of the first two rounds. The probability of this is:



    So the probability of making it through round 3 with bno duplicates is 0.9802 x 0.9607 = 0.9417.

    4. Continue on like this. For the 40 rounds you end up with:



    This turns out to be a very small number - approximately 0.000000135

    You next question is about the probability of exactly one number being pulled twice. This would be quite a complicated calculation. But please clarify something - do you want to include the probability that one number will be drawn more than 2 times, or not?
    ebaines's Avatar
    ebaines Posts: 12,131, Reputation: 1307
    Expert
     
    #5

    May 28, 2013, 06:25 AM
    One way to approach this is to determine the expected number of duplicates that will occur, and the standard deviation as well. Let's start with the probability that any particular number (say, 37) will be drawn precisely twice. That probability is:



    Given that there are 20K numbers to choose from, the expected value of the number of duplicates and the standard deviation can be found using binomial probability theory:



    Thus on average you can expect 15 duplicates with a standard deviation of 3.87. See the attached for a plot of these results. Basically you have a 95% chance that the number of duplicates will be between 8 and 22.

    Applying this same method to triplets, the expected number of triplets is 0.19 with a standard deviation of 0.436. The probability of zero triplets occurring is about 33%, the probability of one triplet is 64%, the probability for two is 3%, and is essentially 0 for any more than that.
    Attached Images
     
    zbrkanac's Avatar
    zbrkanac Posts: 5, Reputation: 1
    New Member
     
    #6

    May 29, 2013, 05:46 PM
    Hi ebaines,

    thank you so much!
    this makes sense
    looking into it I think that last equation should be
    20x40x0.000751=0.6008
    as we are drawing 24x40=800 times
    (Expecting 0.6008 duplicates fits better with intuitive guess)

    really appreciate your help
    thanks
    zoran
    ebaines's Avatar
    ebaines Posts: 12,131, Reputation: 1307
    Expert
     
    #7

    May 30, 2013, 06:11 AM
    Quote Originally Posted by zbrkanac View Post
    looking into it I think that last equation should be
    20x40x0.000751=0.6008
    as we are drawing 24x40=800 times
    (Expecting 0.6008 duplicates fits better with intuitive guess)
    The equation I had is correct. I'm afraid your intuition is leading you astray - the probability of having just 0 or 1 duplicates in 800 draws is incredibly small. To test this I ran a Monte Carlo simulation of this game 25 times and got the following number of duplicates:

    16, 11, 15, 22, 13, 15, 20, 18, 14, 21, 25, 16, 18, 15, 10, 20, 9, 16, 16, 14, 15, 20, 14, 17, 15

    Results ranged from a low of 9 to a high of 25 with a mean of 16.2 and std deviation of 3.66, which is pretty close to my earlier calculation.

    The problem with intuition is that you are probably thinking that the odds of a duplicate should be around 800/20000 = 0.04. But that's not the right way to think about it. Instead consider the probability of no matches, which as shown earlier is very small, hence the probability of there being at least some matches is quite high. An analogous situation is in the old problem of determining the probability of at least two people in a crowd having the same birthday. How many randomly-selected people do you need in a room for the odds to be favorable that at least two have the same birthday? The "intuitive" answer is 365/2 = 182.5, but the actual answer is 23 - given a room of 23 randomly selected people the odds of at least one duplicate birthday is 51%. If you have 50 people in the room the odds of at least one duplicate are over 98%. With 50 people you know that at most only 50/365 = 14% of all dates can be represented, so it may seem incredible that with so few people you are virtually guarranteed of having duplicates. If you do the same calculation with your game it turns out that the odds of a duplicate number occurring hit 50% with only the 168th number drawn! Seems incredible, but true.
    zbrkanac's Avatar
    zbrkanac Posts: 5, Reputation: 1
    New Member
     
    #8

    Jun 5, 2013, 03:31 PM
    Hi ebaines,

    Yes on one side this seems quite simple, but I still find my head getting wrapped into it with difficulties seeing results clearly.

    I have additional similar problem with only 2 drawings. First time I take 440 numbers out of 20,000 and put them back. Second time I take 220 out. What is the expectation, how many numbers I will see twice i.e how many numbers that I have drawn firs time I will draw again. I was trying to play with binomial formula but I am not sure if this is applicable for two uneven drawings and how to plug in numbers to formula. I.e. How many trials I have, how many successes, what is a probability of success etc.

    Could the solution be really simple i.e.
    Count only second drawing and ask question what is the probability that one of 440 numbers drawn first will be drawn again in 220 drawings 440/20,000 x 220 = 4.84 ? Sd~2.2

    intuitively seems like OK number

    thanks zoran
    ebaines's Avatar
    ebaines Posts: 12,131, Reputation: 1307
    Expert
     
    #9

    Jun 6, 2013, 04:23 AM
    What you've calculated is not the probability that a number will be repeated but rather the expected number of repeats that will occur. Remember that probabilities are always between 0 and 1. But with that correction, yes: 440/20000 x 220 = 4.84 is the expected number of repeats. The standard deviation is
    zbrkanac's Avatar
    zbrkanac Posts: 5, Reputation: 1
    New Member
     
    #10

    Jun 6, 2013, 09:37 AM
    Great,
    Thank you!

Not your question? Ask your question View similar questions

 

Question Tools Search this Question
Search this Question:

Advanced Search

Add your answer here.


Check out some similar questions!

What is the probability of type I error with multiple comparisons [ 1 Answers ]

I would appreciate help with the following: There are 512 contiguous points of comparison. I perform a t-test for each point. The probability of getting an erroneous result for each comparison is 0.05. What is the probability of getting an erroneous result in 5 contiguous points (at any 5...

Multiple baseboards with multiple thermostats one circuit [ 22 Answers ]

Hi again guys, I want to run multiple baseboard heaters one one circuit in my basement. I have a small basement, so I figure I will need about 4 total baseboards. I figure I'll need about 4000 watts, I'm running 240 volt. So according to my code where I live, I'll require #12 and a 30 amp breaker...


View more questions Search