5 of our 301 active users have their birthday today. I'm not sure what the odds of that are. I don't even know if this is the most birthday laden day of the year (in terms of active The-W.com users). On the one hand, it seems uncommon that that 1/60 users would have a birthday on the same day, when you consider that there are 366 days a person can be born on. But on the other hand, you've probably heard that if you have just 23 people in the same room, there's about a 50% chance that two of them will have the same birthday. You might be able to adapt that to see what the odds of 5 people having the same birthday would be, but I'm not sure how to go about it.
At any rate, happy birthday to the five of you on this randomly significant (or possibly not) day.
EDIT: Found an excel function that claims to solve this problem, but (1) I don't have Excel and (2) when I tried to calculate the terms manually they were too big for OpenOffice, which apparently has a maximum integer limit around 7 * 10^307. Best I was able to get was that the odds of 5 people in a group of 100 having the same birthday was 0.000000454%. A group of 301 people would obviously be higher, but I have no idea how much.
Obviously no one else cares, so all I say is that I ran some simulations, and it appears that the odds of having at least 5 people in a group of 301 having the same birthday is something like 44%. So I was obviously mistaken about its rarity.
My combination math is a little rusty, but if I recall, it can go something like this.
Think about it this way. You have 365 boxes and 301 birthday cards. 5 cards go in the 1st box, the other 296 can go in any of the other 364 boxes. (I don't remember if this takes into account that no other 5 cards end up in the same box. It probably doesn't and would make the ultimate percentage even lower)
It would be C(301,5), and the 296 would go 364^296 ways. The total possible ways would be 365^301.
As for my simulation: In an OpenOffice spreadsheet, let columns D to KR represent the 301 people. In each cell in row 2 in those columns, enter the formula =randbetween(1;365), which will give you a random number between 1 and 365. This number represents the day of the person's birthday.
Column C will be the mode of the list of numbers (i.e. the day that is the most frequent birthday among those 301 people). C2 will have the formula =mode(d2:kr2).
Column B will be the frequency of the mode (i.e. how many people out of the 301 have a birthday on the day that has the most birthdays). Cell B2 has the formula =countif(D2:KR2; C2)
You can then paste these 303 columns down as many rows down as you want / your computer will allow. Each row represents one random group of 301 people.
Then in cell A1 you can calculate how many of those random groups have a day in which at least 5 people have the same birthday, divided by how many rows you're using. The formula would be =countif(b2:b10001;">=5")/count(b2:b10001) [if you're going down 10000 rows, for instance].
You can then force the thing to manually recalculate if you want to update all the random numbers. When I do so, I get a percentage between 43.91% and 44.29%.
Up to 10 people can download the spreadsheet in question with the 10000 rows at http://rapidshare.com/files/383649857/birthday.ods.html (it is something like 13 MB in size, and as I've said it's Openoffice [.ods] format). Note that when I tried it on a less robust system, it was too memory intensive and crashed the computer; and even on my good computer with 4 GBs of RAM, it takes a couple minutes to update.
I really need to look through this when I'm more awake tomorrow, but looking at what Dr. Rick said near the bottom, the odds of exactly 3 people having the same birthday out of 30 people is 1%. I don't think my 0.13% is far off with just about 10 times more people and 2 more people to match.
I'm not 100% my math is right, it's been at least 10 years since my last calculus class, but I think I'm close. I'm not saying you're wrong, because I could be way off but 5 in 301 at 44% seems WAY high, but that's why it's called the Birthday Paradox, right?
No, really, in the logs we ARE over 2,000,000. For some reason, there is a discrepency between the logs and the database. I'm not sure how the 14,000 page view difference in the database views and log files happened.