Breaking cryptographic Algorithms with the Help of the Paradox of Birth
Table of Contents (Show / Hide)
We’ve got 23 babies. Which means there’s a 50% chance two of them have the same birthday? Huh? Let me explain. Each one of these babies has a random birthday, meaning that they each have the same chance of having been born on any 1 of the 365 days in a year. So, what are the odds that two of them share the same birthday? We’ll keep it simple: no twins, no leap years, no patterns that suggested parents made their babies at non-random, specific times. The odds aren’t 23 out of 365, which would be about 6%. The odds are actually 50%. There’s a 50% chance that out of 23 random people, 2 have the same exact birthday.
Which seems impossible given that we’ve got so few babies and so many possible birthdays. And if we have just 50 babies, the odds of a birthday match jump up to 97%. At 75, it’s 99.97%, making it a virtual certainty that when you get 75 babies together -- or 75 things with birthdays, they don’t have to be babies -- two of them will share a birthday. The Birthday Paradox is a veridical paradox. It’s surprising and absurd-sounding, but we have the math to prove it’s true. How is it possible that so few babies can have such a high chance of sharing a birthday? And how is it that despite 365 possibilities, we only need a sixth of that number to be pretty sure there’s a match?
The easiest way to do this isn’t to calculate the probability that two of any number of babies share the same birthday; it’s to calculate the probability that they don’t. To do that, we assume that Baby #1 has been born. So, the probability of this baby having a birthday is 1, or 365/365. Then we multiply that by Baby #2’s probability of not sharing that birthday: 364/365. We multiply that result by Baby #3’s probability of not sharing a birthday with either Baby #1 or Baby #2, which is 363/365… and we keep doing this for as many babies as we want to calculate the odds for. At Baby #23, we’re multiplying by 343/365. To simplify this calculation, we can write it as 364!
We’ll start with 364 because 365 over 365 is just 1, over 342! Because 365 minus our 23 babies gives us 342, times 365 days out of the year raised to the 22nd power which is our 23 babies minus baby one again. and the result of the whole equation gives us .492703, or about 49.3%. Again, that’s the probability we don’t have a birthday match with 23 babies, so we subtract this from 1 to find the chance that we do, which is 0.507297, which is about 50.7%. The trick here is that every baby is being evaluated against each other. it’s not whether Babies #2 through #23 share a birthday just with Baby #1 -- to have a 50% match with a single, pre-defined birthday, like a 50% chance of finding a match with Baby #5’s specific birthday, we’d need a pool of 253 babies which sounds about right and isn’t particularly surprising.
This is whether any two babies share any birthday with one another. The high probability, 50%, with a low baby count, 23, is surprising, but the math works. And if we keep extending the series and the results show that with 100 babies fewer than a third of the possible birthdays in a given year the chances of a match are 99.99997%, which means the chances of not having a birthday overlap would be just .00003%, or 3 in 10 million. If we replace our birthday babies with online passwords, we’ve got the basis for a type of hack known as the Birthday Attack. When you make a password for a website, that password is crunched into a fixed-length hash value that stores and identifies the combination of characters you input. So like, “12345” becomes, “827ccb0eea8a706c4c34a16891f84e7b.”
Simple. When the MD5 message-digest algorithm was a standard for encryption, its 128-bit, 32-hexadecimal hashes were vulnerable to a hack based on my babies. The goal was to find and force a collision -- when two hashes have the same exact value regardless of what that value is. like if Vsauce2.doc and Kevin.doc had the same hash value I could change information in one and affect the other or I could eventually use collisions like this to decode the encryption algorithm itself and learn how it works.
And the best way to do that wasn’t a trial-and-error, or “brute force” attempt to guess a specific match in the 3.4 × 10^38 possible outcomes in a 128-bit hash. It was putting the Birthday Paradox to use. Hackers developed an algorithm based on the math of the birthday paradox to more quickly cause hash collisions and ultimately crack one of the most widely-used cryptographic algorithms of its time. So, probability surrounding birthdays helped lead to the improvement of internet security.
URL :
News ID : 3460