The Birthday Paradox: an interesting probability problem involving “statistically independent” events

During this past week’s statistics tutorials, we discussed (among other things) the concept of statistical independence, and focused attention on some important implications of statistical independence for probability distributions such as the binomial and normal distributions.

Here, I’d like to call everyone’s attention to an interesting (non-finance) probability problem related to statistical independence. Specifically, consider the so-called “Birthday Paradox”. The Birthday Paradox pertains to the probability that in a set of randomly chosen people, some pair of them will have the same birthday. Counter-intuitively, in a group of 23 randomly chosen people, there is slightly more than a 50% probability that some pair of them will both have been born on the same day.

To compute the probability that two people in a group of n people have the same birthday, we disregard variations in the distribution, such as leap years, twins, seasonal or weekday variations, and assume that the 365 possible birthdays are equally likely.[1] Thus, we assume that birth dates are statistically independent events. Consequently, the probability of two randomly chosen people not sharing the same birthday is 364/365. According to the combinatorial equation, the number of unique pairs in a group of n people is n!/2!(n-2)! = n(n-1)/2. Assuming a uniform distribution (i.e., that all dates are equally probable), this means that the probability that no pair in a group of n people shares the same birthday is equal to p(n) = (364/365)^[n(n-1)/2]. The event of at least two of the n persons having the same birthday is complementary to all n birthdays being different. Therefore, its probability is p’(n) = 1 – (364/365)^[n(n-1)/2].

Given these assumptions, suppose that we are interested in determining how many randomly chosen people are needed in order for there to be a 50% probability that at least two persons share the same birthday. In other words, we are interested in finding the value of n which causes p(n) to equal 0.50. Therefore, 0.50 = (364/365)^[n(n-1)/2]; taking natural logs of both sides and rearranging, we obtain (ln 0.50)/(ln 364/365) = n(n-1)/2. Solving for n, we obtain 505.304 = n(n -1); therefore, n is approximately equal to 23.[2]

The following graph illustrates how the probability that a pair of people share the same birthday varies as the number of people in the sample increases:[1] It is worthwhile noting that real-life birthday distributions are not uniform since not all dates are equally likely. For example, in the northern hemisphere, many children are born in the summer, especially during the months of August and September. In the United States, many children are conceived around the holidays of Christmas and New Year’s Day. Also, because hospitals rarely schedule C-sections and induced labor on the weekend, more Americans are born on Mondays and Tuesdays than on weekends; where many of the people share a birth year (e.g., a class in a school), this creates a tendency toward particular dates. Both of these factors tend to increase the chance of identical birth dates since a denser subset has more possible pairs (in the extreme case when everyone was born on three days of the week, there would obviously be many identical birthdays!).

[2]Note that since 26 students are enrolled in Finance 4366 this semester, this implies that the probability that two Spring 2021 Finance 4366 students share the same birthday is p’(26) = 1 – (364/365)^[26(5)/2] =59%, although given footnote 1’s caveats, it’s likely that there may be one or more shared birthday pairs.

Z Table Extra Credit Assignment (due 11 a.m. CT on Tuesday, February 2)

Here’s an extra credit opportunity for Finance 4366. Working on your own (i.e., this is not a group project; credit will only be given for spreadsheets that are uniquely your own), build your own “z” table in Excel (patterned after the table located at http://fin4366.garven.com/stdnormal.pdf); the top row should have values ranging from 0.00 to 0.09, and the first column should have z values ranging from -3.0 to +3.0, in increments of 0.1).

Quite conveniently, Excel has the standard normal distribution function built right in; e.g., if you type "=normsdist(z)", Excel returns the probability associated with whatever z value that you provide. Not surprisingly, if you type "=normsdist(0)", .5 is returned since half of the area under the curve lies to the left of the expected value E(z) = 0. Similarly, if you type "=normsdist(1)", then .8413 is returned because 84.13% of the area under the curve lies to the left of z = 1. Perhaps you recall from your QBA course that 68.26% of the area under the curve lies between z = -1; this "confidence interval" of +/1 one standard deviation away from the mean (E(z)=0) is calculated in Excel with the following code: "=normsdist(1)-normsdist(-1)", and so forth.

The grade you earn on this extra credit assignment will replace your lowest quiz grade; that is if your lowest quiz grade is lower than your extra credit grade. The deadline is 11 a.m. CT on Tuesday, February 2.

You can turn your spreadsheet for this extra credit assignment in at https://baylor.instructure.com/courses/132766/assignments/1020537.

2021/2022 Department of Finance, Insurance, and Real Estate Scholarship Application

The Department of Finance, Insurance, and Real Estate is accepting scholarship applications for the 2021/2022 academic year until 5:00pm March 1, 2021. Follow the link below to access the application for all available scholarships:

2021/2022 Finance, Insurance, and Real Estate Scholarship Application

Please read the scholarship information carefully and note that you must already be admitted to the business school; you must have a declared major in Finance, Risk Management and Insurance, or Real Estate, and you must have already completed 6 hours of upper-level course work.

Calculus, Probability and Statistics, and a preview of future topics in Finance 4366

Probability and statistics, along with the basic calculus principles covered last Thursday, are foundational for the theory of pricing and managing risk with financial derivatives, which is what this course is all about. During yesterday’s class meeting, we introduced discrete and continuous probability distributions, calculated parameters such as expected value, variance, standard deviation, covariance, and correlation, and applied these concepts to measure expected returns and risks for portfolios comprising risky assets. During tomorrow’s class meeting, we will take a deeper dive into discrete and continuous probability distributions, in which the binomial and normal distributions will be showcased.

On Tuesday, February 2, we will introduce and describe the nature of financial derivatives, and motivate their study with examples of forwards, futures, and options. Derivatives are so named because they derive their values from one or more underlying assets. Underlying assets typically involve traded financial assets such as stocks, bonds, currencies, or other derivatives, but derivatives can derive value from pretty much anything. For example, the Chicago Mercantile Exchange (CME) offers exchange-traded weather futures and options contracts (see “Market Futures: Introduction To Weather Derivatives“). There are also so-called “prediction” markets in which derivatives based upon the outcome of political events are actively traded (see “Prediction Market“).

Besides introducing financial derivatives and discussing various institutional aspects of markets in which they are traded, we’ll consider various properties of forward and option contracts, since virtually all financial derivatives feature payoffs that are isomorphic to either or both schemes. For example, a futures contract is simply an exchange-traded version of a forward contract. Similarly, since swaps involve exchanges between counter-parties of payment streams over time, these instruments essentially represent a series of forward contracts. In the option space, besides traded stock options, many corporate securities feature “embedded” options; e.g., a convertible bond represents a combination of a non-convertible bond plus a call option on company stock. Similarly, when a company makes an investment, so-called “real” options to expand or abandon the investment at some future is often present.

Perhaps the most important (pre-Midterm 1) idea that we’ll introduce is the concept of a so-called “arbitrage-free” price for a financial derivative. While details will follow, the basic idea is that one can replicate the payoffs on a forward or option by forming a portfolio comprising the underlying asset and a riskless bond. This portfolio is called the “replicating” portfolio, since, by design, it replicates the payoffs on the forward or option. Since the forward or option and it’s replicating portfolio produce the same payoffs, then they must also have the same value. However, suppose the replicating portfolio (forward or option) is more expensive than the forward or option (replicating portfolio). If this occurs, then one can earn a riskless arbitrage profit by simply selling the replicating portfolio (forward or option) and buying the forward or option (replicating portfolio). However, competition will ensure that opportunities for riskless arbitrage profits vanish quickly. Thus the forward or option will be priced such that one cannot earn arbitrage profit from playing this game.

Also featured as one of “50 Things That Made the Modern Economy”: The Index Fund

Besides insurance, Tim Harford also features the index fund in his “Fifty Things That Made the Modern Economy” radio and podcast series. This 9-minute long podcast lays out the history of the development of the index fund in particular and the evolution of so-called passive portfolio strategies in general. Much of the content of this podcast is sourced from Vanguard founder Jack Bogle’s September 2011 WSJ article entitled “How the Index Fund Was Born” (available at https://www.wsj.com/articles/SB10001424053111904583204576544681577401622). Here’s the description of this podcast:

“Warren Buffett is the world’s most successful investor. In a letter he wrote to his wife, advising her how to invest after he dies, he offers some clear advice: put almost everything into “a very low-cost S&P 500 index fund”. Index funds passively track the market as a whole by buying a little of everything, rather than trying to beat the market with clever stock picks – the kind of clever stock picks that Warren Buffett himself has been making for more than half a century. Index funds now seem completely natural. But as recently as 1976 they didn’t exist. And, as Tim Harford explains, they have become very important indeed – and not only to Mrs. Buffett.”

Warren Buffett is one of the world’s great investors. His advice? Invest in an index fund

Insurance featured as one of “50 Things That Made the Modern Economy”

From November 2016 through October 2017, Financial Times writer Tim Harford presented an economic history documentary radio and podcast series called 50 Things That Made the Modern Economy. This same information is available in book form under the title “Fifty Inventions That Shaped the Modern Economy“. While I recommend listening to the entire series of podcasts (as well as reading the book), I would like to call your attention to Mr. Harford’s episode on the topic of insurance, which I link below. This 9-minute long podcast lays out the history of the development of the various institutions which exist today for the sharing and trading of risk, including markets for financial derivatives as well as for insurance.

“Legally and culturally, there’s a clear distinction between gambling and insurance. Economically, the difference is not so easy to see. Both the gambler and the insurer agree that money will change hands depending on what transpires in some unknowable future. Today the biggest insurance market of all – financial derivatives – blurs the line between insuring and gambling more than ever. Tim Harford tells the story of insurance; an idea as old as gambling but one which is fundamental to the way the modern economy works.”

Week 2 readings, quiz, and problem set

Here’s a friendly reminder that the following readings are due tomorrow:

1. The New Religion of Risk Management, by Peter Bernstein
2. Normal and standard normal distribution, by James R. Garven
3. Mean and Variance of a Two-Asset Portfolio, by James R. Garven

Keep in mind that Quiz 2, which is based on these readings, must be completed prior to the start of class tomorrow. I have also changed the due date for Problem Set 1 from tomorrow (Tuesday, January 26) to Thursday, January 28. You should consider all other due dates listed on Canvas and on the course website as pretty much set in stone for the remainder of the semester.

Going forward, I will typically not post reminders like this concerning Finance 4366 assignment deadlines; however, you’ll be “good to go” in Finance 4366 if you faithfully follow the guidelines listed in my “How to know whether you are on track with Finance 4366 assignments” posting.