Finding the Most Probable
Distribution Part 1
Distribution Part 1
After tests, students ponder their class grades while teachers seem
to design tests that yield bell-shaped grade histograms. In this two-part series,
I delve into the math and Python solution for this intriguing problem of finding
the most probable distribution.
For this video (Part 1), we first derive the number of ways (permutations) N students
can be distributed into a grade distribution, noting that the grade distribution with
the most number of permutations would be the most probable one. Next, we realize that
when the number of students (N) is large, factorials become large quantities that cannot
be represented with 64-bit floating point data types. Thus, we turn to deriving
Stirling’s approximation (ln N! = N ln N - N), which involves doing the opposite of
the traditional Reimann sum. That is, we approximate the sum of rectangular areas
by an integral. Following an integration by parts, ignoring small terms, and grouping,
we come to a tractable approximation for the natural log of the number of permutations
of a grade distribution: -N ∑ p (ln p) where p is the fraction of students with a
given score.