Deviate with Style

Noah Pessin
May 5
2 min read

Hey Everybody,

The third topic I’d like to discuss from my fall Data Science elective is standard deviation and z-score. I’m sure you may have heard one of these two terms but what do they actually mean?

Standard deviation is a way to determine how spread out a dataset is from its mean. This value is associated with the bell curve: a set curve of normal distribution where each standard deviation represents a percent of the data in the dataset shown by the curve. According to the bell, 34% of the dataset falls within 1 standard deviation of the mean on each side (so 68% total). Then,13.5% on each side when data is 1-2 standard deviations away from the mean. And then, 2.35% and lastly, 0.15% on either side. The standard deviation’s purpose in this bell is to determine how far away each section or percent of the bell is from the mean.

So how do you calculate standard deviation?

This is the formula for standard deviation… A lot’s going on here. The little zero with the tail on top that everything is equal to is standard deviation. X represents each data point and the u with a tail is the mean of the dataset. The E-looking symbol means that you add up all the values of the equation next to it: (X-μ)^2. Lastly, the n represents the number of data points your set has.

Now, what does z-score have to do with it?

Z-score is a system used to calculate percentile and probability where, by going to the website z-table.com, you match any z-score with a specific percentage representing a percentile.

Using this equation, with x being a specific value in a dataset, μ being the mean, and σ being the standard deviation (hint hint), a z-score is outputted.

One example of this process is: I got a score of 91 on my math test. The class average was an 88 according to the teacher. I asked everybody what their score was and, using the formula for standard deviation above, found that σ=5. Then, using my z-score equation, I calculated it to be 0.6. I went to the z-table and found the corresponding value to be 0.7257 or 72.57%. This meant that I scored higher than about 72% of my classmates. If I were to find the percentiles for scores of 92 and 90 and subtracted one from the other, I would get the percent chance someone in the class could end up getting a score of 91. That would end up being a 13.27% chance.

It may sound confusing but try this process out yourself and you’ll be a pro in no time.

Z you later!

Noah

Deviate with Style

Recent Posts

Comments