X-Bar in Statistics: Theory & Formula
A cow visits an x-bar to give the bartender samples of milk. The bartender asks the cow, ‘What is the mean of this milk?’ The cow replies, ‘Trying to esti-mate mu!’ This lesson explains x-bars and their role in estimating parameters.
What is the x-Bar
A young boy was overheard asking his mother these questions:
How tall is a professional basketball player?
How many calories are in a scoop of chocolate chip ice cream?
How much money does a schoolteacher make?
All of these questions can be answered using statistics. Statistics is the science of collecting and analyzing numerical data gathered from a representative sample in order to infer the true mean or proportion of a population.
Obviously, all professional basketball players aren’t the same height, every scoop of chocolate chip ice-cream contains slightly more or fewer calories than the next, and every teacher in America does not make exactly the same income.
We can take samples of data from the populations they represent and calculate a single value called a statistic. We can then use that value to estimate characteristics that are true for the whole population, called the parameter. The x-bar is the symbol (or expression) used to represent the sample mean, a statistic, and that mean is used to estimate the true population parameter, mu.
To find the average height of professional basketball players (the population), we don’t need to measure every player, just some of them (the sample). How do we select which ones to measure? How many players are enough to call a sample? How a researcher makes these decisions influences the inferences that can be made. After all, an anonymous but oft cited quote about statistics is that any analysis is only as good as the data on which it is based!
Random Samples and Sample Size Matter
Accurate sample means come from samples that are randomized and include a sufficient number of people. Statistical inferencing is only appropriate for random data. The act of randomizing guarantees that the results of analyzing our data are subject to the laws of probability.
Simple random sampling (SRS) is one type of sampling method, that is to say, it is a procedure for selecting the sample that will represent the population. SRS is simple and reliable and so is most often used when selecting a sample. There are various strategies used to obtain a random sample. A researcher could place the names of every professional basketball player on slips of paper and place those in a hat and, without looking, draw out a random sample, or assign each player a number and use a random table generator to select a random sample.
In addition to how samples are selected, sample size is also important. The central limit theorem states that as sample size increases to at least 30 individual observations, the sampling distribution of statistics obtained for any random variable will be normal.
To estimate the calories in a scoop of chocolate chip ice cream, we need to ensure the sample is random and a sufficient size. We use a trucking service to randomly select 30 ice cream trucks in transport across America on some Thursday. We ask the local college to send a scientist to stop these trucks and take 1 scoop of chocolate chip ice cream from 20 different containers. The scientists measure each scoop, then place the scoop in a calorie counter thingy that measures the calories of each (now melted) scoop of ice cream. Each researcher then calls in a single statistic: the sample mean, x-bar.
x-bar-from-population