Tag Archives: 101

Pre-phase What is statistics

In my opinion, statistic is ALL about estimation. Estimating the probability of some events that happen over the universe of all events.

Test of significance

When people first learn about statistics, they are probably learned from stats 101, where the professor told them how to do a t-test, or chi-square-test where they can decide a certain judgement is significant or not.  Well, this is an estimation too, in fact, these tests are estimating the probability of you making a mistake by saying the judgement is significant.  For example, if you are doing a t-test of two samples, and the p-value is 0.01, that is saying if you say these two samples are significant, the probability that you are incorrect is 0.001. It's pretty much means that you are almost correct.

Then why is there a whole area of statistics if the only goal of statistics is to estimate?

It's because there are so many models, that each have its own strength when estimating a probability. There's parametric, non-parametric to estimate a probability distribution over continuous, or discrete interval. There's also graphical models, multivariate models if the dataset you have got more than one variables, and you want to estimate conditional probability.

When you are estimating something, there are also many measurements of how good the estimator is. There are always trade-off between properties of an estimator, if your estimator is unbiased, it's probably going to have high variance.

There are many questions to ask when you want to estimate something.

Would you like an estimator that is generally good, but can make is very bad mistake or you'd like an estimator that is not as good, but is guaranteed to not make a very bad mistake?

Would you like an estimator that is unbiased when sample size is infinity with high variance or you'd like an estimator that is little biased but with very small variance?


So before you get into the field of statistics,  these questions are definitely important to keep in mind, and when you use statistics to solve problems in research, you'll always have to state how/why you chose such estimator.