Monthly Archives: May 2012

Pre-phase What is statistics

In my opinion, statistic is ALL about estimation. Estimating the probability of some events that happen over the universe of all events.

Test of significance

When people first learn about statistics, they are probably learned from stats 101, where the professor told them how to do a t-test, or chi-square-test where they can decide a certain judgement is significant or not.  Well, this is an estimation too, in fact, these tests are estimating the probability of you making a mistake by saying the judgement is significant.  For example, if you are doing a t-test of two samples, and the p-value is 0.01, that is saying if you say these two samples are significant, the probability that you are incorrect is 0.001. It's pretty much means that you are almost correct.

Then why is there a whole area of statistics if the only goal of statistics is to estimate?

It's because there are so many models, that each have its own strength when estimating a probability. There's parametric, non-parametric to estimate a probability distribution over continuous, or discrete interval. There's also graphical models, multivariate models if the dataset you have got more than one variables, and you want to estimate conditional probability.

When you are estimating something, there are also many measurements of how good the estimator is. There are always trade-off between properties of an estimator, if your estimator is unbiased, it's probably going to have high variance.

There are many questions to ask when you want to estimate something.

Would you like an estimator that is generally good, but can make is very bad mistake or you'd like an estimator that is not as good, but is guaranteed to not make a very bad mistake?

Would you like an estimator that is unbiased when sample size is infinity with high variance or you'd like an estimator that is little biased but with very small variance?

etc.

So before you get into the field of statistics,  these questions are definitely important to keep in mind, and when you use statistics to solve problems in research, you'll always have to state how/why you chose such estimator.

 

Matlab: Double precision problem

In matlab, Sometimes when you try to compare two numbers, they don't usually gives you the answer you excepted. When you compare two integers,

a=1;
b=1;
a==b

will gives you 1.

but when you compare double, sometimes it doesn't work. Simple cases that if you do

a=0.001;
b=0.001;
a==b

will give you 1 still. But if you save a into a file, and use textread(filename) to get the value, the value may still look like 0.001, but if you do

a==0.001

It might give you 0 because the a was read from a file and it was in some weird format. This might be a bug. Some people fix it by doing

abs(a-0.001)<0.000001

it basically means if a and 0.001 is very close, then they are equal.

I personally have a quicker fix.

a+1==0.001+1.

For some reason, after any operation on the variable, the value no long have anything weird going on inside.