Sunday, October 23, 2011

Error bar judgment

Error bars are useful because they allow us to figure out how much of the data is signal and how much of the data is noise. We want to pay attention to aspects of the data that are real (i.e. outside of the error) and discount aspects of the data that are due to chance (i.e. within the error). Error bars that reflect +/- 1 standard error are surprisingly aggressive (see below).


% We are going to measure 40 different conditions.
% For the first twenty conditions, the true signal will be 0.
% For the second twenty conditions, the true signal will be 1.
numconditions = 40;

% We will make 30 different measurements for each condition.
n = 30;

% Let's perform a simulation, visualize the results,
% and then do it again (ad nauseum).
while 1

  % these are the true signal values
  signal = [zeros(1,numconditions/2) ones(1,numconditions/2)];
  % this is the noise (random Gaussian noise)
  noise = randn(n,numconditions);
  % these are the measurements
  measurement = bsxfun(@plus,signal,noise);
  % given the measurements, let's calculate the mean
  % and standard error for each condition.
  mn = mean(measurement,1);
  se = std(measurement,[],1)/sqrt(n);

  % now, let's visualize the results
  figure(999); clf; hold on;
  axis([0 numconditions+1 -1 2]);
  title('Black is the mean; red is the standard error; blue is the true signal');



If you use +/- 1 standard error to visualize results, it may subjectively look like there are differences, even though there aren't any. For this reason, it may be useful to instead plot error bars that reflect +/- 2 standard errors.

It is quite common to find measurements that are several error bars away from the true value.  Of course, this is completely expected given the nature of standard errors (e.g. 5% of the time, you will find a data point that is more than 2 standard errors away from the true value).

No comments:

Post a Comment