**CODE**

% Let's distill the distinction between accuracy and reliability

% down to its core and look at a very simple example.

figure(999); clf; hold on;

h1 = scatter(1,7,'ro');

h2 = scatter(1,4,'bo');

h3 = errorbar2(1,4,1,'v','b-');

axis([0 2 0 12]);

legend([h1 h2 h3],{'True model' 'Estimated model' 'Error bars'});

ylabel('Value');

set(gca,'XTick',[]);

% In this example, we have a single number indicated by the red dot,

% and we are trying to match this number with a model. Through some

% means we have estimated a specific model, and the prediction of the

% model is indicated by the blue dot. Moreover, through some means we

% have estimated error bars on the model's prediction, and this is

% indicated by the blue line.

% Now let's consider the accuracy and reliability of the estimated

% model. The accuracy of the model corresponds to how far the

% estimated model is away from the true model. The reliability

% of the model corresponds to how variable the estimated model is.

h4 = drawarrow([1.3 4.5],[1.03 4.52],'k-',[],10);

h5 = text(1.33,4.5,'Reliability');

h6 = plot([.95 .9 .9 .95],[7 7 4 4],'k-');

h7 = text(.88,5.5,'Accuracy','HorizontalAlignment','Right');

% Accuracy and reliability are not the same thing, although they do bear

% certain relationships to one another. For example, if reliability is

% low, then it is likely that accuracy is low. (Imagine that the error bar

% on a given model is very large. Then, we would expect that any given

% estimate of the model would be not well matched to the true model.)

% Conversely, if accuracy is high, then it is likely that reliability

% is also high. (If a model estimate predicts responses extremely

% well, then it is likely that the parameters of the model are well

% estimated.)

%

% However, an important case to keep in mind is that it is possible for a

% model to have high reliability but low accuracy. To see how this can

% occur, let's examine each possible configuration of accuracy and

% reliability.

% CASE 1: MODEL IS RELIABLE AND ACCURATE.

% In this case, there are enough data to obtain good estimates of

% the parameters of the model, and the model is a good description

% of the data. Let's see an example (quadratic model fitted to

% quadratic data).

x = rand(1,100)*14 - 8;

y = -x.^2 + 2*x + 4 + 6*randn(1,100);

rec = fitprfstatic([x.^2; x; ones(1,length(x))]',y',0,0,[],100,[],[],[],@calccod);

figure(998); clf; hold on;

h1 = scatter(x,y,'k.');

ax = axis;

xx = linspace(ax(1),ax(2),100);

X = [xx.^2; xx; ones(1,length(xx))]';

modelfits = [];

for p=1:size(rec.params,1)

modelfits(p,:) = X*rec.params(p,:)';

end

mn = median(modelfits,1);

se = stdquartile(modelfits,1,1);

h2 = errorbar3(xx,mn,se,'v',[.8 .8 1]);

h3 = plot(xx,mn,'b-');

h4 = plot(xx,-xx.^2 + 2*xx + 4,'r-');

uistack(h1,'top');

xlabel('x'); ylabel('y');

legend([h1 h4 h3 h2],{'Data' 'True model' 'Estimated model' 'Error bars'});

title('Model is reliable and accurate');

% CASE 2: MODEL IS RELIABLE BUT INACCURATE.

% In this case, there are enough data to obtain good estimates of

% the parameters of the model, but the model is a bad description

% of the data. Let's see an example (linear model fitted to

% quadratic data).

x = rand(1,100)*10 - 5;

y = x.^2 - 3*x + 4 + 1*randn(1,100);

rec = fitprfstatic([x; ones(1,length(x))]',y',0,0,[],100,[],[],[],@calccod);

figure(997); clf; hold on;

h1 = scatter(x,y,'k.');

ax = axis;

xx = linspace(ax(1),ax(2),100);

X = [xx; ones(1,length(xx))]';

modelfits = [];

for p=1:size(rec.params,1)

modelfits(p,:) = X*rec.params(p,:)';

end

mn = median(modelfits,1);

se = stdquartile(modelfits,1,1);

h2 = errorbar3(xx,mn,se,'v',[.8 .8 1]);

h3 = plot(xx,mn,'b-');

h4 = plot(xx,xx.^2 - 3*xx + 4,'r-');

uistack(h1,'top');

xlabel('x'); ylabel('y');

legend([h1 h4 h3 h2],{'Data' 'True model' 'Estimated model' 'Error bars'});

title('Model is reliable but inaccurate');

% CASE 3: MODEL IS UNRELIABLE BUT ACCURATE.

% This is not a likely situation. Suppose there are insufficient data to

% obtain good estimates of the parameters of a model. This implies that

% the parameters would fluctuate widely from dataset to dataset, which in

% turn implies that the predictions of the model would also fluctuate widely

% from dataset to dataset. Thus, for any given dataset, it would be unlikely

% that the predictions of the estimated model would be well matched to the data.

% CASE 4. MODEL IS UNRELIABLE AND INACCURATE.

% In this case, there are insufficient data to obtain good estimates of

% the parameters of the model, and this supplies a plausible explanation

% for why the model does not describe the data well. (Of course, it could

% be the case that even with sufficient data, the estimated model would

% still be a poor description of the data; see case 2 above.) Let's see

% an example of an unreliable and inaccurate model (Gaussian model

% fitted to Gaussian data, but only a few noisy data points are available).

x = linspace(1,100,20);

y = evalgaussian1d([40 10 10 2],x) + 10*randn(1,20);

model = {[30 20 5 0] [-Inf 0 -Inf -Inf; Inf Inf Inf Inf] @(pp,xx) evalgaussian1d(pp,xx)};

rec = fitprfstatic(x',y',model,[],[],100,[],[],[],@calccod);

figure(996); clf; hold on;

h1 = scatter(x,y,'k.');

ax = axis;

xx = linspace(ax(1),ax(2),100);

modelfits = [];

for p=1:size(rec.params,1)

modelfits(p,:) = evalgaussian1d(rec.params(p,:),xx);

end

mn = median(modelfits,1);

se = stdquartile(modelfits,1,1);

h2 = errorbar3(xx,mn,se,'v',[.8 .8 1]);

h3 = plot(xx,mn,'b-');

h4 = plot(xx,evalgaussian1d([40 10 10 2],xx),'r-');

uistack(h1,'top');

xlabel('x'); ylabel('y');

legend([h1 h4 h3 h2],{'Data' 'True model' 'Estimated model' 'Error bars'});

title('Model is unreliable and inaccurate');

## No comments:

## Post a Comment