Understanding and improving error-correcting output coding
supervised learning problem into a large number L of two-class supervised learning
problems and then combining the results of these L evaluations. Previous research
has shown that ECOC can dramatically improve the classi cation accuracy of supervised
learning algorithms that learn to classify data points into one of k 2 classes.
An investigation of why the ECOC technique works, particularly when employed
with decision tree learning algorithms, is presented.
It is shown that the ECOC method is a compact form of \voting" among
multiple hypotheses. The success of the voting depends on that the errors committed
by each of the L learned binary functions are substantially uncorrelated.
By employing the statistical notions of bias and variance, the generalization
errors of ECOC are decomposed into bias and variance errors. Like any voting
method, ECOC reduces variance errors. However, unlike homogeneous voting, which
simply combines multiple runs of the same learning algorithm, ECOC can also
reduce bias errors. It is shown that the bias errors in the individual functions are uncorrelated and that this results from non-local behavior of the learning algorithm
in splitting the feature space.
ECOC is also extended to provide class probability information. The problem
of computing these class probabilities can be formulated as an over-constrained system
of linear equations. Least squares methods are applied to solve these equations.
Accuracy of the posterior probabilities is demonstrated with overlapping classes and
a simple reject option task.
Advisor:Dietterich, Thomas G.
School:Oregon State University
School Location:USA - Oregon
Source Type:Master's Thesis
Keywords:error correcting codes information theory
Date of Publication:02/20/1995