Understanding and improving error-correcting output coding

by Kong, Eun Bae

Abstract (Summary)
Error-correcting output coding (ECOC) is a method for converting a k-class

supervised learning problem into a large number L of two-class supervised learning

problems and then combining the results of these L evaluations. Previous research

has shown that ECOC can dramatically improve the classi cation accuracy of supervised

learning algorithms that learn to classify data points into one of k 2 classes.

An investigation of why the ECOC technique works, particularly when employed

with decision tree learning algorithms, is presented.

It is shown that the ECOC method is a compact form of \voting" among

multiple hypotheses. The success of the voting depends on that the errors committed

by each of the L learned binary functions are substantially uncorrelated.

By employing the statistical notions of bias and variance, the generalization

errors of ECOC are decomposed into bias and variance errors. Like any voting

method, ECOC reduces variance errors. However, unlike homogeneous voting, which

simply combines multiple runs of the same learning algorithm, ECOC can also

reduce bias errors. It is shown that the bias errors in the individual functions are uncorrelated and that this results from non-local behavior of the learning algorithm

in splitting the feature space.

ECOC is also extended to provide class probability information. The problem

of computing these class probabilities can be formulated as an over-constrained system

of linear equations. Least squares methods are applied to solve these equations.

Accuracy of the posterior probabilities is demonstrated with overlapping classes and

a simple reject option task.

Bibliographical Information:

Advisor:Dietterich, Thomas G.

School:Oregon State University

School Location:USA - Oregon

Source Type:Master's Thesis

Keywords:error correcting codes information theory


Date of Publication:02/20/1995

© 2009 All Rights Reserved.