Centre for Discrete and Applicable Mathematics |
|
CDAM Research Report, LSE-CDAM-2005-22December 2005 |
Using a similarity measure for credible classification
M. Anthony, P. L. Hammer, E. Subasi, M. Subasi
This paper concerns classification by Boolean functions. We investigate
the classification accuracy obtained by standard classification techniques on unseen points (elements of the domain, {0, 1}n, for some n) that are similar, in particular senses, to the points that have been observed as training observations. Explicitly, we use a new measure of how similar a point x ∈ {0, 1}n is to a set of such points to restrict the domain of points on which we offer a classification. For points sufficiently dissimilar, no classification is given.We report on experimental results which indicate that the classification accuracies obtained on the resulting restricted domains are better than those
obtained without restriction. These experiments involve a number of standard data-sets and classification techniques. We also compare the classification accuracies with those obtained by restricting the domain on which classification is given by using the Hamming distance.A PDF file (455 kB) with the full contents of this report can be downloaded by clicking here.
Alternatively, if you would like to get a free hard copy of this report, please send the number of this report, LSE-CDAM-2005-22, together with your name and postal address to:
CDAM Research Reports Series Centre for Discrete and Applicable Mathematics London School of Economics Houghton Street London WC2A 2AE, U.K. |
||
Phone: +44(0)-20-7955 7494. Fax: +44(0)-20-7955 6877. Email: info@maths.lse.ac.uk |
Introduction to the CDAM Research Report Series. | ||
CDAM Homepage. |