On Decomposition for Incomplete Data

Rafal Latkowski

rlatkows@mimuw.edu.pl
Institute of Computer Science, Warsaw University
ul. Banacha 2, 02--097 Warsaw, Poland

Abstract

In this paper we present a method of data decomposition to avoid
the necessity of reasoning on data with missing attribute values.
This method can be applied to any algorithm of classifier
induction. The original incomplete data is decomposed into data
subsets without missing values. Next, methods for classifier
induction are applied to these sets. Finally, a conflict resolving
method is used to obtain final classification from partial
classifiers. We provide an empirical evaluation of the
decomposition method accuracy and model size with use of various
decomposition criteria on data with natural missing values. We
present also experiments on data with synthetic missing values to
examine the properties of proposed method with variable ratio of
incompleteness.