Data Decomposition and Decision Rule Joining for Classification of Data with Missing Values

Rafal Latkowski(1) and Michal Mikolajczyk(2)

(1) Warsaw University, Institute of Computer Science,
    ul. Banacha 2, 02-097 Warszawa, Poland,
    R.Latkowski@mimuw.edu.pl
(2) Warsaw University, Institute of Mathematics,
    ul. Banacha 2, 02-097 Warszawa, Poland,
    M.Mikolajczyk@mimuw.edu.pl

Abstract

In this paper we present a new approach to handling incomplete
information and classifier complexity reduction. We describe a
method, called D3RJ, that performs data decomposition and
decision rule joining to avoid the necessity of reasoning with
missing attribute values. In the consequence more complex
reasoning process is needed than in the case of known algorithms
for induction of decision rules. The original incomplete data
table is decomposed into sub-tables without missing values. Next,
methods for induction of decision rules are applied to these sets.
Finally, an algorithm for decision rule joining is used to obtain
the final rule set from partial rule sets. Using D3RJ method it
is possible to obtain smaller set of rules and next better
classification accuracy than standard decision rule induction
methods. We provide an empirical evaluation of the D3RJ method
accuracy and model size on data with missing values of natural
origin.