Wróblewski J., 1998. Covering with reducts - a fast algorithm for rule generation. Proc. of RSCTC'98, Warsaw, Poland. Springer-Verlag, Berlin Heidelberg 1998, pp. 402 - 407.
ABSTRACT
We discuss a searching method for synthesis of approximate description of decision classes in large data tables (decision tables). In a rough set approach to knowledge discovery problems, a set of rules is generated basing on training data using a notion of reduct. Because a problem of finding short reducts is NP-hard, we have to use several approximation techniques.
A covering approach to the problem of generating rules based on information system is presented in this article. A new, efficient algorithm for finding local reducts for each object in data table is described, as well as its parallelization and some optimization notes. A problem of working with tolerances in our algorithm is discussed. Some experimental results generated on large data tables (concerned with real applications) are presented.
A local reduct for given object x is a minimal subset of attributes which is sufficient to discern between the object x and any other object with another decision value. We may use the notion of local reduct to produce rules: an object x and a local reduct determines one rule.
Our algorithm realizes the following objective: assuming the information system is consistent, find a family of subsets R1, R2, ... Rk of attributes such that for any object from the information system at least one Rj is a local reduct. We look for possibly small family R1,... Rk, i.e. we prefer these subsets which cover possibly many objects. We assume, that these subsets reflect regularities in data and generate more general rules - it means better classification of new samples and less memory required to store rules.
Results show, that our new method is relatively fast, even for large data tables (e.g. "Satellite image" data: 4492 obj., 36 attr. - comp. time 13 sec.), especially when we are interested in just a covering of objects. Actually, when we cover objects by at least one reduct, an average number of reducts covering an object is usually equal to about 3.5.