The dominance-based rough set approach (DRSA) is an extension of rough set theory for multi-criteria decision analysis (MCDA), introduced by Greco, Matarazzo and Słowiński.[1] [2] [3] The main change compared to the classical rough sets is the substitution for the indiscernibility relation by a dominance relation, which permits one to deal with inconsistencies typical to consideration of criteria and preference-ordered decision classes.
Multicriteria classification (sorting) is one of the problems considered within MCDA and can be stated as follows: given a set of objects evaluated by a set of criteria (attributes with preference-order domains), assign these objects to some pre-defined and preference-ordered decision classes, such that each object is assigned to exactly one class. Due to the preference ordering, improvement of evaluations of an object on the criteria should not worsen its class assignment. The sorting problem is very similar to the problem of classification, however, in the latter, the objects are evaluated by regular attributes and the decision classes are not necessarily preference ordered. The problem of multicriteria classification is also referred to as ordinal classification problem with monotonicity constraints and often appears in real-life application when ordinal and monotone properties follow from the domain knowledge about the problem.
As an illustrative example, consider the problem of evaluation in a high school. The director of the school wants to assign students (objects) to three classes: bad, medium and good (notice that class good is preferred to medium and medium is preferred to bad). Each student is described by three criteria: level in Physics, Mathematics and Literature, each taking one of three possible values bad, medium and good. Criteria are preference-ordered and improving the level from one of the subjects should not result in worse global evaluation (class).
As a more serious example, consider classification of bank clients, from the viewpoint of bankruptcy risk, into classes safe and risky. This may involve such characteristics as "return on equity (ROE)", "return on investment (ROI)" and "return on sales (ROS)". The domains of these attributes are not simply ordered but involve a preference order since, from the viewpoint of bank managers, greater values of ROE, ROI or ROS are better for clients being analysed for bankruptcy risk . Thus, these attributes are criteria. Neglecting this information in knowledge discovery may lead to wrong conclusions.
In DRSA, data are often presented using a particular form of decision table. Formally, a DRSA decision table is a 4-tuple
S=\langleU,Q,V,f\rangle
U
Q
V=cup{}qVq
Vq
q
f\colonU x Q\toV
f(x,q)\inVq
(x,q)\inU x Q
Q
C ≠ \emptyset
d
f(x,q)
x
q\inC
f(x,d)
It is assumed that the domain of a criterion
q\inQ
\succeqq
x\succeqqy
x
y
q
q
Vq\subseteqR
\geq
x\succeqqy\ifff(x,q)\geqf(y,q)
Vq
Let
T=\{1,\ldots,n\}
Vd
n
Vd=T
U
n
bf{Cl}=\{Clt,t\inT\}
Clt=\{x\inU\colonf(x,d)=t\}
x\inU
Clt,t\inT
r,s\inT
r\geqs
Clr
Cls
\geq | |
Cl | |
t |
=cupsCls
\leq | |
Cl | |
t= |
cupsCls t\inT
We say that
x
y
P\subseteqC
xDpy
x
y
P
x\succeqqy,\forallq\inP
P\subseteqC
DP
P\subseteqC
x\inU
+(x) | |
D | |
P |
=\{y\inU\colonyDpx\}
-(x) | |
D | |
P |
=\{y\inU\colonxDpy\}
represent P-dominating set and P-dominated set with respect to
x\inU
The key idea of the rough set philosophy is approximation of one knowledge by another knowledge. In DRSA, the knowledge being approximated is a collection of upward and downward unions of decision classes and the "granules of knowledge" used for approximation are P-dominating and P-dominated sets.
The P-lower and the P-upper approximation of
\geq | |
Cl | |
t |
,t\inT
P\subseteqC
\geq | |
\underline{P}(Cl | |
t |
)
\geq | |
\overline{P}(Cl | |
t |
)
\geq | |
\underline{P}(Cl | |
t |
)=\{x\inU\colon
+(x) | |
D | |
P |
\subseteq
\geq | |
Cl | |
t |
\}
\geq | |
\overline{P}(Cl | |
t |
)=\{x\inU\colon
-(x) | |
D | |
P |
\cap
\geq | |
Cl | |
t |
≠ \emptyset\}
Analogously, the P-lower and the P-upper approximation of
\leq | |
Cl | |
t |
,t\inT
P\subseteqC
\leq | |
\underline{P}(Cl | |
t |
)
\leq | |
\overline{P}(Cl | |
t |
)
\leq | |
\underline{P}(Cl | |
t |
)=\{x\inU\colon
-(x) | |
D | |
P |
\subseteq
\leq | |
Cl | |
t |
\}
\leq | |
\overline{P}(Cl | |
t |
)=\{x\inU\colon
+(x) | |
D | |
P |
\cap
\leq | |
Cl | |
t |
≠ \emptyset\}
Lower approximations group the objects which certainly belong to class union
\ge | |
Cl | |
t |
\le | |
Cl | |
t |
x\inU
\ge | |
\underline{P}(Cl | |
t) |
\le | |
\underline{P}(Cl | |
t) |
U
y\inU
x
\ge | |
Cl | |
t |
\le | |
Cl | |
t |
\ge | |
Cl | |
t |
\le | |
Cl | |
t |
x\inU
\ge | |
\overline{P}(Cl | |
t) |
\le | |
\overline{P}(Cl | |
t) |
y\inU
x
\ge | |
Cl | |
t |
\le | |
Cl | |
t |
The P-lower and P-upper approximations defined as above satisfy the following properties for all
t\inT
P\subseteqC
\geq | |
\underline{P}(Cl | |
t |
)\subseteq
\geq | |
Cl | |
t |
\subseteq
\geq | |
\overline{P}(Cl | |
t |
)
\leq | |
\underline{P}(Cl | |
t |
)\subseteq
\leq | |
Cl | |
t |
\subseteq
\leq | |
\overline{P}(Cl | |
t |
)
The P-boundaries (P-doubtful regions) of
\geq | |
Cl | |
t |
\leq | |
Cl | |
t |
BnP(Cl
\geq | |
t |
)=
\geq | |
\overline{P}(Cl | |
t |
\geq | |
)-\underline{P}(Cl | |
t |
)
BnP(Cl
\leq | |
t |
)=
\leq | |
\overline{P}(Cl | |
t |
\leq | |
)-\underline{P}(Cl | |
t |
)
The ratio
\gammaP(bf{Cl})=
| |||||||||||||||
|U| |
defines the quality of approximation of the partition
bf{Cl}
P
Every minimal subset
P\subseteqC
\gammaP(Cl)=\gammaC(Cl)
C
REDCl(P)
On the basis of the approximations obtained by means of the dominance relations, it is possible to induce a generalized description of the preferential information contained in the decision table, in terms of decision rules. The decision rules are expressions of the form if [condition] then [consequent], that represent a form of dependency between condition criteria and decision criteria. Procedures for generating decision rules from a decision table use an inductive learning principle. We can distinguish three types of rules: certain, possible and approximate. Certain rules are generated from lower approximations of unions of classes; possible rules are generated from upper approximations of unions of classes and approximate rules are generated from boundary regions.
Certain rules has the following form:
if
f(x,q1)\geqr1
f(x,q2)\geqr2
\ldotsf(x,qp)\geqrp
x\in
\geq | |
Cl | |
t |
if
f(x,q1)\leqr1
f(x,q2)\leqr2
\ldotsf(x,qp)\leqrp
x\in
\leq | |
Cl | |
t |
Possible rules has a similar syntax, however the consequent part of the rule has the form:
x
\geq | |
Cl | |
t |
x
\leq | |
Cl | |
t |
Finally, approximate rules has the syntax:
if
f(x,q1)\geqr1
f(x,q2)\geqr2
\ldotsf(x,qk)\geqrk
f(x,qk+1)\leqrk+1
f(x,qk+2)\leqrk+2
\ldotsf(x,qp)\leqrp
x\inCls\cupCls+1\cupClt
The certain, possible and approximate rules represent certain, possible and ambiguous knowledge extracted from the decision table.
Each decision rule should be minimal. Since a decision rule is an implication, by a minimal decision rule we understand such an implication that there is no other implication with an antecedent of at least the same weakness (in other words, rule using a subset of elementary conditions or/and weaker elementary conditions) and a consequent of at least the same strength (in other words, rule assigning objects to the same union or sub-union of classes).
A set of decision rules is complete if it is able to cover all objects from the decision table in such a way that consistent objects are re-classified to their original classes and inconsistent objects are classified to clusters of classes referring to this inconsistency. We call minimal each set of decision rules that is complete and non-redundant, i.e. exclusion of any rule from this set makes it non-complete.One of three induction strategies can be adopted to obtain a set of decision rules:[4]
The most popular rule induction algorithm for dominance-based rough set approach is DOMLEM,[5] which generates minimal set of rules.
Consider the following problem of high school students’ evaluations:
x1 | medium | medium | bad | bad | ||
---|---|---|---|---|---|---|
x2 | good | medium | bad | medium | ||
x3 | medium | good | bad | medium | ||
x4 | bad | medium | good | bad | ||
x5 | bad | bad | medium | bad | ||
x6 | bad | medium | medium | medium | ||
x7 | good | good | bad | good | ||
x8 | good | medium | medium | medium | ||
x9 | medium | medium | good | good | ||
x10 | good | medium | good | good |
Each object (student) is described by three criteria
q1,q2,q3
Cl1=\{bad\}
Cl2=\{medium\}
Cl3=\{good\}
\leq | |
Cl | |
1 |
\leq | |
Cl | |
2 |
\geq | |
Cl | |
2 |
\geq | |
Cl | |
3 |
Notice that evaluations of objects
x4
x6
x4
x6
Therefore, lower approximations of class unions consist of the following objects:
\leq | |
\underline{P}(Cl | |
1 |
)=\{x1,x5\}
\leq | |
\underline{P}(Cl | |
2 |
)=\{x1,x2,x3,x4,x5,x6,x8\}=
\leq | |
Cl | |
2 |
\geq | |
\underline{P}(Cl | |
2 |
)=\{x2,x3,x7,x8,x9,x10\}
\geq | |
\underline{P}(Cl | |
3 |
)=\{x7,x9,x10\}=
\geq | |
Cl | |
3 |
Thus, only classes
\leq | |
Cl | |
1 |
\geq | |
Cl | |
2 |
\leq | |
\overline{P}(Cl | |
1 |
)=\{x1,x4,x5,x6\}
\geq | |
\overline{P}(Cl | |
2 |
)=\{x2,x3,x4,x6,x7,x8,x9,x10\}
while their boundary regions are:
BnP(Cl
\leq | |
1 |
)=BnP(Cl
\geq | |
2 |
)=\{x4,x6\}
Of course, since
\leq | |
Cl | |
2 |
\geq | |
Cl | |
3 |
\leq | |
\overline{P}(Cl | |
2 |
\leq | |
)=Cl | |
2 |
\geq | |
\overline{P}(Cl | |
3 |
\geq | |
)=Cl | |
3 |
BnP(Cl
\leq | |
2 |
)=BnP(Cl
\geq | |
3 |
)=\emptyset
The following minimal set of 10 rules can be induced from the decision table:
Physics\leqbad
student\leqbad
Literature\leqbad
Physics\leqmedium
Math\leqmedium
student\leqbad
Math\leqbad
student\leqmedium
Literature\leqmedium
Physics\leqmedium
student\leqmedium
Math\leqmedium
Literature\leqbad
student\leqmedium
Literature\geqgood
Math\geqmedium
student\geqgood
Physics\geqgood
Math\geqgood
student\geqgood
Math\geqgood
student\geqmedium
Physics\geqgood
student\geqmedium
Math\leqbad
Physics\geqmedium
student=bad\lormedium
The last rule is approximate, while the rest are certain.
The other two problems considered within multi-criteria decision analysis, multicriteria choice and ranking problems, can also be solved using dominance-based rough set approach. This is done by converting the decision table into pairwise comparison table (PCT).[1]
The definitions of rough approximations are based on a strict application of the dominance principle. However, when defining non-ambiguous objects, it is reasonable to accept a limited proportion of negative examples, particularly for large decision tables. Such extended version of DRSA is called Variable-Consistency DRSA model (VC-DRSA)[6]
In real-life data, particularly for large datasets, the notions of rough approximations were found to be excessively restrictive. Therefore, an extension of DRSA, based on stochastic model (Stochastic DRSA), which allows inconsistencies to some degree, has been introduced.[7] Having stated the probabilistic model for ordinal classification problems with monotonicity constraints, the concepts of lower approximations are extended to thestochastic case. The method is based on estimating the conditional probabilities using the nonparametric maximum likelihood method which leadsto the problem of isotonic regression.
Stochastic dominance-based rough sets can also be regarded as a sort of variable-consistency model.
4eMka2 is a decision support system for multiple criteria classification problems based on dominance-based rough sets (DRSA). JAMM is a much more advanced successor of 4eMka2. Both systems are freely available for non-profit purposes on the Laboratory of Intelligent Decision Support Systems (IDSS) website.