|
|
| ARTICLE |
|
|
|
| Year : 2012 | Volume
: 58
| Issue : 2 | Page : 138-154 |
|
|
ERANN: An Algorithm to Extract Symbolic Rules from Trained Artificial Neural Networks
SM Kamruzzaman1, Md. Abdul Hamid2, AM Jehad Sarkar3
1 Department of Electronics Engineering, Hankuk University of Foreign Studies, Yongin-si, Kyonggi-do, 449-791, Korea 2 Department of Information and Communications Engineering, Hankuk University of Foreign Studies, Yongin-si, Kyonggi-do, 449-791, Korea 3 Digital Information Engineering, Hankuk University of Foreign Studies, Yongin-si, Kyonggi-do, 449-791, Korea
| Date of Web Publication | 16-May-2012 |
Correspondence Address: S M Kamruzzaman Department of Electronics Engineering, Hankuk University of Foreign Studies, Yongin-si, Kyonggi-do, 449-791 Korea
 DOI: 10.4103/0377-2063.96181
Abstract | | |
This paper presents an algorithm to extract symbolic rules from trained artificial neural networks (ANNs), called ERANN. In many applications, it is desirable to extract knowledge from ANNs for the users to gain a better understanding of how the networks solve the problems. Although ANN usually achieves high classification accuracy, the obtained results sometimes may be incomprehensible, because the knowledge embedded within them is distributed over the activation functions and the connection weights. This problem can be solved by extracting rules from trained ANNs. To do so, a rule extraction algorithm has been proposed in this paper to extract symbolic rules from trained ANNs. A standard three-layer feedforward ANN with four-phase training is the basis of the proposed algorithm. Extensive experimental studies on a set of benchmark classification problems, including breast cancer, iris, diabetes, wine, season, golfplaying, and lenses classification, demonstrates the applicability of the proposed method. Extracted rules are comparable with other methods in terms of number of rules, average number of conditions for a rule, and the rules accuracy. The proposed method achieved accuracy values 96.28%, 98.67%, 76.56%, 91.01%, 100%, 100%, and 100% for the above problems, respectively. It has been seen that these results are one of the best results comparing with results obtained from related previous studies. Keywords: Backpropagation, Clustering algorithm, Constructive algorithm, Continuous activation function, Pruning algorithm, Rule extraction algorithm, Symbolic rules
How to cite this article: Kamruzzaman S M, Hamid M, Jehad Sarkar A M. ERANN: An Algorithm to Extract Symbolic Rules from Trained Artificial Neural Networks. IETE J Res 2012;58:138-54 |
1. Introduction | |  |
Artificial neural networks (ANNs) have become a powerful tool to solve some classes of problems, especially problems which can be hard tracked by expert systems, heuristics, or deterministic algorithms. ANNs have been successfully applied to solve a variety of real world problems [1],[2],[3],[4],[5] . However, an inherent defect of ANNs is that the learned knowledge is masked in a large amount of connection weights, which leads to the poor transparency of knowledge and poor explanation ability [6] . In order to compensate this defect, researchers are interested in developing a humanly understandable representation for ANNs. This can be achieved by extracting rules from trained ANNs because rule extraction is providing a description of the inner workings of the network which is being easy to understand by human beings. This is why developing algorithms to extract rules from trained ANNs has been a hot research topic for a couple of years [2, 3, 6-18]. It is an extensively studied research topic and the detailed surveys are available in [6],[8] . Rule extraction from ANNs solves two fundamental problems: It gives insight into the logic behind the network and in many cases, it improves the network's ability to generalize the acquired knowledge [16] . It is therefore desirable to have a set of rules to explain how ANNs solve a given problem.
One of the main criteria in describing rule extraction scheme is how the algorithm makes use of the existing ANNs. There are three approaches on rule extraction from ANNs: Pedagogical, decompositional, and eclectic. Pedagogical approaches consider a neural network as a blackbox in that they are not concerned with the internal structure of the network and extract rules by only looking at the input variables and output activations. It is no need to examine the behavior of any nodes within the network [17,18] . This approach aims at extracting symbolic rules which map the input-output relationship as closely as possible to the way the ANNs understand the relationship.The number of the extracted rules and their form do not directly correspond to the number of connection weights or the architecture of ANNs [19] . On the other hand, decompositional methods investigate hidden nodes and weight matrices to produce rules that follow the internal working of the networks. Decompositional rule extraction algorithms directly interpret the response of each node in the network, sometimes assigning linguistic meaning to the nodes [20,21] . Finally, rule extraction techniques that do not fall clearly into one of the above categories are called eclectic. This approach is named as eclectic, because it is based on both pedagogical and decompositional approaches. The eclectic approach is characterized by any use of knowledge concerning the internal architecture and/or weight vectors in a trained ANN to complement a symbolic learning algorithm [22] .
Beside good classification accuracy, having a smaller number of rules is also a very important criterion for rule extraction algorithms for understanding easily by human experts. There are a number of works in the literature to explain the functionality of ANNs by extracting rules from trained ANNs. The main problem of most of the existing work is that they determine the number of hidden nodes in ANNs manually. Thus, the prediction accuracy and rules extracted from trained ANNs may not be optimal since the performance of ANNs is greatly dependent on their architectures. Furthermore, rules extracted by existing algorithms are not simple as a result it is difficult to understand by users.
This paper proposes a new algorithm, called ERANN (extraction of rules from ANNs), to extract symbolic rules from trained ANNs. A standard three-layer feedforward ANN with four-phase training is the basis of the proposed algorithm. In the first phase, the number of hidden nodes in ANNs is determined automatically by a weight freezing-based constructive algorithm. In the second phase, irrelevant connections and nodes are removed from trained ANNs without sacrificing the predictive accuracy of ANNs. The continuous activation values of the hidden nodes are then discretized by using an efficient heuristic clustering algorithm in the third phase. Finally, rules are extracted from compact ANNs by examining the discretized activation values of the hidden nodes.
The prominent feature of ERANN is that it does not require many user-specified parameters for extracting rules. In addition, an efficient clustering algorithm is used in ERANN to discretize the continuous values of hidden nodes so that rules can be extracted easily by using discretized values.The rest of the paper is organized as follows. Section 2 describes the related work. The proposed algorithm is presented in section 3. We discuss the performance studies in section 4. The performance comparisons are reported in section 5. Finally, in section 6, we conclude the paper.
2. Related Work | |  |
Because of the strong research interest, a number of algorithms for extracting rules from trained ANNs have been developed. Towell and Shavlik described two methods for extracting rules from ANN in [23] . The first method is the subset algorithm [24] , which searches for subsets of connections to a unit whose summed weight exceeds the bias of that node. The major problem with subset algorithms is that the cost of finding all subsets increases as the size of the power set of the links to each node. The second method, the MofN algorithm [25] , is an improvement of the subset method that is designed to explicitly search for M-of-N rules from knowledge-based ANNs. It checks a group of connections instead of a single connection in ANNs to find their contribution in node's activation. This is done by clustering the connections of ANNs. The problems of MofN are it uses threshold activation function, which is not continuous and uses fixed number of hidden nodes that require prior knowledge of the problem to be solved.
Craven and Shavlik proposed a method that uses sampling and queries in [26] . Instead of searching for rules from the ANN, the problem of rule extraction is viewed as a learning task. The target concept is the function computed by the network and the ANN input features are the inputs for the learning task. Conjunctive rules are then extracted from the ANN. Liu and Tan proposed a simple and fast algorithm X2R in [27] that can be applied to both numeric and discrete data for generating rules. X2R can generate concise rules from raw datasets by using first-order information. It can generate perfect rules in the sense that the error rate of the rules is not worse than the inconsistency rate found in the original data. The problem of X2R is that rules generated by it are order sensitive, i.e., generated rules should be fired in sequence. Setiono and Liu presented a novel way to understand ANNs by extracting rules with a three-phase algorithm in [28] . A weight decay backpropagation network is built in the first phase so that important connections are reflected by their bigger weights. In the second phase, the network is pruned in such a way so that insignificant connections are deleted while its predictive accuracy is still maintained. In the third phase, rules are extracted by recursively discretizing the hidden unit activation values. The problem of three-phase algorithm is that the discretizing algorithm used to discretize the output values of hidden nodes is not efficient.
Setiono and Liu proposed a rule extraction algorithm named NeuroRule in [29] which can extract symbolic classification rules from a pruned network with a single hidden layer in two steps. First, the rules that explain the network outputs are extracted in terms of the discretized activation values of the hidden units. Second, the rules that explain the discretized hidden unit activation values are extracted in terms of the network inputs. When the two sets of rules are merged, a DNF representation of network classification can be obtained. Setiono and Leow proposed a new method, rule extraction from function approximating neural networks (REFANN), for extracting rules from trained ANNs for nonlinear regression in [30] . It is shown that REFANN can produce rules that are almost as accurate as the original ANNs from whom rules are extracted. For some problems, REFANN extracts few rules that represent useful knowledge for explaining problems easily. REFANN approximates the nonlinear hyperbolic tangent activation function of the hidden nodes by using a simple three-piece or five-piece linear function. It then generates rules in the form of linear equations from trained ANNs. The problem of REFANN is that it needs to divide the continuous hidden node activation into three-piece or five-piece linear function, which may not be possible for complex problems.
Kamruzzaman and Islam proposed a new algorithm, REANN, to extract rules from trained ANNs for medical diagnosis problems in [31] . This paper investigates the rule extraction process for only threedatasets that include breast cancer, diabetes, and lenses, whereas our current work investigates the applicability of the proposed approach in a wide variety of real world problems. In addition, the constructive algorithm used in REANN is not efficient. Etchells and Lisboa [32] proposed an orthogonal search-based methodology for rule-extraction (OSRE) by fitting Boolean rules to any analytical smooth decision surface, which is typical of several neural network models. The application of the OSRE methodology was tested on four validation datasets, namely the three Monks' and Wisconsin Breast Cancer, generating interpretable rules that accurately replicate the binary inferences made by the response surfaces.
Jin and Sendhoff provide a review of the existing research on Pareto-based multiobjective learning algorithms in [33] . They illustrate Pareto-based multiobjective machine learning (PMML) on three benchmark problems; breast cancer, iris, and diabetes, which can address important topics in machine learning, such as generating interpretable models, model selection for generalization, and ensemble extraction, using the Pareto-based multiobjective approach. Finally, they compare three Pareto-based approaches to the extraction of neural ensembles and indicate that the method by trading off accuracy and complexity can provide reliable results.
Kahramanli and Allahverdi presented a new method RAIS in [34] that uses artificial immune systems algorithm to extract rules from trained adaptive neural network. Two real-time problems data, breast cancer and ECG, were investigated for determining applicability of the proposed method. Finally, Chorowski and Zurada presented a novel eclectic approach to rule extraction from ANNs in [16] , named LOcal Rule Extraction, suited for multilayer perceptron networks with discrete (logical or categorical) inputs. The extracted rules mimic network behavior on the training set and relax this condition on the remaining input space.
In summary, the problems of existing algorithms are the use predefined and fixed number of hidden nodes, clustering algorithms is not efficient, computationally expensive, and could not produce concise and order-insensitive rules. To overcome these limitations, we have proposed an algorithm to extract symbolic rules from trained ANNs that are described in detail in the next section.
3. The Erann Algorithm | |  |
There are several possible approaches to understanding the representations learned by ANNs. Our approach to understanding trained ANNs is to extract symbolic rules from them with a four-phase algorithm. The aim of this section is to introduce a new rule extraction algorithm named extraction of rules from ANNs (ERANN) to extract symbolic rules from trained ANNs for understanding how an ANN solves a given problem. In comparison with other existing algorithms in the literature, the major advantages of ERANN include the following: (i) it can determine near optimal ANN architectures automatically by using a constructive-pruning strategy; (ii) it uses an efficient method to discretize the output values of hidden nodes; (iii) it is computationally inexpensive; and (iv) the extracted rules are concise, comprehensible, order insensitive, and highly accurate.
The major steps of ERANN are summarized in [Figure 1] and explained further as follows:
Step 1: Create an initial ANN architecture having three layers: An input, an output, and a hidden layer. The number of nodes in the input and output layers is the same as the number of input (attributes)and the output (classes) of the problem, respectively. Number of node in the hidden layer starts with one and determined automatically by using a weight freezing-based constructive algorithm, explained in subsection 3.1.
Step 2: Remove redundant connection weights between input and hidden layer nodes and between hidden and output layer nodes by using a basic pruning algorithm, explained in subsection 3.2. A node is pruned if all the connection weights to and from the node are pruned. When pruning is completed, the ANN architecture contains only important nodes and connection weights. This architecture is saved for the next step.
Step 3: Discretize the continuous output values of hidden nodes by using an efficient heuristic clustering algorithm, explained in subsection 3.3. The purpose of the discretization is that rules cannot be readily extractable from ANNs with that continuous value rather than discretizing it.
Step 4: Extract the symbolic rules that map the input-output relationships. The task of the rule extraction is accomplished in three phases. In the first phase, rules are extracted by using the rule extraction algorithm, explained in subsection 3.4, to describe the outputs of ANN in terms of the discretized output values of the hidden nodes. In the second phase, rules are extracted to describe the discretized output values of the hidden nodes in terms of the inputs. Finally, in the third phase, combine the rules extracted in the first and second phases.
It may be seen that ERANN is very straightforward. However, ERANN consisted of four phases which are implemented sequentially one by one. In the following subsections, each phase is described elaborately and the reasons for utilizing different techniques in each phase are also explained. The rules extracted by ERANN are compact and comprehensible, and do not involve any weight values. The accuracy of the rules extracted from pruned ANNs is as high as the accuracy of the original ANNs.
3.1 Constructive Algorithm
Constructive algorithms offer an attractive framework for the incremental construction of near-minimal ANN architectures. These algorithms start with a small network (usually a single node) and dynamically grow the network by adding and training nodes as needed until a satisfactory solution is found. One drawback of the traditional backpropagation algorithm is the need to determine the number of nodes in the hidden layer prior to training. To overcome this difficulty, many algorithms that construct a network dynamically have been proposed [35],[36],[37] . The most well-known constructive algorithms are dynamic node creation [38] , feedforward neural network construction algorithm(FNNCA), and the cascade correlation algorithm [39] .
The constructive algorithm used in ERANN is based on the FNNCA proposed in [40] . In FNNCA, the training process is stopped when the classification accuracy on the training set is 100%. However, it is not possible to get 100% classification accuracy for most of the benchmark classification problems. In addition, higher classification accuracy on the training set does not guarantee the higher generalization ability, i.e., classification accuracy on the testing set.
The training time is an important issue in designing ANNs. One approach for reducing the number of connection weights to be trained is to train few weights rather than all weights in a network and keep remaining connection weights fixed, commonly known as weight freezing. The idea behind the weight freezing-based constructive algorithm is to freeze input connection weights of a hidden node when its output does not change much in the successive few training epochs. Theoretical and experimental studies reveal that some hidden nodes of an ANN maintain almost constant output after some training epochs, while others continuously change during the whole training period [41,42] . In our constructive algorithm, it has been proposed that the input connection weights of a hidden node can be frozen when its output does not change much in the successive training epochs. This weight freezing method can be considered as combination of the two extremes: For training all the weights of ANNs and for training the weights of only the newly added hidden node of ANNs [41] .
The major steps of constructive algorithm used in ERANN are summarized in [Figure 2] and explained further as follows:
Step 1: Create an initial ANN consisting of three layers, i.e., an input, an output, and a hidden layer. The number of nodes in the input and output layers is the same as the number of inputs and outputs of the problem. Initially, the hidden layer contains only one node. Randomly initialize connection weights between input layer to hidden layer and hidden layer to output layer within a certain small range.
Step 2: Train the network on the training set by using backpropagation algorithm until the training error E does not significantly reduce in the next few training epochs τ. Here,τ is the user-defined positive integer number.
Step 3: Compute the training error E of ANN. If E is found unacceptable (i.e., too large), then assume that the ANN has inappropriate architecture, and go to the next step. Otherwise, stop the training process. The training error E of ANN is calculated according to the following equations:

where, k is the number of patterns, C is the number of output nodes, and t pi is the target value for pattern x i at output node p andS pi is the output of the network.

where, h is the number of hidden nodes in the network, x i is an n-dimensional input pattern, , w m is a p-dimensional vector weights for the arcs connecting the input layer and the m-th hidden node, , v m is a C-dimensional vector of weights for the arcs connecting the m-th hidden node and the output layer. We have chosen logistic sigmoid function as σ(y)=1/(1+e-y) the activation function for the output layer and the hyperbolic tangent function as the activation function for the hidden layer.

Step 5 : Freeze the input connection weights of that node.
Step 6: Add one node to the hidden layer. Randomly initialize the connection weights of the arcs connecting this new hidden node with input layer and output layer nodes and go to step 2.
3.2 Pruning Algorithm
Pruning offers an approach for dynamically determining an appropriate network topology. Pruning techniques begin by training a larger than necessary network and then eliminate weights and nodes that are deemed redundant [43] . As the nodes of the hidden layer are determined automatically by constructive algorithm in ERANN, the aim of this pruning algorithm used here is to remove as many unnecessary connections as possible. A node is pruned if all the connections to and from the node are pruned. Typically, methods for removing weights from the network involve adding a penalty term to the error function. It is expected that by adding a penalty term to the error function, unnecessary connections will have small weights, and therefore pruning can reduce the complexity of the network significantly. The simplest and most commonly used penalty term is the sum of the squared weights. The penalty function consists of two terms; the first term is to discourage the use of unnecessary connection weights and the second term is to prevent the connection weights from taking excessively large values.
Given a set of input patterns xi,Rn, i=1, 2, …, k, let w m is an p-dimensional vector weights for the arcs connecting the input layer and the m-th hidden node, m=1, 2, …, h. The weight of the connection from the l-th input node to the m-th hidden node is denoted by w ml, v m is a C-dimensional vector of weights for the arcs connecting the m-th hidden node and the output layer. The weight of the connection from the m-th hidden node to the p-th output node is denoted by v pm. It has been suggested that faster convergence can be achieved by minimizing the cross entropy function instead of squared error function [44] .
The backpropagation algorithm is applied to update the weights (w, v) and minimize the following error function:

where, F(w, v) is the cross entropy function

where,is the output of the network


The values for the weight decay parameters, ε1 , ε2 > 0, must be chosen to reflect the relative importance of the accuracy of the network vs its complexity. More weights may be removed from the network at the cost of a decrease in the network accuracy with larger values of these two parameters. They also determine the range of values where the penalty for each weight in the network is approximately equal to ε1 . The parameter β> 0 determines the steepness of the error function near the origin.
This pruning algorithm removes the connections of the ANN according to the magnitudes of their weights. As the eventual goal of ERANN is to get a set of simple rules that describe the classification process, it is important that all unnecessary connections and nodes must be removed. In order to remove as many connections as possible, the weights of the network must be prevented from taking values that are too large [45] . At the same time, weights of irrelevant connections should be encouraged to converge zero. The penalty function is found to be particularly suitable for these purposes.
The steps of the weight-pruning algorithm are summarized in [Figure 3], which are explained further as follows:
Step 1: Train the network to meet a prespecified accuracy level with the condition (7) satisfied by all correctly classified input patterns.

Let η1 and η2 be positive scalars such that η1 +η2 < 0.5 (η1 is the error tolerance, η2 is a threshold that determines if a weight can be removed), where [0, 0.5). Let (w, v) be the weights of this network.
Step 2: Remove connections between input nodes and hidden nodes and between hidden nodes and output nodes. This task is accomplished in two phases. In the first phase, connections between input nodes and hidden nodes are removed. For eachin the network, if

then remove from the network.

Step 4: Retrain the network and calculate the classification accuracy of the network.
Step 5: If classification rate of the network falls below an acceptable level, then stop and use the previous setting of network weights. Otherwise, go to step 2.
The pruning algorithm used in ERANN intended to reduce the amount of training time. Although it can no longer be guaranteed that the retrained pruned ANN will give the same accuracy rate as the original ANN, the experiments show that many weights can be eliminated simultaneously without deteriorating the performance of the ANN. The two conditions (8) and (9) for pruning depends on the weights for connections between input and hidden nodes and between hidden and output nodes. It is imperative that during the training, these weights be prevented from getting too large. At the same time, small weights should be encouraged to decay rapidly to zero.
3.3 Heuristic Clustering Algorithm
The process of grouping a set of physical or abstract objects into classes of similar objects is called clustering. A cluster is a collection of data objects that are similar within the same cluster and are dissimilar to the object in other clusters. A cluster of a data object can be treated collectively as one group in many applications [46] . There exist a large number of clustering algorithms in the literature such as k-means, k-medoids [47,48] . The choice of clustering algorithm depends both on the type of data available and on the particular purpose and application.
After applying pruning algorithm in ERANN, the ANN architecture produced by constructive algorithm contains only important connections and nodes. Nevertheless, rules are not readily extractable because the hidden node activation values are continuous. The discretization of these values paves the way for rule extraction. It is found that some hidden nodes of an ANN maintain almost constant output while other nodes change continuously during the whole training process [42] . In ERANN, no clustering algorithm is used when hidden nodes maintain almost constant output. If the outputs of hidden nodes do not maintain constant value, a heuristic clustering algorithm is used.
The aim of the clustering algorithm is to discretize the output values of hidden nodes. The algorithm places candidates for discrete values such that the distance between them is at least a threshold value ε. A very small e will always guarantee that the network with discrete activation values that will have the same accuracy as the original network with continuous activation values. The algorithm can then be run again with a larger value of ε to reduce the number of clusters. The steps of the heuristic clustering algorithm are summarized in [Figure 4] and explained further as follows:



3.4 Rule Extraction Algorithm
Classification rules are sought in many areas from automatic knowledge acquisition [49,50] to data mining [51,52] and ANN rule extraction because of some of their attractive features. They are explicit, understandable, and verifiable by domain experts, and can be modified, extended, and passed on as modular knowledge. The rule extraction algorithm is composed of three major functions:
- Rule Extraction: This function iteratively generates shortest rules and remove/mark the patterns covered by each rule until all patterns are covered by the rules;
- Rule Clustering: Rules are clustered in terms of their class levels; and
- Rule Pruning: Redundant or more specific rules in each cluster are removed.
A default rule should be chosen to accommodate possible unclassifiable patterns. If rules are clustered, the choice of the default rule is based on clusters of rules. The steps of the rule extraction algorithm are summarized in [Figure 5] and explained further as follows:
Step 1: Extract rule
i=0;
while (data are not empty/marked){extract Ri to cover the current pattern and differentiate it from patterns in other categories; remove/mark all the patterns covered by Ri; i=i+1;}
The core of this step is a greedy algorithm that finds the shortest rule based on first-order information, which can differentiate the pattern under consideration from the patterns of other classes. It then iteratively extracts shortest rules and remove the patterns covered by each rule until all patterns are covered by the rules.
Step 2: Cluster rule
Cluster rules according to their class levels. Rules extracted in step 1 are grouped in terms of their class levels. In each rule cluster, redundant rules are eliminated; specific rules are replaced by more general rules.
Step 3: Prune rule
Replace specific rules with more general ones;
Remove noise rules;
Eliminate redundant rules;
Step 4: Check whether all patterns are covered by any rules. If yes, then stop, otherwise continue.
Step 5: Determine a default rule.
A default rule is chosen when no rule can be applied to a pattern.
The rule extraction algorithm exploits the first-order information in the data and finds shortest sufficient conditions for a rule of a class that can differentiate it from patterns of other classes. It can extract concise and perfect rules in the sense that the error rate of the rules is not worse than the inconsistency rate found in the original data. The novelty of this algorithm is that the rule extracted by it is order insensitive, i.e., the rules need not be required to fire sequentially.
4. Performance Studies | |  |
This section evaluates the performance of ERANN on a set of well-known benchmark classification problems including breast cancer, iris, diabetes, wine, season, golf playing, and lenses which are widely used in machine learning and ANN research. The datasets representing all the problems were real-world data.
4.1 Dataset Description
The following subsections briefly describe the dataset used in this study. The characteristics of the datasets are summarized in [Table 1]. The detailed descriptions of the datasets are available in [53,54] .
The breast cancer data: The purpose of this problem is to diagnose a breast tumor as either benign or malignant based on cell descriptions gathered by microscopic examination. The dataset representing this problem contained 699 examples. Each example consisted of nine-element real-valued vectors. This is a two-class problem. All inputs are continuous; 65.5% of the examples are benign. This makes for entropy of 0.93 bits per example. Input attributes are for instance the clump thickness (A1 ), the uniformity of cell size (A2 ) and cell shape (A3 ), the amount of marginal adhesion (A4 ), single epithelial cell size (A5 ), bare nuclei (A6 ), bland chromatin (A7 ), normal nucleoli (A8 ), and mitosis (A9 ). This dataset was created based on the "Breast Cancer Wisconsin" problem dataset from the University of California, Irvine (UCI) repository of machine-learning databases. All inputs are normalized, to be more exact, A1 ..., A9 {0.1, 0.2..., 0.8, 0.9, 1.0}. The two outputs are a complementary binary value, i.e., if the first output is 1, which means "benign," then the second output is 0. Otherwise, the first output is 0, which means "malignant," and the second output is 1.
The iris data: This is perhaps the best-known database to be found in the pattern recognition literature. The dataset contains threeclasses of 50 instances each, where each class refers to a type of Iris plant. Four attributes are used to predict the Iris class, i.e., sepal length (A1 ), sepal width (A2 ), petal length (A3 ), and petal width (A4 ), all in centimeters. Among the three classes, class 1 is linearly separable from the other two classes, and classes 2 and 3 are not linearly separable from each other. To ease knowledge extraction, we reformulate the data with three outputs, where class 1 is represented by {1, 0, 0}, class 2 by {0, 1, 0}, and class 3 by {0, 0, 1}.
The diabetes data: The objective of this problem is to diagnose whether a Pima Indian individual is diabetes positive or not based on his/her personal data, including age, number of times pregnant, and the results of medical examinations (e.g., blood pressure, body mass index, result of glucose tolerance test, etc.). There are 768 examples in the dataset, each of which consisted of eight-element real-valued vectors. This is a two-class problem. The eight attributes are number of pregnant (A1 ), plasma glucose concentration (A2 ), blood pressure (A3 ), triceps skin fold thickness (A4 ), Two hour h serum insulin (A5 ), body mass index (A6 ), diabetes pedigree function (A7 ), and age (A8 ). In this database, 268 instances are positive (output equals 1) and 500 instances are negative (output equals 0). All inputs are continuous and 65.1% of the examples are diabetes negative; entropy 0.93 bits per example. This dataset was created based on the "Pima Indians Diabetes" problem dataset from the UCI repository of machine learning databases.
The wine data: In a classification context, this is a well-posed problem with "well-behaved" class structures. It is a good dataset for first testing of a new classifier, but not very challenging. These data are the results of a chemical analysis of wines grown in the same region in Italy but derived from three different cultivars. The analysis determined the quantities of 13 constituents found in each of the three types of wines. Number of instances is 178 and number of attributes is 13. All attributes are continuous. This was a three-class problem.
The season data: The season dataset contains discrete data only. There are 11 examples in the dataset, each of which consisted of three elements. These are weather, tree, and temperature. This was a four-class problem.
The golf-playing data: This is a small illustrative dataset that uses weather information to decide whether or not to play golf. There are 14 examples in the dataset, each of which consisted of four elements. The dataset has two nominal attributes, outlook (with values sunny, overcast, and rain) and windy (with values true and false), and two continuous valued ones, temperature and humidity.The golf-playing dataset contains both numeric and discrete data. This was a two-class problem.
The lenses data: This problem uses a database for fitting contact lenses. The database is complete and noise free and contains 24 examples. These examples highly simplified the problem. The attributes do not fully describe all the factors affecting the decision as to which type, if any, to fit. All attributes are nominal. This was a three-class problem: The patient should be fitted with hard contact lenses, soft contact lenses, and no contact lenses.
4.2 Experimental Setup

In this study, all datasets representing the problems were divided into two sets: The training set and the testing set. The numbers of examples in the training set and testing set were chosen to be the same as those in other works, in order to make the comparison with those works possible. The sizes of the training and testing datasets used in this study are given as follows:
Breast cancer data: The first 350 examples are used for the training set and the rest 349 for the testing set.
Iris data: The first 75 examples are used for the training set and the rest 75 for the testing set.
Diabetes data: The first 384 examples are used for the training set and the rest 384 for the testing set.
Wine data: The first 89 examples are used for the training set and the rest 89 for the testing set.
Season data: The first 6 examples are used for the training set and the rest 5 for the testing set.
Golf playing data: The first 7 examples are used for the training set and the rest 7 for the testing set.
Lenses data: The first 12 examples are used for the training set and the rest 12 for the testing set.
4.3 Experimental Results
[Table 2],[Table 3],[Table 4],[Table 5],[Table 6],[Table 7] and [Table 8] show the ANN architectures produced by ERANN and training epochs over 10 independent runs on seven benchmark classification problems. The initial architecture was selected before applying the constructive algorithm, which was used to determine the number of nodes in the hidden layer. The intermediate architecture was the outcome of the constructive algorithm, and the final architecture was the outcome of pruning algorithm used in ERANN. The minimum (min), maximum (max), average (mean), and standard deviation (S. dev) have shown for each architecture and training epochs of every dataset.It has been seen that ERANN can automatically determine compact ANN architectures. For example, for the breast cancer data, ERANN produces more compact architecture. The average number of nodes and connections were 6.8 and 5.8, respectively; in most of the 10 runs, 5 to 6 input nodes were pruned.
[Figure 6] and [Figure 7] show the smallest of the pruned networks over 10 runs for breast cancer and diabetes problem. The pruned network for breast cancer problem has only 1 hidden node and 5 connections. The accuracy of this network on the training data and testing data were 96.275% and 93.429%, respectively. In this example, only three input attributes A1 , A6 , and A9 were important and only three discrete values of hidden node activation's were needed to maintain the accuracy of the network. The discrete values found by the heuristic clustering algorithm were 0.987, −0.986, and 0.004. Of the 350 training data, 238 patters have the first value, 106 have the second value, and rest 6 patterns have third value. The weight of the connection from the hidden node to the first output node was 3.0354 and to the second output node was −3.0354.
The pruned network for diabetes problem has only 2 hidden nodes. No input nodes were pruned by pruning algorithm. One hidden node was pruned, as all the connections to and from this node was pruned. The accuracy on the training data and testing data were 76.30% and 75.52%, respectively. The weight of the connection from the first hidden node to the first output node was −1.153 and to the second output node was 1.153 and the weight of the connection from the second hidden node to the first output node was −32.078 and to the second output node was 32.084. The discrete values found by the heuristic clustering algorithm were 0.987, −0.986, and 0.004. Of the 350 training data, 238 patterns have the first value, 106 patterns the second value, and rest 6 patterns the third value. The weight of the connection from the hidden node to the first output node was 3.0354 and to the second output node was −3.0354.
[Figure 8] shows the training time error for breast cancer problem. It was observed that the training error decreased and maintained almost constant for a long time after some training epochs and then fluctuates. The fluctuation was made due to the pruning process. As the network was retrained after completing the pruning process, thus the training error again maintained almost constant value.
[Figure 9]and [Figure 10] show the training time error for diabetes problem. From the figure, it was observed that the training error decreased and maintained almost constant; after some training epochs, it was further decreased when additional hidden nodes were added. The fluctuation was observed due to the connection pruning and finally maintained almost constant value in account of retraining the pruned network. Training time error for diabetes data with weight freezing is shown in [Figure 10]. When error becomes constant, then weight freezing is done as shown in the [Figure 10]. | Figure 10: Training time error for diabetes problem with weight freezing.
Click here to view |
4.4 Performance Metric
We had chosen number of extracted rules, average number of conditions for a rule, and the classification accuracy as the performance metrics to evaluate the extracted rules of our proposed scheme. The measure of the ability of the classifier to produce accurate classification is determined by accuracy. Accuracy is calculated as per the following.

4.5 Extracted Rules
The number of rules extracted by ERANN, the average number of conditions for a rule, and the accuracy of the rules ispresented in [Table 9], but the visualization of the rules in terms of the original attributes were not discussed. The aim of this subsection is to show what kinds of rules are extracted by ERANN in terms of the original attributes for different problem. The number of conditions per rule and the number of rules extracted were also visualized there. | Table 9: Number of extracted rules, average number of conditions, and rules accuracies
Click here to view |
The breast cancer data

[Table 9] shows the number of extracted rules, average conditions for a rule, and the rules accuracy for the seven benchmark problems. In most of the cases, ERANN produces fewer rules with better accuracy. It was observed that two to three rules were sufficient to solve the problems. On the other hand, one to three conditions for a rule were needed in all cases. The accuracies were 100% for three datasets including season, golf playing, and lenses classification. These datasets have a lower number of examples.
5. Comparisons | |  |
This section compares experimental results of ERANN with the results of other works. A rule with many conditions is harder to understand than a rule with fewer conditions. Too many rules also hamper human beings' understanding of the data under examination. In addition to understandability, rules without generalization are not much of use. Hence, the comparison is performed along three dimensions: Number of rules, average number of conditions for arule, and accuracy. The primary aim of this work is not to evaluate ERANN in order to gain a deeper understanding of rule generation without an exhaustive comparison between ERANN and all other works.
[Table 10] compares the ERANN results of the breast cancer problem with those produced by PMML [33] , RAIS [34] , NN RULES [29] , DT RULES [29] , C4.5 [49] , NN-C4.5 [55] , OC1 [55] , and CART [56] algorithms. ERANN achieved best performance with 96.28% accuracy. Although NN RULES was the closest second with 96% accuracy, number of rules generated by ERANN is two whereas that was four for NN RULES. | Table 10: Performance comparison of ERANN with other algorithms for the breast cancer data
Click here to view |
[Table 11] compares ERANN results of iris data with those produced by PMML, NN RULES, DT RULES, BIO RE [26] , Partial RE [26] , and Full RE [26] algorithms. ERANN achieved 98.67% accuracy, although NN RULES was closest second with 97.33% accuracy. Here, number of rules extracted by ERANN and NN RULES are equal.  | Table 11: Performance comparison of ERANN with other algorithms for iris data
Click here to view |
[Table 12] compares ERANN results of diabetes data with those produced by PMML, NN RULES, C4.5, NN-C4.5, OC1, and CART algorithms. ERANN achieved 76.56% accuracy, although NN-C4.5 was closest second with 76.4% accuracy. Due to the high noise level, the diabetes problem is one of the most challenging problems in our experiments. ERANN has outperformed all other algorithms. | Table 12: Performance comparison of ERANN with other algorithms for diabetes data
Click here to view |
[Table 13] shows ERANN results of wine data. ERANN achieved 91.01% accuracy on wine data by generating three rules. No detailed previous work is found for showing comparison of this dataset. [Table 14] compares the ERANN results of the season data with those produced by RULES [57] and X2R [27] . All three algorithms achieved 100% accuracy. This is possible because the number of examples is low. ERANN extracted five rules, whereas RULES extractedseven and X2R six. | Table 14: Performance comparison of ERANN with other algorithms for season data
Click here to view |
[Table 15] compares ERANN results of golf playing data with those produced by RULES [57] , RULES-2 [58] , and X2R [27] . All four algorithms achieved 100% accuracy because of the lower number of examples. Number of extracted rules by ERANN is three whereas that was eight for RULES and 14 for RULES-2. Finally, [Table 16] compares ERANN results of lenses data with those produced by REANN, PRISM [59] . Both algorithms achieved 100% accuracy because of the lower number of examples. Number of extracted rules by ERANN is eight whereas that wasnine for PRISM. | Table 15: Performance comparison of ERANN with other algorithms for golf playing data
Click here to view |
 | Table 16: Performance comparison of ERANN with other algorithm for lenses data
Click here to view |
[Figure 11] shows the comparison of number of rules graphically for various algorithms. It was found that number of rules extracted by ERANN is lower in most of the cases for seven benchmark classification problems compared with other works. [Figure 12] shows the comparison of number of conditions per rule graphically for various algorithms. It was found again that number of conditions per rule is encouraging. ERANN and NN RULES emphasis the use of parallel features, while DT RULES focus on individual feature, that is why number of conditions of rules extracted by ERANN and NN RULES are greater than DT RULES. | Figure 12: Comparison of number of conditions per rule for various algorithms.
Click here to view |
6. Conclusions | |  |
This work is an attempted to open up the black boxes of ANNs by generating symbolic rules from it through the proposed efficient rule extraction algorithm ERANN. The algorithm can extract concise rules from standard feedforward ANN. An important feature of the rule extraction algorithm is its recursive nature. The rules are concise, comprehensible, order insensitive, and do not involve any weight values. The accuracy of the rules from a pruned network is as high as the accuracy of the fully connected network. Extensive experiments have been carried out in this study to evaluate how well ERANN performed on seven benchmark classification problems in ANNs including breast cancer, iris, diabetes, wine, season, golf playing, and lenses classification problems in comparison with other algorithms. In almost all cases, ERANN outperformed the others.
7. Acknowledgment | |  |
This work was supported by Hankuk University of Foreign Studies Research Fund of 2012.
References | |  |
| 1. | K Saito, and R Nakano, "Medical diagnosis expert system based on PDP model," in Proceedings of IEEE International Conference on Neutal Networks, New York: IEEE Press;pp. 1255-62. 1988.  |
| 2. | R Setiono, W K Leow, and J M Zurada, "Extraction of rules from artificial neural networks for nonlinear regression," IEEE Transactions on Neural Networks, Vol. 13, pp. 564-77, 2002.  |
| 3. | B Baesens, R Setiono, C Mues, and J Vanthienen, "Using neural network rule extraction and decision tables for credit-risk evaluation," J Management Science, Vol. 49, no. 3, pp. 312-29, 2003.  |
| 4. | T Hou, C Su, and H Chang, "Using neural networks and immune algorithms to find the optimal parameters for an IC wire bonding process," Expert System with Applications, Vol. 34, pp. 427-36, 2008.  |
| 5. | T Q Huynh, and J A Reggia, "Guiding hidden layer representations for improved rule extraction from neural networks," IEEE Transactions on Neural Networks, Vol. 22, no. 2, pp. 264-75, 2011.  |
| 6. | R Andrews, J Diederich, and A B Tickle, "Survey and critique of techniques for extracting rules from trained artificial neural networks," Knowledge Based System, Vol. 8, no. 6, pp. 373-89, 1995.  |
| 7. | S H Huang, and H Xing, "Extract intelligible and concise fuzzy rules from neural networks," Fuzzy Sets and Systems, Vol. 132, no. 2, pp. 233-43, 2002.  |
| 8. | A Darbari, "Rule extraction from trained ANN: A survey," Technical Report, Department of Computer Science, Dresden University of Technology, Dresden, Germany, 2000.  |
| 9. | Z H Zhou, "Rule extraction: Using neural networks or for neural networks?" Journal of Computer Science and Technology, Vol. 19, no. 2, pp. 249-53, 2004.  |
| 10. | H Jacobsson, "Rule extraction from recurrent neural networks: A taxonomy and review," Neural Computation, Vol. 17, no. 6, pp. 1223-63, 2005.  |
| 11. | E R Hruschka, and N F Ebecken, "Extracting rules from multilayer perceptrons in classification problems: A clustering-based approach," Neurocomputing, Vol. 70, pp. 384-97, 2006.  |
| 12. | S Bader, S Holldobler, and V Mayer-Eichberger, "Extracting propositional rules from feed-forward neural networks: A new decompositional approach," in Proceedings of 3 rd International Workshop on Neural-Symbolic Learning and Reasoning, pp. 1-6, 2007.  |
| 13. | R Setiono, B Baesens, and C Mues, "Recursive neural network rule extraction for data with mixed attributes," IEEE Transactions on Neural Networks, Vol. 19, no. 2, pp. 299-307, 2008.  |
| 14. | R Nayak, "Generating rules with predicates, terms and variables from the pruned neural networks," Neural Network, Vol. 22, no. 4, pp. 405-14, May 2009.  |
| 15. | J Guerreiro, and D Trigueiros, "A unified approach to the extraction of rules from artificial neural networks and support vector machines," in Proceedings of International Conference Advanced Data Mining and Applications, ADMA Part II, LNCS 6441, pp. 34-42, 2010.  |
| 16. | J Chorowski, and J M Zurada, "Extracting rules from neural networks as decision diagrams," to appear in the IEEE Transactions on Neural Networks, 2011.  |
| 17. | I Taha, and J Ghosh, "Three techniques for extracting rules from feedforward networks," Intelligent Engineering Systems Through Artificial Neural Networks, vol. 6, pp. 23-8, 1996.  |
| 18. | H Tsukimoto, "Extracting rules from trained neural networks," IEEE Transactions on Neural Networks, Vol. 11, no. 2, pp. 377-89, Mar. 2000.  |
| 19. | E W Saad, and D C II Wunsch, "Neural network explanation using inversion," Neural Networks, Vol. 20, pp. 78-93, 2007.  |
| 20. | R Setione, and H Liu, "Neurolinear: From neural networks to oblique decision rules," Neurocomputing, Vol. 17, no. 1, pp. 1-24, 1997.  |
| 21. | A S Garcez d'Avila, K Broda, and D M Gabbay, "Symbolic knowledge from trained neural networks: A sound approach," Artificial. Intelligence, vol.125, no. 1, pp. 155-207, 2001.  |
| 22. | E Keedwell, A Narayanan, and D Savic, "Creating rules from trained neural networks using genetic algorithms,"International Journal of Computers Systems and Signals, vol. 1, no. 1, pp. 30-42, 2000.  |
| 23. | G G Towell, and J W Shavlik, "Extracting refined rules from knowledge-based system neural networks," Machine Learning, Vol. 13, pp. 71-101, 1993.  |
| 24. | L Fu, "Rule learning by searching on adapted nets," in Proceedings of the 9 th National Conference on Artificial Intelligence, The MIT Press, Menlo Park, CA, pp. 590-5, 1991.  |
| 25. | G G Towell, and J W Shavlik, "Knowledge-based artificial neural networks," Artificial Intelligence, Vol. 70, pp. 119-65, 1994.  |
| 26. | M W Craven, and J W Shavlik, "Using sampling and queries to extract rules from trained neural networks," in Proceedings of the 11 th International Conference on Machine Learning, Morgan and Kaufmann, San Mateo, CA, 1994.  |
| 27. | H Liu, and S T Tan, "X2R: A fast rule generator," in Proceedings of IEEE International Conference on Systems, Man and Cybernetics, Vancouver, CA, 1995.  |
| 28. | R Setiono, and H Liu, "Understanding neural networks via rule extraction," in Proceedings of the 14 th International Joint Conference on Artificial Intelligence, 1995, pp. 480-5.  |
| 29. | R Setiono, and H Liu, "Symbolic presentation of neural networks," IEEE Computer, pp. 71-77, 1996.  |
| 30. | R Setiono, and W K Leow, "FERNN: An algorithm for fast extraction of rules from neural networks," Applied Intelligence, Vol. 12, pp. 15-25, 2000.  |
| 31. | S M Kamruzzaman, and M M Islam, "An algorithm to extract rules from artificial neural networks for medical diagnosis problems", International Journal of Information Technology, Vol. 12, no. 8, pp. 41-59, 2006.  |
| 32. | T A Etchells, and P J Lisboa, "Orthogonal search-based rule extraction (OSRE) for trained neural networks: A practical and efficient approach," IEEE Transactions on Neural Networks, Vol. 17, no. 2, pp. 374-84, Mar. 2006.  |
| 33. | Y Jin, and B Sendhoff, "Pareto-based multiobjective machine learning: An overview and case studies," IEEE Transactions on System Man and Cybernetics.-Part C: Applications and Reviews, Vol. 38, no. 3, pp. 397-415, 2008.  |
| 34. | H Kahramanli, and N Allahverdi, "Rule extraction from trained adaptive neural networks using artificial immune systems," Expert Systems with Applications, Vol. 36, pp. 1513-22, 2009.  |
| 35. | T Y Kwok, and D Y Yeung, "Constructive algorithms for structured learning in feedforward neural networks for regression problems," IEEE Transactions on Neural Networks, Vol. 8, pp. 630-45, 1997.  |
| 36. | M M Islam, X Yao, and K Murase, "A constructive algorithm for training cooperative neural network ensembles," IEEE Transactions on Neural Networks, Vol. 14, pp. 820-34, 2003.  |
| 37. | R Parekh, J Yang, and V Honavar, "Constructive neural network learning algorithms for pattern classification," IEEE Transactions on Neural Networks, Vol. 11, pp. 436-52, 2000.  |
| 38. | T Ash, "Dynamic node creation in backpropagation networks," Connection Science, Vol. 1, pp. 365-75, 1989.  |
| 39. | S E Fahlman, and C Lebiere, "The cascade-correlation learning architecture," Advances in Neural Information Processing System, In: D S Touretzky, Editor. San Mateo, CA: Morgan Kaufmann; pp. 524-32, 1990.  |
| 40. | R Setiono, and L C Hui, "Use of quasi-Newton method in a feedforward neural network construction algorithm," IEEE Transactions on Neural Networks, Vol. 6, pp. 273-7, 1995.  |
| 41. | M M Islam, M A Akhand, M A Rahman, and K Murase, "Weight freezing to reduce training time in designing artificial neural networks," in Proceedings of 5 th International Conference on Computer and Information Technology, EWU, Dhaka,pp. 132-6,2002.  |
| 42. | M M Islam, and K Murase, "A new algorithm to design compact two hidden-layer artificial neural networks," Neural Networks, Vol. 14, pp. 1265-78, 2001.  |
| 43. | J Sietsma, and RJ Dow, "Neural net pruning-why and how?," in Proceedings of IEEE International Conference on Neural Networks, Vol. 1, pp. 325-33, 1988.  |
| 44. | A V Ooyen, and B Nienhuis, "Improving the convergence of backpropagation algorithm," Neural Networks, Vol. 5, pp. 465-71, 1992.  |
| 45. | R Reed, "Pruning algorithms-A survey," IEEE Transactions on Neural Networks, Vol. 4, pp. 740-7, 1993.  |
| 46. | J Han, and M Kamber, "Data Mining: Concepts and Techniques," CA: Morgan Kaufmann Publisher: 2001.  |
| 47. | L Kaufman, and P J Rousseeuw, "Finding Groups in Data: An Introduction to Cluster Analysis," Hoboken, New Jersey: John Wiley and Sons; 1990.  |
| 48. | T N Raymond, and J Han, "Efficient and effective clustering methods for spatial data mining," in Proceedings of VLDB Conference, Santiago, Chile, pp. 144-55, 1994.  |
| 49. | J R Quinlan, "C4.5: Programs for Machine Learning," San Mateo, CA: Morgan Kaufmann; 1993.  |
| 50. | S Russel, and P Norvig, "Artificial Intelligence: A Modern Approach," Upper Saddle River, New Jersey: Prentice Hall; 1995.  |
| 51. | R Agrawal, T Imielinski, and A Swami, "Database mining: A performance perspective," IEEE Transactions on Knowledge and Data Engineering, Vol. 5, pp. 914-25, 1993.  |
| 52. | S J Yen, and A L Chen, "An efficient algorithm for deriving compact rules from databases," in Proceedings of the 4 th International Conference on Database Systems for Advanced Applications, pp. 364-71, 1995.  |
| 53. | P M Murphy, and D W Aha, UCI Repository of Machine Learning Databases, (Machine-Readable Data Repository); Department of Information and Computer Science, University of California, Irvine, CA, USA, 1998. Available online: http://archive.ics.uci.edu/ml/. [Last cited on 2010 Jan 01].  |
| 54. | L Prechelt, "PROBEN1-A set of neural network benchmark problems and benchmarking rules," Fakult¨at Inf., Univ. Karlsruhe, Karlsruhe, Germany, Technical Report, 1994.  |
| 55. | R Setiono, "Techniques for extracting rules from artificial neural networks," Plenary lecture presented at the 5 th International Conference on Soft Computing and Information Systems, Iizuka, Japan, October 1998.  |
| 56. | L Breiman, J Friedman, R Olshen, and C Stone, "Classification and Regression Trees," Wadsworth and Brooks, Monterey, CA, 1984.  |
| 57. | D T Pham, and M S Aksoy, "Rules: A simple rule extraction system," Expert Systems with Applications, Vol. 8, pp. 59-65, 1995.  |
| 58. | D T Pham, and M S Aksoy, "An algorithm for automatic rule induction," Artificial Intelligence in Engineering, Vol. 8, pp. 277-82, 1994.  |
| 59. | J Cendrowska, "PRISM: An algorithm for inducting modular rules," International Journal of Man-Machine Studies, Vol. 27, pp. 349-70, 1987.  |
Authors | |  |
S. M. Kamruzzaman received the B. S. in Electrical and Electronic Engineering from Dhaka University of Engineering and Technology, Bangladesh, in 1997 and the M. S. in Computer Science and Engineering from Bangladesh University of Engineering and Technology in 2005. Since September 2007, he is working toward his Ph.D. degree in Electronics Engineering at Hankuk University of Foreign Studies, Korea. From March 1998 to December 2004, he was a lecturer and an assistant professor at the Department of Computer Science and Engineering (CSE), International Islamic University Chittagong, Bangladesh. From January 2005 to July 2006, he was an assistant professor at the CSE Department, Manarat International University, Bangladesh. In August 2006, he moved to the Department of Information and Communication Engineering as an assistant professor at the University of Rajshahi, Bangladesh. His research interests include neural networks, data mining, communication protocols, ad hoc networks, and cognitive radio networks. He is a student member of the IEEE.
Md. Abdul Hamid received his Bachelor of Engineering degree in Computer and Information Engineering in 2001 from International Islamic University Malaysia (IIUM). In 2002, he joined as a lecturer in the Computer Science and Engineering department, Asian University of Bangladesh, Dhaka. He received the Ph.D. degree from Kyung Hee University, South Korea, in August 2009 from the Computer Engineering department. In September 2009, he joined as a lecturer in the department of Information and Communications Engineering at Hankuk University of Foreign Studies (HUFS), South Korea. Currently, he is working as an assistant professor in the same department at HUFS. He is the TPC member of TNS-2011 and ICCIT-2011, and member of KSII (Korean Society for Internet Information). His research interest includes artificial intelligence, wireless sensor, mesh, ad hoc, and opportunistic networks.
A. M. Jehad Sarkar received B.S. and M.S. degree in Computer Science from National University, Bangladesh, in 1999 and 2000, respectively. He received Ph.D. in Computer Engineering from Kyung Hee University, Korea, in 2010. Since 2012, he has been an Assistant Professor at the Department of Digital Information Engineering, Hunkuk University of Foreign Studies, Korea. From May 2000 to August 2006, he served as a Principal Software Engineer in ReliSource Technologies Ltd., Bangladesh. From April 2000 to April 2005, he served as a Senior Software Engineer in TigerIT Ltd., Bangladesh. His research interests are Activity Recognition, Web mining, Data mining, and Cloud computing.
[Figure 1], [Figure 2], [Figure 3], [Figure 4], [Figure 5], [Figure 6], [Figure 7], [Figure 8], [Figure 9], [Figure 10], [Figure 11], [Figure 12]
[Table 1], [Table 2], [Table 3], [Table 4], [Table 5], [Table 6], [Table 7], [Table 8], [Table 9], [Table 10], [Table 11], [Table 12], [Table 13], [Table 14], [Table 15], [Table 16]
|