spacer
Louisiana Tech University's Home Page CEnIT Home Page DMRL Home Page  
spacer
 
header

Tools Description
People
Contact


 

Weighted Rule-Based Algorithmic Tool for Image Classification

Harpreet Singh1, Sumeet Dua1 ,Hilary W. Thompson2

1Data Mining Research Laboratory, Louisiana Tech University, Ruston, LA 71270, USA. 2LSU Eye Center, LSU Health Sciences Center, New Orleans, LA 70112-2234.
{pradeep, sdua}@latech.edu, {hthomp2}@lsuhsc.edu

Introduction

AIMS (Associative Image Classification System), is an association rule based Image classification and Image retrieval tool. The objective of image classification is to identify images by assigning a class to an image group with homogeneous characteristics on the basis of gray or color level. AIMS provide optional user interfaces for users to perform real time classification on regular camera images or on digital mammograms. Users can choose any image already in the database or can provide a new query image for classification. The software tool uses a weighted association rule based algorithm WAR-BC [1] for performing feature extraction and classification. To show the superiority of WAR-BC algorithm the user is also provided the flexibility of using raw features (texture) for classification with different classifiers. Two types of texture features, wavelet-based and statistical co-occurrence-based (Haralick), are extracted to test the scalability and robustness of the tool.

Methodology

Each Image is divided into non-overlapping segments of size NxN (N varies from domain to domain). Once the image has been segmented into blocks, depending upon the type of image (colored or gray scale) texture features based on wavelets or co-occurrence matrix are extracted from each segment. To extract wavelet features: wavelet decomposition of each segment is performed and moments of coefficients of HH, HL, and LH bands are computed resulting in three wavelet features [2]. For colored images these three wavelet features are combined with mean values of the red, green, and blue values of the segment resulting in a feature vector of length k=6. For co-occurrence based features eight of the fourteen Haralick [3] features are used resulting in a feature vector of length k=8. Each vector is given a unique Segment ID, which, in our case, is the number of the segment from which the features were extracted, e.g. TID 1 (f1, f2, f3…..fk) and TID 2 (f1, f2, f3….fk) where k=6 or k=8. Fig.1. shows the feature extraction process. Table 1. gives the list of texture features used in AIMS for classification.

 

 

Once the features have been extracted from each segment, the image can be considered a transaction database where one transaction is one row of the database or the features extracted from one segment. The next step is to uncover the isomorphisms present by using association rules. An association rule is of the form f1(1134),f2(2124) → f8(8074) with Support = 40% and Confidence = 80%, given by following formulas.

Support {f1(1134),f2(2124) → f8(8074)} = No. of Transactions having f1(1134),f2(2124) → f8(8074) / Total No. of Transactions.


Confidence {f1(1134),f2(2124) → f8(8074)} = No. of Transactions having f1(1134),f2(2124) → f8(8074) / No. of Transactions having f1(1134),f2(2124)


Rules from individual images in each class are combined to form a class-level rule set. These class level rule sets are combined to form a global rule set. Weights are given to these rules according to their intra-class (vertical) and inter-class presence (horizontal). Fig 2. shows the over algorithmic strategy behind the tool.

 

Support = 40% and Confidence = 80%, given by following formulas. For classification of a query image rules are matched, their weights are calculated and the scores of the matching rules are added. The image is classified to the class with the highest cumulative sum. Fig. 3 shows the query mechanism behind the tool.

 

Results

The base algorithm used for AIMS, WAR-BC, has been tested on two different domains of Image data, Scenic data and Medical Data. A well known mammography data-set called MIAS [4] is used for mammogram classification experiments. It consists of a total of 322 mammograms, of which 208 are Normal, 63 are Benign, and 51 are Malignant. To make an accurate comparison with other existing techniques, we use the same data for training/testing (90/10).Our technique (WAR-BC) outperforms others in the 10-fold technique. For scenic image we used a subset of the Coral Image data-set with 10 classes having 100 images in each class giving a total of 1000 images. Fig. 4. and Fig. 5. show the results on Medical dataset while Fig. 6. and Fig. 7. show the results for Scenic data.

 

Tool Functionality and GUI:-

The tool, which was developed using MATLAB, has a simple, user friendly Graphical User Interface (GUI). The user does not need background information to use it. The first screen provides the user with a choice for either mammogram or scene classification. By clicking on the mammogram classification user is provided three images, one from each of the classes: normal, benign, and malignant. He or she can choose any one of the mammogram images as the query image. The tool will classify this query image and provide images closest to it from the mammogram database. For the new query image, the user can then pick any of these new images or go back to the main option and start with a query image from another class. For natural Scene classification the user is provided with three options:
  1. Option 1:- This option lets the user choose an image on which the algorithm has been trained beforehand. The WAR-BC algorithm has already been trained for 10 classes with 50 images each. The query images provided to user for classification are also from these 10 classes and they are randomly picked from a pool of 500 images (50 from each of the 10 classes) which have not been used during the training part. The user clicks on any of the representative images as a query, the tool then performs classification on this image and provides images closest to this query image from the training image database.
  2. Option 2:- In this option the user can choose an image from the class on which the algorithm has not been trained. The query images used here are randomly picked from another data-set of 300 images (3 classes with 100 images each) and the tool tries to classify these images forcefully into one of the 10 base classes used for classification in Option 1. This option shows the scalability of the tool.
  3. Option 3:- This option lets the user provide a completely new image for query and see each part of the classification module of the tool working to classify it. The tool extracts features, generates rules, and classifies the image into one of the ten base classes. Here we also provide the user with the flexibility of using raw features instead of association rules for classification with different classifiers.

Below is the workflow of AIMS:

 

Download

The AIMS tutorial and MATLAB executable is available for download. DOWNLOAD FILE

References

  • [1] Dua S., Singh H., Thompson, H., “Associative classification of mammograms using weighted rules”, Expert system with applications, volume 36, issue 5, pp. 9250--9259 (2009)
  • [2] Chen Y., Bi J., Wang J.Z., “MILES: Multiple Instance Learning via Embedded Instance Selection”, IEEE Trans. On Pattern Analysis and Machine Intelligence, Vol. 28, issue 12, pp 1931-1947, 2006
  • [3] R.M. Haralick, K. Shanmugam, and I. Dinstein, “Textural features for image classification,” IEEE Trans. on SMC, IEEE SMC Society, Piscataway, NJ, Nov. 1973, pp. 610-621
  • [4] MIAS Database, The PCCV Project: Benchmarking Vision Systems, http://peipa.essex.ac.uk/info/mias.htm.

spacer
This site is maintained by the Data Mining Research Laboratory. Webmaster: Alan E. Alex & Image Master: Pradeep Chowriappa
spacer