Digitale Bibliotheek
Sluiten Bladeren door artikelen uit een tijdschrift
 
<< vorige    volgende >>
     Tijdschrift beschrijving
       Alle jaargangen van het bijbehorende tijdschrift
         Alle afleveringen van het bijbehorende jaargang
           Alle artikelen van de bijbehorende aflevering
                                       Details van artikel 2 van 6 gevonden artikelen
 
 
  Detecting noisy instances with the rule-based classification model
 
 
Titel: Detecting noisy instances with the rule-based classification model
Auteur: Taghi M. Khoshgoftaar
Naeem Seliya
Kehan Gao
Verschenen in: Intelligent data analysis
Paginering: Jaargang 9 (2005) nr. 4 pagina's 347-364
Jaar: 2005-08-29
Inhoud: The performance of a classification model is invariably affected by the characteristics of measurement data it is built upon. If quality of the data is generally poor, then the classification model will demonstrate poor performance. The amount of noisy instances present in a given dataset is a good reflection of quality of the data. The detection and removal of noisy data instances will improve quality of the data, and consequently the performance of the classification model. This study presents an attractive and user-friendly approach for detecting data noise based on Boolean rules generated from the measurement data. The approach follows a simple and replicable approach that analyzes the rules to detect mislabeled noisy instances in the training dataset. Such instances are treated as data noise, and are removed to obtain a clean dataset. A case study of a software measurement dataset with known noisy instances is used to demonstrate the effectiveness of our approach. The dataset is obtained from a NASA software project developed for realtime predictions based on simulations. It is empirically demonstrated that the proposed approach is extremely effective in detecting noise in the dataset; in fact, the approach detected 100% of the known noisy instances. The proposed approach is compared with noise filtering based on five classification filters and an ensemble filter of five classifiers. We also demonstrate that the proposed approach shows excellent promise in detecting noisy instances in several (six) independent and real-world software measurement datasets with unknown noisy instances.
Uitgever: IOS Press
Bronbestand: Elektronische Wetenschappelijke Tijdschriften
 
 

                             Details van artikel 2 van 6 gevonden artikelen
 
<< vorige    volgende >>
 
 Koninklijke Bibliotheek - Nationale Bibliotheek van Nederland