Digitale Bibliotheek
Sluiten Bladeren door artikelen uit een tijdschrift
 
<< vorige    volgende >>
     Tijdschrift beschrijving
       Alle jaargangen van het bijbehorende tijdschrift
         Alle afleveringen van het bijbehorende jaargang
           Alle artikelen van de bijbehorende aflevering
                                       Details van artikel 5 van 7 gevonden artikelen
 
 
  Class noise detection using frequent itemsets
 
 
Titel: Class noise detection using frequent itemsets
Auteur: Van Hulse, Jason
Khoshgoftaar, Taghi M.
Verschenen in: Intelligent data analysis
Paginering: Jaargang 10 (2006) nr. 6 pagina's 487-507
Jaar: 2006-11-27
Inhoud: The presence of a substantial number of noisy instances in a given dataset may adversely affect the hypothesis learnt from that data. Removing noisy instances prior to the construction of a classifier has been shown to improve the classification ability of a learner on new data. This paper introduces a novel technique for identifying observations with class noise in a dataset using frequent itemsets. For the given dataset, each instance is assigned a NoiseFactor, indicating a relative likelihood that it contains class noise. A frequent itemset is a set of instances with common attribute values which contains at least as many instances as a user-defined minimum support threshold. Consequently, the set of frequent itemsets contains information related to the structure and dependence between the attributes. Each frequent itemset is assigned a class, based on the proportion of instances within the itemset from each class. Instances that are contained in itemsets that have a large proportion of instances from the other class are identified as noisy. The technique proposed in this paper is analyzed in numerous case studies using real-world software measurement datasets with either inherent or injected noise. A comparison is provided with two well-known techniques for the identification of class noise: Classification Filter and Ensemble Filter. The results demonstrate that this new algorithm is very effective at identifying instances with class noise.
Uitgever: IOS Press
Bronbestand: Elektronische Wetenschappelijke Tijdschriften
 
 

                             Details van artikel 5 van 7 gevonden artikelen
 
<< vorige    volgende >>
 
 Koninklijke Bibliotheek - Nationale Bibliotheek van Nederland