Digitale Bibliotheek
Sluiten Bladeren door artikelen uit een tijdschrift
 
<< vorige   
     Tijdschrift beschrijving
       Alle jaargangen van het bijbehorende tijdschrift
         Alle afleveringen van het bijbehorende jaargang
           Alle artikelen van de bijbehorende aflevering
                                       Details van artikel 7 van 7 gevonden artikelen
 
 
  Optimal bin number for equal frequency discretizations in supervized learning
 
 
Titel: Optimal bin number for equal frequency discretizations in supervized learning
Auteur: Marc Boulle
Verschenen in: Intelligent data analysis
Paginering: Jaargang 9 (2005) nr. 2 pagina's 175-188
Jaar: 2005-05-23
Inhoud: While real data often comes in mixed format, discrete and continuous, many supervised induction algorithms require discrete data. Although efficient supervised discretization methods are available, the unsupervised Equal Frequency discretization method is still widely used by the statistician both for data exploration and data preparation. In this paper, we propose an automatic method, based on a Bayesian approach, to optimize the number of bins for Equal Frequency discretizations in the context of supervised learning. We introduce a space of Equal Frequency discretization models and a prior distribution defined on this model space. This results in the definition of a Bayes optimal evaluation criterion for Equal Frequency discretizations. We then propose an optimal search algorithm whose run-time is super-linear in the sample size. Extensive comparative experiments demonstrate that the method works quite well in many cases.
Uitgever: IOS Press
Bronbestand: Elektronische Wetenschappelijke Tijdschriften
 
 

                             Details van artikel 7 van 7 gevonden artikelen
 
<< vorige   
 
 Koninklijke Bibliotheek - Nationale Bibliotheek van Nederland