Digitale Bibliotheek
Sluiten Bladeren door artikelen uit een tijdschrift
 
<< vorige    volgende >>
     Tijdschrift beschrijving
       Alle jaargangen van het bijbehorende tijdschrift
         Alle afleveringen van het bijbehorende jaargang
           Alle artikelen van de bijbehorende aflevering
                                       Details van artikel 5 van 6 gevonden artikelen
 
 
  Iterative cross-training: An algorithm for web page categorization
 
 
Titel: Iterative cross-training: An algorithm for web page categorization
Auteur: Nuanwan Soonthornphisaj
Boonserm Kijsirikul
Verschenen in: Intelligent data analysis
Paginering: Jaargang 7 (2003) nr. 3 pagina's 233-253
Jaar: 2003-08-05
Inhoud: The goal of Web page categorization is to classify Web documents into a certain number of predefined categories. Previous works in this area employed a large number of labeled training documents for supervised learning. The problem is that, it is difficult to create labeled training documents. Though it is not so easy to manually categorize unlabeled documents for creating training data, it is easy to collect unlabeled ones. Therefore, a new machine learning algorithm is investigated to overcome these difficulties and effectively utilize unlabeled documents. We propose a novel approach called Iterative Cross-Training (ICT). In this paper, we applied the algorithm to Web page categorization on three data sets. The performance of ICT was evaluated and analyzed with the supervised learning algorithms, Co-Training and Expectation Maximization. We found that ICT is considered to be an effective approach for the Web page categorization task.
Uitgever: IOS Press
Bronbestand: Elektronische Wetenschappelijke Tijdschriften
 
 

                             Details van artikel 5 van 6 gevonden artikelen
 
<< vorige    volgende >>
 
 Koninklijke Bibliotheek - Nationale Bibliotheek van Nederland