Digital Library
Close Browse articles from a journal
 
<< previous    next >>
     Journal description
       All volumes of the corresponding journal
         All issues of the corresponding volume
           All articles of the corresponding issues
                                       Details for article 40 of 56 found articles
 
 
  Protein sequence analysis using relational soft clustering algorithms
 
 
Title: Protein sequence analysis using relational soft clustering algorithms
Author: Maji, Pradipta
Pal, Sankar K.
Appeared in: International journal of computer mathematics
Paging: Volume 84 (2007) nr. 5 pages 599-617
Year: 2007-05
Contents: To recognize functional sites within a protein sequence, the non-numerical attributes of the sequence need encoding prior to using a pattern recognition algorithm. The success of recognition depends on the efficient coding of the biological information contained in the sequence. In this regard, a bio-basis function maps a non-numerical sequence space to a numerical feature space, based on an amino acid mutation matrix. In effect, the biological content in a sequence can be maximally utilized for analysis. One of the important issues for the bio-basis function is how to select a minimum set of bio-bases with maximum information. In this paper, we present two relational soft clustering algorithms, named rough c-medoids and fuzzy-possibilistic c-medoids, to select the most informative bio-bases. While both fuzzy and possibilistic memberships of fuzzy-possibilistic c-medoids avoid the noise sensitivity defect of fuzzy c-medoids and the coincident clusters problem of possibilistic c-medoids, the concept of lower and upper boundaries of rough c-medoids deals with uncertainty, vagueness, and incompleteness in class definition of biological data. The concept of 'degree of resemblance', based on non-gapped pairwise homology alignment score, circumvents the initialization and local minima problems of both c-medoids algorithms. In effect, it enables efficient selection of a minimum set of most informative bio-bases. The effectiveness of the algorithms, along with a comparison with other algorithms, has been demonstrated on HIV (human immunodeficiency virus) protein datasets.
Publisher: Taylor & Francis
Source file: Elektronische Wetenschappelijke Tijdschriften
 
 

                             Details for article 40 of 56 found articles
 
<< previous    next >>
 
 Koninklijke Bibliotheek - National Library of the Netherlands