Debnath Bhattacharyya Poulami Das Debashis Ganguly Kheyali Mitra Purnendu Das Samir Kumar Bandyopadhyay Tai-hoon Kim
Verschenen in:
International journal of signal processing, image processing and pattern recognition
Paginering:
Jaargang 1 (2008) nr. 1 pagina's 55-62
Jaar:
2008
Inhoud:
The main purpose of communication is to transfer information from onecorner to another of the world. The information is basically stored in forms of documents or files created on the basis of requirements. So, the randomness of creation and storage makes them unstructured in nature. As a consequence, data retrieval and modification become hard nut to crack. The data, that is required frequently, should maintain certain pattern. Otherwise, problems like retrievingerroneous data or anomalies in modification or time consumption in retrieving process may hike. As every problem has its own solution, these unstructured documents have also given the solution named unstructured document categorization. That means, the collected unstructured documents will be categorized based on some given constraints. This paper is a review which deals with different techniques like text and data mining, genetic algorithm, lexicalchaining, binarization method to reach the fulfillment of desired unstructured document categorization appeared in the literature.