Algorithms for mining frequent itemsets in static and dynamic datasets
Titel:
Algorithms for mining frequent itemsets in static and dynamic datasets
Auteur:
Hernández-León, R. Hernández-Palancar, J. Carrasco-Ochoa, Jesús A. Martínez-Trinidad, José Fco.
Verschenen in:
Intelligent data analysis
Paginering:
Jaargang 14 (2010) nr. 3 pagina's 419-435
Jaar:
2010-05-20
Inhoud:
In this paper, two algorithms for mining frequent itemsets in large sparse datasets are proposed. The first one, named Compressed Arrays (CA), allows to process datasets that do not change along the time (static datasets) while the second one, based on the ideas of the former and named Dynamic Compressed Arrays (DCA), processes datasets that change along the time by adding/deleting transactions (dynamic datasets). Both algorithms introduce a novel way to use equivalence classes of itemsets by performing a breadth first search through them and by storing the class prefix support in compressed arrays, which allows fast itemset support computing. On the other hand, unlike previous algorithms for dynamic datasets that store the full dataset in main memory without reusing the current frequent itemsets, DCA algorithm stores the current frequent itemsets in binary files, grouped in equivalence classes, and reuses them to calculate the new frequent itemsets.