Statistical supports for mining sequential patterns and improving the incremental update process on data streams
Titel:
Statistical supports for mining sequential patterns and improving the incremental update process on data streams
Auteur:
Laur, Pierre-Alain Symphor, Jean-Emile Nock, Richard Poncelet, Pascal
Verschenen in:
Intelligent data analysis
Paginering:
Jaargang 11 (2007) nr. 1 pagina's 29-47
Jaar:
2007-03-21
Inhoud:
Recently, the knowledge extraction community takes a closer look at new models where data arrive in timely manner like a fast and continuous flow, i.e. data streams. As only a part of the stream can be stored, mining data streams for sequential patterns and updating previously found frequent patterns need to cope with uncertainty. In this paper, we introduce a new statistical approach which biases the initial support for sequential patterns. This approach holds the advantage to maximize either the precision or the recall, as chosen by the user, and limit the degradation of the other criterion. Moreover, these statistical supports help building statistical borders which are the relevant sets of frequent patterns to use into an incremental mining process. From the statistical standpoint, theoretical results show that the technique is not far from the optimum. Experiments performed on sequential patterns demonstrate the interest of this approach and the potential of such techniques.