Annalisa Appice, Anna Ciampi, Donato Malerba
Advances in pervasive computing and sensor technologies have paved the way for the explosive living ubiquity of geo-physical data streams. The management of the massive and unbounded streams of sensor data produced poses several challenges, including the real-time application of summarization techniques, which should allow the storage and query of this amount of georeferenced and timestamped data in a server with limited memory. In order to face this issue, we have designed a summarization technique, called SUMATRA, which segments the stream into windows, computes summaries window-by-window and stores these summaries in a database. Trend clusters are discovered as summaries of each window. They are clusters of georeferenced data which vary according to a similar trend along the window time horizon. Several compression techniques are also investigated to derive a compact, but accurate representation of these trends for storage in the database. A learning strategy to automatically choose the best trend compression technique is designed. Finally, an in-network modality for tree-based trend cluster discovery is investigated in order to achieve an efficacious aggregation schema which drastically reduces the number of bytes transmitted across the network and maintains a longer network lifespan. This schema is mapped onto the routing structure of a tree-based WSN topology. Experiments performed with several data streams of real sensor networks assess the summarization capability, the accuracy and the efficiency of the proposed summarization schema.
© 2001-2025 Fundación Dialnet · Todos los derechos reservados