Everyone is invited to attend her talk. Her abstract follows below.
The emergence of streaming data or “data in motion” has motivated the development of new “streaming” algorithms that provide up-to-date answers to continuous queries; that is,
queries that are issued once and then run continuously as new data streams in. For example, in the context of network traffic management, continuous queries over streaming Netflow data may be used to detect anomalies in the network as they happen (e.g.,
performance degradation, onset of an attack).
The main objective of this thesis is to demonstrate with an illustrative example how the continuous stream of answers produced by existing streaming algorithms can be visualized
in an effective and meaningful manner. In particular, such visualization efforts should be able to capture the dynamics that is inherent in the answers that result from such continuous queries. Our illustrative example concerns the class of frequent
itemset mining algorithms for streaming data, a generalization of the well-know algorithms for finding frequent items (e.g., top-k) in streaming data. Using the output of these streaming algorithms as input, our proposed visualization method combines
two previously-considered types of flow diagrams (i.e., sankey and alluvial diagrams) to depict the temporal evolution of frequent itemsets. We apply our visualization method to real-world streaming data (e.g., unsampled Netflow data from a stub network,
DNS records from a university network) and illustrate its basic features and capabilities.