Stream Anomaly Detection

Technology Overview

Altometrics’s Stream Anomaly Detection provides a new way of thinking about anomaly detection.

The Big Idea

The big idea here is that the Stream Anomaly Detection engine lets you see the forest, not just the trees.

Key Differences

Unlike traditional anomaly detection methods–which focus on finding individual data points that differ from another– the Stream Anomaly Detection engine is designed to find anomalies in data flows.  The image below highlights this distinction.

At the top, you have the traditional approach to anomaly detection.  The red arrows point to anomalous data (red dots), and the black dots indicate “normal” data used as the basis for comparison.

The lower portion of the image, however, shows the Stream Anomaly Detection approach.  The red arrow indicates that the portion of flow highlighted in red isn’t the same “shape” as the rest of the data flow.  It means that the structure of this region of data indicates a larger problem that may not be visible by looking at individual data points.  Rather than looking for individually anomalous data points, stream-flow anomalies find issues emerging in the flow of data.  This has a number of major advantages over traditional approaches, but more importantly, it can be used in conjunction with traditional approaches to provide deeper insight.

Advantages

There are a number of things this technology provides that are extremely difficult (or impossible) with traditional anomaly detection techniques.

Taking the Broad View

As mentioned above, the Stream Anomaly Detection approach opens an opportunity to look for meaning in the way data changes.

Learning From History

One key element of the technology is that it learns from history.  When the system begins to monitor a data stream, it spends some time first learning what is normal for that flow (this can also be accomplished by pre-loading it with historical normals, if desired).  As data flows through, a concept of normal is established, unique to each flow of data.  Over time, as data flow is monitored, the engine notes when a data flow begins to deviate from normal and can trigger events, raise alerts, and so on, depending on the nature of the anomaly that begins to emerge.

High Volume at Speed

Because of the way the engine works, it can process volumes of data in the millions or billions of data points in real-time.  For example, applied to application performance management in a data center, it is possible to monitor all of the data flowing into and out of the data center in real-time, without impeding it.

Technical Information

For more technical information, please see the technical discussion.