How Health Log Analytics generates alerts
Summarize
Summary of How Health Log Analytics generates alerts
Health Log Analytics (HLA) uses AI to analyze log data patterns and detect anomalous behavior. When anomalies are found, HLA sends events to the ServiceNow Event Management application, enabling operators to proactively address emerging IT issues before they impact users.
Show less
How Alerts Are Generated
HLA identifies anomalies as deviations from established baselines in log activity. For example, if an infrequently active log suddenly shows a spike in events, this deviation triggers an alert. Multiple methods support anomaly detection, including monitoring various alert metrics, lexical keywords, correlations, and custom rules.
Alert Metrics
- Each metric is tied to a unique source (service instance and component).
- When an anomalous pattern is detected for a metric, an alert is generated.
- Operators can mark alerts as significant or mute them to reduce noise, teaching HLA which alerts matter.
- Muted metrics stop generating alerts until reactivated; significant metrics have increased alert priority.
Lexical Keywords
- HLA scans logs for keywords like "crashed" or "failed" that indicate issues.
- Each keyword has a threshold based on normal frequency; exceeding it triggers alerts.
- Customers can view, add, edit, or delete global and source-specific keywords to tailor alerting.
Correlations
Log correlators identify relationships between alerts by detecting common keys or values (e.g., the same network device ID in multiple warnings), helping to relate and understand distributed issues.
Advanced Alert Filtering
Operators can create and manage filters to scan alerts for specific conditions, reducing noise by dropping insignificant alerts. Filters can be tested, updated, published, and activated dynamically.
Custom Alert Rules
Customers can define custom alert rules to generate alerts based on specific log metrics and thresholds, allowing precise control over alert generation and properties.
Health Log Analytics identifies patterns in your log data and learns pattern behavior. When HLA's AI engine detects anomalous behavior, it sends an event to the ServiceNow Event Management application. As an operator, you can use these predictive alerts to handle emerging IT issues before they impact users.
Log anomaly detection
Anomalies are abnormal or unexpected behavior that occur when activities deviate from established baselines. There are many kinds of anomalies. In this example, the system tracks the baseline rate (the average number of events per minute) for a specific log pattern. When this typically inactive log generates a spike in events, the system detects the deviation from the baseline and generates an alert.
Health Log Analytics uses various methods to detect anomalies and generate alerts.
Alert metrics
Health Log Analytics monitors multiple metrics in the log stream to detect anomalous behavior. Each metric is associated with a unique source: the combination of service instance and component. When the system identifies an anomalous pattern for a metric, it generates an alert.
As an operator, you can provide feedback about the generated alerts. Your feedback "teaches" Health Log Analytics that a specific alert is significant or irrelevant to you. The application then either raises the priority of the alert metric or mutes it to reduce noise.
- A significant alert is more likely to be included in a Log Analytics group when the associated metric behaves anomalously. For more information, see Mark an alert as significant in Health Log Analytics
- Mute an alert for a specified source to eliminate distracting new alerts for unimportant issues. When a metric is muted, Health Log Analytics removes the current alert and any other alerts based on that metric from the feed. It also stops generating new alerts from that metric. For more information, see Mute an unimportant alert in Health Log Analytics.
- When the situation changes, you can return a significant metric to its default significance. You can also reactivate a muted metric to cause the system to start generating alerts again. For more information, see Restore normal importance to an alert metric in Health Log Analytics.
Lexical keywords
Health Log Analytics scans your logs for words that can indicate important issues. Lexical keywords such as "crashed" or "failed" signal a condition that can merit attention.
The system sets a threshold for each lexical keyword that is based on what it considers the normal occurrence pattern and frequency of that keyword in your logs. When it scans your logs, it finds all occurrences of the keyword. If the number exceeds the threshold, it generates an alert. For more information, see View the lexical keywords that generate alerts in Health Log Analytics.
For information about managing global keywords, see Add, edit, or delete lexical keywords in Health Log Analytics. To create or delete keywords for a specific source type, see Configure source type capabilities in Health Log Analytics.
Correlations
Log correlators are keys or values in log data that detect correlations between alerts. For example, a log correlator could detect when the ID of a particular network device occurs simultaneously in multiple warnings across different service instances. For more information, see Identifying related alerts in log data by using log correlators in Health Log Analytics.
Advanced alert filtering
Add advanced log alert filters to scan alerts for conditions that you specify. The filters reduce noise by dropping alerts that do not indicate a significant issue. While developing a filter, you can test, update, publish, or activate the filter at any time. For more information, see Create advanced log alert filters.
Custom alert rules
Define a Log Analytics alert rule when you encounter log data that should generate an alert. The alert rule generates an alert for a specified metric with a threshold that you specify and sets the properties of the generated alert. For more information, see Alert rules in Health Log Analytics.