Elasticsearch data input configuration fields
Summarize
Summary of Elasticsearch data input configuration fields
This guide details the configuration fields available when setting up an Elasticsearch data input in ServiceNow, specifically for Health Log Analytics. It explains how to connect and stream log data from Elasticsearch clusters to your ServiceNow instance using MID Servers or MID Server clusters. The configuration supports authentication methods, cross-cluster search, filtering, and performance tuning options.
Show less
Basic configuration
- Name and Description: Required fields to identify and describe the data input.
- Execute on: Choose whether to run on a specific MID Server or a failover MID Server cluster. Only MID Servers supporting basic authentication are listed; mTLS is not supported for log ingestion.
- MID Server or Cluster: Specify which MID Server or cluster pulls log data. Clusters provide failover protection by moving tasks if one MID Server fails. Log ingestion is enabled automatically if not already active.
- Service instance: Bind the data input to a ServiceNow service instance, which must be set to Operational. Create one if none exists.
- Status and operational info: Read-only fields show the data input’s status, transport protocol, source count, disable time, and last log time.
Transport configuration
- Server URL: Required URL to access the Elasticsearch cluster.
- Max connections per route and max scroll slices: Control the number of connections per node and parallel queries for efficient data retrieval.
- Proxy settings: Optional HTTP proxy host and port for requests.
- Authentication methods: Supports Basic auth, API key, client certificate, or AWS credentials. Relevant credential fields appear based on the chosen method.
- MID certificate policy check: Option to enable SSL/TLS encryption of log data by applying MID Server certificate policies.
Query settings
- From/To dates: Define the time range for reading data. Setting a past "From" date may cause large data reads and congestion.
- Cross-cluster search: Enable searching across multiple Elasticsearch clusters, specifying which clusters or all remote clusters to include.
- Index prefix: Required prefix to specify which Elasticsearch indices to read from.
- Use minimal privileges: Option to read logs directly from indices with read-only privileges, affecting how cross-cluster data collection works.
- Document timestamp field and format: Specify the timestamp field and format (default is Unix epoch milliseconds) used for sorting logs.
- Term filters: JSON map to filter specific log terms (e.g., severity levels), improving data relevance.
- Max documents per query: Limits documents fetched per query to manage load.
- Sliced-scrolling and search-after APIs: Toggle between APIs for efficient retrieval; sliced-scrolling suits historical data, search-after suits real-time data.
- Index time-suffix format: Define date format for time-based indices; leave empty when using aliases.
Advanced configuration
- Timeouts and intervals: Set timeouts for requests, index discovery intervals for new data, and scroll context lifetime for data reads.
- Event processing: Configure CPU core usage and batch queue sizes to balance throughput and resource consumption.
- Timezone and character encoding: Define default timezone for events without timezone info and character encoding for the input.
- Sub sampling ratios: Control event batching to reduce or limit the number of events ingested.
- Sleep interval: Set delay between queries when no data is returned, optimizing resource use.
- Max log message length: Limit the size of log messages in bytes.
- Delay in reading current timestamp: Offset for querying recent data to capture delayed logs, important for multi-cluster environments.
Practical benefits for ServiceNow customers
This comprehensive configuration enables customers to reliably ingest Elasticsearch logs into ServiceNow for advanced Health Log Analytics. It supports secure authentication, failover through MID Server clusters, precise filtering, and optimized data retrieval methods. Customers can customize data input behavior to balance performance and resource use, ensuring timely and accurate log data ingestion for monitoring and analysis.
Description of the fields on the Elasticsearch data input configuration form.
Basic configuration
| Field | Description |
|---|---|
| Name | Name of the new data input. This field is required. |
| Description | Description of the data input. |
| Execute on | Option to determine whether to use a specific MID Server or a MID Server cluster. This feature is supported in the Health Log Analytics application, Version 26.0.17 - February 2023 and later, available from the ServiceNow Store. |
| MID | (Only when the Execute on field is set to Specific MID Server) MID Server to which log data from Elasticsearch indices is pulled.Note: This field is required.
|
| MID Server Cluster | (Only when the Execute on field is set to Specific MID Server Cluster) The MID Server cluster to which the log data is pulled.The data input runs on a single MID Server in the cluster until that MID Server fails. The system then moves all the data input tasks to the next available MID Server in the cluster according to the configured order. This feature is supported in the Health Log Analytics application, Version 26.0.17 - February 2023 and later, available from the ServiceNow Store. Note: For more information about MID Server clusters, see Configure a MID Server cluster.
This field is required. |
| Service instance | The service
instance to which to bind the log data. This field is required. Note:
If no relevant
service
instance exists, Create an service instance and add CIs to it. Set the status of the new
service
instance to Operational. |
The following fields show read-only information:
| Field | Description |
|---|---|
| Status | Status of the data input. |
| Transport | Protocol used to stream the log data. This data input uses Elastic to stream log data to your instance. |
| Sources count | The number of log sources this data input has created. |
| Disabled since | The time when the data input stopped or failed. |
| Last log time | The time when the last log streamed in the data input. |
| Field | Description |
|---|---|
| Server URL | URL used to access the cluster. This field is required. |
| Max connections per route | Maximum number of connections to be opened per node. Default: 2. |
| Max scroll slices | The number of shards configured for the relevant index in Elasticsearch. This number tells Elastic how many parallel queries to execute in each polling request. |
| Proxy host | Host name of the HTTP proxy through which requests are sent. |
| Proxy port | Port of the HTTP proxy through which requests are sent. |
| Authentication method | The authentication method used to authenticate the data input to Elasticsearch. The options are: Basic auth, apiKey, or client certificate. Note: When you select the required authentication method, the corresponding credentials fields display on the form. |
| Basic auth credentials | User name and password used to connect to the Elasticsearch search engine. Note: Fill in either this field or the AWS credentials field. |
| AWS credentials | AWS credentials to use to connect to the AWS-hosted Elasticsearch search engine. Note: Fill in either this field or the Basic auth credentials field. |
| AWS region | AWS region where the Elasticsearch cluster runs. |
| API key credentials | The API key used to connect to the Elasticsearch search engine. |
| Client certificate | The client certificate used to connect to the Elasticsearch search engine. |
| Use MID certificate policy check | Option to enable the MID certificate policy check. Select this option if you want to ship your logs encrypted using SSL TLS. Then navigate to and add the MID certificate policy check to the table. For more information, see MID Server certificate check policies. |
| Field | Description | Example |
|---|---|---|
| From/To | From and to dates and time for reading the data.
|
From: 1970-01-01 15:59:59 To: 2300-01-01 15:59:59 |
| Use cross-cluster search | Option for searching for data across Elasticsearch clusters. When this check box is selected, the Clusters to search field displays. Note:
Your settings in the Use minimal privileges check box and the Delay in reading current timestamp (seconds) field on the Advanced
configuration form affect how data is collected across multiple clusters. |
|
| Clusters to search | The Elasticsearch clusters to search. This field displays only when the Use cross-cluster search check box is selected. Do one of the following:
|
east,west,south |
| Index prefix | Prefix for the Elasticsearch indices to read from. The data input reads only from indices with this prefix. This field is required. | only-read-these-indices-* |
| Use minimal privileges | Option for reading log data directly from the Elasticsearch indices with the configured prefix.
For additional information about streaming logs using the Elasticsearch data input, see the Stream logs using Elasticsearch data input - Advanced guide [KB1080162] article in the Now Support Knowledge Base. |
|
| Document timestamp field | Timestamp field in documents stored in the read indices. This field is required. | |
| Timestamp field format | Format of the timestamp field in the documents. If no format is specified, the default Unix epoch time format is used, in milliseconds. For example: 1684168407 (May 15, 2023 4:33:27 PM) |
yyyy-MM-dd'T'HH:mm:ss.SSSSSSS'Z' |
| Term filters | JSON map of the terms to filter. Note: Avoid using the term query for text fields. If the target field is mapped as both text and keyword, reference the keyword by using fieldname.keyword. |
{"severity": ["error", "warning"]} |
| Max documents per query | Maximum number of documents fetched in a single query. | |
| Sliced-scrolling tiebreaker | Value used to slice the data. Each slice is scrolled in parallel. Default: _id | |
| Search-after tiebreaker | Unique value per document to use as tiebreaker when sorting log entries by timestamp. | |
| Use search-after API | Option for toggling between using sliced-scrolling and search-after APIs. Note: Sliced-scrolling APIs are preferable when reading historical data, while search-after APIs are better for reading real-time
data. |
|
| Index time-suffix format | Format of the time suffix when using time-based index names, such as [logstash-]YYYY.MM.DD. When using aliases, leave this field empty. |
uuuu.MM.dd |
Advanced configuration
| Field | Description |
|---|---|
| Data reading timeout (milliseconds) | The duration of time, in milliseconds, before a request to the Elasticsearch cluster times out. |
| Index discovery interval (seconds) | The number of seconds between intermittent MID Server requests to the Elasticsearch cluster for new indices from which to read data. |
| Scroll context time (milliseconds) | The lifetime of the created scroll when using the scroll API to read data from Elasticsearch. For more information, see the Elasticsearch scroll API documentation. |
| Event processor workers | The maximum number of CPU cores used in parallel to process events fetched from Elasticsearch. A higher setting increases the data input throughput at the cost of higher CPU usage. |
| Worker queue size | The maximum number of batches to queue for processing. A higher setting increases throughput, at the cost of higher RAM usage. |
| Default timezone | The default timezone if the event date and time doesn't include timezone information. |
| Sub sample drop ratio | The number of events to batch together, out of which one will be discarded. This setting is used to reduce the number of fetched events. |
| Sub sample receive ratio | The number of events to batch together, out of which all but one will be discarded. This setting is used to decrease the number of received events. |
| Character encoding | The character encoding for this data input. |
| Sleep interval (seconds) | The interval, in seconds, to wait before querying again after a query has returned no data. |
| Max length in bytes | The maximum length, in bytes, of log messages. |
| Delay in reading current timestamp (seconds) | The number of seconds before current time to query to include delayed data. The configured number of seconds is subtracted from the current time for reading the last timestamp. Note:
If this value is 0 and data is collected from multiple clusters simultaneously, the query may not include data that was sent with a delay on one of the clusters, |