Apache Kafka data input configuration fields

  • Release version: Xanadu
  • Updated August 1, 2024
  • 4 minutes to read
  • Summarize
    Summarized using AI
    This content was generated using new OpenAI-powered functionality. Results are provided on an as is basis and are not guaranteed to be accurate or complete.

    Summary of Apache Kafka data input configuration fields

    This document details the configuration fields available for setting up Apache Kafka data inputs in ServiceNow’s Health Log Analytics application (version 26.0.17 and later). These configurations enable customers to stream log data from Apache Kafka into their ServiceNow instance efficiently and reliably, using either a specific MID Server or a MID Server cluster for failover protection.

    Show full answer Show less

    Basic Configuration

    • Name: Required field to name the data input.
    • Description: Optional field to describe the data input.
    • Execute on: Choose whether to run on a specific MID Server or a MID Server cluster. This affects failover and load distribution.
    • MID Server: Select a MID Server that supports basic authentication (mTLS unsupported). Maximum of 10 concurrent streaming inputs per MID Server by default, modifiable in MID Server properties. Log ingestion is enabled automatically if needed.
    • MID Server Cluster: Supports failover clusters only; all MID Servers must support basic authentication. Log ingestion must be enabled on each server, and failover ensures continuous data input operation. Capacity validation ensures at least one MID Server in the cluster is available for streaming.
    • Service Instance: Required field to bind log data to a ServiceNow service instance, which must be operational. If no relevant instance exists, create one and add Configuration Items (CIs).

    Read-only fields provide status, transport protocol, source count, last log time, and error messages for monitoring and troubleshooting.

    Query Settings

    • From: Required starting date/time to read logs, ensuring only data newer than this point is ingested. Setting this to a far past date may cause system congestion due to large data volumes.

    Transport Settings

    • Kafka node names: Comma-separated list of Kafka broker addresses (HOST:PORT). Not all cluster servers need to be listed.
    • Topics: Required comma-separated list of Kafka topics to subscribe to for log streaming.
    • Kafka credentials: Reference to credentials configuring the security protocol (SSL, SASLSSL, SASLPLAINTEXT, or None) for authenticating with Kafka.
    • Group Id: Name of the Kafka Consumer Group used for the data input.

    Advanced Configuration

    • Timeout: Milliseconds to wait when polling Kafka if no data is available (default 500ms).
    • Node discovery timeout: Timeout for Kafka node discovery in milliseconds (default 30ms).
    • Default timezone: Timezone used if logs lack timezone info (default GMT).
    • Sub sample drop/receive ratio: Settings to reduce the volume of fetched or received events by selectively dropping events.
    • Max length in bytes: Maximum allowed event size in bytes (default 32766).
    • Character encoding: Encoding used for the data input stream (default UTF-8).
    • Drop if queue is full: Option to discard logs when the MID Server is under load (default false).

    Practical Implications for ServiceNow Customers

    Configuring Apache Kafka data inputs with these fields allows customers to reliably ingest log data into ServiceNow using Health Log Analytics. Selection between specific MID Servers or failover clusters provides flexibility and resilience. Proper setup of Kafka nodes, topics, credentials, and consumer groups ensures secure and efficient streaming. Advanced settings enable tuning performance and resource usage according to operational needs. Monitoring read-only status fields supports ongoing health checks and troubleshooting.

    Description of the fields on the Apache Kafka data input configuration form.

    Basic configuration

    Field Description
    Name Name of the new data input. This field is required.
    Description Description of the data input.
    Execute on Option to determine whether to use a specific MID Server or a MID Server cluster.

    This feature is supported in the Health Log Analytics application, Version 26.0.17 - February 2023 and later, available from the ServiceNow Store.

    MID

    (Only when the Execute on field is set to Specific MID Server)

    MID Server to which log data from Apache Kafka is pulled.
    Note:
    • You can select only MID Servers that support basic authentication. MID Servers that support mTLS are not listed.
    • The default maximum number of data inputs streaming logs to a single MID Server is 10. You can modify this number in the MID Server properties.
    • If log ingestion is not enabled for the selected MID Server, Health Log Analytics enables it automatically.
    This field is required.
    MID Server Cluster

    (Only when the Execute on field is set to Specific MID Server Cluster)

    The MID Server cluster to which the log data is pulled.

    The data input runs on a single MID Server in the cluster until that MID Server fails. The system then moves all the data input tasks to the next available MID Server in the cluster according to the configured order.

    This feature is supported in the Health Log Analytics application, Version 26.0.17 - February 2023 and later, available from the ServiceNow Store.

    Note:
    • Health Log Analytics supports only failover MID Server clusters. In these clusters, multiple MID Servers are grouped together for failover protection. When selecting a cluster from the data input form, the MID Server Clusters list displays only failover clusters.
    • The MID Server cluster must include only MID Servers that support basic authentication. mTLS is not supported for log ingestion.
    • Log ingestion must be enabled for each MID Server in the cluster. If log ingestion is not enabled for the active MID Server, Health Log Analytics enables it automatically.
    • The default maximum number of data inputs streaming logs to a single MID Server is 10. A cluster passes capacity validation if it contains at least one MID Server with fewer than 10 data inputs running on it, even when that MID Server is down.
    For more information about MID Server clusters, see Configure a MID Server cluster.

    This field is required.

    Service instance The service instance to which to bind the log data.
    Note:
    If no relevant service instance exists, Create an service instance and add CIs to it. Set the status of the new service instance to Operational.
    This field is required.
    The following fields show read-only information:
    Field Description
    Status Status of the data input.
    Transport Protocol used to stream the log data.

    This data input uses Apache Kafka to stream log data to your instance.

    Sources count The number of log sources this data input has created.
    Disabled since The time when the data input stopped or failed.
    Last log time The time when the last log streamed in the data input.
    Error message The streaming error.

    This field is populated automatically. It displays only when a streaming error has occurred.

    Table 1. Query Settings tab
    Field Description Example
    From Starting date and time for reading the data. Data older than this date and time is not read.
    Note:
    Setting this value to a past date might require the system to read large amounts of data, causing congestion.

    This field is required.

    Now -1 week
    Table 2. Transport tab
    Field Description Example
    Kafka node names A comma-separated list in the format HOST:PORT,HOST:PORT. The list does not have to include all the Apache Kafka Cluster servers. 123.4.5.6:9092,123.3.4.5:9093
    Topics A comma-separated list of topics to which the data input must subscribe.

    This field is required.

    FirstTopic,SecondTopic,ThirdTopic
    Kafka credentials Reference to the Apache Kafka credentials.

    You can display the Kafka SSL credentials form by selecting the information icon (Information icon.) and then selecting Open Record. The form enables you to choose the security protocol used for authenticating with Apache Kafka from the following options:

    • SSL - SSL channel.
    • SASL_SSL - SASL authenticated, SSL channel.
    • SASL_PLAINTEXT - SASL authenticated, non-encrypted channel.

    For a description of the fields on the Kafka SSL credentials form, see Kafka SSL credentials fields.

    None
    Group Id The name of the Apache Kafka Consumer Group. logs

    Advanced configuration

    Table 3. Advanced configuration form
    Field Description Default value
    Timeout The time, in milliseconds, spent waiting in the poll if data is not available in the topics. 500
    Node discovery timeout The time, in milliseconds, before node discovery times out. 30
    Default timezone The default timezone if the log doesn't include timezone information. GMT
    Sub sample drop ratio The number of events to batch together, out of which one will be discarded. This setting is used to reduce the number of fetched events. -1
    Sub sample receive ratio The number of events to batch together, out of which all but one will be discarded. This setting is used to decrease the number of received events. -1
    Max length in bytes The maximum length, in bytes, of events. 32766
    Character encoding The character encoding for this data input. UTF-8
    Drop if queue is full Option for selecting to discard logs if there is a load on the MID Server. False