Using Stream Connect for Apache Kafka
Summarize
Summary of Using Stream Connect for Apache Kafka
ServiceNow® Stream Connect for Apache Kafka enables you to integrate your Apache Kafka environment with your ServiceNow instance, facilitating high-volume, low-latency streaming of data between ServiceNow and external systems. This integration is built on the Hermes Messaging Service, which manages data flows and supports both producing and consuming Kafka events seamlessly within ServiceNow.
Show less
Note: Using Stream Connect for Apache Kafka requires subscriptions to Workflow Data Fabric and Stream Connect for Apache Kafka.
Key Features
- Producers and Consumers: Stream Connect provides multiple producers (Workflow Studio Kafka Producer step and ProducerV2 API) to publish Kafka events, and several consumer types (Kafka Message trigger, ETL Consumer, Transform Map Consumer, Script Consumer) to read and process events.
- Integration with Workflow Studio: Build low-code flows that produce and consume Kafka messages. Kafka Message trigger automatically creates necessary Kafka streams and subscriptions, simplifying event-driven automation.
- Support for Apache Avro Format: Import and create Avro schemas for efficient message serialization, reducing payload size and simplifying integration. Schemas can be imported from the Confluent Registry or created via JSON.
- Data Import and Transformation: Use existing Robust Transform Engine (RTE) or transform map configurations to process Kafka data through ETL or Transform Map consumers, or apply custom logic with the Script Consumer for advanced use cases.
- Message Replication: Configure and manage Kafka message replications directly within ServiceNow using a MID Server, eliminating the need for additional replication services and automating certificate generation.
- Message Handling and Reliability: Undelivered and unprocessed messages are stored in dedicated tables for retry and automatic cleanup, ensuring message delivery reliability.
- Compression Support: Configure compression (NONE, GZIP, LZ4) for producers to optimize data transmission. Consuming compressed messages requires GZIP or LZ4 compression.
- Domain Separation and Namespace Management: Use topic namespaces to logically organize Kafka topics and control access based on domains in domain-separated instances.
- Monitoring and Reporting: View detailed statistics on producers, consumers, subscriptions, and message processing performance through the Stream Connect dashboard.
Practical Benefits for ServiceNow Customers
- Scale and Performance: Handle large volumes of Kafka events with low latency, enabling real-time data synchronization and event-driven workflows.
- Simplified Integration: Leverage low-code Workflow Studio triggers and actions to easily consume and publish Kafka messages without deep technical Kafka expertise.
- Reuse Existing Configurations: Utilize current RTE and transform map setups to import Kafka data, reducing development effort and maintaining consistency.
- Reliable Data Flow: Benefit from automated retry mechanisms and message tracking to ensure data integrity and minimal loss.
- Secure and Organized: Manage Kafka topics and access via namespaces and domain separation to maintain security and governance in multi-domain environments.
Implementation Essentials
- Enable the ServiceNow Stream Connect Installer plugin to activate licensed components.
- Create Kafka streams to define data flows for ETL, Transform Map, or Script consumers as needed.
- Configure subscriptions automatically via Workflow Studio triggers or manually for other consumers to monitor Kafka topics and partitions.
- Set system properties to customize producer compression and consumer timeout behavior for optimized performance.
By leveraging ServiceNow Stream Connect for Apache Kafka, customers can seamlessly integrate Kafka event streaming into their ServiceNow workflows, enhancing automation, data synchronization, and operational efficiency across their enterprise systems.
Connect your Apache Kafka environment to your ServiceNow instance with ServiceNow® Stream Connect for Apache Kafka.
Apache Kafka is a distributed event-streaming platform that provides a unified way to exchange data across multiple systems. Stream Connect for Apache Kafka links your Kafka environment to your ServiceNow instance, enabling you to stream data between your instance and your external systems.
Benefits
Publish and process Kafka events at scale. Publish events to your Kafka environment from your ServiceNow instance and consume Kafka events from your external systems at a high volume with low latency.
- Build flows that produce and consume Kafka events. Stream Connect is integrated with Workflow Studio, providing a low-code way to publish and process Kafka messages.
- Import data from your Kafka environment and process that data using your existing Robust Transform Engine (RTE) or transform map configurations.
- Configure a consumer that uses your own scripts to process data from a Kafka topic.
- Monitor your consumers' performance with detailed reporting of statistics and performance metrics.
Components
Stream Connect has the following components.
- Producers
A producer publishes events to a Kafka environment. Stream Connect has two producers.
- Kafka Producer step in Workflow Studio
- ProducerV2 API
- Consumers
A consumer reads and processes events from a Kafka environment. Stream Connect has several consumers.
- Kafka Message trigger in Workflow Studio
- Extract Transform Load (ETL) Consumer
- Transform Map Consumer
- Script Consumer
- Topics and topic namespaces
Events are organized and stored in topics. A topic stores events of the same type. Topics are partitioned. Events have a key. Events with the same key are stored in the same partition.
Topics link to a topic namespace. You can use namespaces to organize topics in logical ways. For example, you can group topics together based on which Kafka cluster they come from. You can also use namespaces to configure which domains can access which topics on a domain-separated instance. For more information, see Managing namespaces and topics in Hermes.
- Subscriptions
A subscription is a record associated with a consumer. It stores configuration information about the consumer, such as the name of the Kafka topic to consume messages from and the number of partitions the topic has. The subscription record is created when a Kafka stream is activated.
Each subscription record has several metrics that enable you to view the performance of the consumer reading from the topic. For more information, see Viewing Kafka subscriptions and statistics.
- Partition groups
A partition group is a set of topic partitions. For example, if a topic has six partitions, they can be divided into three partition groups, with two partitions in each group.
- Kafka consumer job
A job that regularly checks Hermes for any new events in a topic. The job picks a free partition group and retrieves its subscription. The subscription gives the topic name, and the job checks the partitions for messages for that topic.
- Kafka streams
A Kafka stream is a record that defines the data stream for a consumer. If you're using the Kafka Message trigger in Workflow Studio, the Kafka stream is automatically created for you. If you're using a different consumer, you’ll need to create one manually.
To link your Kafka environment to your ServiceNow instance, Stream Connect uses the Hermes Messaging Service. The Hermes Messaging Service enables your instance to produce and consume large volumes of Kafka events. It manages the flow of data between your Kafka environment and your instance. For more information, see Hermes Messaging Service.
The following diagram shows some of the key components of Stream Connect.
Stream Connect and Workflow Studio
Build flows that produce and consume Kafka events with Stream Connect and Workflow Studio . Stream Connect has a flow trigger for consuming Kafka events and an action step for producing them.
Use the Kafka Message trigger to create flows that process Kafka events. You can build a flow that consumes data from Kafka and inserts it into a table, or uses spokes to communicate the data to third-party environments.
The trigger is enabled when the flow is activated. After it's activated, the trigger starts the flow whenever there's a message in the specified Kafka topic. When you use the Kafka Message trigger, you don't need to create a Kafka stream or subscription record. The system automatically creates both when the flow is activated. Messages are read from the topic as long as the flow is active.
Use the Kafka Producer step to create actions that publish events to a topic in your Kafka environment. For example, you can use the step to create a message about an update on an incident in ServiceNow, then push the message to a topic in your Kafka environment.
Support for messages in an Avro format
Import and create schemas to send and receive messages in an Apache Avro format. Using an Avro format can reduce the size of the payload and simplify your integration to your local Kafka instance.
You can import Avro schemas directly from the Confluent Registry, or you can create your own schemas using a JSON file or a JSON-formatted string. The schemas are stored in ServiceNow and enable your producers and consumers to convert plain-text messages to an Avro format and back. For details, see Schema management in Stream Connect.
ETL, Transform Map, and Script Consumers
Import data from your Kafka environment using your existing RTE or transform map configurations. The Extract Transform Load (ETL) and Transform Map consumers simplify your data imports by providing an efficient way to take a payload from a Kafka message, transform the data, and insert or update a record in a table. You can switch from a scheduled data import to one using Stream Connect and process the data with the same configurations.
You can also use the Script Consumer to process data from your Kafka environment. The Script consumer is for more advanced use cases, such as when the data in the message isn't structured, or it requires data lookups using code.
When you Configure an Extract Transform Load (ETL) consumer, Configure a Transform Map consumer, or Configure a script consumer, you also need to Create a Kafka stream.
ProducerV2 API
Publish events to a Kafka topic with the ProducerV2 API.
Stream Connect Message Replication
You can replicate data between your Kafka environment and ServiceNow with Stream Connect Message Replication.
Stream Connect Message Replication enables you to configure and manage message replications directly from your ServiceNow instance. It uses a MID Server to run the data replications, so you don't need to configure or host additional replication services. It also simplifies the message replication setup by automatically generating the required certificates.
For more information, see Stream Connect Message Replication.
Unprocessed and undelivered messages
If a message can't be delivered, it’s stored in the Kafka Undelivered Messages [sys_kafka_undelivered_messages] table. A scheduled job, Kafka Producer Retry, regularly reads this table and tries to redeliver any messages.
If a batch of messages can't be processed because it has timed out, it’s stored in the Kafka Unprocessed Messages [sys_kafka_unprocessed_messages] table. The time-out for a message batch can be set with the com.glide.kafka_consumer.timeout property. The default value is 60 seconds. This table is a rotated table, so it cleans records automatically.
Producer compression formats
- NONE
- GZIP
- LZ4
Domain separation
Use Stream Connect topic namespaces to configure which domains can access a Kafka topic on a domain-separated instance. Group topics into ServiceNow namespaces, then link the namespaces to specific domains. For more information, see Domain separation and Stream Connect.
Architecture diagram
The following diagram shows key components of Stream Connect, how they relate to ServiceNow and third-party applications, and how they connect to your Kafka environment through Hermes.
Plugin
Stream Connect requires the ServiceNow Stream Connect Installer [com.glide.hub.stream_connect.installer] plugin. This plugin enables the licensed components for working with message-based streaming data in Stream Connect.