Exploring Service Observability

Release version: Yokohama

Updated July 31, 2025

3 minutes to read

Summarize

Summarized using AI

Summary of Exploring Service Observability

Service Observability enables operations teams to efficiently triage and manage incidents within complex, distributed production environments by integrating telemetry from external Application Performance Monitoring (APM) systems with Configuration Management Database (CMDB) data. This integration is presented within a unified workflow in the Service Operations Workspace (SOW), allowing users to view service health metrics alongside related configuration item (CI) details.

Show full answer Show less

Supported APM vendors include Amazon CloudWatch, AppDynamics, Datadog, Dynatrace, Microsoft Azure Monitor, New Relic, Prometheus (on-premise), SolarWinds (on-premise), and Splunk Observability. Supported databases include MySQL, PostgreSQL (except with Splunk), and Amazon RDS (via CloudWatch).

By mapping CMDB services to APM metrics using tags, Service Observability correlates data such as host or database metrics with associated service CIs. This correlation helps operators identify the root cause of service issues directly within the SOW, without switching tools.

Key Features

Centralized Data Integration: Combines external APM telemetry with CMDB service data for comprehensive visibility.
Service-to-Metric Mapping: Uses tags to connect APM metrics to corresponding CMDB services, enhancing context.
Unified Interface: Displays health metrics, incidents, alerts, and changes related to services within the SOW.
Role-Based Access: System admins and Service Observability admins configure connections, mappings, and dashboards; operators and managers use the tool for incident triage and health monitoring.
Customizable Dashboards: Admins can tailor dashboard templates to suit organizational needs for displaying metrics.

How It Works

For Admins:

Identify and register critical services to monitor based on business priority.
Connect Service Observability to existing APM instances.
Map services to APM data using tags for accurate metric association.
Customize dashboard templates to optimize metric visualization.

For Operators and Managers:

Detect service issues via alerts or dashboards in the SOW.
Review overall service health metrics, incidents, alerts, and recent changes.
Drill down into detailed metrics on the Observability tab to identify underlying problematic entities.
Determine ownership of affected entities to initiate remediation effectively.

Benefits

Improved Agility and Reliability: Centralizes critical signals and bridges workflows by integrating diverse data sources.
Reduced Mean Time to Resolution (MTTR): Facilitates faster root cause analysis by presenting combined metrics and related CI information.
Comprehensive Incident Context: Enables viewing of service health, alerts, and related changes in one place for informed decision-making.
Customization: Allows administrators to tailor dashboard templates to meet specific operational requirements.

Service Observability helps operations teams triage and manage incidents in a complex and distributed production system. It combines external application performance monitoring (APM) systems' telemetry with related data from the Configuration Management Database (CMDB) and displays both in a single workflow in the Service Operations Workspace (SOW).

Service Observability overview

Service Observability displays health metrics in the SOW related to a given service. Metrics can be ingested from an external APM system and displayed alongside information for related configuration items in the CMDB.

Service Observability supports the following APM vendors:

Amazon CloudWatch
AppDynamics
Datadog
Dynatrace
Microsoft Azure Monitor
New Relic
Prometheus (on-premise)
SolarWinds on-premise
Splunk Observability

Service Observability supports the following databases:

MySQL
PostgreSQL (not supported with Splunk)
RDS (Relational Database Service) (Amazon CloudWatch)

After connecting an APM instance to Service Observability, map services in the CMDB to APM metrics using existing tags.

With this data mapping, Service Observability displays APM metrics for entities such as host or database along with details about related CI information. Operators use these metrics and contextual information, including current incidents and alerts, to assess service health.

For example, say you use Dynatrace to monitor your checkout service, and metrics from your database and host use the tag checkout-service to denote requests coming from that service. By mapping the checkout service CI to the APM data tagged with checkout-service, Service Observability retrieves metrics for those databases and hosts and CIs related to the service, then displays them together. Operators can pinpoint issues on entities related to the service and narrow down the mitigation process without having to leave the SOW.

Service Observability users

Table 1. Users
User	Description
System admin	Version 1.5 only. System admins configure users and teams, register services to be monitored, connect Service Observability to APMs, and then map those services to that data. They can also view the data in the SOW
Service Observability admin	Version 1.6.x and later. Service Observability admins can configure users and teams, connect Service Observability to APMs, and then map services to that data. They can also view the data in the SOW. Admins can also customize dashboard templates used to display metrics and related information.
Operator/operations manager Note: These users must belong to an `srm` group type to see all data.	Operators use Service Observability when triaging incidents in the SOW. They can view basic health metrics for a service, along with related incidents, alerts, and changes. They can get more detailed information by navigating to the Observability tab to view additional service metrics, along with metrics from related entities, such as a host or database.

Service Observability workflow

Admins configure Service Observability by registering services, connecting APM metrics, and then mapping the services to that data. Operators use Service Observability to determine if another related entity is causing issues surfaced by the service's performance.

As an admin, you:

Determine the services to be monitored by Service Observability based on business criticality.
Connect existing APM instances to Service Observability.
Map services with APM metric data based on APM-based tags used on that data.
Customize the templates used to display metric charts.

As an operator or manager, you:

Spot an issue with a service while working in the SOW, for example, from an alert, the Service dashboard, or Express List, then navigate to the Service Details page.
View overall health metrics for the service, along with related incidents, alerts, and changes. If one of the metrics seems unhealthy, navigate to the Observability tab.
View more detailed service metrics, as well as information from related entities, to start root cause investigation. When finding that the issue is further down the system's stack, identify the ownership for that entity to start remediation.

Service Observability benefits


Benefit	Feature	Users
Centralize critical signals and bridge workflows to increase agility and reliability: Connect data from external APMs Map that data to CMDB services View combined data in the SOW	Connect a Service Observability data source Create and manage data mappings .	Admins
Increase efficiency and reduce mean time to resolution (MTTR). View combined metrics from entities associated with a service to begin to determine blast radius and ownership of an incident.	View service health metrics	Operators
See related changes to the system and alerts associated with a service in one place.	View overall service health.	Operators
Customize dashboard templates.	Customize Service Observability dashboard templates	Admins

What to explore next

To learn more about configuring and using Service Observability, see: