Now LLM Service updates

  • Release version: Yokohama
  • Updated May 27, 2026
  • 5 minutes to read
  • Summarize
    Summarized using AI
    This content was generated using new OpenAI-powered functionality. Results are provided on an as is basis and are not guaranteed to be accurate or complete.

    Summary of Now LLM Service updates

    The Now LLM Service offers access to specialized large language models (LLMs) developed or enhanced by ServiceNow, as well as selected open-source LLMs from the community and partners. These models support a variety of language-related tasks within the ServiceNow platform, such as conversational interactions, code generation, flow recommendations, summarization, and automation. Model cards provide detailed information about each LLM's intended use, training data, and limitations, helping customers understand the appropriate applications for each model.

    Show full answer Show less

    Key Features

    • Variety of Specialized Models: Includes text-to-text, text-to-code, text-to-flow, flow recommendation models, and both large and small language models tailored for different ServiceNow use cases like Virtual Agent, Now Assist, AI search, and automation workflows.
    • Model Cards: Provide transparency about model capabilities, use cases, and limitations, enabling informed selection and deployment within ServiceNow applications.
    • Multilingual Support: Expanded language support to include English, German, French, Japanese, Dutch, French Canadian, Spanish, Brazilian Portuguese, and Italian to better serve global teams.
    • JSON Output Format: New support for JSON-formatted responses ensures structured, consistent outputs that reduce errors and improve integration efficiency, lowering token consumption and operational costs.
    • Enhanced Instruction Following: Improved model training leads to more accurate interpretation and execution of user commands, delivering precise and actionable responses.
    • Model Consolidation: Recent releases unify multiple ServiceNow-related AI tasks into singular model architectures, simplifying deployment while improving performance.
    • Ongoing Model Updates: Regular enhancements include expanded context windows (up to 32K tokens), better multilingual capabilities (notably Japanese), and robust content moderation to ensure safe, reliable AI usage.

    June 2026 Release Highlights

    • Updated third-party default models to GAIC 13.1.2; teams should ensure updates are applied.
    • Changed default reasoning effort setting for GPT-5 Mini from "none" to "minimal"; regression and functional testing recommended.
    • Retirement and backend redirection of Claude Sonnet 4.0 models to Claude 4.5 Sonnet versions; no team action needed.
    • Discontinuation of the Now LLM long-term support (LTS) SKU; no testing or migration required by teams.

    Previous Key Releases

    • May 2025: Introduction of an advanced 12B parameter small language model (SLM) with enhanced instruction adherence, doubled context window (32K tokens), improved multilingual performance, and optimized workflow capabilities.
    • March 2025: Release of a powerful 12B parameter general-purpose SLM fine-tuned for case summarization, text-to-code, and content moderation with consolidated deployment and improved instruction following.
    • November 2024: Added support for additional languages, JSON output format for structured responses, deterministic and lower token consumption responses, and enhanced instruction following accuracy.

    Practical Implications for ServiceNow Customers

    Customers leveraging the Now LLM Service can expect improved AI-driven automation, conversational AI, and coding assistance with greater accuracy, efficiency, and language support. The introduction of JSON output simplifies integration with ServiceNow workflows, reducing errors and operational costs. Regular updates require teams to stay current with model versions and configuration changes, such as reasoning effort settings, to maintain optimal performance. The retirement of certain SKUs and model redirects are managed by ServiceNow, minimizing disruption.

    The Now LLM Service provides access to specialized large language models (LLMs) that are developed by ServiceNow. It also provides access to open-source LLMs that are selected, configured, or enhanced by ServiceNow, from the ServiceNow community and partners. Review these reference materials and model cards for additional information about the Now LLM Service and about the models used.

    Model cards

    Large language models (LLMs) are complex machine-learning models that are trained on large datasets like websites and documentation to perform language-related tasks, such as text generation for case summaries and resolution notes.

    Model cards explain the specific model's context, intended use, training data, limitations, and other important information.

    These model cards are for skills that use the Now LLM Service. There are certain skills, such as Now Assist Multi-Turn Catalog Ordering, that use Azure OpenAI instead. To see what LLM a skill is using, you can check the skill list in the Now Assist Admin console and review the LLM service column.

    Model card for ServiceNow text-to-text LLM

    Model used for conversational use cases like Virtual Agent topic execution and conversational catalog and agent assist use cases like alert analysis, AI search, and incident, case, and chat summarization.

    Model card for ServiceNow text-to-code LLM
    Model used for code generation.
    Model card for ServiceNow flow next-best-action LLM
    Model used for flow recommendations.
    Model card for ServiceNow text-to-flow LLM
    Model used for flow generation.
    Model card for ServiceNow text-to-text SLM
    Model used for Now Assist Guardian, text-to-cypher and other use cases that demand rapid inference and high throughput.
    Model card for ServiceNow large language model
    Model used for AI-driven solutions to support natural language understanding, automation, and decision support.
    This model card is available in Yokohama patch 1 and later.
    Model card for ServiceNow small language model
    Model used for enterprise AI applications by enhancing text-based automation and content generation within ServiceNow workflows.
    This model card is available in Yokohama patch 1 and later.
    Model card for ServiceNow third party large language model
    Model used for AI-driven solutions for text generation, summarization, and conversational AI.
    This model card is available in Yokohama patch 1 and later.

    June 2026

    The June release includes updates to third-party model defaults, a change to the default reasoning effort setting for GPT-5 Mini, and the retirement of the Now LLM long-term support (LTS) SKU.

    • Third-party default model version update: Teams that did not update their third-party default model versions to the latest available versions in the May release must do so in the June release. GAIC 13.1.2 is the required version for this update.

    • GPT-5 Mini — reasoning_effort default change: The default reasoning_effort setting for GPT-5 Mini has changed from none to minimal. This change is included in GAIC Snapshot 14.0.0, which is compatible with Now Assist for Platform 12.0.0.

      Teams using GPT-5 Mini should run regression and functional testing to confirm that the new default works as expected. If you explicitly set reasoning_effort in your generative AI config additional properties, smoke test to verify there are no unexpected effects. If you have reasoning_effort: none set in additional properties, update the value to minimal and run regression and functional testing.

    • Claude Sonnet 4.0 retirement: Claude Sonnet 4.0 references are being redirected on the backend. No team action is required.

      • claude_large / Claude 4.0 Sonnet redirects to Claude 4.5 Sonnet
      • claude_small / Claude 4.0 Sonnet redirects to Claude 4.5 Sonnet
    • Now LLM LTS SKU retired: ServiceNow no longer offers the LTS model SKU. There are no testing requirements or expectations for teams related to this retirement.

    May 2025

    An advanced 12B general-purpose small language model (SLM) with a singular, high-performance architecture that supports a wide range of tasks in ServiceNow’s context was released. Fine-tuned on Mistral-Nemo-12B-Instruct, this model is designed and optimized for tasks like Agent Assist, Text-to-Flow, Text-to-Cypher, Safety & Content Moderation and Text-to-Code.

    Key Enhancements:
    • Enhanced instruction adherence: Improved the model’s capability to accurately interpret and follow user instructions, ensuring that the model can better understand and execute complex commands. Leading to more precise and reliable outcomes than previous releases.
    • Increased context window: increased context window from 16K to 32K, enabling the model to better understand long-form inputs, maintain coherence over extended interactions, and support more complex tasks with richer contextual awareness.
    • Improved multilingual proficiency: Boosted performance across languages compared to previous releases, with notable enhancements in Japanese processing.
    • Optimized for ServiceNow workflow related capabilities: Extended support coverage for Text-to-Flow, and improved the performance of Text-to-Code, Text-to-Cypher etc.
    • Continuously enhanced model deployment consolidation: Integrates ServiceNow-related tasks into a single model, reducing system complexity at the same time while elevating overall performance.

    March 2025

    A powerful 12B general-purpose small language model (SLM) designed to enhance a wide range of applications, including text-to-code and agent use cases was released. Fine-tuned on Mistral-Nemo-12B, it streamlines deployment and consolidates multiple functionalities into a singular, architecture.

    Key Enhancements:
    • Optimized to fulfill use cases: Enhances case summarization, chat summarization, resolution notes, and knowledge base generation across supported languages, including improvements in Japanese quality.
    • Superior text-to-code and text-to-cypher performance: Delivers major advancements in Glide JavaScript and generic JavaScript editing and generation, along with improved accuracy in query generation and execution for structured databases.
    • Robust content moderation and safety: Provides stronger protection against adversarial prompts, jail-breaking attempts, and harmful content generation, ensuring safer deployment with built-in content filtering.
    • Unified model deployment:integrates ServiceNow-related tasks into a single model, thereby reducing system complexity while elevating overall performance.
    • Improved instruction adherence: Delivers better instruction following and consistency across varying levels of prompt and instruction strictness than the current text-to-text NowLLM.

    November 2024

    Several key improvements were added to the Now LLM Service that are aimed at enhancing performance and quality.

    • Multilingual support: Now LLM Service supports 8 additional languages, enabling global teams to use the model in their native languages.

      The supported languages are: English, German, French, Japanese, Dutch, French Canadian, Spanish, Brazilian Portuguese, and Italian.

    • JSON format support: The model now provides output in JSON format, making it easier for developers to integrate with various applications and automate workflows seamlessly.
      • Deterministic responses: JSON mode ensures structured, consistent output, which improves predictability and reliability when integrating with applications.
      • Error reduction: Unlike free-form text mode, JSON responses are less prone to format errors or stray characters, minimizing integration issues.
      • Lower token consumption: The fixed structure of JSON can reduce token usage, making it more efficient and cost-effective for applications with high response frequency.
    • Improvements in instruction following: The model has been fine-tuned to understand and follow instructions more precisely. This enables the model to deliver more to-the-point and actionable responses, helping users get the information they need faster and more efficiently.