Customizing semantic metadata

  • Release version: Australia
  • Updated April 17, 2026
  • 4 minutes to read
  • Semantic metadata — descriptions, labels, and usage instructions — control how Query Generation interprets natural language questions. Customize these metadata to improve accuracy for your organization's terminology and data.

    Both entities (tables) and dimensions (fields) in the semantic layer have three metadata fields that Query Generation uses when processing a question:

    Semantic Label
    A short name or alias for the entity or dimension. Functions as a search keyword that helps AI search identify the correct table or field when a user's question does not match the default field label.
    Semantic Description
    A natural-language description of what the entity or dimension represents in business terms. Provides context that helps the system distinguish between similar tables or fields.
    Semantic Usage Instructions
    Instructions injected directly into the LLM prompt when the entity or dimension is selected. Teaches the system how to query the data — for example, which operators to use, how to handle abbreviations, or how to expand hierarchical values.

    When to customize

    Metadata is auto-generated and works well in its baseline state. Customize it only when you observe consistent, repeatable problems:

    • The system selects the wrong table or cannot find one — edit descriptions and labels
    • The wrong field is selected or a field is missing — edit descriptions and labels
    • Your organization uses different terminology than the auto-generated labels — edit labels
    • The right table or field is selected but the query is constructed incorrectly — edit usage instructions
    Note:
    There are thousands of dimensions. Do not try to review them all. Focus on fields that appear in failing queries.

    Two methods for customizing metadata

    Two methods are available. Editing the configuration tables is considered the default approach.

    Table 1. Metadata customization methods
    Method Tables When to use
    Edit records on the Semantic Table Configuration and Semantic Column Configuration tables Semantic Table Configuration [sn_query_gen_table_config] and Semantic Column Configuration [sn_query_gen_column_config] Update set compatible. Customizations can be included in update sets and transferred between instances. Overrides values set in the Entity and Dimension tables. Supports entity and dimension descriptions and dimension usage instructions. Doesn't support entity labels or usage instructions.
    Edit records on the Entity and Dimension tables Entity [sn_query_gen_entity] and Dimension [sn_query_gen_dimension] Simpler. Suitable for testing on a single instance. These customizations cannot be transferred between instances. However, this is the only method for changing entity labels and entity-level usage instructions. Could be overridden by an update to the table from an upgrade.
    Note:
    Both methods require the sn_query_gen.admin role or higher.

    Writing effective descriptions

    Follow these guidelines when writing semantic descriptions:

    • Keep descriptions to 1–2 sentences focused on how users refer to this data
    • Include common synonyms and abbreviations your users would say
    • Avoid full paragraphs. Concise descriptions match better than verbose ones

    Writing effective usage instructions

    Follow these guidelines when writing semantic usage instructions:

    • Be specific and structured. Include rules, examples, and edge cases
    • Use numbered steps or labeled sections if the logic is complex
    • Include example user questions and what the query should look like
    • For free-text fields, specify matching strategy (CONTAINS, exact match) and expansion rules
    • Keep instructions focused on one entity or field's needs

    Usage instructions vs. Segments

    Usage instructions teach the LLM how to query a field dynamically, allowing the LLM to handle many scenarios based on your rules. Segments hard-code specific filter values. For fields like free-text location, usage instructions are the correct approach because you cannot predefine every possible location query. Segments are better for fixed business terminology such as "Sev1" = priority 1.

    Semantic description

    For an Incident entity, instead of using just "Incident table", use a description like "IT incidents, outages, service disruptions, and IT support tickets" to include terminology your users actually say.

    Data conventions in usage instructions

    For a State field that stores abbreviations:

    "Values in this field may be full state names or two-letter abbreviations (for example, 'California' or 'CA'). Always query for both forms. For country names, also include common aliases (for example, 'United Kingdom' OR 'UK')."

    Complex query logic in usage instructions

    These usage instructions are for a free-text Approximate Location field with hierarchical expansion and synonym handling:

    Field: Approximate Location - Usage Instruction
    The Approximate Location field is a non-normalized, free-text string. All queries must use case-insensitive CONTAINS matching and handle hierarchical expansion.
    Core Query Logic
    1. Identify the geographic entity (Neighborhood, City, State, Country, Region, or Continent).
    2. Expand broader entities into explicit lists of sub-entities before querying.
    3. Use OR logic for all expanded terms and synonyms.
    Expansion Rules
    • Continents: Expand to a list of all major countries within that continent.
    • Regions: Expand to relevant states or countries (for example, "Atlantic Coast" -> NC, VA, FL, etc.).
    • States/Provinces: Include both full names and standard abbreviations (for example, "North Carolina" OR "NC").
    • Countries: Include common aliases (for example, "UK" OR "United Kingdom").
    Implementation Pattern
    1. Extract geographic intent.
    2. Classify entity level.
    3. Expand downward (Continent > Country) or include synonyms (State > Abbreviation).
    4. Construct a single query string using OR-based CONTAINS filters.

    Notice the pattern: the instruction defines rules and examples, and the LLM can handle many related queries dynamically based on the rules you provide. You teach the logic once, and the model applies it to any input.

    Examples of user questions converted to queries
    • User: "Show me things in Japan" → Query: location CONTAINS "Japan"
    • User: "Show me things in Asian countries" → Decomposition: Asia -> [Japan, China, Indonesia...] → Query: location CONTAINS "Japan" OR location CONTAINS "China" OR location CONTAINS "Indonesia"...

    • User: "Show me things in North Carolina" → Query: location CONTAINS "North Carolina" OR location CONTAINS "NC"
    • User: "Show me everything in Europe" → Decomposition: Europe > [UK, Spain, France, Germany...] → Query: location CONTAINS "UK" OR location CONTAINS "United Kingdom" OR location CONTAINS "Spain" OR location CONTAINS "France"...