Using Group By for classification
Summarize
Summary of Using Group By for classification
The Group By feature in classification enables ServiceNow customers to train and maintain a single classification solution that spans multiple data segments, such as geographical locations or business domains. By specifying a group-by field (which must be categorical), the system automatically creates individual child models for each group value, allowing more granular and relevant predictions within one unified solution.
Show less
Key Features
- Group By Parameter in API: When creating a classification solution via APIs, include the
groupbyparameter to specify the categorical field by which data is segmented. - Multiple Child Models: The system generates separate models for each group value only if sufficient data meets minimum record criteria.
- Prediction Routing: Incoming prediction requests are routed to the appropriate child model based on the group-by field value in the input data.
- Batch Predictions: Currently, batch prediction is not supported for Group By solutions.
- Use Case Example: For a global company with US and Europe support centers, a single Group By solution can create distinct models for US and European incidents, simplifying management across locations and domains.
- API Usage: Sample code demonstrates how to define datasets, create classification solutions with Group By, submit training jobs, and perform predictions using GlideRecord or data maps.
Key Outcomes
- Efficient Model Management: Manage multiple classification models under one solution, reducing overhead compared to managing separate solutions per segment.
- Improved Prediction Accuracy: Models tailored to specific groups (like location or domain) can yield more accurate classifications.
- Flexible Application: Supports scenarios where data is naturally segmented, such as different countries or business units, enabling tailored machine learning within the same framework.
Use APIs to simultaneously submit multiple classification solutions for training based on the Group By field.
You can use the optional Group By capability to train and maintain one classification solution that covers more than one data area, such as geographical location or domain.
To train a solution using Group By, you must add the groupby parameter while creating a classification solution definition using APIs. The groupby parameter accepts only categorical columns as inputs, where individual models are created on the subset of data belonging to each of the groupby values. Only those child solutions that pass the minimum records criteria set for the capability are created. Here, the prediction calls are routed to the corresponding Group By model based on the Group By value present in the prediction input. Batch predictions are not supported.
A Group By scenario for geographical locations
Let's say your global company uses classification routing for incoming records, with one support center in the US and one in Europe. Here, you want to create a single classification solution that has one model for your United States incidents and another model for your European incidents.
- Create and train two separate ML classification solution definitions, where one is filtered by US incidents only, and one by European incidents only.
- Use the groupby parameter to create Groupby for the country location so that all US definitions create a US model and all European definitions create a European model. Then, based on the incident, the system identifies which model it uses to predict the correct classification category.
The second approach has benefits in that the models you use can even be in different domains, such as healthcare or finance. This approach is especially beneficial if you have several country locations or domains to maintain.
Example usage for training and prediction using Group By via API
var myIncidentData = new sn_ml.DatasetDefinition({
'tableName' : 'incident',
'fieldNames' : ['category','short_description','assignment_group','description','priority'],
'encodedQuery' : 'activeANYTHING'
});
var mySolution = new sn_ml.ClassificationSolution({
'label': 'solution label',
'dataset' : myIncidentData,
'groupByFieldName' : 'assignment_group',
'predictedFieldName': 'category',
'inputFieldNames': ['short_description','description','priority']
});
//Add solution definition
var solution_gr = sn_ml.ClassificationSolutionStore.add(mySolution)
//Get existing solution
var my_unique_name = sn_ml.ClassificationSolutionStore.get('solution name');
// submit training job
var solutionVersion = my_unique_name.submitTrainingJob();
// Run prediction
var input = new GlideRecord("incident");
input.get("sys_id");
// configure optional parameters
var options = {};
options.apply_threshold = false;
var mlSolution = sn_ml.ClassificationSolutionStore.get('solution name');
//Prediction using glide record
var results = mlSolution.getActiveVersion().predict(input, options);
//Prediction using map
var results = mlSolution.getActiveVersion().predict([{ 'short_description': input.short_description,
'assignment_group': input.assignment_group }], options);For more context regarding this example and the general usage of Machine Learning APIs, see the links in the Related Content section on this page.