This article provides best practices and recommendations for retraining a model.

In this article:

Table of Contents Placeholder

About Retraining

Although machine learning and statistics tools are not designed to guarantee future performance, following simple guidelines can improve performance results and reduce potential clutter from attempting to create more models instead of retraining those already in place. The following sections provide general guidelines and best practices about retraining both trained and deployed models.

For information about how to retrain a model, see Retraining a Model.

Estimating the Number of Trainings Needed

Estimating the number of trainings a model may need is more of a machine learning art than a science and is specific to the technology, not the product. It does take multiple trainings to fine tune any model, typically 1 to 3 trainings, including tuning the attributes to exclude. When the model starts to degrade, another training will produce a better result and then the cycle repeats, but slowly stretches out. Each training (version) typically takes longer to degrade as the training date range gets longer due time time elapsed.

Identifying a Degraded Model

The amount of time lapsed and additional traffic are the two main factors in model degradation. A model in training is only a snapshot, whereas a deployed is model is live. If the score for your deployed model is lower than the score for your trained model, this indicates that your model has degraded and needs retraining.

Deploying a Newly-Trained Version with a Higher Score

When a currently-deployed version of a model has a low performance score and you retrain to create a new version that has a higher score, deploy the newly-trained version to ensure that the best version of the model is deployed.

It is not recommended to recreate a second, similar model in attempt to achieve a higher score. This will likely not provide additional benefit and can add clutter to your implementation.

When a Newly-Trained Version has a Lower Score

When a lowered score for the deployed model makes it evident that the model is starting to degrade, it is evident that the model needs to be retrained to improve performance. The following list illustrates an example of a model that has begun to degrade, is retrained, and a lower score is achieved on the retrained model.

  • The performance for a deployed model using live data starts to degrade, indicating a need for retraining.
  • The model is retrained but the F1 score for the retrained model appears to be the same as the original, degraded score. After a few days, the score decreases even more.

In this scenario, the newly trained model should not be deployed in place of the current model. Adjustments and retraining should continue until the score for the newly-trained model is higher than the initial degraded model score. Only when the score is higher should the new model be deployed.

Low F1 Scores and Rapid Degradation

If a model has a low F1 score when trained, and degrades rapidly, this typically points to one or both of the following issues:

  • The behavior of your visitors is changing or inconsistent within the timeframe used for training the models as compared to the present.
  • You might need additional visitor attributes for a more complete view of visitor behavior.

Retraining Frequency

The need to retrain models is not specific to the Predict product but to machine learning in general. The frequency of retraining depends on your data, which differs greatly between businesses. In general, longer training date ranges most often create models that degrade more slowly. Retraining is typically called for when a model quality degrades below a predetermined score that is acceptable to your organization.

Retraining vs. Deleting

Consider retraining a model before deleting it. When you delete a model, you lose the training history. When you retrain, each training can have a different configuration in terms of time frame, excluded attributes, and data added over time. You can then view the differences between versions (individual trainings) to decide which version to deploy.

How Global Markets Impact Model Training

Global issues that impact markets, such as the COVID-19 pandemic of 2020, do impact models to some extent. When the world, businesses, and visitor behavior shift rapidly, a model that normally takes months to degrade can degrade more quickly than normal. Though this could make it difficult to get a clear picture of your visitor behavior, it is easy to retrain a new version and then seamlessly deploy it without the manually intensive process of retuning over and over using valuable engineering and data staff resources.

How Marketing Campaigns Impact Model Training

Models are sometimes trained simultaneously with other activities, such as a television or radio ad campaign, a major holiday, unseasonable heat, mass riots or political upheavals.

Once activity ceases or no longer applies, your current model may not function as well as it did during training. It is at this point where data science becomes an important factor in your decision-making process.

In addition to training your models, you can control these events by holding them equal or ensure that you have data attributes in place to represent the events in your model so that the model can utilize the data.

Data Distribution Considerations

For a model to accurately predict, the data making the predictions must have a similar distribution as the data on which the model was trained. As data distribution drifts over time, model deployment is not a one time task, but a continuous process. It is a best practice to continuously monitor your incoming data and retrain your model on newer data when you know that your data distribution has deviated from the original training data distribution. If monitoring data to detect changes in data distribution has a high overhead, you can simply retrain your model periodically, such as daily, weekly, or monthly.

Additional Information

For information about how to retrain a model, see Retraining a Model.