This article describes target, output, and excluded attributes in Tealium Predict ML and how to select the right target attribute to use in your models.

In this article:

Target, Output, and Exclusion Attributes

When defining a model, each attribute in your Tealium AudienceStream CDP profile is reviewed to automatically determine the top attributes that have a predictive relationship for the action you want to predict.

  • Target Attribute
    A target attribute is a AudienceStream attribute that represents the visitor behavior that you want to predict with any Tealium Predict model. For example, a true boolean visit attribute named "Has Purchased" signals that a purchase event has occurred during a visit while a false visit attribute means a purchase event did not occur during a visit.

    The target attribute must be either a boolean/flag or a badge type attribute and be Visit or Visitor-scoped. When building a model, you can select a target attribute from the visit- or visitor-level booleans attributes that display in the drop-down list.

    Properly structured booleans default to false and are enriched to true. Visit booleans are reset to false after each visit, while Visitor booleans reset back to false depending on the attribute configuration.

    Configurations should not be changed in the attribute. If you need to change an attribute configuration, create a new attribute to collect the required data.

  • Output Attribute
    The output attribute is an attribute created as a result of model training.
  • Exclusion Attributes
    You can select to exclude attributes that are not relevant for the results you are seeking. These are referred to as Exclusion Attributes. When deciding which attribute types to exclude, Tealium recommends that you first train the model for initial insights with no attributes excluded.
    Training without including exclusion attributes provides insight into which attributes the model finds the most relevant and can lead you to consider introducing new AudienceStream attributes to help future model trainings.
    For example, after the initial training, you can exclude attributes with values that occur outside of the training period. After excluding these types of attributes, your training F1 score results may be lower when you retrain; however, your model produces better results when deployed.
    • Attributes based on dates of visit or dates of purchase. These attributes do not repeat their values outside of the training period.
    • Attributes based on unique user information, such as a User ID or Analytics ID. These attributes do not apply to other users outside of the training period.

Determining Attribute Readiness

When you create a model using Tealium Predict, you can choose from a list of potential target attributes from your AudienceStream dataset. Each attribute is given a rating of Ready or Not Ready to signify whether the attribute is ready for training. The readiness of an attribute displays next to each attribute in the drop-down list of available attributes when you click + New Model. This feature helps you determine, in advance, whether an attribute candidate is a Ready or Not Ready to use as a target output attribute, as shown in the following example.

predict_v2_target_attr_ready.png

This rating simplifies model creation by clarifying which of your flags and badges are ready to be used to create models. A rating of Not Ready does not mean that the attribute is problematic in other contexts, but that it is currently deemed insufficient for successful training of a Tealium Predict model.

Machine learning technology generally requires a high volume of data to succeed and machine learning models provide better results when trained on a large amount of data.

The following two factors to define target attributes that are Ready or Not Ready:

  • The volume of data for the attribute
  • The distribution of that volume between true and false values

Both the true and false groups must be above a minimum threshold. For example, for the dates that span the Training Date Range, the median daily counts of True and False visitors must be greater than or equal to 200. This threshold is intentionally set as low as possible to provide the most options possible for target attributes. A model with a target attribute that is labeled Not Ready typically fails during the training process due to insufficient data for the model.

If the target attribute you want to use is labeled as Not Ready, try one of the following solutions:

  • Wait for more data to accumulate. The problem often solves itself over a longer time period.
  • Identify ways to drive additional traffic to the AudienceStream data source.
  • Add additional data sources to your AudienceStream profile so that the volume of daily data increases for this target.
Public