This document defines general statistical modeling terminology, terms specific to Tealium products, and terms used in the Tealium Predict interface.
In this article:
In AudienceStream, attributes allow you to define important characteristics that represent visitor habits, preferences, actions, and engagements with your brand.
In the Tealium Predict product, you can choose any Boolean or Badge attribute to represent a "Target" attribute. To do this, you must have already set up enrichments in AudienceStream to set this to true or assign it as a badge when the visitor performs the action of interest.
By default, all AudienceStream attributes are used by Predict during training to prepare a model. You can optionally choose to exclude some attributes from your model so that they are not considered.
In Tealium products, an audience is defined as a group of visitor profiles that share a set of attribute conditions and used to trigger vendor actions (connectors) in real-time. The more conditions used to create an audience, the more specific your audience.
AudienceStream is a Tealium product consisting of Omnichannel customer segmentation and a real-time action engine. AudienceStream takes the data that flows into the Tealium EventStream product and creates visitor profiles that represent the most important attributes of a visitor engagements with your brand. Visitor profiles are then segmented by shared behaviors to create targeted audiences to fuel your marketing technology stack in real-time via connectors.
In Tealium Predict, the confusion matrix, also known as an error matrix, is a performance measurement reported for a trained model in the Model Explorer that compares actual and predicted values. In industry terms, a confusion matrix uses a set of test data for which true values are known and then displays actual and predicted values in table format to allow you to visualize the performance of a given algorithm.
In Tealium Predict, data collection refers to a predetermined time range in which data is collected to use with the product.
In the Tealium iQ Tag Management product, a data layer is the foundation of tag management that defines the attributes of your website, such as site language or page names, and other user behaviors that you want to track, such as purchases and logins.
In industry terms, a Data Scientist is typically an analytical expert that utilizes skills in technology and social science to look for trends and manage data using industry knowledge, contextual understanding, and skepticism of existing assumptions to reveal solutions to business challenges.
In Tealium Predict, a deployed model is a model that has been "trained" and then deployed to populate prediction values into your customer profiles in AudienceStream.
A hyperparameter is a parameter that is automatically set before the learning process for a model begins. Though these parameters are tunable and can directly affect how well a model trains, no action is required on your part.
Machine learning is a subfield of artificial intelligence that focuses on enabling machines to learn without human guidance by recognizing patterns. Machine learning uses a set of predetermined rules to remember the patterns, analyze output, and create a model to explain the patterns and guide the future behavior. In cases where you know what data you want, machine learning accelerates the path to acquiring the desired data. In cases where you do not know exactly what you want or a pattern to identify, machine learning can find a pattern and reveal results that you can use to move forward with acquiring the data you need.
In Tealium Predict, a model represents the behavior you are predicting within a specific timeframe, such as a purchase, conversion, or any customer behavior tracked in AudienceStream. A model is created using an algorithm and the results are used to explain patterns and predict future outcomes.
In Tealium Predict, the Model Explorer is an interactive section of the product interface where you can view performance measurements for each model in each stage and fine-tune the model with actionable items from the interface, such as retrain or deploy.
In Tealium Predict, the model strength is a combination of metrics and acceptable thresholds used to assign a score to a model. These scores are used to determine model quality and performance and the ability of the model to perform in the real world. The model score displays in the Model Explorer as "Excellent", "Good", "Fair", or "Poor".
In Tealium Predict, the probability distribution is a performance graph reported for a trained model in the Model Explorer. This graph shows how well the model separates the cases where a visitor did return and perform the action of interest as compared to cases in which the visitor did not return and perform the action. In industry terms, this is a mathematical function whose outcome is to provide the probabilities of the occurrence of different outcomes of an experiment, and thus the probability of a predetermined event to occur.
In Tealium Predict, the ROC/AUC (under the curve) is a performance measurement reported for a trained model in the Model Explorer. In industry terms, the ROC is a true positive rate calculated as the number of true positives divided by the sum of the number of true positives and the number of false negatives. An ROC describes how well a model predicts the positive class when the actual outcome is positive. The true positive rate is also referred to as sensitivity.
Over time, and without additional training, the prediction accuracy for a model is degraded. When you retrain a Tealium Predict model with new data, prediction accuracy is increased and remains more accurate over a longer period of time
In Tealium Predict, training refers to the stage in which a model is consuming and analyzing data for a predetermined period of time to be used for predictions. The size and quality of the data used during this stage plays an important part in the accuracy of results when you deploy the model.
In Tealium Predict, a trained version is a single instance of training a model. Every machine learning model is trained with data in order to accurately make predictions.