Working with AudienceStore and EventStore

Working with AudienceStore and EventStore

by Community Manager on ‎03-10-2017 04:22 PM - edited on ‎08-04-2017 11:33 PM by Community Manager (1,074 Views)

This article shows how to access your AudienceStore and EventStore services, which store your unstructured audience and event data as compressed JSON files on Tealium's Amazon S3 bucket.

These services must be activated for your account. Contact your account manager for more info.

This article covers:

Table of Contents Placeholder

Prerequisites

  • Active AudienceStore Connector (see the Setup Guide)

or

DataAccess Console

You can browse the data files associated with each Audience and Stream via the DataAccess Console. This is an easy way to verify that data is flowing through the system and to download a sample file to get familiar with the format.

To access files via the DataAccess Console:

  1. Navigate to Act > DataAccess Console.
  2. Click AudienceStore or EventStore from the left-side navigation.
  3. Select the number of weeks of data to display.
  4. Select the name of the AudienceStore Action or Stream.
  5. Click Reload.
  6. Click a date to expand the list of file details.
  7. Find the file you want and cilck Download.

The .gzip file will be saved to your computer where you can use an unzip utility to open the file.

Once you have your Amazon S3 credentials, you can download your S3 Events logs using a GUI-based FTP client or by programmatically downloading the logs using tools like AWS CLI (Command Line Interface) or AWS SDK for Python (Boto).

The downloaded JSON file will have a .gz extension as of August 4, 2017. All files downloaded before August 4, 2017 will have the .gzip extension. Learn more about this change here

Retrieving Data Files

Your data files can also be accessed via third-party tools such as an FTP client or the Amazon S3 command line interface. To allow these tools access to your files you will need the connection credentials for your bucket.

To get the Amazon S3 Access Key:

  1. Navigate to Act > DataAccess Console.
  2. Click AudienceStore or EventStore from the left-side navigation.
  3. Click Get Amazon Access Key.

The following fields are displayed: Access Key ID, Secret Access Key, and Path. For security purposes the Secret Access Key is only displayed once, so it's important to store it securely for later use. If you ever lose this value you can regenerate a new one, but it will invalidate all previous connections that used the old value.

FTP Clients with Amazon S3 Support

We recommend using a desktop application for a more convenient method of downloading a large number of AudienceStore/EventStore files.

Here are some Windows and Mac clients that work with Amazon S3.

  • Windows: Cyberduck, CrossFTP
  • Mac: Cyberduck, CrossFTP, Transmit

The primary benefit of using a GUI-based FTP client with S3 support is that you can point-and-click on individual files and folders to download from Amazon S3. 

Below is a sample screenshot in Cyberduck for how to configure the connection. Note that there is not an input for the Secret Access Key within the configuration wizard, you will instead be prompted for it upon a connection attempt where it will be saved for future use.

cyberduck.png

Amazon Command Line Interface

For more technical users, the Amazon Command Line Interface (CLI) can be installed to give you full control over accessing your data files. The primary benefit of using a Amazon CLI is the ability to customize for your specific needs, such as syncing and automating the file retrieval from Amazon S3. 

Some example uses:

  • Initial bulk download of all historical log files
  • Schedule hourly incremental download to grab only the newest generated log files
  • Synchronize a local folder on your desktop or server to a remote folder on S3 so that they contain exactly the same log file content
  • Download files before and/or after a certain LastModified date

To install Amazon CLI, follow the instructions at:

http://docs.aws.amazon.com/cli/latest/userguide/installing.html

You will be prompted for your Access Key and Access Key ID when you call aws configure (you can leave Region Name and Output Format blank). 

Once you configure CLI, you can now make queries using the s3api method, which we will use in the remaining examples. 

List Objects in S3

The "list-objects" method allows you to list objects in your S3 directory. This is needed get the "key" for each object to download individual files.

List All Objects in Root folder:

aws s3api list-objects --bucket uconnect.tealiumiq.com \
    --prefix {account}/{profile}/

 

List All Objects in events folder:

aws s3api list-objects --bucket uconnect.tealiumiq.com \
    --prefix {account}/{profile}/events/

 

List All Objects in specific folder:

aws s3api list-objects --bucket uconnect.tealiumiq.com \
    --prefix {account}/{profile}/events/{stream}/

Get Single Object

The "get-object" method will download one specific remote key to a local location on your desktop or server.

aws s3api get-object --bucket uconnect.tealiumiq.com \
    --key {account}/{profile}/events/{stream}/{filename}.gz ./

 

The "--key" component is made up of this format:

 {account}/{profile}/events/{stream}/{filename}.gz

Synchronize Local and Remote folders

The "sync" method takes a remote folder on Amazon S3 and synchronizes it with a local folder on your desktop or server. In this example we synchronize a specific remote Stream folder to a local folder on the desktop.

The "--dryrun" argument shows you what files would actually sync, without actually doing the download. To execute the actual download, remove the "--dryrun" argument.

aws s3 sync s3://uconnect.tealiumiq.com/{account}/{profile}/events// \
    ~/Desktop/temp --dryrun

Lastly, you can also filter the "sync" method to only download files matching a specific filter.

aws s3 sync s3://uconnect.tealiumiq.com/{account}/{profile}/events// \
    ~/Desktop/temp --exclude "*" --include "*2015.06.14*" --dryrun

In this example, only the files that match the wildcard filter of "*2015.06.14*" will be downloaded.