Structured Data vs. Semi Structured Data

Structured Data vs. Semi Structured Data

by kathleen_jo on ‎09-23-2016 02:25 PM - edited on ‎03-07-2017 12:06 PM by Community Manager (5,666 Views)

This document describes the differences between structured data and semi structured data and how they relate to DataAccess.

What is data modeling?

To talk about structured data versus semi structured data we first need to describe what data modeling is. Data modeling establishes the logical structure of a database. Data modeling determines how data is stored, organized, and then manipulated in the database. The two types of data models that DataAccess provides are structured data and semi structured data. 

What is Structured Data?

Structured data is highly organized data that is easy to search and is predictable. The data is usually in a fixed field or predetermined record and can be related to other data records within its structure. An example of structured data would be a relational database or even the table below 

first_name last_name order_id order_total
Kathleen Jo 123456 12.34
John Doe 098765 98.76

What is Semi Structured Data?

Semi structured data does not have the same level of organization and predictability of structured data. The data does not reside in fixed fields or records, but does contain elements that can separate the data into various hiearchies. Examples of semi structured data are:

  • JSON (this is the structure that DataAccess uses by default)
  • XML
  • .csv files
  • tab delimited files

 

[
     {
          first_name  : "Kathleen",
          last_name   : "Jo",
          order_id    : "123456",
          order_total : "12.34"
     },
     {
          first_name  : "John",
          last_name   : "Doe",
          order_id    : "098765",
          order_total : "98.76
     }
]

 

Why are the differences important?

Structured data is easier to...

  • upload
  • extract
  • load
  • store
  • query
  • and analyze

...because of its high degree of organization and fixed fields. 

Semi structured data, due to its lack of organization, makes the above harder to accomplish, and requires an ETL into a system such as Hadoop before it can be utilized. 

DataAccess, Structured Data, and Semi Structured Data

Below, please find a chart describing the different DataAccess offerings. 

  Event Data Audience Data Platform
Semi Structured Data EventStore  AudienceStore Amazon S3
Structured Data EventDB AudienceDB Amazon Redshift
Customer Collected EventDirect Webhook Customer Host