Visit Intelligence Data Structure
The below table represents the structure of the daily file which will be shared with partners. The file contains all the information gathered by the top data providers of Cuebiq. The file will be delivered in TSV format and organized in columns.
Standard Visit Intelligence | High Volume Visit Intelligence | |
Fields Included |
Timestamp (of Visit) |
Timestamp (of first time seen at POI) |
Device ID | Device ID | |
POI | POI (Place ID, Place Name, Geoset) | |
Dwell Time | ||
Use Cases |
|
|
Cuebiq reserves the ability to add new fields to the ones above. New fields will be added at the end of the existing list.
Example of Visit Intelligence data from a TSV file:
2019-02-15T11:08:14-07:00 e132698c-5676-4161-b201-22d63de12306 17.13 82c306f2-2be6-4362-a173-45a097a53ca2 BEST_BUY BEST_BUY_ANALYTICS
2019-02-15T17:31:49-05:00 8B0EC37F-DD8E-407B-AB0C-2650D45B5BE3 34.55 4ad6c0e1-e16a-47ed-82c6-3b63d76f0144 Sam's Club SAM_CLUB_ANALYTICS
2019-02-15T09:05:37-05:00 8b0ec09d-d9b0-4792-9b46-8d5ed8e60c75 62.23 1589dcce-b3b5-4434-bda8-046c39cd40f7 Fifth Third Bank & ATM FIFTH_THIRD_BANK_ANALYTICS
Data Access for a Feed in Production
In a production setting where customers are purchasing live ongoing data, Cuebiq would generate Visit Intelligence data on a daily cadence and will share data feed via an Amazon S3 bucket.
A Cuebiq representative will generate a pair of Amazon AWS keys and provide access to the s3 path where the data feed will be stored.
Data is shared according to the following rules:
- Data is added to the S3 bucket daily, with .gzip compression
- Files are broken into subparts in order to optimize performance
- Files are delimited by tab
- Each file contains a set of data points which can have a 1 to 7-day delay
- Each file is automatically removed 30 days after its creation
- Files can be downloaded more than once within the 30-day window
Cuebiq reserves the right to monitor any activity occurring on Cuebiq owned S3 buckets. Misuse (e.g. downloading the same file multiple times, trying to access a path different from the one provided, performing actions unrelated to data download, etc...) will be notified to the user, and, in severe cases, will be sanctioned deactivating the user’s account.
Data Partition Details
The folder names in Amazon s3 will represent the day on which the data was processed, for example:
2019020100 - meaning the data was processed on February 1, 2019
2019020200 - meaning the data was processed on February 2, 2019
2019020300 - meaning the data was processed on February 3, 2019
Cuebiq processes data at 12AM UTC and there can also be a certain amount of lag in the data as it is first batched on device before being sent to our servers, therefore the timestamps in each dated folder may not represent only that day and could potentially include some data from additional dates.
The rule of thumb is that 90+% of data for a given day can be found by looking at the data for that day and the following day. So for example, to find the data for January 1 2019, the best practice is to look at the folders named:
2019010100 - meaning the data was processed on January 1, 2019
2019010200 - meaning the data was processed on January 2, 2019
Confirming Data Access to Cuebiq’s s3
To confirm that access to Cuebiq’s s3 has been established correctly, a quick check can be run using AWS cli.
Installation documentation for AWS cli can be found here - https://docs.aws.amazon.com/cli/latest/userguide/cli-chap-install.html
Once installed, navigate to the command line and run the below command:
aws -- version
In order to confirm that AWS cli was correctly installed.
Once AWS cli is confirmed to be successfully installed, configure a profile with the below command
aws configure --profile cuebiq_data
You will then be prompted to enter the below fields:
AWS Access Key ID [None]: xxxxx
AWS Secret Access Key [None]: xxxxx
Default region name [None]:
Default output format [None]:
Access key and secret key can be filled in with the credentials provided by your Cuebiq Rep. Default region name and Default output format can remain blank.
Once a profile has been configured, the below command can be used to test that access to Cuebiq's s3 bucket is working as expected:
aws s3 ls s3://<cuebiq_path_provided_goes_here>/
With the s3 path mirroring what Cuebiq has provided exactly. If configured correctly, you should now be able to see the folders within Cuebiq's s3.
FAQs
Q: At what cadence is data uploaded in a feed?
A: Data is uploaded on a daily basis to Amazon s3, early in the morning in EST time. Each morning, the data of the previous day will be added into Amazon s3. For example on the morning of January 2nd 2019, the data from January 1st 2019 would be added to Amazon s3
Q: Why do I see rows in the data with timestamps that are on different days from the date listed on the folder in Amazon s3?
A: Cuebiq processes data at 12AM UTC and there can also be a certain amount of lag in the data as it is first batched on device before being sent to our servers, therefore the timestamps in each dated folder may not represent only that day and could potentially include some data from additional dates.
The rule of thumb is that 90+% of data for a given day can be found by looking at the data for that day and the following day. So for example, to find the data for January 1 2019, the best practice is to look at the folders named:
2019010100 - meaning the data was processed on January 1, 2019
2019010200 - meaning the data was processed on January 2, 2019
Q: Why do I see some rows in the data with timestamps located in the far future or far past?
A: Cuebiq collects date/time as it’s stored on the user’s device, without additional modification. So if a user has set a date/time in the far future/past, Cuebiq would represent it in the data as such.
Q: What unit of time is dwell time expressed in?
A: Dwell time is represented in minutes, with a max of two decimal points.
Q: What format is the timestamp in?
A: Timestamps in Visit Intelligence are stored as ISO-8601. Times are expressed in local time, together with a time zone offset in hours and minutes. A time zone offset of "+hh:mm" indicates that the date/time uses a local time zone which is "hh" hours and "mm" minutes ahead of UTC. A time zone offset of "-hh:mm" indicates that the date/time uses a local time zone which is "hh" hours and "mm" minutes behind UTC.
Q: What is the Cuebiq PlaceID?
Cuebiq creates an internal identifier for each POI in the Cuebiq POI database. This identifier can be used to distinguish between visits to specific POIs within a brand.
Q: How does Cuebiq source its placeID info?
A: Cuebiq utilizes a proprietary database of POI info, containing data on most major brands that can be found in the United States.
Q: What kind of geographic information can Cuebiq provide on the POIs in Visit Intelligence?
A: Cuebiq can provide a Places mapping file to be used in conjunction with the VI feed. This mapping file will allow a customer to map each POI in their VI to a state/zipcode.
Q: If I have a list of custom POIs, not specific to any brand, can Cuebiq still provide a feed of Visit Intelligence?
A: Yes, Cuebiq can incorporate these custom POIs into our database in order to then utilize in Visit Intelligence.
Q: Can additional columns such as DMA, state, or zip code be added to Visit Intelligence?
A: Cuebiq’s place_name column in Visit Intelligence can be customized to any value. Therefore, placeName can assume the role of a value such as DMA, state, or zip code of the POI.
Q: What time is a feed in production uploaded each day?
A: Upload times can vary by day, but often occur in the early morning everyday in EST, and will always be uploaded within 24 hours.
Q: How many files are uploaded per day? What is the size per file?
A: File count and file size can vary per feed, and can also vary per day for each feed. However in general, customers can expect a few hundred part files to be uploaded per day, with no individual file greater than 250MB compressed.
Q: I’m not able to access data stored in Cuebiq’s s3. What’s causing this?
A: Please confirm that access keys that were delivered in order to access Cuebiq’s feed are configured correctly. If issues persist, please contact your Cuebiq rep.
Q: Can the Visit Intelligence feed be delivered to a storage location outside of Amazon s3?
A: At the moment, Cuebiq primarily supports Amazon s3 as the main method of creating and storing feed data. Utilizing Amazon s3 ensures the fastest turnaround time for a feed set-up and the most technically feasible option. If an alternative solution is required, please consult with your Cuebiq representative.
Comments
0 comments
Please sign in to leave a comment.