UPDATE (04/13): Video recording from VDD 2022 on how to build Azure ADF pipelines for reading Sitecore Data
UPDATE(05/17): Sample ADF Function Code: https://github.com/brodbor/SitecoreVDD2022ADF
Sitecore xConnect analytics data is one of the most important assets organizations have. Using historical data and applying a data-driven learning approach allows us to unlock hidden secrets and help drive future marketing solutions. As the discussion around data privacy and cookie tracking concerns become more prevalent, it is critical to rely on tools that could enable Analytics at large, with higher accuracy and controllable user privacy. It is certain that Sitecore xConnect is a great solution for tracking, storing, personalizing, and reporting analytics data.
Sitecore Tracker has the capability to track user engagement levels, and with functionality such as tracking contacts’ online or offline interactions on the client or server side, we have a recipe for building a comprehensive analytics system. The decoupled architecture of xConnect allows us to build our solution using the latest and greatest cloud tech using a low-code approach.
We will look at how to prepare and normalize your xDB Analytics data in Cloud and use it for deep analysis using ML algorithms or just simply as a source for Power BI visualization.
Connecting Azure Data Factory to Sitecore xConnect
The process starts by installing the Azure Data Factory Integration runtime on your Sitecore xConnect instance. The design consists of the ETL, Storage, and Visualization platforms. The main reason for selecting Azure was the SaaS Data Factory solution. This solution allowed me to offload xConnect security and migrate the xConnect OData stream into the cloud. In my example, I’ve used SQL Database as a storage system, but I expect to see Hardoop or Spark clusters in a more complex implementation. The Visualization of the data, in my example, was done through Power BI.
Example of connection string in Azure ADF that pulls OData content from the previous day:
Normalizing Sitecore Analytics Data
Normalizing the xConnect OData stream could’ve been the most complex part of this implementation, but thanks to Sitecore out of the box OData API endpoint and Azure Data Factory SaaS platform, this task became a small configuration of content through the Azure Data Factory interface. Data Factory is capable to offload TLS secure stream and consume the private xConnect API endpoint in just a few clicks.
We are taking denormalized JSON input as shown below and convert it to the normalized structure you see on the right side.
The normalization process does not require code development and can be done using the ADF web interface alone. We are able to transform data for interaction events and use ADF to normalize each event type to its own relational data store.
Visualizing xConnect Data
Now, that we have our data in a normal form in the Cloud, we can use Power BI to create reports and share them with our stakeholders. We could also merge different data sources to create a 360-degree view of contact interactions with our brand.