Real-Time News From Build

22 May, 2024 |
Brian

Brian is the person behind the dcode.bi site. He is keen on helping others be better at what they do around data and intelligence.

Real-Time News From Build

Microsoft Build has started and the keynotes for the main releases has been broadcasted. And what a line up of new things from the Real-Time World.

Here is the consolidated highlights from the news around Real-Time experience in Fabric.

  1. New name for the Real-Time experience
  2. Real-Time Hub
  3. Real-Time Dashboards
  4. Copilot in KQL Querysets

New name for the Real-Time experience

First of all, the Real-Time Analytics group of services is now called Real-Time Intelligence. This makes sense, as we see even more services and possibilities to work with time series data in every aspect of a data and business intelligence project.

Real Time Intelligence

The name also implies that the Real-Time Intelligence, might get more services in the future (this is my own guess). We see a lot of services from the other groups in Fabric - more self-service, more ease of use, more overview and so on. We might see new aspects of the RTI (the new short for Real-Time Intelligence) in the future.

Real-Time Hub announced

The Real-Time hub, found as a main menu item to the left in Fabric, is the new high level overview of what is going on in the Fabric tenant around streaming data, event driven processes and loading of data around Real-Time Intelligence.

Real Time Hub

This hub gives you an overview in 3 main elements:

  1. Data Streams
  2. Microsoft Sources
  3. Fabric Events

Data Streams

This menu item under the Real-Time Hub shows you all the streaming items you, with your access, has access to work with. It will show you both the tables containing the data and the streams to those tables.

There is several options to filter the view:

Real Time Hub filter

  • Stream Type - here you can filter on tables or streams
  • Owner - filter on who has created the item
  • Item - filter on the name of the items found in the workspaces
  • Location - filter on the name of the workspace where the items are stored

When clicking one of the items on the list, I’m guided to the information page of that specific item.

Here I can see specific information for that item, like Owner, Location, Type, Parent item, upstream item, downstream item and Stream profile - last one, found to the far right on below screenshot, to give an indication of the inlet and outlet of data for that specific item.

Real Time Hub Item

You will also find a button to preview the data in the specific selected area/item. Clicking this button will guide you to an overview of the inlet/outlet for that item and below, a table with a subset of the data from that table.

Real Time Hub example data

Microsoft sources

This part of the Real-Time Hub shows you the possible Microsoft sources you could configure to work as a streaming source for your data.

You will be presented with a list of elements from the workspaces you have access to. These elements are all candidates to configure as a streaming dataset into Fabric. These sources can also be outside of Fabric, so items you have access to in a resource group in Azure or other places.

Below an example of such a list (blurred by intend):

Real Time Hub - Microsoft sources

The list contains information of the item type, the Source type, Subscription, Resource Group and Region. The last 3 columns are the information found in Azure.

From this list it is now possible to configure the stream directly. Just hover one of the items and select the ->] icon.

From here you can put in the needed information - below is an example of a SQL Server CDC stream being configured. In this way the availability for creating new streams has been eased tremendously.

Real Time hub - new strean

Fabric events

This might be one of my favorite new things in this release - the Fabric Events.

The new feature enables users to create activities that fire based on events happening in Fabric or, for now, in an Azure Blob storage container.

Previously in Synapse Analytics, we could trigger a pipeline to run from a file being saved in a blob storage - and we’ve been wishing for this to be availabe in Fabric. Well, now we can!

Real Time hub - event driven activities

The Azure Blob Storage Events covers the above mentioned “missing link” from Synapse Analytics - plus a bit more.

With triggers of Created, Deleted, Renamed, and a whole lot more, we now have the ability to start a new proces based on things happening in the configured blob storage.

The entire list for now is:

  • Blob Created
  • Blob Deleted
  • Blob Renamed
  • Blob Tier Changed
  • Directory Created
  • Directory Renamed
  • Directory Deleted
  • Async Operation Initiated
  • Blob Inventory Policy Completed
  • Lifecycle Policy Completed

For all event types, you will find the given information fields - like Source, subject, time, type etc. Example below shows the information fields from the Blob Created event:

The Storage Diagnostics field gives you a “batchId”

Blob Created information fields

The Fabric Events covers events happening in Fabric and this in completely new. We can now start processes based on events happening directly in Fabric.

With a complete list:

  • Item Create Succeeded
  • Item Create Failed
  • Item Update Succeeded
  • Item Update Failed
  • Item Delete Succeeded
  • Item Delete Failed
  • Item Read Succeedd
  • Item Read Failed

An item in Fabric, is a new element created. This is now a workspace, but items inside the workspace. So if you create a new workspace, the trigger is not (for now) available to start a new process. But whenever an item is created, deleted, read or updated, you can now trigger a new process.

This process could be to check if the item corresponds to the agreed naming convention or other stuff - of you can think it, you can do it now.

Imagine the possibities now given with this new feature in Fabric. We can now do even more event based triggers than ever before. When an event is happening, you can trigger another event.

I think we should also take care not to create infinite loops. For instance, if an item is created and that event triggers the run of a notebook which creates another item - then we have a problem.

Real-Time Dashboards

With this release Micrsoft now introduces its second visualisation tool after the big Power BI service.

The Real-Time Dashboards is based on the use case, where you need instant information based on a data stream and can’t wait for Power BI to wait 30 seconds to reload.

The Engine behind the dashboard is a KQL Database on which you create KQL statements to build the data-set for each individual visualisation.

To configure your dashboard, you need at least two things:

  1. A Datasource
  2. A Base query

The datasource

This is a guided step - either choose OneLake Data Hub (this includes the build-in KQL Database) or a standalone Azure Data Explorer cluster.

Base Query

This is where the fun begins - create your KQL Query which will support the visualisation (a so called Tile in the dashboard). The thing to remember for this step, is to create a base query which only returns the bare minimum of data needed for the visual - the more data it returns, the more compute you will end up paying. Yes, the compute for each query, is runned every time the dashboard updates - so be careful when designing your queries.

Real Time Dashboard - base query

Notice at the top - you get help to design the query with default parameters of _startTime and _endTime. You can call these default parameters as they are a part of every Dashboard and Tile as a default filter parameter.

A very simple example, where I select the pickup datetime as a between statement from _startTime to _endTime and a simple rowcount by vendorID looks like this:

NYCTaxi
| where tpep_pickup_datetime between (_startTime .. _endTime)
| summarize count() by VendorID

Also remember to give your Base Query a Variable name:

Base Query - done

Now you can build the dashboard as you see fit - when the data is added to a tile, you can add any visualisation you need from a dropdown of elements and design the look and feel to your liking.

I will not go down the detailed path of the dashboard - so here is an example (copied from this link as Data Explorer Dashboards has almost the same build and features) for you to get an idea of how things could turn out.

Real Time Dashboard

Copilot on KQL Querysets

We also get Copilot to help us create the awesome KQL Querysets in Real Time intelligence.

Real Time Copilot

We can now just ask Copilot to help us write the KQL queries for us. As always take care of reading and understanding the code given, as Copilot is great, but not flawless.

For everyone who wants to try this feature, you need a dedicated Fabric Capacity of at least F64 to run this Copilot in KQL Querysets - it does not work on a trial capacity or a Power BI Premium.

To sum up the news

I will be diving into more details on all new elements in the following days in this blog - so keep an eye out, or just sign up for the newsletter and I will make sure to send you an update of news posts.

The news from this year’s Microsoft Build around Real-Time Intelligence is perhaps the biggest release in the area since Fabric. We can now do what we could in Synapse Analytics and a lot more around event driven processes, we can get an complete overview of the streams and data sets for time series data and we can build dashboards which updates based on KQL data and query sets.

I hope you enjoyed this post - if you have any comments, please leave them in the chat below. I would love to hear from you.

comments powered by Disqus