5 SECONDS-SUMMARY:
- The era we live in today is the era where AI has picked up its pace dramatically. More and more, it is becoming an indispensable component of everyday life and work.
- The most prominent example of that is now already all-too-famous OpenAI’s language model Chat-GPT.
- For that reason, with data analytics immersing in the era of AI, in May of 2023, Microsoft announced a new product called Microsoft Fabric, which immediately caught the great interest and attention of the analytics & data community. Discover what is Microsoft Fabric and how it impact your company.
What is Microsoft Fabric? In its essence, Microsoft Fabric is a suite of integrated individual analytic tools and services that work together to provide a unified end-to-end analytics experience. It is a SaaS (Software as a Service) product, which means there is minimal administrational responsibility put on users, so they don’t have to worry too much about the provision of resources, underlying infrastructure, etc. Instead, they can get real business value out of their organizational data within minutes.
The great advantage and novelty that Fabric brings is the concept of capacity. Up to this point, when various analytic tools and services from multiple vendors were combined, it would very often lead to suboptimal resource utilization, which would, in return, lead to wasted resources and, in the end, unnecessary costs for its users. With that in mind, the introduction of capacities as a single pool of computation to power all services Fabric has to offer has the potential to significantly increase resource utilization and reduce costs.
On top of that, Fabric introduced capacity concepts of bursting and smoothing, which, without going into too much detail, in its essence, allow users to use even more resources than they possess when they need to get the job done faster.
Fabric Components
As mentioned earlier, Fabric is the complete analytics experience. That means it covers all the analytics requirements the business can run through. The main experiences that make up the Fabric ecosystem are the following:
• Synapse Data Warehouse
• Synapse Data Engineering
• Synapse Data Science
• Synapse Real-Time Analytics
• Data Factory
• Power BI
• Data Activator
All of them are already well-established and known services, except Data Activator. It is a no-code experience that provides the capability of real-time detection and monitoring of your organizational data so that it can trigger corresponding actions when specified data patterns are identified.
At the core of Microsoft Fabric lies the unified logical data lake for the storage of all organizational data. It is based on Azure Data Lake Gen2, and it is called OneLake. It is the counterpart of Microsoft 365’s OneDrive.
OneLake brings a number of advantages to the table. All data stored within OneLake is easily accessible by all the analytics engines that power the listed experiences.
OneLake is also a multi-cloud data lake that can incorporate data from different cloud providers, such as Amazon or Google. All that makes OneLake a truly powerful novelty brought to us by Microsoft.
Framework
Our framework was specifically built to accelerate your projects reducing the time and effort you may have on building a solution from scratch on Fabric and establish a data foundation that leverages best practices right from the start. With this framework, you can have a designed solution to gather all the information you have, treat it and organize it in a way that your reports will be fully lineaged and guarantee data quality. At a higher level, the entire process is divided into several phases:
Ingestion phase
Data is extracted from a variety of different data sources and loaded inside Fabric Data Warehouse according to the specified load logic – full load or incremental load. Newly loaded data can then take advantage of the optimized infrastructure of Microsoft Fabric.
Processing phase
Raw data ingested in the previous phase gets transformed and enriched to cater to the business requirements in order to create dimension and fact tables that make up the data model that could later be used for reporting purposes inside Power BI to make better business decisions.
DevOps phase
Version control and continuous integration (CI) can be incorporated with Azure DevOps and continuous delivery (CD) with the deployment pipelines of Power BI in accordance with a set of best practices used in modern software development.
Monitoring phase
Monitoring at this point can be done through a Monitoring Hub and the Microsoft Fabric Capacity Metrics app, which unifies telemetry from all Fabric workload experiences into a single set of turnkey analytics in order to monitor the performance of workloads and their usage compared to purchased capacity. It’s also possible to connect Fabric workspace to Azure Log Analytics workspace, which enables tracking of log metrics and possible creation of your own custom visualizations to help track your workload performance.
Additional advantages and current limitations
It is worth mentioning several other advantages that Fabric brings, as well as some of its current limitations.
One of the novelties that was very well received by users is certainly a Direct Lake connection mode in Power BI, which offers all the benefits of existing modes, such as speed in the case of Import mode and up-to-date data in the case of Direct Query mode, but without their flaws, such as data duplication with Import mode.
Another one that brings great value is the automation of several maintenance tasks that had to be put in place manually on other analytics platforms. Frequent work with data has the potential to generate lots of new files, and if that is not taken care of it can quickly lead to performance issues down the road. Luckily for users, there is no need for them to develop any additional mechanisms to take care of that, but there are built-in features and capabilities of Fabric that can achieve just that.
Furthermore, centralized administration and governance of all organizational data artefacts lift the burden of worry off users´ shoulders about security or compliance challenges.
Finally, an advantage that cannot be left out of the list is the AI copilot, which will empower professional developers to build simple to complex dataflows and pipelines using natural language.
On the other hand, there are still certain features that are either lacking or waiting to be included in the Fabric suite of offerings.
Regarding Fabric Monitoring, there is a future plan to offer a read-only database of detailed diagnostic logs, which will further improve monitoring capabilities and improve overall utilization of your entire workload.
Not all Fabric items currently support version control and/or deployment through deployment pipelines. An example of one such item is a data pipeline item or Data Warehouse item which is currently not deployable.
Considering the recency of Fabric as a product, it is to be expected that those limitations will be overcome within short time span.
Another thing worth mentioning for all the current Synapse users wondering if they could easily migrate to Fabric is that, currently, there is no simple lift and shift migration option available for entire workloads.
Final Thoughts
In conclusion, it is obvious that Microsoft Fabric brings a few novel concepts that have the potential to greatly impact the trajectory of the analytics industry. With that in mind, Microsoft looks to solidify its leading position in the enterprise data analytics world. The idea is to fulfill all business analytics requirements in an easier, faster, more efficient and less costly way. Microsoft Fabric certainly looks to be a step in the right direction.
Data Analytics Engineer