Azure

Dedicated SQL Pools Vs Serverless SQL Pools In Azure Synapse Analytics

In Azure Synapse Analytics you will be frequently crossing over a term called SQL pools. It is good to know the difference and the working functionalities of both of them.

No requirement will be similar to the one before and the end users may need different types of usage for each project. Microsoft has kept that in mind when designing Azure Synapse analytics and made sure end users can use different approach to manage their compute availability for different scenarios. Compute and storage are the basic building blocks but in Synapse analytics both are maintained separately which enables us to scale compute independently than the data stored in the system.

Serverless SQL Pool

A Serverless SQL Pool is an auto scale compute environment in which we can utilize the TSQL capabilities to query the ADLS directly. It follows a simple model wherein the Serverless SQL Pool acts as compute engine and ADLS servers as a storage. All the Synapse analytics workspace that we create will be having a Serverless SQL Pool endpoint as a default configuration. Serverless SQL Pools are useful in querying data from ADLS (including parquet, CSV formats). It’s very popular than a dedicated SQL pool for the fact that it has greater control in billing that will help the customer in approaching the budget and forecasting. It imposes the pay per query model, and it helps the customers for adhoc querying. Since Serverless SQL Pools don’t own local storage all the queries are routed to external endpoints and the reading from storage might affect query performance.

Dedicated SQL Pool

Dedicated SQL Pool is a powerful MPP (Massively Parallel Processing) distributed query engine that is apt for big data and data warehousing workloads. The performance is based on DWU (Data Warehouse Units) that we select when we create the resource. Minimum DWU is 100 and there will be one compute node for that. We can select to double our DWU to 200 which will give us two compute nodes and double the performance from single node. The best part is we have the option to pause our Dedicated SQL Pool resources to suspend costs and resume it when we needed. For example, we can process a batch of data and then pause it once it is completed. Later when we need the data only for reading, we can simply scale down to 100 DWUs instead of pausing completely. Please note that the storage costs apply even when we pause the Dedicated SQL Pool.

References

Microsoft official documentation

Tags
Show More

Related Articles

Leave a Reply

Your email address will not be published. Required fields are marked *

Close