[SQL] Why an insert on a certain table is slow?

It is a good topic database question to discuss. There are many reasons why an insert on a certain table takes a little longer than usual. The online article that I found has listed a few reasons. Let me look into it:

  • Existence of INSERT TRIGGERs on the table
  • Lots of enforced constraints that have to be checked (usually foreign keys)
  • Page splits in the clustered index when a row is inserted in the middle of the table
  • Updating all the related non-clustered indexes
  • Blocking from other activities on the table
  • Poor IO write response time

In the same article, some of the replies suggested doing the following:

  • Reduce the batch size from 10000 to something smaller, like 2000 or 1000. In SSIS, reducing the batch size according to the allocated buffer memory size is a great help. It may take a while but at least it lowers the chances of getting batch processing errors that resulting from the failed insertion.
  • Try turning on IO Stats to see just how much IO the FK lookups are taking.
  • Run the query with SET STATISTICS IO ON and SET STATISTICS TIME ON to see the execution time and read/write io for each insert.

References

https://dba.stackexchange.com/questions/482/how-can-i-tell-why-an-insert-on-a-certain-table-is-slow

[SSIS] Implement Timer Delay

Every day I am working and learning new things in SSIS development. Today’s topic is relatively short because it is a simple task to complete.

Background

I was testing my new SSIS solution project for generating CSV data files. There was an update statement I wanted to perform at the end of the SSIS packages execution. However, it kept failing because the SQL Task returned a count that did not match my set condition. The reason was that the package started its PackageStart event, checked the condition using a SQL query, and returned the result while the previous package logged the PackageEnd event.

Therefore, the result turned unmatched with the condition, and the SSIS solution failed. I am glad that I could test the fail scenario because it was a script that I wrote to set the SSIS DTS Task to failure if the condition was unmatched.

How to Resolve?

The instant idea that came into my mind was, is there something we can do to delay the package from calling the SQL query from the SQL Task? I searched online for the syntax to do it. SSIS Toolbox does not come with a timer component.

To achieve that, I used the SQL Task component right before the SQL Task component to run a SQL query for condition checking. The additional SQL Task will run the WAITFOR statement. The DELAY argument specifies the amount of time to pass before proceeding. I set for 20 seconds for testing. For detail about the WAITFOR statement, refer to the reference link below.

It is a pretty cool SQL statement that I learned today.

Reference

  

[SQL] Analyze Execution Plan Graphical Components

SQL Execution Plan is an Actual Plan generated after executing a simple SELECT statement. Before execution, turn on the Actual Execution Plan and press “Execute” or F5 to execute a SELECT statement. When the execution is completed, I will see three tabs; Result, Message, Execution Plan.

Upper Part

The upper part of the SQL execution plan shows the cost of this query relative to the overall batch cost in the case of generating the execution plan for multiple queries in the same session. In some cases, the SQL Server Query Optimizer suggests an index, For example, a missing index, that it finds will enhance the performance of the query in a specific percentage that will be displayed as a recommendation.

How to Read?

To read the SQL Execution Plan correctly, the execution flow starts from the right to the left, and from top to bottom, with the last operator at the left, which is the SELECT operator in most queries. It contains the final result of the query.

Read more: https://www.sqlshack.com/how-to-analyze-sql-execution-plan-graphical-components/

Arrow in Execution Plan

SQL Execution Plan Arrow represents how many rows are processed or run by that operator. The arrow does not represent how much data has been sent from one operator to another operator. The thickness of the arrow indicates the amount of data passed between the operators. Hover the arrow to find out more information.

Arrow Points Out Performance Issue

The thickness of the arrow can also be an indication of a performance issue. For example, if the execution plan shows thick arrows, the number of the rows that are passed through the arrows is large at the beginning of the plan, and the number of rows passed through the last arrow to the SELECT statement and returned by the query is small then a scan operation is performed incorrectly to a table or an index that should be fixed.

I am trying to understand the above line.

What is the percentage representing?

Below each operator, a number as a percentage representing the SQL Server Query Optimizer estimation for the operator’s cost relative to the overall query cost will be displayed.

I have a few things that I do not know how to read it yet.

  • I/O and CPU costs.
  • subtree cost of the operator

[SSIS] AcquireConnection method call failed

This is an SQL Server Integration Services (SSIS) topic. A few months ago, I received an user request to perform a series of data sync from the source into our data warehouse in UAT environment. Once I received the access to the server to perform the task, I opened the SSIS solution from the Visual Studio, and load the packages that I needed to execute.

You may wonder why I have to manually execute the packages instead of using the SQL Agent to trigger the SSISDB that will execute these packages. Yes, I can do so. However, I was told by my colleague that some of the solutions have not been re-deployed to SSISDB after the password updates. These solutions are connected to the Oracle and MSSQL databases.

Therefore, I chose to execute it via Visual Studio.

Problem Faced

Upon execution, the SSIS hit error. The error message contained a line of words, AcquireConnection method call failed. The whole error message was rather confusing. I asked my colleague what happened to the connection after the password updates. Firstly, we checked with the database administrator that the user account is not locked. We did some Google searches and found out that our UAT environment is not 64-bit solutions.

Therefore, the SSIS solution that runs on 64-bit gave problem.

What I Did to Resolve?

I think many people mentioned it in the Internet, set the Run64bitRuntime at the project configuration property to False would help because my UAT environment is running on 32-bit version.

Hope this post can help some of you who are searching for solutions. There are plenty of different solutions out there. If the environment is using 32-bit version, running SSIS packages that connect to Oracle database that runs on 32-bit version too, this might help.

[SSIS] The ‘Microsoft.ACE.OLEDB.15.0’ provider is not registered on the local machine

This is an SQL Server Integration Services (SSIS) topic. I am developing an SSIS package recently, and it requires me to write some of the C# codes for the Script Component that reads the columns from an Excel file and writes into the System.Object. After that, the data from the System.Object will be used to insert into my local database.

It does sound quite straightforward if you think that I read from a proper tab and proper columns in the Excel file. The data will be inserted correctly into the table in my local database based on the column names.

However, I am not familiar with the C# scripting during the development, and I spent quite sometimes to understand how the BufferWrapper.cs and ComponentWrapper.cs files work. I knew that my main code will be residing at the main.cs file. I cover the story of my discovery in another post; you may want to look if you are a beginner or have no experience using Script Component in SSIS.

What is My Problem?

Besides that, I faced another problem on my local machine when I tried to execute the SSIS package. The error returns as “The ‘Microsoft.ACE.OLEDB.15.0’ provider is not registered on the local machine”. It has an issue on running a 32bit or 64bit application. I am going to share about my solution in this post.

At first glance, I thought that I have to check the drive of ‘Microsoft.ACE.OLEDB.15.0’ provider in my local machine. I did not manage to find the information from the Control Panel’s Programs and Features. Next, I went to the OCDC Data Source Adminstrator to check, and found that the excel files are using 64-bit.

What I Did Next?

The first thing that comes into my mind is to set the Debug Option – Run64BitRuntime to True in the property page.

It did not immediately resolve my problem when I executed the package again. It gave the same error. I googled again to find other solutions, including changing the Excel file extension from .xlsx to .xls format and executing it using an older Microsoft Excel version. It seemed not working.

In the test server environment, I am able to execute the package using .xlsx file extension. Hence, I changed the Excel file connection manager to point it to the .xlsx file instead. Then, I saw the dropdown option to select the Excel version. I gave a try to change its version from Microsoft Excel 2013 to Microsoft Excel 2016 (for my local machine setup).

Execute The Package Again

Tada, the execution was successfully done, and I checked the database table to ensure the logic are working, the control flow is correctly executed, and the expected result is met. My solution is basically set the 64-bit runtime to True in the Debugging Option and the Excel version.

There are plenty of other solutions provided to resolve different use cases that give the same or similar error in SSIS. I hope this post may help to give you some ideas on how to fix it.

Use the Azure portal to manage Azure SQL Database

As part of the project requirements to know Microsoft Azure and Azure SQL, I begin the Azure SQL self-learning sessions. The first topic is Introduction to Azure SQL. This post is used as my study notes, and the content is useful for understanding.

Azure SQL database provides database services in Azure. It’s similar to SQL Server, except that it runs in the cloud. Azure SQL Database is a PaaS offering from Microsoft. I can create a managed database server in the cloud, and then deploy the databases on this server. I can manage the Azure SQL database using the Azure portal.

Azure SQL Database is available with several options: Single DatabaseElastic Pool, and Managed Instance

1. Single Database

This option enables you to quickly set up and run a single SQL Server database. You create and run a database server in the cloud, and you access your database through this server. Microsoft manages the server, so all you have to do is configure the database, create your tables, and populate them with your data. You can scale the database if you need additional storage space, memory, or processing power. By default, resources are pre-allocated, and you’re charged per hour for the resources you’ve requested. You can also specify a serverless configuration.

In this configuration, Microsoft creates its own server, which might be shared by a number of databases belonging to other Azure subscribers. Microsoft ensures the privacy of your database.

Azure SQL Database Use Cases

  • Modern cloud applications that need to use the latest stable SQL Server features.
  • Applications that require high availability.
  • Systems with a variable load, that need the database server to scale up and down quickly.

Azure SQL Database Business benefits

  • Azure SQL Database automatically updates and patches the SQL Server software to ensure that you are always running the latest and most secure version of the service.
  • The scalability features of Azure SQL Database ensure that you can increase the resources available to store and process data without having to perform a costly manual upgrade.
  • The service provides high availability guarantees, to ensure that your databases are available at least 99.99% of the time.
  • Databases can be replicated to different regions to provide additional assurance and disaster recovery.
  • Advanced threat protection provides advanced security capabilities, such as vulnerability assessments, to help detect and remediate potential security problems with your databases. It continuously monitors your database for suspicious activities and provides immediate security alerts on potential vulnerabilities, SQL injection attacks, and anomalous database access patterns.
  • Auditing tracks database events and writes them to an audit log in your Azure storage account. Auditing can help you maintain regulatory compliance, understand database activity, and gain insight into discrepancies and anomalies that might indicate business concerns or suspected security violations.
  • SQL Database helps secure your data by providing encryption. For data in motion, it uses Transport Layer Security (TLS). For data at rest, it uses Transparent Data Encryption (TDE). For data in use, it uses Always Encrypted.

For more readings on SQL encryption refers to the references section.

2. Elastic Pool

This option is similar to Single Database, except that by default multiple databases can share the same resources, such as memory, data storage space, and processing power through multiple-tenancy. The resources are referred to as a pool. You create the pool, and only your databases can use the pool. This model is useful if you have databases with resource requirements that vary over time, and can help you to reduce costs.

Elastic Pool enables you to use the resources available in the pool, and then release the resources once processing has completed.

For example, your payroll database might require plenty of CPU power at the end of each month as you handle payroll processing, but at other times the database might become much less active. You might have another database that is used for running reports. 

3. SQL Database managed instance

The Single Database and Elastic Pool options restrict some of the administrative features available to SQL Server. Managed instance effectively runs a fully controllable instance of SQL Server in the cloud. You can install multiple databases on the same instance. You have complete control over this instance, much as you would for an on-premises server. The Managed instance service automates backups, software patching, database monitoring, and other general tasks, but you have full control over security and resource allocation for your databases. 

Managed instances depend on other Azure services such as Azure Storage for backups, Azure Event Hubs for telemetry, Azure Active Directory for authentication, Azure Key Vault for Transparent Data Encryption (TDE) and a couple of Azure platform services that provide security and supportability features. The managed instances make connections to these services.

All communications are encrypted and signed using certificates. To check the trustworthiness of communicating parties, managed instances constantly verify these certificates through certificate revocation lists. If the certificates are revoked, the managed instance closes the connections to protect the data.

Managed Instance Use Cases

Consider Azure SQL Database managed instance if you want to lift-and-shift an on-premises SQL Server instance and all its databases to the cloud, without incurring the management overhead of running SQL Server on a virtual machine.

SQL Database managed instance provides features not available with the Single Database or Elastic Pool options. 

If your system uses features such as linked servers, Service Broker (a message processing system that can be used to distribute work across servers), or Database Mail (which enables your database to send email messages to users), then you should use managed instance.

Managed Instance Business benefits

  • SQL Database managed instance provides all the management and security benefits.
  • Managed instance deployment enables a system administrator to spend less time on administrative tasks because the SQL Database service either performs them for you or greatly simplifies those tasks.
  • Automated tasks include operating system and database management system software installation and patching, dynamic instance resizing and configuration, backups, database replication (including system databases), high availability configuration, and configuration of health and performance monitoring data streams.
  • Managed instance has near 100% compatibility with SQL Server Enterprise Edition, running on-premises.
  • The SQL Database managed instance deployment option supports traditional SQL Server Database engine logins and logins integrated with Azure Active Directory (AD).

Typical configuration tasks such as increasing the database size, creating a new database, and deleting an existing database are done using the Azure portal.

Azure Portal

I can use the Azure portal to dynamically manage and adjust resources such as the data storage size and the number of cores available for database processing. These tasks would require the support of a system administrator if I were running the database on-premises. I can even connect to databases and run queries.

For those who have not seen SQL Server Management Studio (SSMS), below is the GUI that I get from the same website. I use SSMS often in my work.

Azure Data Studio is an open-source, cross-platform tool for querying and working with various Azure data sources, including SQL Server and Azure SQL. I briefly used it in one of my projects because I used Ubuntu as the development machine. It has changed since then. Now, I saw it has an additional feature called Notebook. Notebooks allow you to mix runnable code cells and formatted text in one place.

Languages

The Azure SQL is still using the Transact-SQL (T-SQL) language and the Tabular Data Stream (TDS) protocol. Azure SQL supports REST APIs for the management of SQL Managed Instances, and SQL databases. The popular command-line interface such as sqlcmd and bcp still available in the Azure SQL service. Also, Azure CLI and Azure PowerShell cmdlets are supported for specific Azure SQL service scenarios. Tools like sqlcmd and az are pre-installed in Azure Cloud Shell.

Still remember that I shared about the purchase models (vCores or database transaction unit DTU-based options), the service tiers (general purpose, business critical or Hyperscale), the compute tier (provisioned compute or serverless compute), data max size, and the hardware information at the beginning of the Introduction to Azure SQL? These are the required pre-deployment planning.

Deployment

For Azure SQL Database and Azure SQL Managed Instance, there are essentially five panes in the Azure portal to fill in during a deployment. All of the services are built on the Azure backbone known as Azure Service Fabric. Understanding the back end of how some of these services are deployed and scaled on Azure Service Fabric will help understanding various behaviors that one might see.

In the Microsoft Learn’s website, it shares a few of these options.

1. Server

When create an Azure SQL managed instance, supplying the server name is the same as in SQL Server. For databases and elastic pools, an Azure SQL Database server is required. This is a logical server that acts as a central administrative point for a single or pooled database. It includes logins, firewall rules, auditing rules, threat detection policies, and failover groups.

This logical server does not expose any instance-level access or features as Azure SQL Managed Instance. For Azure SQL Database servers, the server name must be unique across all of Azure.

2. Compute and storage

Generally, if migrating, you should use size that is similar to what you use on-premises. Data migration Assistance SKU recommender estimates the number of vCores and data max size based on current workload. Data max size is not necessarily the size of your data today. It is the maximum amount of data space that can be allocated for your database.

3. Networking Configuration

When you deploy a SQL database, currently the default is No access. You can then choose to select a public endpoint or private endpoint. You can also select Add current client IP address if you want to be able to connect from the IP address from the client computer that you used to deploy Azure SQL Database.

With Azure SQL Managed Instance, you deploy it inside an Azure virtual network and a subnet that is dedicated to managed instances. This enables you to have a secure, private IP address. Azure SQL Managed Instance provides the ability to connect an on-premises network to a managed instance, connect a managed instance to a linked server or other on-premises data store, and connect a managed instance to other resources.

4. Data Source

In Azure SQL Database, upon deployment you have the option to select the AdventureWorksLT database as the sample in the Azure portal.

In Azure SQL Managed Instance, you deploy the instance first and then databases inside it. There is no option to have the sample database upon deployment.

You can also deploy a blank database or create a database that’s based on the restore from a geo-replicated backup.

5. Data Collations

In Azure SQL Database, you cannot set the server collation. It’s set at the default (and most common) collation of SQL_Latin1_General_CP1_CI_AS, but you can set the database collation.

In Azure SQL Managed Instance, you can set the server collation upon the creation of the instance. It cannot be changed later, but can modify the collations on a database and column level.

I get to learn what the SQL_Latin1_General_CP1_CI_AS database collation means.

  • SQL means it’s a SQL Server collation (as opposed to a Windows or binary collation).
  • Latin1_General specifies the alphabet/language to use when sorting.
  • CP1 references the code page that the collation uses.
  • CI means it will be case-insensitive. CS would mean case-sensitive.
  • AS means it will be accent-sensitive. AI would mean accent-insensitive.

6. Opt-in For Azure Defender

It is chargeable service that helps identifying or mitigating potential database vulnerabilities and threat detection.

Resource Governance

There are three technologies are primarily used to govern your usage of resources in Azure SQL

  • Windows job objects allow a group of processes to be managed and governed as a unit. Job objects are used to govern the file’s virtual memory commit, working set caps, CPU affinity, and rate caps. You can use the sys.dm_os_job_object dynamic management view to see the limits in place.
  • Resource Governor is a SQL Server feature that helps users (and in this case, Azure) govern resources like CPU, physical I/O, and memory. Azure SQL Managed Instance also allows user-defined workload groups and pools for Resource Governor.
  • File Server Resource Manager is available in Windows Server and is used to govern file directory quotas, which are used to manage Data max size.

Verification

For Azure SQL Managed Instance and Azure SQL Database, the first thing you might do is check the status of the database or instance with the Azure portal or the Azure CLI.

For Azure SQL Managed Instance, you then might check the error log, which is a common thing to do in SQL Server on-premises or in an Azure VM. This capability is not available in Azure SQL Database.

Next, confirm that your network is configured properly, obtain the server name, and connect in a tool like SQL Server Management Studio or Azure Data Studio. 

SELECT @@VERSION
SELECT * FROM sys.databases
SELECT * FROM sys.objects
SELECT * FROM sys.dm_os_schedulers
SELECT * FROM sys.dm_os_sys_info
SELECT * FROM sys.dm_os_process_memory --Not supported in Azure SQL Database
SELECT * FROM sys.dm_exec_requests
SELECT SERVERPROPERTY('EngineEdition')
SELECT * FROM sys.dm_user_db_resource_governance -- Available only in Azure SQL Database and SQL Managed Instance
SELECT * FROM sys.dm_instance_resource_governance -- Available only in Azure SQL Managed Instance
SELECT * FROM sys.dm_os_job_object -- Available only in Azure SQL Database and SQL Managed Instance

Below is some of the explanation of the last 3 lines of SQL statements.

The last three queries are available only in Azure SQL Database and/or Azure SQL Managed Instance. The first, sys.dm_user_db_resource_governance, will return the configuration and capacity settings used by resource governance mechanisms in the current database or elastic pool. You can get similar information for an Azure SQL Managed Instance with the second, sys.dm_instance_resource_governance. The third, sys.dm_os_job_object, will return a single row that describes the configuration of the job object that manages the SQL Server process, as well as resource consumption statistics.

Knowledge Check

Which of the following options have limits depending on your deployment option and service tier?

The answer is size of tempdb, max log size and backup retention.

For verifying deployments, some new queries are specific to Azure SQL Database and Azure SQL Managed Instance. Which of the following queries are available only for Azure SQL platform as a service (PaaS)?

The answer is SELECT * FROM sys.dm_user_db_resource_governance

References

[SQL] Azure SQL Deployment Options

As part of the project requirements to know Microsoft Azure and Azure SQL, I begin the Azure SQL self-learning sessions. The first topic is Introduction to Azure SQL. There are some histories before it re-branded and called Azure SQL. History can Google it online.

Deployment Options

The summary diagram of the deployment options for Azure SQL that I get from Microsoft’s website tells a little bit about Azure SQL.

1. SQL virtual machines (VMs)

From what I understand, the SQL virtual machines (VMs) is infrastructure as a service (IaaS) where I take responsibility for updating and patching the OS and SQL Server. That means I have access to the full capabilities of SQL Server. Azure can help automate backups and security patches.

Some considerations for optimally deploying and managing SQL Server on virtual machines:

  • Self-install SQL Server on an Azure VM can use SQL Server IaaS Agent Extension for licensing flexibility and to enable automatic backups and updates.
  • Consider memory-optimized or storage-optimized virtual machine sizes for maximum performance.
  • Integrate your virtual machines to on-premises networks by using Azure virtual networks.

Versionless SQL

Versionless SQL is an additional significant difference between IaaS and PaaS. Unlike IaaS, which is tied to a specific SQL Server version, like SQL Server 2019, SQL Database and SQL Managed Instance are versionless. The main “branch” of the SQL Server engine codebase powers SQL Server 2019, SQL Database, and SQL Managed Instance. PaaS services allow Microsoft to continually update SQL databases/instances. Microsoft rolls out fixes and features as appropriate. I do not have control over these updates, according to Microsoft’s website. Besides that, the SQL server version will not be shown if I execute the below syntax.

SELECT @@VERSION

2. Azure SQL Managed Instance

Meanwhile, the rest of the deployment options is a platform as a service (PaaS) deployment. These PaaS Azure SQL deployment options contain a fully managed database engine that automates most of the database management functions, like upgrading, patching, backups, and monitoring. 

Azure SQL Managed Instance is the only PaaS service that supports instance-scoped features such as common language runtime (CLR), SQL Server Agent, Service Broker, Database Mail, linked servers, distributed transactions (preview), and Machine Learning Services.

Key features of Azure SQL managed instance:

  • Scalability, high availability and business continuity.
  • Automated backups and long term backup retention.
  • Geo-replication.
  • Automatic tuning, built-in monitoring and built-in intelligence.
  • Network security and advanced security.

The marketplace pre-installed SQL Server on Windows or Linux. Microsoft Azure offers PostgreSQL, MariaDB and MySQL too

3. SQL Database

SQL Database is a PaaS deployment option of Azure SQL that abstracts both the OS and the SQL Server instance away from users. This deployment option allows you to just get a database and start developing applications. SQL Database is also the only deployment option that supports scenarios that require unlimited database storage (hyperscale) and autoscaling for unpredictable workloads (serverless). SQL Database has the industry’s highest availability SLA. It provides other intelligent capabilities related to monitoring and performance, partly because Microsoft manages instances.

Example use case is AccuWeather. It has been analyzing and predicting weather and Azure allows access to its big data, machine learning and AI capabilities.

Additional Options: Elastic Database Pool

Besides all the above, the SQL Database deployment option has additional two options if I have multiple instances or databases. Elastic database pools allow you to share resources among multiple instances and databases and optimize your costs. From Microsoft’s website, I extracted the explanation of SQL Database elastic pools and SQL Managed instance pools. Refer below:

SQL Database elastic pools allow you to host many databases within a single set of provisioned SQL Database resources. This option is ideal for a software as a service (SaaS) application or provider because you can manage and monitor performance in a simplified way for many databases.

SQL Managed Instance pools allow you to host multiple managed instances and share resources. You can pre-provision compute resources. Doing so can reduce overall deployment time to make migrations easier. You can also host smaller managed instances in an instance pool than you can in a single managed instance. 

Purchase Model

It has two options, based on virtual cores (vCore-based) and based on database transaction units (DTU-based). The DTU-based is not available for Azure SQL Managed Instance. Microsoft’s website recommends to use vCore-based model that allows independently select compute and storage resources, while DTU-based model is bundled measure of compute, storage and I/O resources.

Microsoft recommends to use vCore-based model, I pay for:

  • Computer resources – Service tier, number of vCores, and memory, generation of hardware.
  • The type and amount of data and log storage.
  • Backup storage location.
    • Read-access geo-redundant storage (RA-GRS)
    • Zone-redundant storage (ZRS)
    • Locally-redundant storage (LRS)

Service Tier for Performance and Availability

Three tiers are available in the vCore model:

  • General Purpose: Suitable for most business workloads. Offers budget-oriented, balanced, and scalable compute and storage options.
  • Business Critical: Suitable for business applications with low-latency response requirements. Offers the highest resilience to failures by using several isolated replicas. This tier is the only one that can use In-Memory OLTP to improve performance.
  • Hyperscale: Suitable for business workloads with highly scalable storage (100 TB+) and read-scale requirements. From a performance and cost perspective, this tier falls between General Purpose and Business Critical. Hyperscale is currently available only for single databases in Azure SQL Database.

Compute Tier

It needs to look into compute tier if choose General Purpose and the vCore-based model.

  • Provisioned compute is meant for more regular usage patterns with higher average compute utilization over time, or for multiple databases that use elastic pools. Provisioned compute provides a fixed amount of resources over time to ensure optimal performance and is billed for those resources regardless of usage. In provisioned compute, you need to manage the sizing of compute resources for your workload.
  • Serverless compute is meant for intermittent, unpredictable usage with lower average compute utilization over time. Serverless provides automatic compute scaling to simplify performance management and is billed only for the amount of compute used. Serverless also supports automatic pausing and resuming to help further price optimize. When your database is paused, you pay only for storage.

Hardware

The default hardware generation at this time is referred to as Gen5 hardware. Other available hardware options are:

  • Fsv2-series (compute-optimized)
  • M-series (memory-optimized)
  • DC-series (confidential computing)

Knowledge Check

Only SQL Server in an Azure virtual machine gives access to the underlying OS. In the PaaS deployment options, the OS is abstracted away from you, meaning Azure manages it, but you cannot access it.

Consider this scenario: you want to migrate to the cloud, but you use a third-party application that requires access to the operating system (OS). Which Azure SQL deployment option will be easiest to use?

The answer is SQL Server in an Azure virtual machine.

Consider this scenario: you want to migrate to the cloud and remove some of the management associated with SQL Server, but your application uses CLR and Service Broker capabilities from SQL Server. Which Azure SQL deployment option will be easiest to use?

The answer is Azure SQL Managed Instance.

You’re moving an application and database to Azure, but your database is currently 62 TB and will continue to grow. You don’t currently use any instance-scoped features. Which Azure SQL deployment option will be easiest to use?

The answer is Azure SQL Database, single database.

Imagine you have Azure SQL Database with the serverless compute tier database deployed and an autopause delay of two hours. After two hours of no activity, what happens to your database and incurred charges?

The answer is Azure will pause your database to stop compute costs, and charge you only for storage.

References

[SQL] Datetime and Datetime2

Lately, I was working on updating the data marts in the testing environment, and I found one of the data marts was using the data type, datetime2 in the MSSQL. Therefore, I went to do some reading about the differences between datetime and datetime2.

Introduction

Format:

  • Datetime – YYYY-MM-DD hh:mm:ss[.nnn]
  • Datetime2 – YYYY-MM-DD hh:mm:ss[.nnnnnnn]

The datetime needs 8 bytes of storage and it has a range from 1753-01-01 to 9999-12-31. The datetime2 datatype uses 6 to 8 bytes depending on the millisecond precision. It ranges from 0001-01-01 00:00:00.0000000 through 9999-12-31 23:59:59.9999999.

Also, I understand that data from Oracle are using datetime2. Therefore, some tables that store source data from Oracle may use data type datetime2 in the table schema.

Problem faced in ETL

Occasionally, I faced errors when I tried to do Data Conversion of a column with the data type, datetime2 to database timestamp (datetime) in the SSIS package. For example, it tried to convert a datetime2 value, 2021-05-23 15:38:44:6666667, to datetime. The error randomly happened to my SSIS package, and I could not figure out why it happened. In one of the SSIS packages, the Data Conversion worked well with this example without any changes. However, another SSIS package worked well after I deleted the data flow arrow linking from Data Conversion to OLE DB Destination for data insertion.

Cast datetime2 to datetime

On the same problem, I did try to use T-SQL to perform data conversion of a datetime2 value, 2021-05-23 15:38:44:6666667, to datetime in a SELECT statement in SQL Server Management Studio to datetime. It returns the result without any errors. I was expecting it may throw truncation error because of the milliseconds value in this record. If I pre-transformed the data in the SELECT statement before I begin to load data for actual transformation, does it still consider as ETL or it is ELT?

If you have encountered the same problem before and know how to fix it, please share it with me in the comment box below. Appreciated your help. Thank you 🙂

References

Azure Data Studio

This is a quick introduction to Azure Data Studio. I briefly used the Azure Data Studio in 2018 when I was using Ubuntu OS on one of my working laptops. I used Azure Data Studio as a replacement for DBeaver, a SQL database administration tool for Linux OSes. Almost three years now, Azure Data Studio has become more mature, and it has better features that I just found out. Let me share more about it in this post.

Installation

The installation guide and the download link can be found directly from Microsoft’s website. It is a free tool, cross-platform database tool for data professionals who use on-premises or cloud data platform on Windows, Linux or macOS. Windows’ installation can be completed using the wizard, no hassles at all, for macOS, makes sure the Azure Data Studio available in the Launchpad, drag Azure Data Studio.app to the Applications folder, and for Linux is completed using the Terminal and running some commands.

The interface looks as above. What is new to me is the Notebook. It is like a Jupyter Notebook that I can run or execute codes and use for analysis. Notebooks allow you to mix runnable code cells and formatted text in one place. I have not tried it yet. I will share again if I do.

Set Up Connection

I have an Azure SQL database created in the Azure Sandbox, and I used the Azure Data Studio to connect it. When I launched the Azure Data Studio, it prompted me to set up the connection to a server. Refer to the above screenshot to fill up the information. The server name can be found from the Azure Portal.

After it gets connected, I can see the server connection on the left pane with the server name, database name and user name. If I do not specify the database during the connection, I will see a list of databases on this server. The interface looks similar to the SQL Server Management Studio.

Run A Query

On the right side, the interface looks like the above screenshot. I can make a new query by clicking on the New Query. For example, I want to check the database server version. I can write the SQL script in the query editor and click Run. I will check again whether pressing F5 works the same.

That is all about it. I will update again once I try out the Notebook.

Reference

[SQL] Why use “where 1=2”

Sometimes ago, I went to an interview and one of the interview questions was, what do I get after running a SQL statement below:

select * from table where 1=2

I should have quickly Googled it since it was a video interview. But, I did not do so and did not answer this question. After the interview, I did Googled it and posted my answer on my Facebook. Facebook today’s memories reminded me that I shared this interview question and its answer. I would like to share one of the examples from stackoverflow.com.

example

CREATE TABLE EMPL_DEMO AS SELECT * FROM employees WHERE 1=2

EXPLANATION

The above is an example of how where 1=2 is used in table creation. A respondent replied that this is a command used to copy table structure to another table, such as EMPL_DEMO. It will have the same columns as the employees’ table, except for the keys or constraints.

The 1=2 always evaluates to False that prevents from copying any records from the table.

I hope this is simple enough for quick understanding and learning about SQL. You can check out other SQL topics in my blog from this link. If you have any feedback, please leave me a message in the comment box below.

reference

%d bloggers like this: