Data Warehouse Design

The data warehouse is the core of the Business Intelligence system. Being a database administrator in my previous job, I managed various data warehouses that used to provide meaningful insights into business decision making, data analysis and reporting—for example, the sales reports and sales trends. The data warehouse is not a product, but it is an environment that consists of the current and historical, in both structured and unstructured data. The data warehousing is a process transforming data into information and making it available for the business users. The data is ingested, cleaned and transformed so that the business users can access the processed data by using any Business Intelligence tools such as Tableau, Power BI or SQL clients to analyze data. There is a difference between data warehousing and data mining.

Data warehousing is different than data mining

Types of Data Warehouse

Enterprise Data Warehouse:

  • It is a centralized warehouse.
  • It provides decision support service across the enterprise.
  • It offers a unified approach for organizing and representing data.
  • It provides the ability to classify data according to the subject and give access according to those divisions.

Operational Data Store:

  • Data store required when neither Data warehouse nor OLTP systems support organizations reporting needs.
  • The data warehouse is refreshed in real-time. 

Data Mart:

  • A data mart is a subset of the data warehouse.
  • It specially designed for a particular line of business, such as sales, finance, sales or finance.
  • Data can collect directly from sources.

Star Schema vs. Snowflake Schema

Knowing the differences between star and snowflake schema is one of the most common interview questions. Star schema is a mature modelling approach widely adopted by relational data warehouses. It requires modellers to classify their model tables as either dimension or fact. Generally, dimension tables contain a relatively small number of rows. Fact tables, on the other hand, can hold a huge number of rows and continue to grow over time.

Snowflake Schema is a type of multidimensional model. In snowflake schema contains the fact table, dimension tables and one or more than tables for each dimension table. Snowflake schema is a normalized form of star schema which reduce the redundancy and saves the significant storage. It is easy to operate because it has less number of joins between the tables and in this simple and less complex query is used for accessing the data from database.

Metadata

Metadata is also known as a data dictionary or data specification document. It provides the detail of the data such as the name of the column (attribute), the data type of the column whether it is a string, a number, etc, the order of the columns, the definition of nullable columns and etc. Datasets are made up of data objects (entity). After we identify the metadata, we shall design the conceptual, logical and physical data model.

Relationship and Cardinality

Relationship refers to how to link the tables (entities) together. According to the online Visual Paradigm website (a data modelling tool website), cardinality defines the possible number of occurrences in one entity which is associated with the number of occurrences in another. When you do a Google search to define cardinality in general terms, it returns a definition as “the number of elements in a set or other grouping, as a property of that grouping”. For example, ONE team has MANY players. Below diagram is extracted from lucidchart website which shows the different types of relationship in the database. It is used in the ER diagram, entity-relationship diagram.

Generally, the common types of cardinality are:

  • One to one
  • One to many
  • Many to many

In term of data query optimization, cardinality means the data in a column of a table, specifically how many unique values are in it. If you have done data profiling by using the Microsoft Power BI, for example, you notice Power BI provides a summary statistics of the table that show if the attributes contain unique values. This information helps with planning queries and optimizing the execution plans.

Conceptual, Logical and Physical Data Model

A general understanding to the three data models is that business analyst uses a conceptual and logical model to model the business objects exist in the system, while database designer or database engineer elaborates the conceptual and logical ER model to produce the physical model that presents the physical database structure ready for database creation. 

ERD featureConceptualLogicalPhysical
Entity (name)YesYesYes
RelationshipYesYesYes
ColumnYesYes
Column’s TypeOptionalYes
Primary KeyYes
Foreign KeyYes
Comparison of Conceptual, Logical and Physical data model

In summary,

The enterprise data modelling, metadata management and data warehousing are part of the Data Governance (DMBOK) that I mentioned in my previous posts. I learned it in my current job, and I think it is useful for those who are serious in building an enterprise data warehouses. A part of my job scope includes:

  • Design and develop semantic data models.
  • Design metadata and harmonized code references.
  • Identify business keys and relationships among the tables.
  • Ensure to involve all stakeholders including business users in Data warehouse implementation process.

References:
https://www.datacamp.com/community/blog/data-lakes-vs-data-warehouses
https://www.guru99.com/data-warehousing.html
https://www.geeksforgeeks.org/difference-between-snowflake-schema-and-fact-constellation-schema/
https://www.lucidchart.com/pages/ER-diagram-symbols-and-meaning

SQL Server: Reading the SQL Server log files using TSQL

Recently my colleague asked if I can assist on the SQL Server log files extraction using the SSIS ETL package. The log files are stored in the database server, and it will get purged whenever the database server restarts. My colleague wants to keep the server log files into the data warehouse for further analysis and IT auditing.

System Stored Procedure sp_readerrorlog

This stored procedure allows you to read the contents of the SQL Server error log files directly from a query window. It will enable you to search for specific keywords when reading the error file.  It is not new, and it is available since the SQL Server 2005.

Example of using the stored procedure

This procedure takes four parameters:

  1. Value of error log file you want to read: 0 = current, 1 = Archive #1, 2 = Archive #2, etc…
  2. Log file type: 1 or NULL = error log, 2 = SQL Agent log
  3. Search string 1: String one you want to search for
  4. Search string 2: String two you want to search for to further refine the results

If you do not pass any parameters this will return the contents of the current error log.

CREATE PROC [sys].[sp_readerrorlog]( 
   @p1 INT = 0, 
   @p2 INT = NULL, 
   @p3 VARCHAR(255) = NULL, 
   @p4 VARCHAR(255) = NULL) 
AS 
BEGIN 

   IF (NOT IS_SRVROLEMEMBER(N'securityadmin') = 1) 
   BEGIN 
      RAISERROR(15003,-1,-1, N'securityadmin') 
      RETURN (1) 
   END 
    
   IF (@p2 IS NULL) 
       EXEC sys.xp_readerrorlog @p1 
   ELSE 
       EXEC sys.xp_readerrorlog @p1,@p2,@p3,@p4 
END 

-- Call the stored procedure as below
EXEC sp_readerrorlog 6 

Note: the example here shows the values in single quotes, but for later versions of SQL Server you may need to use double quotes or you might get this error.

In Conclusion,

I will try to run it in my local machine to verify and extract the SQL Server logs from the SQL Management Studio and SSIS ETL package. I will keep developing the content of this article to show the result of the above stored procedure.

References: https://www.mssqltips.com/sqlservertip/1476/reading-the-sql-server-log-files-using-tsql/

Different between OLTP and OLAP

Transaction processing vs. Analytics processing.

What is OLTP?

OLTP stands for Online Transaction Processing. OLTP systems and it becomes the source of data for OLAP systems. The primary purpose of a typical OLTP system is to handle the operational database, to record the routine business data (either through insertion, update or deletion requests), and works as a data warehouse to provide storage for data analysis. It usually involves simple queries. Online transaction processing provides transaction-oriented applications in 3-tier architecture. A typical OLTP database stores current (and detailed) data. The Entity Model (table model) is the scheme used for storing such transactional database. The 3rd Normal Form (3NF) normalization may apply to the OLTP databases. The OLTP needs to perform data backup regularly without fail because it stores all the business functional data.

What is OLAP?

OLAP stands for Online Analytical Processing. An OLAP database reads past data that has been inputted using an OLTP system or different OLTP systems and rarely writes into the database. It usually involves large, complex queries used for business decisions. Aggregation across multiple databases is often required. An OLAP system allows users to view a variety of summaries about stored multidimensional data. OLAP system allows extracting information from a bigger database (presumably the data warehouse) and analyzes the same with the intent of decision-making. It is relatively slow if it involves a large dataset. Data is denormalized to improve query performance when aggregation is required. OLAP is built to persist data if data recovery is needed, OLTP backups can be restored.

The above picture shows the distinct difference between the OLTP and OLAP with the relation with data analysis, data mining and Big Data. Both sets of databases have their own set of data structures and algorithm that work for operational, transactional data in OLTP, and data analytics in OLAP.

Moving forward, I will compile some of the useful SQL interview questions in my blog. Hope to get some feedback from the readers who are working on databases as well.

SQL Server: Find MAX value from multiple columns

A community member from the .Net Developer Community Singapore has shared an excellent article about using the MAX(DateTime) in SSIS to get the unique records when there are duplicated records. In SQL Server Management Studio, I used it often as well to write my SQL statements. However, the article mentions that SSIS’ DateTime variable is missing milliseconds and most of the time, you will get duplicated records. How did the writer resolve the issue? You can refer to the link to read the article. In the same article, there is a reference link to the content which you can read as well.

Recently, I explored into finding a maximum value from multiple DateTime columns in a SQL table. As it is a part of the data extraction requirements, I came across a good example with the execution plan, you can refer to this link for the online article. It shows us how to find the maximum or minimum value from different columns in a table of the same data type.

Below is a snippet of the SQL statements to create a temporary with values:

IF (OBJECT_ID('tempdb..##TestTable') IS NOT NULL)
	DROP TABLE ##TestTable

CREATE TABLE ##TestTable
(
	ID INT IDENTITY(1,1) PRIMARY KEY,
	Name NVARCHAR(40),
	UpdateByApp1Date DATETIME,
	UpdateByApp2Date DATETIME,
	UpdateByApp3Date DATETIME

)

INSERT INTO ##TestTable(Name, UpdateByApp1Date, UpdateByApp2Date, UpdateByApp3Date )
VALUES('ABC', '2015-08-05','2015-08-04', '2015-08-06'),
	  ('NewCopmany', '2014-07-05','2012-12-09', '2015-08-14'),
	  ('MyCompany', '2015-03-05','2015-01-14', '2015-07-26')
	  
SELECT * FROM ##TestTable

Based on the above SQL statements, I believe you should be able to imagine that the temporary table called TestTable has 5 columns in which 3 of the columns are the same data type, DateTime. As the title suggested, our expected result from the solution is a column that contains the maximum date value and another column for ID. The writer provides 3 different solutions.

Solution 1: Using MAX and UpdateDate()

SELECT 
   ID, 
   (SELECT MAX(LastUpdateDate)
      FROM (VALUES (UpdateByApp1Date),(UpdateByApp2Date),(UpdateByApp3Date)) AS UpdateDate(LastUpdateDate)) 
   AS LastUpdateDate
FROM ##TestTable

Solution 2: Using UNPIVOT

SELECT ID, MAX(UpdateDate) AS LastUpdateDate 
FROM ##TestTable
UNPIVOT ( UpdateDate FOR DateVal IN ( UpdateByApp1Date, UpdateByApp2Date, UpdateByApp3Date ) ) AS u
GROUP BY ID, Name 

Solution 3: Using UNION

SELECT ID, MAX(UpdateDate) AS LastUpdateDate
FROM
(
	SELECT ID, UpdateByApp1Date AS UpdateDate
	FROM ##TestTable
	UNION 
	SELECT ID, UpdateByApp2Date AS UpdateDate
	FROM ##TestTable
	UNION 
	SELECT ID, UpdateByApp3Date AS UpdateDate
	FROM ##TestTable
) ud
GROUP BY ID

The writer provided the performance comparison for these solutions and mentioned that solution 1 has the most optimized performance. You can refer to the execution plan in the article and the explanation of the finding. Choose the best solution that fits into your query, table structure, etc. For more references, you can look for the other link below.

Reference: https://www.mssqltips.com/sqlservertip/4067/find-max-value-from-multiple-columns-in-a-sql-server-table/ https://stackoverflow.com/questions/71022/sql-max-of-multiple-columns

SQL Server: LEFT JOIN conditions in ON vs WHERE clause

In one of the new tasks in my project, it required me to check through the difference between the conditions in ON and WHERE clause for a SQL Server LEFT JOIN keyword. Before I go into detail and example between these conditions, let me begin with the basic function of the LEFT JOIN keyword in SQL Server.

LEFT JOIN

The simplest way I can put is LEFT JOIN keyword returns all records from the left table and the matched records from the right table. In the resultset after executing the query in SQL Server, it shows NULL if there is no matched record from the right table. The syntax for LEFT JOIN,

SELECT column_name(s)
FROM table1
LEFT JOIN table2
ON table1.column_name = table2.column_name

Here is a simple example I extracted from w3school,

SELECT c.CustomerName, o.OrderID
FROM Customers c
LEFT JOIN Orders o ON c.CustomerID = c.CustomerID

The Customers in the table1 and the Orders is table2. In the example above, I used an alias, c for the Customers, and o for the Orders. SQL aliases give a table, or a column in a table, a temporary name. The two tables join using the same key, and in this example, it is the CustomerID. For each row from the Customers table, the query compares it with all the rows from the Orders table. If CustomerID is available in both tables, the column values from these two tables will be combined to form a new row which then included in the resultset. If a row from the Customers table does not have any matching row from the Orders table, the query combines column values of the row from the Customers table with "NULL" for each column values from the Orders table. Sometimes, I use the below picture that I extracted from Stackoverflow to help me identify that resultset that I want to get from my query.

The sample resultset as below. As you see from the table, it contains some NULL value in the OrderID column. This means there are no associated orders that the Customers have.

CustomerNameOrderID
Alfreds Futterkistenull
Ana Trujillo Emparedados y helados10308
Antonio Moreno Taquería10365
Around the Horn10355
Around the Horn10383
B’s Beverages10289
Berglunds snabbköp10278
Berglunds snabbköp10280
Berglunds snabbköp10384

You can perform LEFT JOIN with more than one table. In some databases, it uses LEFT OUTER JOIN keyword.

With WHERE condition

The WHERE clause is used to limit the resultset by returning a list of customers that have ordered something. For example, below table removed a customer that does not have OrderID.

SELECT c.CustomerName, o.OrderID
FROM Customers c
LEFT JOIN Orders o ON c.CustomerID = c.CustomerID
WHERE o.OrderID NOT NULL
CustomerNameOrderID
Ana Trujillo Emparedados y helados10308
Antonio Moreno Taquería10365
Around the Horn10355
Around the Horn10383
B’s Beverages10289
Berglunds snabbköp10278
Berglunds snabbköp10280
Berglunds snabbköp10384

LEFT JOIN: conditions in ON vs. WHERE clause

Let observe the difference between the above query and the query below,

SELECT c.CustomerName, o.OrderID
FROM Customers c
LEFT JOIN Orders o ON c.CustomerID = c.CustomerID AND c.Country = 'Mexico'

When the condition of the WHERE clause moves to the ON clause, as shown above, it returns entire customers records. However, only the orders with country = ‘Mexico’ have associated product’s information. It finds the customers that originated from Mexico. The resultset shows as below:

CustomerNameOrderID
Alfreds Futterkistenull
Ana Trujillo Emparedados y helados10308
Antonio Moreno Taquería10365
Around the Hornnull
B’s Beveragesnull
Berglunds snabbköpnull
Blauer See Delikatessennull
Blondel père et filsnull
Bon app’null

This write-up shows how to use the SQL Server LEFT JOIN keyword to retrieve data from multiple related tables.

Note: The INNER JOIN clause, the condition in the ON clause is functionally equivalent if it is placed in the WHERE clause.

Reference: https://www.sqlservertutorial.net/sql-server-basics/sql-server-left-join/

SQL Server: Temporal Table

I am working on a project that requires the ETL to generate daily delta changes. The delta change is based on the last extracted date-time that stored in the log table and filtered the records that meet the criteria from the main table. Based on the performance testing done by my teammate, it was not optimized for millions of records that we need to handle daily. My teammate told me that my manager suggested using the temporal table.

SQL Server 2016 introduced support for temporal table, also known as system versioned temporal tables. The introduction of the temporal table as a database feature that brings built-in support for providing information about data stored in the table at any point in time rather than only the data that is correct at the current moment in time.

The evolution of the data tracking mechanism started with Change Tracking (CT), Change Data Capture (CDC) and now Temporal Tables. I have no experiences using the temporal table. I did some research to find out more information.

Introduction

The temporal table allows SQL Server to maintain and manage the history of the data in the table automatically, and allow historical time analysis. It is a critical feature used to audit SQL Server data. Every temporal table has two explicitly defined columns, each with a datetime2 data type. These columns referred to as period columns. SysStartTime and SysEndTime (in the UTC time zone) represent the current and historical data for every row of the table. The DateTime range has a value that is appropriate during that timeframe.

How does it work?

By querying the table, we can easily see the transitioning of the data to its various states within the specific date-time intervals. In addition to these period columns, a temporal table also contains a reference to another table with a mirrored schema. The system uses this table to automatically store the previous version of the row each time a row in the temporal table gets updated or deleted. This additional table is referred to as the history table, while the main table that stores current (actual) row versions is referred to as the current table or simply as the temporal table. During temporal table creation, users can specify the existing history table (must be schema compliant) or let the system create a default history table.

Why temporal tables

Below are the use cases of the temporal tables besides auditing SQL Server data changes and performing data forensics when necessary:

  • Rebuilding state of the data as of any time in the past.
  • Calculating trends over time (historical trend analysis).
  • Maintaining a slowly changing dimension for decision support applications.
  • Recovering from accidental data changes and application errors.

Examples

From the Microsoft website, a simple example to illustrate a scenario with Employer information in a database.

CREATE TABLE dbo.Employee
(
  [EmployeeID] int NOT NULL PRIMARY KEY CLUSTERED
  , [Name] nvarchar(100) NOT NULL
  , [Position] varchar(100) NOT NULL
  , [Department] varchar(100) NOT NULL
  , [Address] nvarchar(1024) NOT NULL
  , [AnnualSalary] decimal (10,2) NOT NULL
  , [ValidFrom] datetime2 GENERATED ALWAYS AS ROW START
  , [ValidTo] datetime2 GENERATED ALWAYS AS ROW END
  , PERIOD FOR SYSTEM_TIME (ValidFrom, ValidTo)
 )
WITH (SYSTEM_VERSIONING = ON (HISTORY_TABLE = dbo.EmployeeHistory));

I would like to share another example from sqlshack.com, that creates an employee temporal table.

CREATE TABLE dbo.employee(
  EMPNO   INT,
  ENAME    VARCHAR(10),
  JOB      VARCHAR(9),
  MGR      INT,
  HIREDATE DATE,
  SAL      NUMERIC(7,2),
  COMM     NUMERIC(7,2),
  DEPTNO   INT,
     CONSTRAINT EMP_PK PRIMARY KEY (EMPNO),
   SysStartTime datetime2 GENERATED ALWAYS AS ROW START NOT NULL
  ,SysEndTime datetime2 GENERATED ALWAYS AS ROW END NOT NULL
  ,PERIOD FOR SYSTEM_TIME (SysStartTime,SysEndTime)) WITH (SYSTEM_VERSIONING = ON);

SQL Server creates the history table with following format, dbo.MSSQL_TemporalHistoryFor_<Object_ID>.

There is no delete option on temporal table

To get a delete option, we need to turn off the system_versioning setting using the ALTER TABLE statement. For example,

ALTER TABLE [dbo].employee SET ( SYSTEM_VERSIONING = OFF )
GO

Once executed the SQL statement, the temporal table and history table become a regular table. Then, can use the DROP statement to delete the table.

Temporal, Temporary Table and Table Variable

The temporal table is not the same as a temporary table or table variable. I found an article that explains how these two tables can be created in SQL Server, refer to the articles written by Pinal Dave on SQL SERVER – How to Create Table Variable and Temporary Table? In another article, it shares the performance comparison of a temporary table and table variable. I think it is another good article to read, SQL SERVER – Table Variable or Temp Table – Performance Comparison – INSERT.

Summary

In this article, I cover the basic introduction of the temporal table, how do the temporal table work and some examples to create a temporal table. The data retrieval becomes simple by using the SYSTEM_TIME clause along with the associated sub-clause. The history table can grow big, and the number of transactions on the temporal table takes up space. Periodically purging the data from the history table can help to release the space.

I will continuously update this entry whenever I make any successful use cases in my current project. I will use the available dataset to perform the walkthrough and explain the concept.

References:
https://www.sqlshack.com/temporal-tables-in-sql-server/
https://docs.microsoft.com/en-us/sql/relational-databases/tables/temporal-tables?view=sql-server-ver15

Big Data, A Long Introduction

One of my colleagues from the Business Operation texted me on one morning and asked me where she can get insights, understand some of the terminology, difference between the SQL and NoSQL, and how to make decision which type of database to be used. Instantly, I replied, “get it from me!” I was pretty confident that I could give her an answer and I wanted to explain databases in a more interesting way.

What is SQL?

Structured Query Language (SQL) is computer language for database management systems and data manipulation. SQL is used to perform insertion, updation, deletion. It allows us accessing and modifying data. It stored in a relational model, with rows and columns. Rows contain all of the information about one specific entry and columns are the separate data points.

What is NoSQL?

NoSQL encompasses a wide range of database technologies that are designed to cater to the demands of modern apps. It stored a wide range of data types, each with different data storage models. The main ones are document, graph, key-value and columnar. 

This explains the above picture. Apps such as Facebook, Twitter, search engine (web) and IoT applications generate huge amount of data, both structured and unstructured. The best examples to explain what is unstructured data are photos and videos. Therefore, it needs different method to store the data. NoSQL databases do not store data in rows and columns (table) format.

Differences between SQL and NoSQL

There are a lot of websites which we can search online to give us the differences and I referred to this website.

NoSQL is also known as schema-less databases. The above screenshot uses the word, dynamic schema, which means the same, it does not have a fixed schema which locked same number of the columns (fields) for data entry. NoSQL data allow to have different number of columns when data is added.

Image: https://www.guru99.com/nosql-tutorial.html

Another major difference is scalability, SQL is vertical scaling and NoSQL is horizontal scaling. Let’s use a picture to explain scalability.

Relational databases are designed to run on single server in order to maintain integrity of the table mappings and avoid the problems of distributed computing. Often, we will look into more RAM, more CPU and more HDD, ways to upsize our system by upgrading our hardware specification. It is scale up or vertical scaling. This process is expensive.

NoSQL databases is non-relational, making it easy to scale out or horizontal scaling, meaning that it runs on multiple servers that work together, each sharing part of the load. It can be done on inexpensive commodity hardware.

Question: SQL or NoSQL?

Let’s refer to this article, the choice of the database between SQL and NoSQL cannot be concluded on the differences between them but the project requirements. If your application has a fixed structure and does not need frequent modifications, SQL is a preferable database. Conversely, if you have applications where data is changing frequently and growing rapidly, like in Big Data analytics, NoSQL is the best option for you. And remember, SQL is not deceased and can never be superseded by NoSQL or any other database technology.

In short, it depends on what type of applications or project requirements and type of query result as well.

Big Data

Big data is used to refer not just to the total amount of data generated and stored electronically (volume) but also to specific datasets that are large in both size and complexity which algorithms are required in order to extract useful information from them. Example sources such as search engine data, healthcare data and real-time data. In my previous article about What is Big Data?, I shared that Big Data has 3 V’s:

  • Volume of data. Amount of data from myriad sources.
  • Variety of data. Types of data; structured, semi-structured and unstructured.
  • Velocity of data. The speed and time at which the Big Data is generated.

Yes, based on all the above, we have covered 2 of the 3 V’s, the volume and variety. Velocity is how fast data is generated and processed. Although, there are more V’s out there and some are relevant to Big Data’s description. During my visit to the Big Data World 2018 in Singapore, I realized that my understanding of Big Data was limited to the understanding of the volume and variety. In this blog, I am going to write more.

Storing Big Data

Unstructured data storage which cannot be stored in the normal RDBMS for some reasons and often Big Data is related to real-time data and required real-time processing requirements.

Hadoop Distributed File System (HDFS)

It provides efficient and reliable storage for big data across many computers. It is one of the popular distributed file systems (DFS) which stored both unstructured and semi-structured data for data analysis.

Big Data Analytics

There are not many tools for NoSQL analytics in the markets at the moment. One of the popular method dealing with Big Data is MapReduce by dividing it up into small chunks and process each of these individually. In other words, MapReduce spread the required processing or queries over many computers (many processors).

This Big Data does not limited to search engine and healthcare, it can be data e-commerce websites where we want to perform targeted advertising and provide recommendations systems which we can often see in websites such as Amazon, Spotify or Netflix.

Big Data Security

Securing a network and the data it holds are the key issues, a basic measurement such as firewall and encryption should be taken to safeguard networks against unauthorized access.

Big Data and AI

While smart home has became a reality in the recent years, the successful invention of smart vehicles which allows vehicles drive in auto-mode, gives us a big hope that one day smart city can be realized. Countries such as Singapore, Korea, China and European countries such as Ireland and UK are planning smart cities, using the implementation of IoTs and Big Data management techniques to develop the smart cities.

I am looking forward.

Reference:
Dawn E. Holmes (2017) Big Data A Very Short Introduction.

[SQL] Grant User Permissions

The task which I was doing today required me to prepare a SQL script which grants the user to execute the user defined functions and stored procedures. While SQL Server Management Studio allows us to set the permission by a few clicks, it is good to prepare them using the script so we can re-use them in case we have run them in the development and production databases.

A simple script to grant user with execute permission for stored procedures. It applies to all stored procedures in the selected database.

/* GRANT EXECUTE TO THE ROLE */
GRANT EXECUTE TO <user>

Next, I found out that my application needs to set permission for the user defined functions. I felt I could have spent the same amount of time to search the script to automate the process and manually wrote the script one by one. The script below can help,

/* GRANT EXECUTE TO THE USER DEFINED FUNCTIONS */
SELECT 'GRANT EXECUTE ON [' + SCHEMA_NAME(schema_id) + '].[' + [name] + '] TO MyRole;'
FROM sys.objects
WHERE type IN ('TF','FN');

There is another script which could help,

declare @name varchar(100), @qry varchar(2000)
declare cursor cursor_temp
for select name from dbo.sysobjects where OBJECTPROPERTY (id, IsProcedure) = 1
open cursor_temp
fetch next from cursor_temp into @name
while @@fetch_status = 0
begin
set @qry = ' grant execute on [dbo].[' + @name + ' ] to [ <user_name> ]'
exec (@qry)
fetch next from cursor_temp into @name
end
close cursor_temp
deallocate cursor_temp

I hope this could help.

SQL Server Export Data into Excel

How to export data from the SQL Server Management Studio into Excel file?

I used to save the data into .csv file and transfer the file out into my local machine via FTP or use the copy and paste command to transfer the data into the Excel directly from the database. This happened when you have control on the servers and security is not a big concern. Another method is using the SSIS.

Due to various reasons, my client did not allow installation of Excel into the database server and recommended me to use the Export Data function in the SSMS. I seldom used the Export Data function, anyway. Instantly, I gave a try after I was guided.

The steps are simple, guided and straight forward to follow. Here is the step:-
1. Right click the database > Task > Export Data.

2. Choose a data source. It means where the source location is. For this example, we select SQL Server Native Client 11.0. Enter the user credentials if you are using SQL authentication, otherwise, you can just continue with Windows Authentication.

Select the database which you want to export the data from.

3. Choose a destination. It means select the location where you want to save or export the data to. Destination selection, we choose Microsoft Excel, then enter the Excel file path where you want to save the file at. As for the Excel version, I believe it is based on the driver version we installed. If you want to keep the first row with column names, please check the checkbox.

Then, we can proceed with Next button.

4. Specify table copy or query. Here, we can choose either to select which table or tables we want to copy all its data into the Excel or we can write a SQL query to specify the dataset to be copied or exported. Let see how we can export a table out from the SSMS.

Choose the first option, Copy data from one or more tables or views. This allows copying the entire dataset from the selected tables or views into the Excel file. Select the tables by checking the checkbox at the Source.

The destination will name the Excel sheet according to the table’s name. If you want to change the name, you can click on each destination table’s name and edit it. If you do not want to export the entire table, you can select to write a query by choosing the second option from the previous screen.

Write your SQL statement in the next screen, just like below, which I selected top 1000 rows of records from the selected table. You can rename the table name at the Destination similarly to what I describe earlier in the next screen.

The table name will appear in the Excel sheet. One sheet for one table and as you know Excel has maximum number of rows, 65,536 for Excel 2003 and 1,048,576 for Excel 2007. I believe there is no concern about maximum column.

Next, we move to the Edit Mappings… button which is important to look into it especially when we want to format the Excel’s cell format.

In the Edit Mappings, you can select the data type of each field by clicking the dropdown selection list under the Type column. And, click OK to save the mappings. Then, you can view the dataset by clicking on the Preview button. A limited row of data will display in the next screen.

After editing the mappings and preview the dataset, it is time to review the data type mapping

5. Review Data Type Mapping. Click Next button to proceed and run the package.

6. Run Package. Nothing much you can choose, either run immediately or cancel the process. Click Next or Finish button to complete the wizard, it will run the package and save data into the Excel file.

Lastly, navigate to your Excel file path to retrieve the Excel file and check the data inside it.