Using Python with Microsoft Reporting Services (SSRS)

Using Python with SQL Server 2017 SSRS should not be an obstacle, since the framework and the technology is the same as with R language. With SQL Server 2017, Python got a full and functional support for native SSRS.

The usability and functionality of Python is simply immense.  This blog post will be dedicated on using Python for data science in SSRS

Most of the samples  will be derived from the Adventure-Works, both SQL Server sample datasets are available at Microfost Github.

Selecting Python results with SSRS

External procedure sp_execute_external_script within SQL Server database using Python language does not need to have the ODBC drivers or python modules like pyodbc or sqlachemy for extracting or writing data between Sql server engine and python engine, the only module needed is Python pandas, since the communication between sql server requires data frames for exposing data to python (from sql server), respectively.

-- Check Python runtime
EXECUTE sp_execute_external_script
       @language =N'Python'
      ,@script=N'OutputDataSet = InputDataSet'
      ,@input_data_1 = N'SELECT 1 AS result'
WITH RESULT SETS
((
result INT
))
GO

Please note: Python code, as specified with @script parameter, must be validated and is sensitive to indentation. So single missing (or excessive) indent might cause the complete query to fail.

We will introduce the input parameters for better python data selection. At the same time, this parameter will be used as input parameter in SSRS. T-SQL query with python script:

-- query parametrization
DECLARE @MaritalStatusIN CHAR(1) = 'S' -- S/M Single/Married

EXECUTE sp_execute_external_script
@language =N'Python',
@script = N'
import pandas as pd
df = InputDataSet
df_gender = df[''MaritalStatus''] == MaritalStatus
df_gen = df[df_gender]
correlation = df_gen.corr(method=''pearson'')
OutputDataSet = pd.DataFrame(correlation, columns=["nof","age"])',
@input_data_1 = N'SELECT 
               COUNT(*) AS nof,MaritalStatus,age 
               FROM AdventureWorksDW2014.dbo.vTargetMail
               WHERE age < 100
               GROUP BY maritalstatus,age'
,@params = N'@MaritalStatus CHAR(1)'
,@MaritalStatus = @MaritalStatusIN

WITH RESULT SETS 
(( 
   CountObs FLOAT
  ,Age FLOAT 
));
GO

For the purposes of SSRS, I will encapsulate the upper code into stored procedure with one input parameter – that is Marital status. This same parameter will be propagated into python script.

EXEC SelectingPythonResults
      @MaritalStatusIN = 'S'

With results presented through SSRS with option of selecting the values.

2018-11-18 18_47_03-01. Selecting Python Results - SQL Server 2017 Reporting Services

Predicting and scoring with Python in SSRS

In addition to selecting filtered parameters in SSRS, we can also check how our prediction model is working.

For this manner, we will create a slightly different subset and we will try to predict the number of cars owned by the customers in the dataset, based on the age and the daily commute distance. Query for the dataset will the following:

--- Query for scoring the model
SELECT 
   age 
  ,NumberCarsOwned
  ,CAST(REPLACE(LEFT(CommuteDistance,2),'-','') AS TINYINT) as CommuteDistance
FROM AdventureWorksDW2014.dbo.vTargetMail
WHERE
    age < 100

2018-11-18 19_14_46-Python-SQL Server.sql - TOMAZK_MSSQLSERVER2017.SQLPY (TOMAZK_Tomaz (55))_ - Micr

We will run a simple linear regression with dependent (y) variable = NumberCarsOwned and age + CommuteDistance both as X variables.  We will store couple of different versions of prediction model in database, trained on different size of training subsets; 20%, 30%, 40% and 50% of the original dataset for training purposes. All four different models will be serialized and stored in a table.

Let’s create a table to store serialized models:

-- creating table for storing models
CREATE TABLE PredictingWithPy_models
(model VARBINARY(MAX)
,modelName VARCHAR(100)
,trainSize FLOAT
)

Now, we will  use Scikit-learn python module to train the model based on different subsets of original dataset.

CREATE OR ALTER PROCEDURE [dbo].[RunningPredictionsPy] 
(
  @size FLOAT --- format: 0.3 or 0.4 or 0.5 
 ,@name VARCHAR(100) 
 ,@trained_model varbinary(max) OUTPUT
)
AS
BEGIN
EXEC sp_execute_external_script
  @language = N'Python'
 ,@script = N'
import numpy as np
import pandas as pd
import pickle
from sklearn.linear_model import LinearRegression
from sklearn.model_selection import train_test_split

##Create SciKit-Learn linear regression model
X = df2[["age", "CommuteDistance"]]
y = np.ravel(df2[["nofCars"]])
name = name
##Create training (and testing) variables based on test_size
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=size)

## CreateLinear Model
SKL_lr = LinearRegression()
linRegObj = SKL_lr.fit(X_train, y_train)

##Serialize model
trained_model = pickle.dumps(linRegObj)'

 ,@input_data_1 = N'SELECT * FROM vTargetMail'
 ,@input_data_1_name = N'df2'
 ,@params = N'@trained_model varbinary(max) OUTPUT, @size FLOAT, @name VARCHAR(100)'
 ,@trained_model = @trained_model OUTPUT
 ,@size = @size
 ,@name = @name
END;
GO

And executing this procedure to store trained models in previously created table:

DECLARE @model VARBINARY(MAX);
EXEC [RunningPredictionsPy] 
    @size = 0.2  
   ,@name = 'Ln_20'
   ,@trained_model = @model OUTPUT
INSERT INTO PredictingWithPy_models (model, modelName, trainSize) 
VALUES(@model, 'Ln_20', 0.2);
GO

We repeat, so that at the end, we end up with percentage for 20, 30, 40 and 50; complete list is available at GitHub. At the end, the table with stored models should hold 4 lines with serialized models.

2018-11-18 20_43_55-Python-SQL Server.sql - TOMAZK_MSSQLSERVER2017.SQLPY (TOMAZK_Tomaz (55))_ - Micr

Now that we have trained the models, we want to predict the results, using SSRS.

For this manner, we will need another stored procedure, that will have X variables (independent variables: age and Commutedistance) as input parameters, besides selected model and have the prediction returned (Y variable – Number of cars owned).

Following procedure will be used in SSRS:

-- CREATE Procedure to predict the number of cars owned
CREATE OR ALTER PROCEDURE [dbo].[RunningPredictionWithValesPy] 
(
   @model varchar(100)
  ,@age INT
  ,@commuteDistance INT
)
AS
BEGIN

DECLARE @modelIN VARBINARY(MAX) = 
    (SELECT model FROM PredictingWithPy_models WHERE modelName = @model)
-- Create a T-SQL Query 
DECLARE @q NVARCHAR(MAX) = N'SELECT '+CAST(@age AS VARCHAR(5))+' AS age, 
         '+ CAST(@commuteDistance AS VARCHAR(5))+' AS CommuteDistance'

-- Store the new values for prediction in temp table
DROP TABLE IF EXISTS #t
CREATE TABLE #t (age INT, CommuteDistance INT)
INSERT INTO #t (age, CommuteDistance)
EXEC sp_executesql @q

EXEC sp_execute_external_script
@language = N'Python'
,@script = N'
import pickle
import numpy as np
import pandas as pd
from sklearn import metrics

##Deserialize model
mod = pickle.loads(modelIN)
X = InputDataSet[["age", "CommuteDistance"]]

##Create numpy Array when you introducte more values at the same time (bulk prediction)
predArray = mod.predict(X)
#predList = []
#for i in range(len(predArray)):
#     predList.append((predArray[i])[1])

OutputDataSet = pd.DataFrame(data = predArray, columns = ["predictions"])' 
  ,@input_data_1 = N'SELECT * FROM #t'
  ,@input_data_1_name = N'InputDataSet'
  ,@params = N'@modelIN varbinary(max)'
  ,@modelIN = @modelIN
WITH RESULT SETS 
((
prediction_Score FLOAT
));
END
GO

Checking for the procedure to work fine, the following T-SQL query will run the predictions against new dataset:

EXEC [RunningPredictionWithValesPy]
   @model = 'Ln_30'
  ,@age = 44
  ,@commuteDistance = 1

We can integrate the procedure into reporting services (SSRS) and have the combination of the input values displayed with the selected model (size of test set):

2018-11-18 21_17_44-02. Running Predictions - SQL Server 2017 Reporting Services

with the results of predictions in SSRS or in SSMS; both return same results:

2018-11-18 21_45_31-Python-SQL Server.sql - TOMAZK_MSSQLSERVER2017.SQLPY (TOMAZK_Tomaz (55)) - Micro

 

All reports featured here will also be available at GitHub.

Visualizing results with Python in SSRS

As we have created four different models, we would also like to have the accuary of the model visually represented using SSRS.

Showing plots created with Python might not be as straight forward, as with R Language.

Following procedure will extract the data from database and generate plot, that can be used and visualized in SSRS.

CREATE OR ALTER PROCEDURE [dbo].[VisualizeWithPyR2] 
(
     @inputVariable VARCHAR(100)
)
AS
BEGIN

DECLARE @q NVARCHAR(MAX) = N'SELECT '+CAST(@inputVariable AS VARCHAR(50))+' AS val1 FROM vTargetMail'
-- Store the values in temp table
DROP TABLE IF EXISTS #t
CREATE TABLE #t (val1 FLOAT)
INSERT INTO #t (val1)
EXEC sp_executesql @q

EXEC sp_execute_external_script
  @language = N'Python'
  ,@script = N'
import numpy as np
import pandas as pd
import matplotlib
import matplotlib.pyplot as plt

fig = plt.figure(figsize=(12, 5))
plt.plot(plotdataset)
fig.savefig(''C:\\\PyGraphs\\firstGraph.png'') 
OutputDataSet = pd.DataFrame(data =[1], columns = ["plot"])' 
,@input_data_1 = N'SELECT * FROM #t'
,@input_data_1_name = N'plotdataset'
WITH RESULT SETS 
((
    plot INT
));
END
GO

 

So far, I have found this as to be the simplest way – saving the plot on same server, where SSRS is running and exposing the disk path to SSRS. Python stores the graph to disk on the host / client / server with following command:

fig = plt.figure(figsize=(12, 5))
plt.plot(plotdataset)
fig.savefig(''C:\\\PyGraphs\\firstGraph.png'')

Resulting in SSRS as:

2018-11-18 22_59_18-03. Visualizing with Python - SQL Server 2017 Reporting Services

In Report builder, one additional setting worth mentioning is the image properties, where the path to the saved plot should be the same as the one presented in Python code. Savefig function in python is pointing to same location as the external source for this image setting, with function as: file://C://PyGraphs//firstGraph.png.

2018-11-18 23_01_10-Image Properties

 

Evaluating all four linear regression models, based on the size of the test sample size, can be calculated using R2 and accompanying python code:

from sklearn.metrics import r2_score
def R2EvalMod(test_sizeIN):
    X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=test_sizeIN)
    SKL_lr = LinearRegression()
    linRegObj = SKL_lr.fit(X_train, y_train)
    predArray = linRegObj.predict(X_test)
    R2 = r2_score(y_test, predArray) 
    return R2

sql_conn = pyodbc.connect('DRIVER={ODBC Driver 13 for SQL Server};SERVER=TOMAZK\\MSSQLSERVER2017;DATABASE=SQLPY;Trusted_Connection=yes') 
query3 = '''
SELECT trainSize FROM [dbo].[PredictingWithPy_models]
'''
df3 = pd.read_sql(query3, sql_conn)

R2List = []
for i in range(len(df3)):
    z = df3.at[i, 'trainSize']
    R2=R2EvalMod(z)
    R2List.append(([z],[R2]))
 
labels = ['trainsize', 'R2']
df33 = pd.DataFrame.from_records(R2List, columns=labels)

With the fact that, better the model the higher R2 is – between 0 and 1 (also negative -1 to 0), showing that the model R2 is best when using 30% of dataset for testing and 70% for training.

2018-11-18 23_23_57-Book1 - Excel

 

As always, code is available at GitHub.

Happy Pythoning.

Advertisements
Tagged with: , , , ,
Posted in Uncategorized

SQL Saturday Slovenia 2018

SQL Saturday Slovenia 2018 is coming back in town.  And it will be, as always, awesome. I prefer not to use superlatives, but the schedule speaks for it self.

2018-10-29 10_31_15-

The event will take place on Saturday, 8th December 2018, at the Faculty of Computer and Information Science of the University of Ljubljana, Večna pot 113, Ljubljana.

Recap of statistics so far:

  • 133 submitted sessions
  • 55 speakers from Europe and USA
  • Over 30 sessions submitted by speakers originating from Germany, following by UK speakers with count little over 20
  • More than 50% of selected speakers are Microsoft MVP

Two pre-conference workshops will also be held on Friday, 7th of December by:

You are kindly invited to join the pre-conference day with one of the workshops; there are still some places available. And also to take part at the event on saturday.

Ljubljana in December will be, as always, festive and beautiful. Take your time, register for the event, and after the event enjoy the fairs, events, new-years feeling and lights, mulled wine and much more.

In addition, a special thanks to our sponsors, supporting the event:

 

See you on Friday, 7th and on Saturday, 8th of December, 2018!

Tagged with: , , , , , ,
Posted in Uncategorized

Running R scripts within in-database SQL Server Machine Learning

Having all the R functions, all libraries and any kind of definitions (URL, links, working directories, environments, memory, etc) in one file is nothing new, but sometimes a lifesaver.

Using R function source is the function to achieve this. Storing all definitions, functions, classes on one place can help enterprises achieve faster installation and safer environment usage.

So the idea is simple. Stack all the needed functions, classes, libraries and configurations you want to use in a specific environment and save it in a R  file.  Create a file with all the needed setups of libraries and functions, as seen below. We will call this file as util.R.

2018-10-14 20_42_44-RStudio.png

And the rest is just to refer to this file in any R environment or R script with simple call using source.

setwd("C:\\MyFiles\\R_setup")
source("util.R")

Any new R script will have now all the functions and libraries included by default. Testing it with the simple function call is straightforward:

x <- c(1,2,3,4,5)
y <- c(2,3,4,5,6)

mysummary(x)
sss(x,y)

and the result is, as expected

2018-10-14 20_49_11-RStudio

In my previous blog posts, I have mentioned also calling R scripts from SQL Server tables, external files and others. Using R source function is another way to pull in predetermined and preinstalled functions also in SQL Server in-database machine learning service (or machine learning server), that will make setting up same environments on client machines in enterprise environment much easier and faster. With just one update of the file, all client machines can read same definitions.

Using same util.R file, I have placed the file in the location where SQL Server R and workers will have access granted. By using sp_execute_external_script and external R file, this can be simplified:

USE AdventureWorks;
/* SELECT n FROM (VALUES(1),(2),(3),(4)) AS tbl(n) */
exec sp_execute_external_script
  @language = N'R'
 ,@script = N'
#have access to this folder for SQL SERVER /R workers  
   source("C:\\MyFiles\\R_setup\\util.R") 
    x <- InputDataSet
    x <- as.numeric(x)
   out <- addOne(x)
  OutputDataSet <- data.frame(as.numeric(out))'
,@input_data_1 = N'SELECT n FROM (VALUES(1)) AS tbl(n)'
WITH RESULT SETS
((
FunctionResult VARCHAR(100)
))

Result is the same as if ran from and other R environment, but this time, I am running it from SQL Server.

2018-10-14 21_00_02-read_external_r_script_from_R_database_Services.sql - TOMAZK_MSSQLSERVER2017.Adv

There are couple of thoughts to point out, though:

1) have the util.R file or any external R file validated (!)

2) all the libraries, that util.R file is referring to, must be installed. If not, the external file will not be loaded. This can simply be achieved by using:

if(!is.element('AGD', installed.packages()[,1])){
install.packages('AGD')} else {
library(AGD)}

instead of just referencing the package by using library(AGD).

3) there should be more IF statements used regarding the R core engine and to check the availability and compatibility of new packages.

4) not all packages can be used with Microsoft R engine and the current R version.

 

Enjoy R and SQL Server and happy coding.

Tagged with: , , , , , , ,
Posted in Uncategorized

First release and update dates of R Packages statistics

R has been around long time and the packages have evolved through the years as well. From the initial releases, updates, to new packages. Like many open-source and community driven languages, R is not an exception. And getting the first release dates of R packages requires little bit of web scrapping and lots of fun.

CRAN – Comprehensive R Archive Network – has invested a lot of people, rules and hours of work to have the packages available for general public in tidy, ready-to-use and easy-to-use fashion.

 

Last R Package updates

First, let’s check the last package update dates. By loading rvest and getting the data from CRAN web site: https://cran.r-project.org/web/packages/available_packages_by_date.html we are able to turn the HTML table into usable data.frame in R.

library(rvest)
library(ggplot2)

url = 'https://cran.r-project.org/web/packages/available_packages_by_date.html'

CRANpage <- read_html(url)
tbls <- html_nodes(CRANpage, "table") # since HTML is in table; no need to scrape td/tr elements
table1 <- html_table(tbls[1], fill = TRUE)
dd <- data.frame(table1[1])

#house cleaning
dd$Date <- as.Date(dd$Date)

### simple graph
ggplot(dd, aes(x=Date)) +
geom_dotplot(binwidth =12) + 
labs(x = "Dates", y = "Number of packages updates by Year of last update") +
scale_x_date(date_breaks= "2 years", date_labels = "%Y/%m", limits = as.Date(c("2005-01-01", "2018-10-10")))

Based on this graph, we can see that many of the R packages have been updated in past year or two.

2018-10-07 18_49_23-Plot Zoom

So, how many? So we run the following statement:

library(dplyr)
library(lubridate)

# updates by year
dd_y <- dd %>%
mutate( PYear= year(Date)) %>%
select (PYear) %>%
group_by(PYear) %>%
summarise(
  nof = n()
)

with the results:

# A tibble: 14 x 2
   PYear   nof
   <dbl> <int>
 1  2005     1
 2  2006     4
 3  2007     1
 4  2008    10
 5  2009    25
 6  2010    32
 7  2011    65
 8  2012   464
 9  2013   575
10  2014   764
11  2015  1158
12  2016  1772
13  2017  2683
14  2018  5583

So out of 13137 packages (on October 7th, 2018), 5583 have been updated in year 2018 and additional 2683 in 2017.

By running a simple stats:

dd_y %>% 
mutate(cumsum = cumsum(nof)
,percY = nof/cumsum(nof)
,percC = cumsum(nof)/sum(nof))

 

we can see how active many of the packages have been in terms of updates.

# A tibble: 14 x 5
   PYear   nof cumsum percY     percC
   <dbl> <int>  <int> <dbl>     <dbl>
 1  2005     1      1 1.00  0.0000761
 2  2006     4      5 0.800 0.000381 
 3  2007     1      6 0.167 0.000457 
 4  2008    10     16 0.625 0.00122  
 5  2009    25     41 0.610 0.00312  
 6  2010    32     73 0.438 0.00556  
 7  2011    65    138 0.471 0.0105   
 8  2012   464    602 0.771 0.0458   
 9  2013   575   1177 0.489 0.0896   
10  2014   764   1941 0.394 0.148    
11  2015  1158   3099 0.374 0.236    
12  2016  1772   4871 0.364 0.371    
13  2017  2683   7554 0.355 0.575    
14  2018  5583  13137 0.425 1.00 

So majority (or 2/3) of the packages have been actively updated in last 2 years (in order to fit the latest R engine updates). A simple correlation will also support this:

#simple correlation
cor(dd_y)[1,2]

with the value of 0.77.

Funny question: Is there any correlation of the update package and the month?

And the answer is: No 🙂

dd_ym <- dd %>%
mutate( PYear= year(Date)
,month_name = month(Date, label = FALSE)) %>%
select (PYear,month_name) %>%
group_by(PYear,month_name) %>%
summarise(
nof = n()
)
cor(dd_ym)[1,2]

with the correlation coefficient of -0.06. So Month does not play any particular importance. But Since the year 2018 is not over yet, it might be slightly unfair. So, to further check and support this, the distribution of the updates of R packages over months, I have excluded the year 2018 and anything prior to 2010.:

#check distribution over months
dd_ym2010 <- dd_ym %>%
  filter(PYear > 2010 & PYear < 2018)

boxplot(dd_ym2010$nof~dd_ym2010$month_name, 
main="R Packages update over months", xlab = "Month", 
ylab="Number of Packages")

and we can see the boxplot:

2018-10-07 19_11_23-Plot Zoom

So more updates are coming in autumn times. But the results of correlation:

cor(dd_ym2010)[2,3]

is still just 0.155, making it hard to draw any concrete conclusions. Adding year 2018 will skew the picture and add several outliers, as the fact that year 2018 is still a running year (as of writing this blog post).

 

Initial dates of R Package Release

To get the complete picture, not just last updates of the packages, but the complete First or initial release dates of all the packages, some further digging was involved. Again, from CRAN archive web pages, the dates of updates and number of updates have been scrapped, in order for these statistics to be prepared.

A loop over all the package archives, has resulted in in final data frame.

###########################
### Get initial Dates #####
###########################

rm(list = Filter(exists, c("packageNames")))
packageNames <- dd$Package

# rm(df_first)
#create a dataframe to keep the data types in order
df_first <- data.frame(name=c("TK_NA")
              ,firstRelease=c(as.Date("1900-12-31"))
              ,nofUpdates=c(0))

for (i in 1:length(packageNames)){
     url1 <- 'https://cran.r-project.org/src/contrib/Archive/'
     name1 <- packageNames[i]
     url2 <- paste0(url1,name1,'/')

ifErrorPass <- tryCatch(read_html(url2), error=function(e) e) 
if(inherits(ifErrorPass, "error")) next # if package does not have archive!!!

   cp <- read_html(url2)
   t2 <- html_nodes(cp, "table") 
   t2 <- html_table(t2[1], fill = TRUE)
   rm(list = Filter(exists, c("dd2")))
   dd2 <- data.frame(t2[1])
   dat <- dd2$Last.modified
   dat <- as.Date(dat, format = '%Y-%m-%d')
   firstRelease <- dat[order(format(as.Date(dat),"%Y%m%d"))[1]]
   numberOfUpdates <- length(dat) 
   df_first <- rbind(df_first,data.frame(name=name1,firstRelease=as.Date(firstRelease, format='%Y-%m-%d'),nofUpdates=numberOfUpdates))
}

# clean my initial row when creating data.frame
myData = df_first[df_first$firstRelease > '1900-12-31',]

After leaving this part running for roughly 10 minutes,  the code has successfully scraped all the archives of the CRAN web repository.  But not all packages have archive folder yet. And this should mean, that there is not yet any updates for these packages (correct me, If I am wrong. thanks). So some additional data wrangling was needed:

# add missing packages that did not fall into archive folder on CRAN

myDataNonArchive <- dd$Package[!dd$Package %in% myData$name]
myDataNonArchive2 <- cbind(dd[dd$Package %in% myDataNonArchive,c(2,1)],1)

names(myData) <- c("Name","firstRelease","nofUpdates")
names(myDataNonArchive2) <- c("Name","firstRelease","nofUpdates")

finalArchive <- data.frame(rbind(myData, myDataNonArchive2))

And final graph of the inital release year of packages, can be plotted:

hist(year(finalArchive$firstRelease),
main = paste("Histogram of First year of R Package Release")
,xlab="Year",ylab="Number of Packages"
,col="lightblue", border="Black"
,xlim = c(1995, 2020), las=1, ylim=c(0,3000))

And the graph:

2018-10-07 19_33_56-Plot Zoom

With the following numbers (focusing only on past years):

2018-10-07 19_45_20-Book1 - Excel.png

We can conclude that in year 2018, we might not see a positive trend in new package development as in the past years (this is my personal view and conclusion). Another indicator showing this is the number of updates in year 2018 – the year of a major R upgrade – for all the packages released in year 2018 is declining in comparison with previous years. I guess, years 2016 and 2017 were “data science years” and golden years for R.

 

As always, complete code is available at Github.

 

 

Happy R-coding!

Tagged with: , , , ,
Posted in Uncategorized

T-SQL job title generator

While writing a sample random function in using T-SQL Server, I have remembered, why not write a job title generator for T-SQL domain only. You might have seen so called bulls**t job title generator and similar, but this one is T-SQL SQL server specific.

So, why not come up with random, yet funny T-SQL job titles. And making it, I have to tell you, it was fun. And I was simply hitting that F5 button in SSMS, to get new job title generated and laugh out loud.

Following this laughter, I have created the following website:

http://tsqljobtitlegenerator.azurewebsites.net/index.php

that encapsulated the T-SQL code explained in this blog. You can either copy/paste the T-SQL code and run it in your SSMS or visit the T-sql job title generator web site.

The code

Staging some data, I have created just some random words, consisting of three parts. One is just IT slang/jargon set of words, the second is a list of interesting SQL Server technologies, functions, sets, and other words and last one is the more sophisticated – new age – words. So here is the data set:

DROP TABLE IF EXISTS roles;
GO

CREATE TABLE roles
(id int identity(1,1) NOT NULL
,title VARCHAR(100)
);


INSERT into roles(title)

SELECT 'Analyst'
UNION ALL SELECT 'Project Manager'
UNION ALL SELECT 'Expert'
UNION ALL SELECT 'Manager'
UNION ALL SELECT 'Person'
UNION ALL SELECT 'Artist'
UNION ALL SELECT 'Tamer'
UNION ALL SELECT 'Developer'
UNION ALL SELECT 'Administrator'
UNION ALL SELECT 'Technologist'
UNION ALL SELECT 'Uploader'
UNION ALL SELECT 'Downloader'
UNION ALL SELECT 'Sherpa'
UNION ALL SELECT 'Philosopher'
UNION ALL SELECT 'Designer'
UNION ALL SELECT 'Legend'
UNION ALL SELECT 'Evangelist'
UNION ALL SELECT 'Hero'
UNION ALL SELECT 'Guru'
UNION ALL SELECT 'Director'
UNION ALL SELECT 'Slayer'
UNION ALL SELECT 'Composer'
UNION ALL SELECT 'Reader'
UNION ALL SELECT 'Outliner'
UNION ALL SELECT 'Proof-reader'
UNION ALL SELECT 'Assistant'
UNION ALL SELECT 'Operator'
UNION ALL SELECT 'Coffee Maker'
UNION ALL SELECT 'Pizza re-heater'
UNION ALL SELECT 'Banana Cutter'
UNION ALL SELECT 'Tester'
UNION ALL SELECT 'Deep tester'
UNION ALL SELECT 'Backward tester'
UNION ALL SELECT 'Office hater'
UNION ALL SELECT 'Hater'
UNION ALL SELECT 'Warrior'
UNION ALL SELECT 'Junkie'
UNION ALL SELECT 'Wizard'
UNION ALL SELECT 'Leader'
UNION ALL SELECT 'King'
UNION ALL SELECT 'Approver'
UNION ALL SELECT 'Engineer'
UNION ALL SELECT 'Architect'
UNION ALL SELECT 'Rockstar'
UNION ALL SELECT 'Ninja'
UNION ALL SELECT 'Python Coder'
UNION ALL SELECT 'R and Python Hater'
UNION ALL SELECT 'C# Lover'
UNION ALL SELECT 'Java evangelist'
UNION ALL SELECT 'Ninja'
UNION ALL SELECT 'Captain'
UNION ALL SELECT 'Strategist'
UNION ALL SELECT 'Consultant'
UNION ALL SELECT 'Organizer'
UNION ALL SELECT 'Coffee spiller'
UNION ALL SELECT 'Endorser'
UNION ALL SELECT 'Cow'
UNION ALL SELECT 'Dog'
UNION ALL SELECT 'Cheever'
UNION ALL SELECT 'Lazy'
UNION ALL SELECT 'Fanboy'
UNION ALL SELECT 'Copy/Paster'
UNION ALL SELECT 'Researcher'
UNION ALL SELECT 'Cloner sheep'
UNION ALL SELECT 'Copy cat'
UNION ALL SELECT 'Shadower'
UNION ALL SELECT 'Guerilla'
UNION ALL SELECT 'Bullshiter'
UNION ALL SELECT 'Updater'
UNION ALL SELECT 'F5key presser'
UNION ALL SELECT 'Helper'
UNION ALL SELECT 'Knows everything'
UNION ALL SELECT 'Coffee Addict'
UNION ALL SELECT 'ASAP Doer'
UNION ALL SELECT 'Complicator'
UNION ALL SELECT 'Helpdesk dispatcher'
UNION ALL SELECT 'His Awesomeness'
UNION ALL SELECT 'Hers Awesomeness'
UNION ALL SELECT 'Advanced Copy/paster'
UNION ALL SELECT 'Stackover subscriber'
UNION ALL SELECT 'Over-engineering'


DROP TABLE IF EXISTS sqlstuff;

CREATE table sqlstuff
(ID INT IDENTITY(1,1) NOT NULL
,title VARCHAR(100)
)

INSERT INTO sqlstuff (title)
SELECT 'Cardinality Estimator'
UNION ALL SELECT 'Stored Procedure'
UNION ALL SELECT 'Data Masking'
UNION ALL SELECT 'High Availability'
UNION ALL SELECT 'Database Durability'
UNION ALL SELECT 'Memory Optimized table'
UNION ALL SELECT 'User Defined Function'
UNION ALL SELECT 'Stale Statistics'
UNION ALL SELECT 'Azure'
UNION ALL SELECT 'Power BI'
UNION ALL SELECT 'Machine Learning service'
UNION ALL SELECT 'Reporting Service'
UNION ALL SELECT 'Notification Service'
UNION ALL SELECT 'Analysis Service'
UNION ALL SELECT 'Clustered Index'
UNION ALL SELECT 'Database Snapshot'
UNION ALL SELECT 'Query Store'
UNION ALL SELECT 'DBCC Check'
UNION ALL SELECT 'B-Tree'
UNION ALL SELECT 'Query Optimizer'
UNION ALL SELECT 'Linked Server'
UNION ALL SELECT 'Trigger'
UNION ALL SELECT 'Replication'
UNION ALL SELECT 'Resource Governor'
UNION ALL SELECT 'Maintenance Plan'
UNION ALL SELECT 'Server Log'
UNION ALL SELECT 'SQL Server Agent'
UNION ALL SELECT 'Extended Event'
UNION ALL SELECT 'Profiler'
UNION ALL SELECT 'Server Role'
UNION ALL SELECT 'Auditing'
UNION ALL SELECT 'Credentials'
UNION ALL SELECT 'Database Backup'
UNION ALL SELECT 'Extended Properties'
UNION ALL SELECT 'Log Shipping'
UNION ALL SELECT 'Database Mirroring'
UNION ALL SELECT 'Availability Group'
UNION ALL SELECT 'PowerShell'
UNION ALL SELECT 'Parameter Sniffing'
UNION ALL SELECT 'ANSI Default'
UNION ALL SELECT 'Service Broker'
UNION ALL SELECT 'Compatibility Level'
UNION ALL SELECT 'Containment Type'
UNION ALL SELECT 'Recovery Model'
UNION ALL SELECT 'Collation'
UNION ALL SELECT 'Primary Filegroup'
UNION ALL SELECT 'Database Log Backup'
UNION ALL SELECT 'Bulk Insert'
UNION ALL SELECT 'Left Join'
UNION ALL SELECT 'U-SQL'
UNION ALL SELECT 'Azure SQL Server'
UNION ALL SELECT 'MicroContainer'
UNION ALL SELECT 'Pandas Data-frame'
UNION ALL SELECT 'Numpy Array'
UNION ALL SELECT 'Parametrization'
UNION ALL SELECT 'Slow Query'
UNION ALL SELECT 'Long running query'
UNION ALL SELECT 'Nested Query'
UNION ALL SELECT 'R ggplot library'
UNION ALL SELECT 'SARGable Query'
UNION ALL SELECT 'WHERE clause'
UNION ALL SELECT 'WHILE loop'
UNION ALL SELECT 'DELETE statement'
UNION ALL SELECT 'CI/CD'
UNION ALL SELECT 'SQL Server 6.0'
UNION ALL SELECT 'Execution Plan'
UNION ALL SELECT 'String Aggregation'
UNION ALL SELECT 'Dynamic View Management'
UNION ALL SELECT 'User Defined Table'
UNION ALL SELECT 'Fortran OLEDB'
UNION ALL SELECT 'SQL Server 2017'
UNION ALL SELECT 'Cumulative Updates'
UNION ALL SELECT 'Monitoring resources'
UNION ALL SELECT 'Activity Monitor'


DROP TABLE IF EXISTS Fancystuff

CREATE table Fancystuff
(ID INT IDENTITY(1,1) NOT NULL
,title VARCHAR(100)
)


INSERT INTO Fancystuff

SELECT 'Regional'
UNION ALL SELECT 'Group'
UNION ALL SELECT 'Only the best'
UNION ALL SELECT 'Insane'
UNION ALL SELECT 'Qualitative'
UNION ALL SELECT 'Virtuous'
UNION ALL SELECT 'Senior'
UNION ALL SELECT 'Junior'
UNION ALL SELECT 'In-House'
UNION ALL SELECT 'Outsourced'
UNION ALL SELECT 'Magnificent'
UNION ALL SELECT 'Evolutionary'
UNION ALL SELECT 'Customer'
UNION ALL SELECT 'Product'
UNION ALL SELECT 'Forward'
UNION ALL SELECT 'Future'
UNION ALL SELECT 'Dynamic'
UNION ALL SELECT 'Corporate'
UNION ALL SELECT 'Legacy'
UNION ALL SELECT 'Investor'
UNION ALL SELECT 'Direct'
UNION ALL SELECT 'International'
UNION ALL SELECT 'Over-seas'
UNION ALL SELECT 'Internal'
UNION ALL SELECT 'Human'
UNION ALL SELECT 'Creative'
UNION ALL SELECT 'Volunteer'
UNION ALL SELECT 'Lead'
UNION ALL SELECT '4 Stages of'
UNION ALL SELECT 'Complete'
UNION ALL SELECT 'Most Advanced'
UNION ALL SELECT 'State of the art'
UNION ALL SELECT 'Super high'
UNION ALL SELECT 'First Class'
UNION ALL SELECT 'Powerful'
UNION ALL SELECT 'Data'
UNION ALL SELECT 'Head of'
UNION ALL SELECT 'Master of'
UNION ALL SELECT 'Chief of'
UNION ALL SELECT 'Officer'
UNION ALL SELECT 'Lead'
UNION ALL SELECT 'Specialist'

Once I have this, I can have endless possibilities to merge all the datasets together.

With the following CTE, I will generate the random title:

----------------------------------
----- Generating T-SQL job title
-----------------------------------

;WITH Fancy
AS
(
SELECT TOP 1
title
FROM
Fancystuff
ORDER BY NEWID()
),
SQLy AS
(
SELECT TOP 1
title
FROM
sqlstuff
ORDER BY NEWID()
),
Roley AS
(SELECT TOP 1
title
FROM roles
ORDER BY NEWID()
)

SELECT 
CONCAT(f.title, ' ', s.title, ' ', r.title) AS TSQLJobGenerator

FROM fancy AS f
CROSS JOIN SQLy AS s
CROSS JOIN Roley AS r

 

I can’t help myself not laughing, as every time I hit run query, I get a funny combination.

Couple of my random titles – and some go beyond my imagination.

2018-09-15 15_30_29-SQLServer_Job_title_generator.sql - TOMAZK_MSSQLSERVER2017.JTG (TOMAZK_Tomaz (70

Or

2018-09-15 16_41_19-SQLServer_Job_title_generator.sql - TOMAZK_MSSQLSERVER2017.JTG (TOMAZK_Tomaz (51

Or

2018-09-15 16_43_03-SQLServer_Job_title_generator.sql - TOMAZK_MSSQLSERVER2017.JTG (TOMAZK_Tomaz (51

What is your favorite T-SQL job title that you get with this title generator?

 

And this one is taken from the website

2018-09-15 22_14_40-T-SQL Job Title Generator

 

As always, code is also available on GitHub.

Enjoy T-SQLing!

Tagged with: , , , , , ,
Posted in Uncategorized

T-SQL date and time manipulation

Manipulating date and time in T-SQL is a daily and very common task that every DBA, SQL Developer, BI Developer and data scientist will come across. And over the years, I have accumulated many of the simple date or/and time manipulation combinations of different functions, that it is time, to put them together.

Don’t expect to find here anything you haven’t used or seen – especially, if you are a long time T-SQL developer. The point is to have a post, that will have a lot of examples on date and time manipulation on one place. And by no means, this is not the definite list, but should be quite substantial and the code on Github repository will be update.

The list will be updated on my Github, and therefore this blogpost might not include all. In all of the following examples I will be using function GETDATE() to get the current datetime, unless the examples will have stored dates. Therefore, some of the examples or screen-prints will be different from yours.

2018-09-04 23_29_56-Window

I will be primarily using following T-SQL functions:

  • dateadd
  • datediff
  • year
  • month
  • day
  • format
  • cast
  • convert
  • parsename
  • and some string manipulation functions like: substring, replicate, left, right.

Starting with simple date

Most common functions are for sure extracting Year, Month and Day from your date.

SELECT
   GETDATE() AS RightNow
 ,YEAR(GETDATE()) AS Year_RightNow
 ,MONTH(GETDATE()) AS Month_RightNow
 ,DAY(GETDATE()) AS Day_RightNow

2018-09-04 23_22_39-Window

Deciding if the Year is a leap or non-leap year, following examples will help you get along:

-- Is Leap Year (we check for presence/existence of 29.Feb)
SELECT
    ISDATE('2019/02/29') AS CheckLeapYear -- if 1, date exists and this year is a leap year; if 0 date does not exists and is not leap year
   ,CASE WHEN (YEAR(GETDATE()) % 4 = 0 AND YEAR(GETDATE()) % 100 <> 0) 
  OR YEAR(GETDATE()) % 400 = 0 THEN 'Leap Year' ELSE 'Non Leap Year' END AS CheckLeapYear

with results as following:

2018-09-04 23_26_19-Window

So the first example will simply check if the 29th of February is valid for given year or not. But for a given SQL Statement the second variant with modulo for years is much easier for execution.

The following block of examples will simply give you the desired dates for a given time period. Mainly using functions DATENAME, DATEPART, DATEDIFF, CAST, CONVERT, FORMAT and DAY, MONTH, WEEK, YEAR, EOMONTH.

Simply run and examine the examples to get the hang of the examples.

-- Name of Days, Months
SELECT 
DATENAME(WEEKDAY, GETDATE()) AS [DayName]
,DATENAME(MONTH, GETDATE()) AS [MonthName]
,DATEPART(WEEK, GETDATE()) AS [WeekNumber]
,DATEPART(ISO_WEEK, GETDATE()) AS [ISO_WeekNumber]

-- Using Date format or FORMAT or CAST / CONVERT
SELECT
CAST(GETDATE() AS DATE) AS Date_RightNow
,FORMAT(GETDATE(), 'dd/MM/yyyy') AS DATE_dd_MM_yyyy
,FORMAT(GETDATE(), 'yyyy-MM-dd') AS DATE_yyyy_MM_dd
,FORMAT(GETDATE(), 'MM-dd') AS DATE_MM_dd
,FORMAT(GETDATE(), 'dd/MM/yyyy', 'en-US' ) AS DATE_US 
,FORMAT(GETDATE(), 'dd/MM/yyyy', 'sl-SI' ) AS DATE_SLO

-- Days
SELECT
DATEADD(DAY,DATEDIFF(DAY,1,GETDATE()),0) AS Yesterday
,DATEADD(DAY,DATEDIFF(DAY,0,GETDATE()),0) AS Today
,DATEADD(DAY,DATEDIFF(DAY,0,GETDATE()),1) AS Tomorrow

-- Weeks
SELECT
DATEADD(WEEK,DATEDIFF(WEEK,7,GETDATE()),0) AS LastWeek_startOf
,DATEADD(WEEK,DATEDIFF(WEEK,0,GETDATE()),0) AS ThisWeek_startOf
,DATEADD(WEEK,DATEDIFF(WEEK,0,GETDATE()),7) AS NextWeek_startOf

-- Months (works for all months; with 30 or 31 days, or with February)
SELECT
DATEADD(MONTH,DATEDIFF(MONTH,31,GETDATE()),0) AS LastMonth_startOf
,DATEADD(MONTH,DATEDIFF(MONTH,0,GETDATE()),0) AS ThisMonth_startOf
,DATEADD(MONTH,DATEDIFF(MONTH,-1,GETDATE()),0) AS NextMonth_startOf

SELECT 
EOMONTH(GETDATE()) AS CurrentMonthEnd
,EOMONTH(GETDATE(), -1) AS PreviousMonthEnd
,EOMONTH(GETDATE(), 1) AS NextMonthEnd

-- Years (works with leap years)
SELECT
DATEADD(year, DATEDIFF(year, 365, (GETDATE())), 0) AS LastYear_startOf
,DATEADD(year, DATEDIFF(year, 0, (GETDATE())), 0) AS ThisYear_startOf
,DATEADD(year, DATEDIFF(year, -1, (GETDATE())), 0) AS NextYear_startOf

And the results are as following:

2018-09-05 16_25_18-Window

I have added in all of the examples the extraction of the current, previous and next part of the date (week, day, month, year).

Getting dates in particular point-in-time

Sometimes one needs to set a particular time in the future / past and setting a particular point-in-time. Therefore, I Have added just a simple example for finding exact time from now.

SELECT
DATEADD(MONTH, DATEDIFF(MONTH, -1, (GETDATE())), DAY(GETDATE()-1)) AS OneMonthFromNow
,CAST(DATEADD(MONTH, 1, CAST(GETDATE() AS DATE)) AS DATETIME) AS OneMonthFromNow


SELECT
CAST(DATEADD(YEAR, 1, CAST(GETDATE() AS DATE)) AS DATETIME) AS OneYearFromNow
,DATEADD(DAY, DATEDIFF(DAY, 0, (GETDATE())), 365) AS OneYearFromNow

Finding point-in-time can also be used with for example also using BETWEEN function.

The result of the query is:

2018-09-05 16_30_02-Window

 

Differences in dates

When you are solving and calculating the differences in dates, I have added a simple “countdown” with the remaining number of days until end of month or year.

-- Number of days until ...
SELECT
	 (DATEDIFF(DAY, GETDATE(), DATEADD(MONTH, DATEDIFF(MONTH, -1, (GETDATE())), 0)))-1 AS NumberOfDAysUntilEndOfMonth
	,(DATEDIFF(DAY, GETDATE(), DATEADD(YEAR, DATEDIFF(YEAR, -1, (GETDATE())), 0)))-1 AS NumberOfDAysUntilEndOfYear

The results can be replaced with calculation of hours, quarters, years.

2018-09-05 16_32_46-Window

But more interesting and sometimes not that straightforward business case would be the calculation of working days – so finding the difference between two dates and calculating the number or working day.

The following example does this with simple query:

-- Number of business / working days between two dates
SET DATEFIRST 1;
DECLARE @dayFrom SMALLDATETIME = '20180901'
DECLARE @dayTO SMALLDATETIME = '20180905'

SELECT
(DATEDIFF(DAY, @dayFrom, @dayTO) + 1)-(DATEDIFF(WEEK, @dayFrom, @dayTO) * 2)-(CASE WHEN DATEPART(WEEKDAY, @dayFrom) = 7 THEN 1 ELSE 0 END)-(CASE WHEN DATEPART(WEEKDAY, @dayTO) = 6 THEN 1 ELSE 0 END) AS NumberOfWorkingDays

And since I have added that the week starts on Monday (SET DATEFIRST 1), and that 1st of September 2018 was Saturday, one can easily calculate this. But dateframe can be extended and problem is not that trivial anymore.

2018-09-05 16_36_59-Window

But making yet such trivial problem little bit more complicated, another typical business case is to calculate (sum/count) working hours between two dates. Now, the trickery can now go, from simple to over-complicated. Following example uses CTE (Common Table Expression) with recursions.

So the calculation is fairly simple:

-- Number of working hours between two dates
SET DATEFIRST 1;
DECLARE @dayFromDateTime SMALLDATETIME = '2018-09-01 12:33:11.245'
DECLARE @dayTODateTime SMALLDATETIME = '2018-09-05 09:33:32.256'
DECLARE @hourFrom INT = 8
DECLARE @hourTo INT = 16

;WITH cte
AS
(SELECT
DATEADD(MINUTE, -1, @dayFromDateTime) AS StartDate
,0 AS WorkDayFlag
,0 AS WorkHourFlag

UNION ALL

SELECT
DATEADD(MINUTE, 1, StartDate) AS StartDate
,CASE WHEN DATEPART(WEEKDAY, DATEADD(MINUTE, 1, StartDate)) IN (1,2,3,4,5) THEN 1 ELSE 0 END AS WorkDayFlag
,CASE WHEN DATEPART(HOUR, DATEADD(MINUTE, 1, StartDAte)) BETWEEN @hourFrom AND @hourTo-1 THEN 1 ELSE 0 END AS WorkHourFlag
FROM cte
WHERE
StartDate <= @dayTODateTime
)
SELECT
SUM(CASE WHEN WorkDayFlag = 1 AND WorkHourFlag = 1 THEN 1 ELSE 0 END) AS nofWorkingMinutes
,SUM(CASE WHEN WorkDayFlag = 1 AND WorkHourFlag = 1 THEN 1 ELSE 0 END)*1.0/60 AS nofWorkingHours
FROM cte 
OPTION (MAXRECURSION 10000)

 

I have even added the time – when business hours begin and end within a given time frame. Please note that I do not support the usage of MAXRECURSION OPTIONS but for small examples the query option is very effective.

The result of the query is:

2018-09-05 16_41_34-Window

If you will be using and running such calculations extensively, I strongly suggest to change the solution to UDF and adding some additional rules table with list of holidays, working days / weekends, setting up the start of the week, getting correct times (timezone, UTC) and any other business rules that should be applied.

Date intervals

Finding correct intervals is always part of  solving the business need. By querying the database, one will for sure come across the intervals as start and end date/time format.

Here it the example collection of date intervals between current week, month, quarter and script can be extended to different or broader (narrower) interval.

SELECT 
DATEADD(WEEK, DATEDIFF(WEEK, '19000101', GETDATE()), '18991231') as FromCurrentWeek
,DATEADD(WEEK, DATEDIFF(WEEK, '18991231', GETDATE()), '19000106') as ToCurrentWeek

SELECT 
CAST(DATEADD(DAY, 1, EOMONTH(GETDATE(),-1)) AS DATETIME) AS FromCurrentMonth
,CAST(EOMONTH(GETDATE()) AS DATETIME) AS ToCurrentMonth

SELECT 
DATEADD(QUARTER, DATEDIFF(QUARTER, 0, (GETDATE())), 0) as FromStartCurrentQuarter
,DATEADD(DAY,-1,DATEADD(QUARTER, DATEDIFF(QUARTER, -1, (GETDATE())), 0)) as ToEndCurrentQuarter

--Last days
SELECT 
DATEADD(s,-1,DATEADD(mm, DATEDIFF(m,0,GETDATE()),0)) AS LastDayOfPreviousMonth
,DATEADD(s,-1,DATEADD(mm, DATEDIFF(m,0,GETDATE())+1,0)) AS LastDayOfCurrentMonth
,DATEADD(s,-1,DATEADD(mm, DATEDIFF(m,0,GETDATE())+2,0)) AS LastDayOfNextMonth

And the results of the query based on the GETDATE() function.

2018-09-05 16_46_57-Window

werwerwe

Support Date functions

With following couple of examples, you can set your current session settings, e.g.: language, start of the week (DATEFIRST), dateformat (DATEFORMAT) and based on these settings, the behaviour of inserts or selects against any datetime data types can be altered, controlled or changed.

-- Set language
SET LANGUAGE us_english;
SELECT DATENAME(WEEKDAY, '20190904') [US_English];
 
SET LANGUAGE Slovenian;
SELECT DATENAME(WEEKDAY, '20180904') [Slovenian];

-- Starting week with day
SET LANGUAGE Slovenian;  
SELECT @@DATEFIRST AS weekStart;  

SET LANGUAGE us_english;  
SELECT @@DATEFIRST AS WeekStart;  

-- datefirst or @@datefirst
SET DATEFIRST 1; -- week starts with monday

-- Example with  September 5th, 2018
-- Possible formats:  mdy, dmy, ymd, ydm, myd, and dym. 
DROP TABLE IF EXISTS #dateformats
CREATE TABLE #dateformats (dd SMALLDATETIME ,ddFormat VARCHAR(10))

SET DATEFORMAT mdy;
INSERT INTO #dateformats  VALUES ('09/06/2018', 'mdy');
SET DATEFORMAT ymd;
INSERT INTO #dateformats  VALUES ('2018/09/07', 'ymd');
SET DATEFORMAT dmy;
INSERT INTO #dateformats  VALUES ('08/09/2018', 'ymd');

SELECT 
	 dd AS InsertedDate
	,ddFormat AS InsertedDateFormat
FROM #dateformats

With SET DATEFORMAT, example shows, that if datetime format data types come in different format year-month-day or day-month-year, by simple controlling of SET DATEFORMAT, date formats can be handled easier, without any futher CASTing or CONVERTing.

2018-09-06 14_53_40-Date_Time_manipulations.sql - geningsql07.ETL (CORP_tomazk (115)) - Microsoft SQ

Starting Simple Time

Manipulating time, as part of datetime data type or a standalone time, following examples will help you understand working with time. Simple functions as HOUR, MINUTE, SECOND or MILLISECOND will extract part of the time for any further use.

Also CONVERT and CAST are used frequently for converting time to desired format.

SELECT
  GETDATE() AS RightNow
 ,DATEPART(HOUR,GETDATE()) AS Hour_RightNow
 ,DATEPART(MINUTE, GETDATE()) AS Minute_RightNow
 ,DATEPART(SECOND, GETDATE()) AS Second_RightNow
 ,DATEPART(MILLISECOND, GETDATE()) AS MilliSecond_RightNow

-- Using Date format or FORMAT or CAST / CONVERT To get only time
SELECT
  CAST(GETDATE() AS TIME) AS TIME_RightNow
 ,CONVERT(CHAR,GETDATE(),14) AS TIME_withConvert14
 ,CONVERT(TIME, GETDATE()) AS Time_ConvertONly
 ,CAST(CONVERT(CHAR(8),GETDATE(),114) AS DATETIME) AS Time_WithConvert114_AandDate
 ,CONVERT(VARCHAR(12),GETDATE(),114) AS Time_standardFormat
 ,GETUTCDATE() AS TimeUTC_RightNow
 ,SYSDATETIME() AS SystemDateTime
 ,CONVERT(VARCHAR(10), CAST(GETDATE() AS TIME(0)), 100) AS SimpleAM_PM

SELECT
  FORMAT(cast(GETDATE() AS TIME), N'hh\.mm') AS timeFormatDot
 ,FORMAT(cast(GETDATE() AS TIME), N'hh\:mm') AS timeFormatColon
 ,FORMAT(CAST(GETDATE() AS TIME), 'hh\:mm\:ss') AS standardTimeFormat

 

Query outputs the following results and using function FORMAT might be fast and easy handling time formats for desired outputs.

2018-09-05 16_52_51-Window

Converting to time formats

Time can be stored in different type formats, which might present a problem, but firstly, let’s take a look into integer.

Following example shows how to manipulate the time formats, when time parts are stored in integers.

-- seconds/milliseconds to time format
DECLARE @MilliSec INT = 55433
DECLARE @Sec INT = 532

SELECT 
CONVERT(VARCHAR(10),DATEADD(ms,@MilliSec,0),114) AS MilliSecToTime --format hh:mm:ss:nnn
,CONVERT(VARCHAR(10),DATEADD(s,@Sec,0),114) AS SecToTime --format hh:mm:ss:nnn

-- Converting seconds to time readable format
DECLARE @seconds INT = 10000

SELECT
@seconds AS NumberOfSeconds
,@seconds/86400 AS NumberOfDays
,@seconds/3600 As NumberOfHours
,@seconds/60 AS NumberMinutes
,CONVERT(VARCHAR, DATEADD(ms, @seconds * 1000, 0), 114) AS FormatedTime

Both examples are converting integers into time formats for readable and usable form.

2018-09-05 16_53_37-Window

VARCHAR / DECIMAL/FLOAT to time formats

Oh. Yes. This happens more often than you think. Having datetime information stored in DECIMAL, FLOAT is not that infrequent case. Let us not go into the bad designs, but rather propose the idea, on how to tackle such problems.

With decimal data type:

-- Using Decimal data type
DECLARE @test_DTY TABLE
(id int
,KA2_DATE decimal (8,0)
,KA2_TIME decimal (6,0)
)

INSERT INTO @test_DTY
SELECT 1, 20180905, 110951
UNION ALL SELECT 2, 20180905, 113407
UNION ALL SELECT 3, 20180905, 063407

SELECT
id
,KA2_DATE AS OriginalDate
,KA2_TIME AS OriginalDate --Note leading zeros will not be presented as this is decimal data type

,RIGHT(CAST(KA2_DATE AS VARCHAR(8)),2) + '/' + SUBSTRING(CAST(KA2_DATE AS VARCHAR(8)),5,2) + '/' + LEFT(CAST(KA2_DATE AS VARCHAR(8)),4) AS DateFormated_dd_MM_yyyy
,LEFT(CAST(REPLICATE('0', 6 - LEN(CAST(KA2_TIME AS VARCHAR(6)))) AS VARCHAR(1)) + CAST(CAST(KA2_TIME AS VARCHAR(6)) AS VARCHAR(6)),2) + ':' 
+ SUBSTRING(CAST(REPLICATE('0', 6 - LEN(CAST(KA2_TIME AS VARCHAR(6)))) AS VARCHAR(1)) + CAST(CAST(KA2_TIME AS VARCHAR(6)) AS VARCHAR(6)),3,2) + ':' 
+ RIGHT(CAST(REPLICATE('0', 6 - LEN(CAST(KA2_TIME AS VARCHAR(6)))) AS VARCHAR(1)) + CAST(CAST(KA2_TIME AS VARCHAR(6)) AS VARCHAR(6)),2) AS Time_formatted_hh_mm_ss
FROM @test_DTY

you can see the results are presented in more readable and easy to handle formats for any further use with your T-SQL query.

2018-09-05 22_23_06-Window

Another frequent, yet uncomfortable data type is VARCHAR format. With use of PARSENAME and other string manipulation functions (SUBSTRING, CHARINDEX, PATINDEX, …), converting to datetime formats can be done. Even though, it is not performance efficient, sometimes you just can not change the table(s) design.

-- Using String data Type (Varchar)
DECLARE @temp TABLE (Ddate VARCHAR(20))

INSERT INTO @temp (Ddate)
SELECT '3.11.2017 14:55:53' 
UNION ALL SELECT '12.11.2018 22:39:49' 
UNION ALL SELECT '12.8.2018 22:39:49' 
UNION ALL SELECT '2.3.2018 22:39:49' 
UNION ALL SELECT '12.8.2018 7:39:49' 
UNION ALL SELECT '12.8.2018 7:09:49'

SELECT 
Ddate AS OriginalDateFormatInVarchar
,PARSENAME(Ddate,3) AS DayFromVarchar
,PARSENAME(Ddate,2) AS MonthFromVarchar
,LEFT(PARSENAME(Ddate,1),4) AS YearFromVarchar
,REPLACE(SUBSTRING(Ddate, CHARINDEX(' ', Ddate), CHARINDEX(':', Ddate)),':','') AS TimeFromVarchar

,CAST(CAST(LEFT(PARSENAME(Ddate,1),4) AS CHAR(4)) + REPLICATE ('0', 2 - LEN(CAST(PARSENAME(Ddate,2) AS VARCHAR(2)))) + 
CAST(PARSENAME(Ddate,2) AS VARCHAR(2)) + REPLICATE ('0', 2 - LEN(CAST(PARSENAME(Ddate,3) AS VARCHAR(2)))) + 
CAST(PARSENAME(Ddate,3) AS VARCHAR(2)) AS INT) AS DateFormattedFromVarchar --Leading zeros corrected!

,CAST(CAST(LEFT(PARSENAME(Ddate,1),4) AS CHAR(4)) + REPLICATE ('0', 2 - LEN(CAST(PARSENAME(Ddate,2) AS VARCHAR(2)))) + 
CAST(PARSENAME(Ddate,2) AS VARCHAR(2)) + REPLICATE ('0', 2 - LEN(CAST(PARSENAME(Ddate,3) AS VARCHAR(2)))) + 
CAST(PARSENAME(Ddate,3) AS VARCHAR(2)) AS SMALLDATETIME) AS DateFormattedFromVarcharToSmallDateTime

,CAST(CAST(left(PARSENAME(Ddate,1),4) AS CHAR(4)) + REPLICATE ('0', 2 - LEN(CAST(PARSENAME(Ddate,2) AS VARCHAR(2)))) + 
CAST(PARSENAME(Ddate,2) AS VARCHAR(2)) + REPLICATE ('0', 2 - LEN(CAST(PARSENAME(Ddate,3) AS VARCHAR(2)))) + 
CAST(PARSENAME(Ddate,3) AS VARCHAR(2)) + REPLICATE('0', 6 - LEN(
CAST(LTRIM(RTRIM(REPLACE(SUBSTRING(Ddate, CHARINDEX(' ', Ddate), CHARINDEX(':', Ddate)),':',''))) AS VARCHAR(6)))) +
CAST(LTRIM(RTRIM(REPLACE(SUBSTRING(Ddate, CHARINDEX(' ', Ddate), CHARINDEX(':', Ddate)),':',''))) AS VARCHAR(6))
AS VARCHAR(14)) AS DateAndTimeFormatFromVarchar

FROM @temp
ORDER BY 8 ASC

Using the query, the end result is time (and date) represented in different formats for any further use with T-SQL query or functions for datetime formats, like between and logical operators.

2018-09-05 22_23_17-Window

Fiscal Years (#Update 1; Sept. 6th,2018)

For any given time (using GETDATE()), getting the start and end time of the fiscal year is another important boundary for and PnL or financial reports.

-- a) it start on April 1 and runs for 364,25 days until March 31 of next year 
-- b) it starts on October 1 and runs for 364,25 days until September 30 of next year

SELECT
-- Fiscal Year Apr-Mar
CASE WHEN MONTH(GETDATE()) >= 4 THEN YEAR(GETDATE()) ELSE DATEPART(year,DATEADD(Year,-1,GETDATE())) END AS StartOfFiscalYearYear
,CASE WHEN MONTH(GETDATE()) >= 4 THEN YEAR(GETDATE())+1 ELSE YEAR(GETDATE()) END AS EndOfFiscalYearYear
,CASE WHEN MONTH(GETDATE()) >= 4 THEN CAST(CONCAT(CAST(YEAR(GETDATE()) AS VARCHAR),'/04/01') AS DATE) ELSE CAST(CONCAT(CAST(DATEPART(year,DATEADD(Year,-1,GETDATE())) AS VARCHAR),'/04/01') AS DATE) END AS StartOfFiscalYearDateTime
,CASE WHEN MONTH(GETDATE()) >= 4 THEN CAST(CONCAT(CAST(YEAR(dateadd(year,1,getdate())) AS VARCHAR),'/03/31') AS DATE) ELSE CAST(CONCAT(CAST(YEAR(GETDATE()) AS VARCHAR),'/03/31') AS DATE) END AS EndOfFiscalYearDateTime
-- Fiscal Year Oct-Sep
,CASE WHEN MONTH(GETDATE()) >= 10 THEN YEAR(GETDATE()) ELSE DATEPART(year,DATEADD(Year,-1,GETDATE())) END AS StartOfFiscalYearYear
,CASE WHEN MONTH(GETDATE()) >= 10 THEN YEAR(GETDATE())+1 ELSE YEAR(GETDATE()) END AS EndOfFiscalYearYear
,CASE WHEN MONTH(GETDATE()) >= 10 THEN CAST(CONCAT(CAST(YEAR(GETDATE()) AS VARCHAR),'/10/01') AS DATE) ELSE CAST(CONCAT(CAST(DATEPART(year,DATEADD(Year,-1,GETDATE())) AS VARCHAR),'/10/01') AS DATE) END AS StartOfFiscalYearDateTime
,CASE WHEN MONTH(GETDATE()) >= 10 THEN CAST(CONCAT(CAST(YEAR(dateadd(year,1,getdate())) AS VARCHAR),'/09/30') AS DATE) ELSE CAST(CONCAT(CAST(YEAR(GETDATE()) AS VARCHAR),'/09/30') AS DATE) END AS EndOfFiscalYearDateTime

Using two different boundaries for fiscal year and given the fact that today is September 6th, 2018, the result is as following:

2018-09-07 21_19_46-Window

Thank you for the comments and suggestions.

 

As always, all the code is available at the Github.

Happy T-SQLing.

Tagged with: , , , , , , ,
Posted in Uncategorized

Top SQL Server bloggers of 2018

SQLShack community web-site  by company ApexSQL have announced on 1st of September 2018 the list of SQL Server bloggers.

ApexSQL_Logo

The list of top SQL Server bloggers is absolutely worth checking for entry-level learning, for troubleshooting and for advanced topics. There are also many other top list of bloggers – PowerShell and others.

2018-09-04 15_25_32-Top SQL Server bloggers of 2017

 

Based on ALEXA ranking the list for sorted by relevance. My blog also managed to get on the list, humbled and honoured to be recognised for my community work.  Congrats to all SQL fellows and SQL family and all the SQL lovers. Thank you also to SQLShack community for doing this.

Top-30-2018

Happy Reading!

Tagged with: , , ,
Posted in Uncategorized
Categories
Follow TomazTsql on WordPress.com
Programs I Use
Programs I Use
Programs I Use
Rdeči Noski – Charity

Rdeči noski

100% of donations made here go to charity, no deductions, no fees. For CLOWNDOCTORS - encouraging more joy and happiness to children staying in hospitals (http://www.rednoses.eu/red-noses-organisations/slovenia/)

€2.00

Top SQL Server Bloggers 2018
Discover

A daily selection of the best content published on WordPress, collected for you by humans who love to read.

Revolutions

Tomaz doing BI and DEV with SQL Server and R

tenbulls.co.uk

attaining enlightenment with sql server, .net, biztalk, windows and linux

SQL DBA with A Beard

He's a SQL DBA and he has a beard

Reeves Smith's SQL & BI Blog

A blog about SQL Server and the Microsoft Business Intelligence stack with some random Non-Microsoft tools thrown in for good measure.

SQL Server

for Application Developers

Clocksmith Games

We make games we love to play

Business Analytics 3.0

Data Driven Business Models

SQL Database Engine Blog

Tomaz doing BI and DEV with SQL Server and R

Search Msdn

Tomaz doing BI and DEV with SQL Server and R

R-bloggers

Tomaz doing BI and DEV with SQL Server and R

Ms SQL Girl

Julie Koesmarno's Journey In Data, BI and SQL World

R-bloggers

R news and tutorials contributed by (750) R bloggers

Data Until I Die!

Data for Life :)

Paul Turley's SQL Server BI Blog

sharing my experiences with the Microsoft data platform, SQL Server BI, Data Modeling, SSAS Design, Power Pivot, Power BI, SSRS Advanced Design, Power BI, Dashboards & Visualization since 2009

Grant Fritchey

Intimidating Databases and Code

Madhivanan's SQL blog

A modern business theme

Alessandro Alpi's Blog

SQL Server, Azure and .net in a nutshell :D

Paul te Braak

Business Intelligence Blog

Sql Server Insane Asylum (A Blog by Pat Wright)

Information about SQL Server from the Asylum.

Gareth's Blog

A blog about Life, SQL & Everything ...

SQLPam's Blog

Life changes fast and this is where I occasionally take time to ponder what I have learned and experienced. A lot of focus will be on SQL and the SQL community – but life varies.

William Durkin

William Durkin a blog on SQL Server, Replication, Performance Tuning and whatever else.