Passing two SQL queries to sp_execute_external_script

Recently, I got a question on one of my previous blog posts, if there is possibility to pass two queries in same run-time as an argument to external procedure sp_execute_external_script.

Some of the  arguments of the procedure sp_execute_external_script are enumerated. This is valid for the inputting dataset and as the name of argument @input_data_1 suggests, one can easily (and this is valid doubt) think, there can also be @input_data_2 argument, and so on. Unfortunately, this is not true.  External procedure can hold only one T-SQL dataset, inserted through this parameter.

There are many reasons for that, one would be the cost of sending several datasets to external process and back, so inadvertently, this forces user to rethink and pre-prepare the dataset (meaning, do all the data munging beforehand), prior to sending it into external procedure.

But there are workarounds on how to pass additional query/queries to sp_execute_external_script. I am not advocating this, and I strongly disagree with such usage, but here it is.

First I will create two small datasets, using T-SQL

USE SQLR;
GO

DROP TABLE IF EXISTS dataset;
GO

CREATE TABLE dataset
(ID INT IDENTITY(1,1) NOT NULL
,v1 INT
,v2 INT
CONSTRAINT pk_dataset PRIMARY KEY (id)
)

SET NOCOUNT ON;
GO

INSERT INTO dataset(v1,v2)
SELECT TOP 1
 (SELECT TOP 1 number FROM master..spt_values WHERE type IN ('EOB') ORDER BY NEWID()) AS V1
,(SELECT TOP 1 number FROM master..spt_values WHERE type IN ('EOD') ORDER BY NEWID()) AS v2
FROM master..spt_values
GO 50

This dataset will be used directly into @input_data_1 argument. The next one will be used through R code:

CREATE TABLE external_dataset
(ID INT IDENTITY(1,1) NOT NULL
,v1 INT
CONSTRAINT pk_external_dataset PRIMARY KEY (id)
)

SET NOCOUNT ON;
GO

INSERT INTO external_dataset(v1)
SELECT TOP 1
 (SELECT TOP 1 number FROM master..spt_values WHERE type IN ('EOB') ORDER BY NEWID()) AS V1
FROM master..spt_values
GO 50

Normally,  one would use a single dataset like:

EXEC sp_execute_external_script
     @language = N'R'
    ,@script = N'OutputDataSet <- data.frame(MySet);'
    ,@input_data_1 = N'SELECT TOP 5 v1, v2 FROM dataset;'
    ,@input_data_1_name = N'MySet'
WITH RESULT SETS
((
 Val1 INT
 ,Val2 INT
))

But by “injecting” the  ODBC into R code, we can allow external procedure, to get back to your SQL Server and get additional dataset.

This can be done by following:

EXECUTE AS USER = 'RR'; 
GO

DECLARE @Rscript NVARCHAR(MAX)
SET @Rscript = '
   library(RODBC)
   myconn <-odbcDriverConnect("driver={SQL Server};
         Server=SICN-KASTRUN;database=SQLR;uid=RR;pwd=Read!2$16")
  External_source <- sqlQuery(myconn, "SELECT v1 AS v3 
                    FROM external_dataset")
  close(myconn) 
  Myset <- data.frame(MySet)
   #Merge both datasets
   mergeDataSet <- data.frame(cbind(Myset, External_source));'

EXEC sp_execute_external_script
    @language = N'R'
   ,@script = @Rscript
   ,@input_data_1 = N'SELECT v1, v2 FROM dataset;'
   ,@input_data_1_name = N'MySet'
   ,@output_data_1_name = N'mergeDataSet'
WITH RESULT SETS
((
    Val1 INT
   ,Val2 INT
   ,Val3 INT
))

REVERT;
GO

And the result will be merged two datasets, in total three columns:

2017-07-25 22_04_48-Two_data_sets_R_sp_execute_external_script.sql - SICN-KASTRUN.SQLR (SPAR_si01017

which correspond to two datasets:

-- Check the results!
SELECT * FROM dataset
SELECT * FROM external_dataset

There are, as already mentioned, several opposing factors to this approach, and I would not recommend this. Some are:

  • validating and keeping R code in one place
  • performance issues
  • additional costs of data transferring
  • using ODBC connectors
  • installing additional R packages (in my case RODBC package)
  • keeping different datasets in one place
  • security issues
  • additional login/user settings
  • firewall inbound/outbound rules setting

This, of course, can also be achieved with *.XDF file formats, if they are stored locally or on server as a files.

As always, code is available at Github.

Happy R-SQLing! 🙂

Advertisements

5 thoughts on “Passing two SQL queries to sp_execute_external_script

  1. Thank you for this. I had wondered if that would work. It does seem like poor design to not allow access to other sql queries or even the use of variables in the input data sql query – unless you know of a way of doing so?

    Like

    • Hi Jonathan,

      your question is spot on. Was this done as part of poor or well thought design. To be honest, I can not answer that. But I will only explain, how I see it. My opinion is, that was thought over and the way it is, makes sense. First of all, doing and running data mapping, data joining, data cleaning on SQL Server side, will make your data more prune to error, consistent and less traffic will be generated to external engine and this will for sure be faster (I have written a blog post on checking performances: doing data wrangling with T-SQL or data wrangling with R. and doing this with T-SQL was faster, because less data was sent to external engine). So if you can join two queries or two datasets in R code, you will be able to join two (or more) datasets before sending the data to R; with T-SQL. On the other hand, if you look into RevoScaleR package, you will find function rxMerge, that is used for merging different datasets in R or Microsoft R Server environment (that is, “outside” of SQL Server database) and this also tells me, that guys at Microsoft though about this well. rxMerge enables you to join to XDF datasets. The last point into favor of well thought design is, there is possibility to use parameters, where you can pass additional information into R code (such as models, or it can also be some coding lists or ad-hoc information).

      Best, Tomaž

      Like

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s