Handling required and missing R packages in Microsoft R Services

I have seen several times, that execution of the R code with procedure sp_execute_external_script was not valid due to missing library or library dependencies.

Problem is – in general – not solved out of the box. But can be solved using and maintaining a list of installed libraries used by Microsoft R services or by simply create a “pre-code” R code to do a check for you.

In both cases, user will end up with additional code, but it might be a good check if you are installing library in production and you run such check prior to running any relevant R code.

Let’s start with simple R code:

USE WideWorldImporters;
GO

EXECUTE sp_execute_external_script    
           @language = N'R'
          ,@script=N'library(Hmisc) 
                    df <- data.frame(rcorr(as.matrix(sp_RStats_query), 
                            type="pearson")$P)
                    OutputDataSet<-df'
          ,@input_data_1 = N'SELECT 
                     SupplierID
                    ,UnitPackageID
                    ,OuterPackageID
                FROM [Warehouse].[StockItems]'
          ,@input_data_1_name = N'sp_RStats_query'
    WITH RESULT SETS ((SupplierID NVARCHAR(200)
                    ,UnitPackageID NVARCHAR(200)
                    ,OuterPackageID NVARCHAR(200)));

 

This code will in my case return error message, that R Service is missing a specific library in order to execute R code – in my case library Hmisc.

Error_installed_library_01.Wide

So the first step to solve this issue is to declare variable for R script and parametrize  @script parameter for procedure sp_execute_external_script.

DECLARE @OutScript NVARCHAR(MAX) 
SET @OutScript =N'library(Hmisc) 
                    df <- data.frame(rcorr(as.matrix(sp_RStats_query), 
                     type="pearson")$P)
                    OutputDataSet<-df'

EXECUTE sp_execute_external_script    
           @language = N'R'
          ,@script= @OutScript
          ,@input_data_1 = N'SELECT 
                     SupplierID
                    ,UnitPackageID
                    ,OuterPackageID
                FROM [Warehouse].[StockItems]'
          ,@input_data_1_name = N'sp_RStats_query'
    WITH RESULT SETS ((
                         SupplierID    NVARCHAR(200)
                        ,UnitPackageID NVARCHAR(200)
                        ,OuterPackageID NVARCHAR(200)
                    ));

Now we need to do a string search for following patterns:

  • library (c(package1,package2))
  • library (package1),  library(package2)
  • library(package1)

All these reserved R words denote the need for particular library/libraries installation.

DECLARE @OutScript NVARCHAR(MAX) 
SET @OutScript =N'library(Hmisc)
                  library(test123) 
                    df <- data.frame(rcorr(as.matrix(sp_RStats_query), 
                      type="pearson")$P)
                    OutputDataSet<-df'

/*
***************************************************************
START: CODE FOR CHECKING AND INSTALLING MISSING R LIBRARIES
***************************************************************
*/


DECLARE @Tally TABLE (num TINYINT,R_Code NVARCHAR(MAX))
INSERT INTO @Tally VALUES (1,@OutScript)
DECLARE @libstatement NVARCHAR(MAX)
DECLARE @cmdstatement NVARCHAR(MAX)

;WITH CTE_R(num,R_Code, libname)
AS
(
SELECT
      1 AS num,
      RIGHT(R_Code, LEN(R_Code) - CHARINDEX(')', R_Code, 0)) AS  R_Code, 
      substring(R_Code, CHARINDEX('library(', R_Code, 0) + 0, CHARINDEX(')',
 R_Code, 0) - CHARINDEX('library(', R_Code, 0) + 1) AS libname
FROM @Tally
WHERE  
        CHARINDEX('(', R_Code, 0) > 0 
    AND CHARINDEX('library(',R_Code,0) > 0

UNION ALL

SELECT
     1 AS num,
     RIGHT(R_Code, LEN(R_Code) - CHARINDEX(')', R_Code, 0)) AS  R_Code,
     substring(R_Code, CHARINDEX('library(', R_Code, 0) + 0, CHARINDEX(')', 
R_Code, 0) - CHARINDEX('library(', R_Code, 0) + 1) AS libname
FROM CTE_R
WHERE 
    CHARINDEX('(', R_Code, 0) > 0 
AND CHARINDEX('library(',R_Code,0) > 0

)
, fin AS
(
SELECT TOP 1 stuff((SELECT ' install.packages(''''' + 
REPLACE(REPLACE(REPLACE(c1.libname,'library',''),')',''),'(','') + '''''
           , dependencies = T)'
              FROM CTE_R AS c1 
              WHERE 
                    c1.num = c2.num
              FOR XML PATH (''),TYPE).value('.','NVARCHAR(MAX)'),1,1,'') AS lib_stat
              FROM CTE_R AS c2
)
SELECT 
        @libstatement = lib_stat 
FROM fin

SET @cmdstatement = 'EXEC xp_cmdshell ''"C:\Program Files\Microsoft SQL Server\
MSSQL13.MSSQLSERVER\R_SERVICES\bin\R.EXE" cmd -e ' + @libstatement + ''''
EXEC SP_EXECUTESQL @cmdstatement

/*
***************************************************************
END: CODE FOR CHECKING AND INSTALLING MISSING R LIBRARIES
***************************************************************
*/


EXECUTE sp_execute_external_script    
           @language = N'R'
          ,@script= @OutScript
          ,@input_data_1 = N'SELECT 
                     SupplierID
                    ,UnitPackageID
                    ,OuterPackageID
                FROM [Warehouse].[StockItems]'
          ,@input_data_1_name = N'sp_RStats_query'
    WITH RESULT SETS ((
                         SupplierID    NVARCHAR(200)
                        ,UnitPackageID NVARCHAR(200)
                        ,OuterPackageID NVARCHAR(200)
                    ));

 

Result in this case will be successful with correct R results and sp_execute_external_script will not return error for missing libraries.

I added a “fake” library called test123 for testing purposes if all the libraries will be installed successfully.

At the end the script generated xp_cmdshell command (in one line):

EXEC xp_cmdshell '"C:\Program Files\Microsoft SQL Server\
MSSQL13.MSSQLSERVER\R_SERVICES\bin\R.EXE" cmd 
-e install.packages(''Hmisc'') install.packages(''test123'')'

You might also experience the following error in the output of xp_cmdshell command:

Error_installed_library_02.Wide

In this case, go to the following location and enable write permission for this folder and subfolders.

Error_installed_library_03.Wide

For the end, you will always find more elegant and easy way to write R code that will inadvertently check the installation (as well as version and dependencies for library) with following R code:

if(!is.element("Hmisc", installed.packages()))
{install.packages("Hmisc", dependencies = T)
}else{library("Hmisc")}

So the original code can simply be changed to:

USE WideWorldImporters;
GO

EXECUTE sp_execute_external_script    
           @language = N'R'
          ,@script=N'if(!is.element("Hmisc", installed.packages()))
                      {install.packages("Hmisc", dependencies = T)
                        }else{library("Hmisc")}
                    df <- data.frame(rcorr(as.matrix(sp_RStats_query), 
 type="pearson")$P)
                    OutputDataSet<-df'
          ,@input_data_1 = N'SELECT 
                     SupplierID
                    ,UnitPackageID
                    ,OuterPackageID
                FROM [Warehouse].[StockItems]'
          ,@input_data_1_name = N'sp_RStats_query'
    WITH RESULT SETS ((SupplierID NVARCHAR(200)
                    ,UnitPackageID NVARCHAR(200)
                    ,OuterPackageID NVARCHAR(200)));

But unfortunately, one can not always count on the consistence of developer or data scientists or author of the code, that they will always add a simple check for library installation.

Code is available at GitHub.

Happy R-TSQLing!

Advertisements
Tagged with: , , , , ,
Posted in Uncategorized
4 comments on “Handling required and missing R packages in Microsoft R Services
  1. […] article was first published on R – TomazTsql, and kindly contributed to […]

    Like

  2. […] article was first published on R – TomazTsql, and kindly contributed to […]

    Like

  3. […] article was first published on R – TomazTsql, and kindly contributed to […]

    Like

  4. […] Tomaz Kastrun has a script to check and install missing packages in SQL Server R Services code: […]

    Like

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

w

Connecting to %s

Categories
Follow TomazTsql on WordPress.com
Revolutions

Tomaz doing BI and DEV with SQL Server and R

tenbulls.co.uk

attaining enlightenment with sql server, .net, biztalk, windows and linux

SQL DBA with A Beard

He's a SQL DBA and he has a beard

DB NewsFeed

Matan Yungman's SQL Server blog

Reeves Smith's SQL & BI Blog

A blog about SQL Server and the Microsoft Business Intelligence stack with some random Non-Microsoft tools thrown in for good measure.

SQL Server

for Application Developers

Clocksmith Games

We make games we love to play

Business Analytics 3.0

Data Driven Business Models

SQL Database Engine Blog

Tomaz doing BI and DEV with SQL Server and R

Search Msdn

Tomaz doing BI and DEV with SQL Server and R

R-bloggers

Tomaz doing BI and DEV with SQL Server and R

Ms SQL Girl

Julie Koesmarno's Journey In Data, BI and SQL World

R-bloggers

R news and tutorials contributed by (750) R bloggers

Data Until I Die!

Data for Life :)

Paul Turley's SQL Server BI Blog

sharing my experiences with the Microsoft data platform, SQL Server BI, Data Modeling, SSAS Design, Power Pivot, Power BI, SSRS Advanced Design, Power BI, Dashboards & Visualization since 2009

Grant Fritchey

Intimidating Databases and Code

Madhivanan's SQL blog

A modern business theme

Alessandro Alpi's Blog

SQL Server, Azure and .net in a nutshell :D

Paul te Braak

Business Intelligence Blog

Sql Server Insane Asylum (A Blog by Pat Wright)

Information about SQL Server from the Asylum.

Gareth's Blog

A blog about Life, SQL & Everything ...

SQLPam's Blog

Life changes fast and this is where I occasionally take time to ponder what I have learned and experienced. A lot of focus will be on SQL and the SQL community – but life varies.

William Durkin

William Durkin a blog on SQL Server, Replication, Performance Tuning and whatever else.

$hell Your Experience !!!

As aventuras de um DBA usando o Poder do $hell

%d bloggers like this: