Handling required and missing R packages in Microsoft R Services

I have seen several times, that execution of the R code with procedure sp_execute_external_script was not valid due to missing library or library dependencies.

Problem is – in general – not solved out of the box. But can be solved using and maintaining a list of installed libraries used by Microsoft R services or by simply create a “pre-code” R code to do a check for you.

In both cases, user will end up with additional code, but it might be a good check if you are installing library in production and you run such check prior to running any relevant R code.

Let’s start with simple R code:

USE WideWorldImporters;
GO

EXECUTE sp_execute_external_script    
           @language = N'R'
          ,@script=N'library(Hmisc) 
                    df <- data.frame(rcorr(as.matrix(sp_RStats_query), 
                            type="pearson")$P)
                    OutputDataSet<-df'
          ,@input_data_1 = N'SELECT 
                     SupplierID
                    ,UnitPackageID
                    ,OuterPackageID
                FROM [Warehouse].[StockItems]'
          ,@input_data_1_name = N'sp_RStats_query'
    WITH RESULT SETS ((SupplierID NVARCHAR(200)
                    ,UnitPackageID NVARCHAR(200)
                    ,OuterPackageID NVARCHAR(200)));

 

This code will in my case return error message, that R Service is missing a specific library in order to execute R code – in my case library Hmisc.

Error_installed_library_01.Wide

So the first step to solve this issue is to declare variable for R script and parametrize  @script parameter for procedure sp_execute_external_script.

DECLARE @OutScript NVARCHAR(MAX) 
SET @OutScript =N'library(Hmisc) 
                    df <- data.frame(rcorr(as.matrix(sp_RStats_query), 
                     type="pearson")$P)
                    OutputDataSet<-df'

EXECUTE sp_execute_external_script    
           @language = N'R'
          ,@script= @OutScript
          ,@input_data_1 = N'SELECT 
                     SupplierID
                    ,UnitPackageID
                    ,OuterPackageID
                FROM [Warehouse].[StockItems]'
          ,@input_data_1_name = N'sp_RStats_query'
    WITH RESULT SETS ((
                         SupplierID    NVARCHAR(200)
                        ,UnitPackageID NVARCHAR(200)
                        ,OuterPackageID NVARCHAR(200)
                    ));

Now we need to do a string search for following patterns:

  • library (c(package1,package2))
  • library (package1),  library(package2)
  • library(package1)

All these reserved R words denote the need for particular library/libraries installation.

DECLARE @OutScript NVARCHAR(MAX) 
SET @OutScript =N'library(Hmisc)
                  library(test123) 
                    df <- data.frame(rcorr(as.matrix(sp_RStats_query), 
                      type="pearson")$P)
                    OutputDataSet<-df'

/*
***************************************************************
START: CODE FOR CHECKING AND INSTALLING MISSING R LIBRARIES
***************************************************************
*/


DECLARE @Tally TABLE (num TINYINT,R_Code NVARCHAR(MAX))
INSERT INTO @Tally VALUES (1,@OutScript)
DECLARE @libstatement NVARCHAR(MAX)
DECLARE @cmdstatement NVARCHAR(MAX)

;WITH CTE_R(num,R_Code, libname)
AS
(
SELECT
      1 AS num,
      RIGHT(R_Code, LEN(R_Code) - CHARINDEX(')', R_Code, 0)) AS  R_Code, 
      substring(R_Code, CHARINDEX('library(', R_Code, 0) + 0, CHARINDEX(')',
 R_Code, 0) - CHARINDEX('library(', R_Code, 0) + 1) AS libname
FROM @Tally
WHERE  
        CHARINDEX('(', R_Code, 0) > 0 
    AND CHARINDEX('library(',R_Code,0) > 0

UNION ALL

SELECT
     1 AS num,
     RIGHT(R_Code, LEN(R_Code) - CHARINDEX(')', R_Code, 0)) AS  R_Code,
     substring(R_Code, CHARINDEX('library(', R_Code, 0) + 0, CHARINDEX(')', 
R_Code, 0) - CHARINDEX('library(', R_Code, 0) + 1) AS libname
FROM CTE_R
WHERE 
    CHARINDEX('(', R_Code, 0) > 0 
AND CHARINDEX('library(',R_Code,0) > 0

)
, fin AS
(
SELECT TOP 1 stuff((SELECT ' install.packages(''''' + 
REPLACE(REPLACE(REPLACE(c1.libname,'library',''),')',''),'(','') + '''''
           , dependencies = T)'
              FROM CTE_R AS c1 
              WHERE 
                    c1.num = c2.num
              FOR XML PATH (''),TYPE).value('.','NVARCHAR(MAX)'),1,1,'') AS lib_stat
              FROM CTE_R AS c2
)
SELECT 
        @libstatement = lib_stat 
FROM fin

SET @cmdstatement = 'EXEC xp_cmdshell ''"C:\Program Files\Microsoft SQL Server\
MSSQL13.MSSQLSERVER\R_SERVICES\bin\R.EXE" cmd -e ' + @libstatement + ''''
EXEC SP_EXECUTESQL @cmdstatement

/*
***************************************************************
END: CODE FOR CHECKING AND INSTALLING MISSING R LIBRARIES
***************************************************************
*/


EXECUTE sp_execute_external_script    
           @language = N'R'
          ,@script= @OutScript
          ,@input_data_1 = N'SELECT 
                     SupplierID
                    ,UnitPackageID
                    ,OuterPackageID
                FROM [Warehouse].[StockItems]'
          ,@input_data_1_name = N'sp_RStats_query'
    WITH RESULT SETS ((
                         SupplierID    NVARCHAR(200)
                        ,UnitPackageID NVARCHAR(200)
                        ,OuterPackageID NVARCHAR(200)
                    ));

 

Result in this case will be successful with correct R results and sp_execute_external_script will not return error for missing libraries.

I added a “fake” library called test123 for testing purposes if all the libraries will be installed successfully.

At the end the script generated xp_cmdshell command (in one line):

EXEC xp_cmdshell '"C:\Program Files\Microsoft SQL Server\
MSSQL13.MSSQLSERVER\R_SERVICES\bin\R.EXE" cmd 
-e install.packages(''Hmisc'') install.packages(''test123'')'

You might also experience the following error in the output of xp_cmdshell command:

Error_installed_library_02.Wide

In this case, go to the following location and enable write permission for this folder and subfolders.

Error_installed_library_03.Wide

For the end, you will always find more elegant and easy way to write R code that will inadvertently check the installation (as well as version and dependencies for library) with following R code:

if(!is.element("Hmisc", installed.packages()))
{install.packages("Hmisc", dependencies = T)
}else{library("Hmisc")}

So the original code can simply be changed to:

USE WideWorldImporters;
GO

EXECUTE sp_execute_external_script    
           @language = N'R'
          ,@script=N'if(!is.element("Hmisc", installed.packages()))
                      {install.packages("Hmisc", dependencies = T)
                        }else{library("Hmisc")}
                    df <- data.frame(rcorr(as.matrix(sp_RStats_query), 
 type="pearson")$P)
                    OutputDataSet<-df'
          ,@input_data_1 = N'SELECT 
                     SupplierID
                    ,UnitPackageID
                    ,OuterPackageID
                FROM [Warehouse].[StockItems]'
          ,@input_data_1_name = N'sp_RStats_query'
    WITH RESULT SETS ((SupplierID NVARCHAR(200)
                    ,UnitPackageID NVARCHAR(200)
                    ,OuterPackageID NVARCHAR(200)));

But unfortunately, one can not always count on the consistence of developer or data scientists or author of the code, that they will always add a simple check for library installation.

Code is available at GitHub.

Happy R-TSQLing!

Tagged with: , , , , ,
Posted in Uncategorized
5 comments on “Handling required and missing R packages in Microsoft R Services
  1. […] article was first published on R – TomazTsql, and kindly contributed to […]

    Like

  2. […] article was first published on R – TomazTsql, and kindly contributed to […]

    Like

  3. […] article was first published on R – TomazTsql, and kindly contributed to […]

    Like

  4. […] Tomaz Kastrun has a script to check and install missing packages in SQL Server R Services code: […]

    Like

  5. […] have written couple of blog posts on R packages (here | here ) and this blog post is sort of a preset of all the most needed packages for data science, […]

    Like

Leave a comment

Follow TomazTsql on WordPress.com
Programs I Use: SQL Search
Programs I Use: R Studio
Programs I Use: Plan Explorer
Rdeči Noski – Charity

Rdeči noski

100% of donations made here go to charity, no deductions, no fees. For CLOWNDOCTORS - encouraging more joy and happiness to children staying in hospitals (http://www.rednoses.eu/red-noses-organisations/slovenia/)

€2.00

Top SQL Server Bloggers 2018
TomazTsql

Tomaz doing BI and DEV with SQL Server and R, Python, Power BI, Azure and beyond

Discover WordPress

A daily selection of the best content published on WordPress, collected for you by humans who love to read.

Revolutions

Tomaz doing BI and DEV with SQL Server and R, Python, Power BI, Azure and beyond

tenbulls.co.uk

tenbulls.co.uk - attaining enlightenment with the Microsoft Data and Cloud Platforms with a sprinkling of Open Source and supporting technologies!

SQL DBA with A Beard

He's a SQL DBA and he has a beard

Reeves Smith's SQL & BI Blog

A blog about SQL Server and the Microsoft Business Intelligence stack with some random Non-Microsoft tools thrown in for good measure.

SQL Server

for Application Developers

Business Analytics 3.0

Data Driven Business Models

SQL Database Engine Blog

Tomaz doing BI and DEV with SQL Server and R, Python, Power BI, Azure and beyond

Search Msdn

Tomaz doing BI and DEV with SQL Server and R, Python, Power BI, Azure and beyond

R-bloggers

Tomaz doing BI and DEV with SQL Server and R, Python, Power BI, Azure and beyond

R-bloggers

R news and tutorials contributed by hundreds of R bloggers

Data Until I Die!

Data for Life :)

Paul Turley's SQL Server BI Blog

sharing my experiences with the Microsoft data platform, SQL Server BI, Data Modeling, SSAS Design, Power Pivot, Power BI, SSRS Advanced Design, Power BI, Dashboards & Visualization since 2009

Grant Fritchey

Intimidating Databases and Code

Madhivanan's SQL blog

A modern business theme

Alessandro Alpi's Blog

DevOps could be the disease you die with, but don’t die of.

Paul te Braak

Business Intelligence Blog

Sql Insane Asylum (A Blog by Pat Wright)

Information about SQL (PostgreSQL & SQL Server) from the Asylum.

Gareth's Blog

A blog about Life, SQL & Everything ...