First thoughs on R integration in SQL Server 2016

This blog post is a very personal confession, might be delicate but it must be understood as a  proposal or plan for improvement.

Update #1: Typing error in title: correct title is First thoughts on R integration in SQL Server 2016 (update: June 24, 2016)

Before the start, let me put some background and context. I have been working with statistics and database for quite some time now and used different tools for statistical purposes. Starting with SPSS, Lisrel, Pajek, continuing with SPSS Modeler and then going to Weka and R together, very heavly used SQL Server SSAS  and working with Orange and in past 2 years working with SAS  and SAP Hana Analytics. With all the products I have used – and I am not here rank or make comparison which one is better or which one I prefer – some things remain the same. Getting to know your data to a detail is the most important one, where as  applying the statistical knowledge to it the result of it.

So I have had opportunity to work with different data sets, for different companies, on different projects and with different tools. And I am – looking back – very happy to have done it. So this thoughts are not me complaining about any of the products or solutions, but merely looking on improvements based on my year long experiences.

Working with R in SQL Server has been my personal thing for couple of years. The story goes back 4 years ago, when I started working on building a web based engine for creating marketing campaigns based on input parameters and desired goals. Once the business/marketing department decided on their new campaign, using a web based (in house build) program, they selected product groups and customer segment and some minor additional information (time frame, validity of campaign, legal stuff,..) they clicked okey and that was it. But in the background the stored procedure generated long T-SQL that has pushed (1) data into SQL Server engine and (2) selected proper model that feed the data to R engine.  So going back to year 2012, I have build framework for communication SQL Server with R engine. Most of the framework is available at SQLServerCentral web page.  The framework had some security flaws and issues (calling external procedure, storing data in external files) and mostly performance issues (scaling out wasn’t possible, R was solemnly dependent on RAM of the server), but overall the framework was working with putting it into production and project put into production.

Back then I wrote my framework in a way that I would be used as easily as possible, storing and handling R code and T-SQL code in SQL Server database. Most of the problems I had was, that SSMS couldn’t debug or validate R code and to some extent, converting R results back to table was also proven not to be as easy as one might think. But overall, I was still in the middle – between the platform and putting the solution into production – mainly to control the input data and results (outputted data in form of recommendations or campaign lists).

After working with R integration in SQL Server 2016 for the past couple of months, I have to admit, I was happy that security issues were solved. And performance issues were solved with RevoScale library. Both are very very welcoming features. Come to think of it, two main issues did Microsoft solved. And looking from my aspect of building my own framework, security issues were really and badly needed to be solved. Same logic applies to workload and working with large datasets. Performance has proven really great. In addition, you have to keep in mind that with R Integration Microsoft is – rather then keeping them separated – bringing different roles of people even closer. So DBA, Analysts and developers are now and can now be all part of one data process. Integration with Azure is also very positive and good thing, come to think of it as an extension to your flavor of landscape. Support to different databases is also what will in future be very helpful – but admitting, I haven’t test it yet.

What I expect in the future for R integration that Microsoft will do is:

  • support R in Analysis Services
  • better support R in Integration Services
  • stronger R integration with DQS and MDM
  • native integration of R into transact SQL (calling SP every time for every kind of statistics might be tiring)
  • have some DMV created for R profiler, for R debugging and for R performance monitoring
  • bring support for HTML files and JS / JQuery scripts into SSRS or PowerBI
  • improve usage of R exports with file table
  • open server Endpoints Objects for R in order to use full extension of R.
  • improve Resource Governor for R in Standard edition
  • questions on auditing, profiling of R code from T-SQL and
  • security (couple of questions as well)

I have most probably forgot on couple of points for improvement. But this was more technical, now let’s talk about user experience point of view:

  • think about the support of R debugger / validation of code in SSMS
  • in fact, bring R into SSMS (!) instead of VS or create RTMS (R Tools for SQL Server Management Studio)
  • not limit R exports only on data frames or images but support also vectors, lists, etc

Of course, not to hasten myself too much, I forgot to ask myself, what actually the purpose of integration was.

2016-06-23 11_48_40-sql-server-2016-r-services-7-638.jpg (JPEG Image, 656 × 333 pixels)

Is it really only about making the super predictive analytics or is it making statistical tool as well? If latter, than I would strongly suggest to recap my open suggestions.

So, for the end, I will serve you with a hint 🙂 I am preparing and building bunch of scripts (from my previous experience and from the framework that I have built) to help all the data scientists and developers to use R much easier. But (!) – mathematical and statistical background still is very much a something I would put as highly recommended to have when digging into R.

Follow the blog where I will keep you updated on the scripts and where the code will be available.


Happy R-SQLing 🙂

Tagged with: , , , , ,
Posted in Uncategorized
One comment on “First thoughs on R integration in SQL Server 2016
  1. Good points! and thumbs up for supporting R on analysis services.


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

Follow TomazTsql on
Programs I Use: SQL Search
Programs I Use: R Studio
Programs I Use: Plan Explorer
Programs I use: Scraper API
Rdeči Noski – Charity

Rdeči noski

100% of donations made here go to charity, no deductions, no fees. For CLOWNDOCTORS - encouraging more joy and happiness to children staying in hospitals (


Top SQL Server Bloggers 2018

Tomaz doing BI and DEV with SQL Server and R, Python, Power BI, Azure and beyond


A daily selection of the best content published on WordPress, collected for you by humans who love to read.


Tomaz doing BI and DEV with SQL Server and R, Python, Power BI, Azure and beyond - attaining enlightenment with the Microsoft Data and Cloud Platforms with a sprinkling of Open Source and supporting technologies!

SQL DBA with A Beard

He's a SQL DBA and he has a beard

Reeves Smith's SQL & BI Blog

A blog about SQL Server and the Microsoft Business Intelligence stack with some random Non-Microsoft tools thrown in for good measure.

SQL Server

for Application Developers

Business Analytics 3.0

Data Driven Business Models

SQL Database Engine Blog

Tomaz doing BI and DEV with SQL Server and R, Python, Power BI, Azure and beyond

Search Msdn

Tomaz doing BI and DEV with SQL Server and R, Python, Power BI, Azure and beyond


Tomaz doing BI and DEV with SQL Server and R, Python, Power BI, Azure and beyond

Ms SQL Girl

Julie Koesmarno's Journey In Data, BI and SQL World


R news and tutorials contributed by hundreds of R bloggers

Data Until I Die!

Data for Life :)

Paul Turley's SQL Server BI Blog

sharing my experiences with the Microsoft data platform, SQL Server BI, Data Modeling, SSAS Design, Power Pivot, Power BI, SSRS Advanced Design, Power BI, Dashboards & Visualization since 2009

Grant Fritchey

Intimidating Databases and Code

Madhivanan's SQL blog

A modern business theme

Alessandro Alpi's Blog

DevOps could be the disease you die with, but don’t die of.

Paul te Braak

Business Intelligence Blog

Sql Server Insane Asylum (A Blog by Pat Wright)

Information about SQL Server from the Asylum.

Gareth's Blog

A blog about Life, SQL & Everything ...

SQLPam's Blog

Life changes fast and this is where I occasionally take time to ponder what I have learned and experienced. A lot of focus will be on SQL and the SQL community – but life varies.

%d bloggers like this: