“Climate spiral” – Plotting GISS Surface Temperature

NASA has been collecting surface temperature for more than over 100 years, and the GISS Surface Temperature analysis from 1888 onward. It is an estimate of global surface temperature change.

The temperature analysis schema was defined in the late 1970s by James Hansen and the complete analysis and method are documented in Hansen and Lebedeff (1987). To put it simply is a method of global temperature change that is used for comparison with one-dimensional global climate models.

Data

The website (https://data.giss.nasa.gov/gistemp/ ) offers loads of data and I am using the Global-mean monthly, seasonal, and annual means, 1880-present, updated through most recent month: TXTCSV

It need some prior data preparation, namely min and max values, as it is required by radarchart function.

library(ggradar)
library(fmsb)
library(scales)
library(RColorBrewer)

#data txt and preparation
df <-read.csv("Documents/GLB.Ts+dSST.csv",header = TRUE, sep = ",", skip = 1, dec="." )[1:13]
df <- as.data.frame(sapply(df[1:143,], as.numeric))
df_months <- names(df)[2:13]
df_years <- df$Year
rownames(df) <- df_years
df <- df[,2:13]

# adding max min
max_min <- data.frame(
  Jan = c(1.4, -0.85), Feb = c(1.4, -0.85), Mar = c(1.4, -0.85),
  Apr = c(1.4, -0.85), May = c(1.4, -0.85), Jun = c(1.4, -0.85),
  Jul = c(1.4, -0.85), Aug = c(1.4, -0.85), Sep = c(1.4, -0.85),
  Oct = c(1.4, -0.85), Nov = c(1.4, -0.85), Dec = c(1.4, -0.85)
)
rownames(max_min) <- c("Max", "Min")
#merging
df <- rbind(max_min, df)

# Set graphic colors
nb.cols <- length(df_years)
mycolors <- colorRampPalette(brewer.pal(8, "Set2"))(nb.cols)
colors_border <- mycolors  
colors_in <- alpha(mycolors, 0.3) 

Data visualisation

Adding the radarchart and looping through the years:

for (i in 1:length(df_years)){
  y <- df_years[1:i]
  df_tmp <- df[rownames(df)%in%y,1:12]
  df_tmp <- rbind(max_min, df_tmp)
  radarchart( df_tmp, maxmin=TRUE, axistype=1,seg=3,vlabels = df_months,
              plwd=0.5 , plty=1,centerzero=FALSE,caxislabels = c(-1, 0, 1, 1.4),
              cglcol="grey", cglty=2, axislabcol="black",  
              vlcex=1.2,
              title= paste0("GISS Surface temperature for years until ", tail(y,1)) )  
  legend(x=-0.35, y=0.15, legend = tail(y,1), bty = "n", pch=30 , col=colors_in , text.col ="black", cex=1.3, pt.cex=3)
}

Changes through time stretch from 1888 until 2023. The scale of this radar charts go from -1ºC to 1.4ºC (with 0 and 1 as reference lines).

Year 1900

Year 1950

Year 2000

Year 2022

The spikes and bursts in temperature over the past 30 years are mind-boggling 😦

As always, code is available on Github in  Useless_R_function repository. The sample file in this repository is here (filename: Climate_spiral.R) Check the repository for future updates.

Happy R-coding and stay healthy!

Advertisement
Tagged with: , , , , ,
Posted in thoughts, Useless R functions

Little useless-useful R functions – Old phone converted from text to numbers

We all remember this, right?

What a nightmare was to type a short message on these keypads. So, image writing Hello on this keypad, you had to press: 4433555555666 to get the letters “hello”.

44 = h
33 = e
555 = l
555 = l
666 = o

So creating converter would be great to troll on your friends 🙂

By using this useless function, you can now convert text into numbers.

SMSconverter <- function(tt){
  st <- NULL
  tti <- unlist(strsplit(paste0(tt, " "), ""))
  # check if input string are letters
  if (!grepl("[^A-Za-z]", tti[1]) == TRUE){
    for (i in 1:nchar(tt)){
      lt <- substr(tt,i,i)
      if (lt != " "){
        rn <- substr(rownames(which(mm == lt, arr.ind = T)),2,2)
        rep <- which(mm == lt, arr.ind = T)[2]
        st <- c(st, replicate(rep, rn))
      } else { st <- c(st, "0") }
    }
    print(paste0(st, collapse = ""))
  }
  
  # check if input string are numbers
  if(!grepl("\\D", tti[1]) == TRUE){
    tti <- unlist(strsplit(paste0(tt, ""), ""))
    st <- NULL
    tti <- unlist(strsplit(as.character(tt), ""))
    tmp <- rle(tti)
    for (i in 1:length(tmp$lengths)){
      rpt <- tmp$lengths[i]
      row_cnt <- tmp$values[i]
      lt <- mm[as.integer(row_cnt)-1,rpt]
      st <- c(st, lt)  
    }
    print(paste(st, collapse=""))
  }
}

You can run the function as:

SMSconverter("text")

and it will output you the sequence of numbers, corresponding to the text: 833998.

The function is created to convert the text to numbers and numbers to text. Yet, the nondeterministic nature of numbers prevents correct conversion in the latter case.

Let me give you an example, with my name “tomaz”. The corresponding number conversion is 8666629999.

8 = t
6666 = can be: 6, 666 or 66, 66 or 666, 6 with solutions: m,o or nn or o,m
2 = a
9999 = z

So solutions can be tmoaz, tnnaz or tomaz. Well, one might need to use also the spell checker. Anyways, the conversion fails when there are multiple letters from the same key.

As always, complete code with helper datasets is available on Github in  Useless_R_function repository. The sample file in this repository is here (filename: Convert_text_to_number.R) Check the repository for future updates.

Happy R-coding and stay healthy!

Tagged with: , , ,
Posted in Uncategorized, Useless R functions

T-SQL Tuesday #162 Invitation – Data Science in the time of ChatGPT

This is a great opportunity and I am honoured to be hosting this months T-SQL Tuesday blogging invitation. With the invitation of Steve, we have agreed to post topic on Data science.

I will be receiving all of your answers on blog posts and twitter (make sure to add #tsql2sday).

Data Science in the time of Chat GPT

Instead of writing and asking Data science questions, let’s discuss the aspects of Data science with the presence of Chat GPT 4.0.

By now, it is known to everyone that Chat GPT is a language model (LLM – Large Language Model) that is based on the GPT (Generative Pre-trained Transformer) architecture. It uses deep learning algorithms to like neural nets with billions of weights and transformers, that generated the sequence of tokens, that make up a piece of text.Transformers introduce the concept of “paying attention” to generally build better sequence of text. It operates primarily with probabilities of words and their sequence and therefore it is also good for human-like responses to natural language queries, making it great for a conversation-like experience.

There are many of the caveats hidden in the processing of text, adjustments of weights, functions (different and tweaked versions of Relu), additional corpora and billions of text for model training and many additional texts.

I have prepared two groups of questions. And I will not go into debate, if the end of data science is near, nor will go into debate, if the AGI (artificial general intelligence) will completely replace the role of data scientists. What I want to hear from you is simply how did you embrace (if at all) the use of Chat GPT, and what were your first impressions. And mostly, how did it help you (if at all), what did you use it for, and have you encountered any traps?

Usage and working along Chat GPT

Imagine using SQL, R, Python, Julia, or Scala, for your daily data science work. And you can practically ask Chat GPT anything and it will return you a relatively coherent and good answer. If you need an explanation, it will excel. Where and what have you used it for? Here is a short list, that might get you started:

  1. Explain the data science algorithn?
  2. Help tune or create SQL code to query big data
  3. Prepare R, Python, Scala code for exploring the data
  4. Help you prepare the training of the model in desired language
  5. Prepare the code for hyperparameter tunning and cross-validation
  6. Ask for data visualization for given dataset
  7. Help create dashboard
  8. Create code for model deployment, model re-training or model consumption
  9. Ask for preparing custom functions and algorithm/function adjustments?

Now, that you have added and found the list of where and how it did help you, I would like to understand, how did this help you? Feel free to make a general comparison and add some explanations. And lastly, of course, add, if this has in any kind of way compromise your work as a data scientist (in a term of embracing it in – a positive way, or in terms of a negative experience).

Responsible usage

We have seen many controversies around Chat GPT emerge. Some European Union countries have banned it, and some will so be doing it too. And the question is not only its use (as the end of humanity and empathy) but also the misuse of personal data, privacy issues and leaking of relevant, corporate information.

Have you considered responsible usage of Chat GPT? Here is again the short list for helping you:

  1. The use of personal data retrieved from the model
  2. Inserting sensitive (personal or company) data
  3. Explaining the section of R, Python, Scala code, that is the property of your enterprise

Instead of this, have you tried using it more responsibly:

  1. Using pseudo code for explanation of the algorithm
  2. Using mock data rather than real data
  3. Giving pseudo-code in order to receive the documentation
  4. Skipping on sensible data (SQL schema, model information, sensible data)

So which cases have you come across? Did it have any consequences for you? Which other responsible use of Chat GPT have you also done?

My takeaways

ChatGPT offers interesting answers (based on my experience and search), and it is the next step from a google search of Stackoverflow. In other words, it gives you a more focused answer. When exploring and searching forums, you might find several different solutions for a single problem, whereas here, you have to ask for another solution. And respectively, it can give you answer faster, in comparison to browsing the web. In both cases, both sides have their advantages and disadvantages, but non will assure you, that the answer is correct!

I embrace this technology as an additional learning source. But I personally do not use it as my daily driver, despite trying it out a couple of times (with mixed results; working and nonworking/useless/meaningless). It can be super helpful for entry/junior positions, but the more experienced you are, the more abstract data science work you and the more complicated topics you cover, less frequently you will presumably use it.

Tagged with: , , ,
Posted in thoughts, Uncategorized

Little useless-useful R functions – Transforming dataframe to markdown table

Writing markdown documents outside RStudio (using the usual set of packages) has benefits and struggles. Huge struggle is transforming dataframe results into markdown table, using hypens and pipes. Ugghhh…

This useless functions takes R dataframe as input and prints out dataframe wrapped in markdown table.

iris[1:3,1:5]
# Result:
#> Sepal.Length Sepal.Width Petal.Length Petal.Width Species
#> 1          5.1         3.5          1.4         0.2  setosa
#> 2          4.9         3.0          1.4         0.2  setosa
#> 3          4.7         3.2          1.3         0.2  setosa

And if we want this result in Markdown, it should look like:

 |Sepal.Length|Sepal.Width|Petal.Length|Petal.Width|Species|
 |---|---|---|---|---|
 |5.1|3.5|1.4|0.2|setosa|
 |4.9|3|1.4|0.2|setosa|
 |4.7|3.2|1.3|0.2|setosa|

This can be directly used in any editor. (Sidenote: if you decide to use Latex, same function can be created, just different ASCII chars needs to be used – ampersend and backslash).

The super lazy function 🙂

df_2_MD <- function(your_df){
  cn <- as.character(names(your_df))
  headr <- paste0(c("", cn),  sep = "|", collapse='')
  sepr <- paste0(c('|', rep(paste0(c(rep('-',3), "|"), collapse=''),length(cn))), collapse ='')
  st <- "|"
    for (i in 1:nrow(your_df)){
      for(j in 1:ncol(your_df)){
        if (j%%ncol(your_df) == 0) {
          st <- paste0(st, as.character(your_df[i,j]), "|", "\n", "" , "|", collapse = '')
        } else {
        st <- paste0(st, as.character(your_df[i,j]), "|", collapse = '')
        }
      }
    }
  fin <- paste0(c(headr, sepr, substr(st,1,nchar(st)-1)), collapse="\n")
  cat(fin)
}  


# run function
short_iris <- iris[1:3,1:5]
df_2_MD(short_iris)

As always, code is available on Github in  Useless_R_function repository. The sample file in this repository is here (filename: Dataframe_to_markdown.R) Check the repository for future updates.

Happy R-coding and stay healthy!

Tagged with: , , ,
Posted in Uncategorized, Useless R functions

Little useless-useful R functions – Using xspline to create wacky signatures

Nothing short of wacky usage of plot() function with xspline to interpolate the points, but still a “parameter” short of Bezier’s curve. 🙂

Given two random vectors, you can generate a plot that, xspline will smooth out the plot and give it a “signaturey” look.

The function itself is straightforward:

inscrp <- function(rep){
          x <- rnorm(rep)
          y <- rnorm(rep)
          plot(x,y, pch = 1, col = "white",  xaxt='n',  yaxt='n', ann=FALSE, frame.plot=FALSE)
          xspline(x,y, 1, draw = TRUE, col="blue")
}

but the results are splendid 🙂

Run the following:

par(mfrow = c(2,1))

inscrp(10)
inscrp(20)

par(mfrow = c(1,1))

And you will get either a signature kind of look or a doctor’s drug prescription. The main point is, both are impossible to read 🙂

Adding more points (e.g.: above 50) to inscrp() function might result in a Picasso drawing 🙂

As always, code is available on the Github in the same Useless_R_function repository. Check Github for future updates.

Happy R-coding and stay healthy!“

Tagged with: , , , ,
Posted in Uncategorized, Useless R functions

Tips for organising your R code

Keeping your R code organised is not as straightforward as one might think. Just think about the libraries, variables, functions, and many more. All these objects can be defined and later rewritten, some might get obsolete during the process.

This process is proven to be even more crucial when you are part of a larger group of engineers, and scientists, who collaborate with you.

Motivation

The most important step toward code reproducibility is to keep code organised, and atomic, storing it by layers or components and keeping up-to-date documentation. Because R language is a scripting language, organising files into directories and subdirectories is important for later re-usage, and collaboration with different departments in an organisation.

Ideally, the names of files and subdirectories are self-explanatory, so that one can tell at a glance what data files contain, what scripts do, and what came from what.

The following R tips are based on frequent problems, many organisations are facing. All R samples and themes are created to be fictional and can be attached to your organisational environment. Dataset used is the iris dataset, used to show the custom theme and use of functions. All images unless otherwise noted are by the author.

1. Organising R files

You can always call R files, libraries, functions and settings from a different file. This gives a great segue to creating a folder structure, where each developer can clone or access and get all necessary files, themes, and functions that the organisation is pushing.

Organising R folders and R scripts

Structuring R files, functions, data and many more is an essential step toward reproducibility.

Using Projects is a great place to start (also available in Posit — RStudio), but you can always create your own structure, that will help you with code and file organisation.

2. Installing and attaching R libraries

Installing and attaching R libraries is in almost all cases part of the R code. Whenever you are writing R code, there will be a point, that you will be referencing to an external library.

You don’t want to install and attach single or multiple libraries as the sample of the pseudo-code below.

install.package("a")
install.package("b")
library(a)
library(b)

Instead, you can create a string vector with libraries names and install them if they do not exist and attach them with a shorter code:

required_Packages_Install <- c("ggplot2", "caret", "leaflet", "plotly", "magick")

for(Package in required_Packages_Install){
  if(!require(Package,character.only = TRUE)) { 
      install.packages(Package, dependencies=TRUE)
  }
  library(Package,character.only = TRUE)
}

In many enterprise environments, you might have issues with installing some packages on your local disk. These packages may contain *.zip or *.exe files and the security policy will deny them. In this case, the best solution is to install dedicated folder(s), as introduced in “Organise R files”. You will have to add the installation path and loading path for the packages.

The next step is to use a TXT file and write down all the packages needed for an R project/script. Let’s create a requirements.txt file (just like with YAML, Python,…) and put it inside package names.

Consider two R packages to help you achieve installation from the requirements.txt file. These two are requiRements and versions. Both are similar if your package list is stored in requirements.txt, but the versions function will also take the package version as input, which brings a whole new capability. On the other hand, the base function install.packages() gives you the possibility to specify in detail the arguments as reposlib (path), destdir and many system variables. But these will be essential to store packages at the desired location.

If you want to simplify the process, you can always create a ZIP file of all working packages and restore and install it at any given time. In this case, it is advised to add the R version in the ZIP file as well.

Protip: use library() instead of require(). The first one will fail and give you a warning, whereas, the require() will silently fail, causing you later failures in the code.

3. Use the corporate themes

Every corporate environment should follow the theme, with predefined colours, table design, pixel-perfect diagrams and positions. With the ggplot package, you can create a theme, that will follow your design guidelines.

Furthermore, adding a condition to your theme, that the same colours will always reflect the same KPI, can also be achieved with the themes.

library(ggplot2)
iris <- iris
ggplot(data = iris, aes(Sepal.Length)) + geom_bar(color="grey", fill="red") +
  labs(x = "Length of Sepal", 
       y = "Count of flowers", 
       title = "Number of flowers \nby sepal length",
       caption = "Source: IRIS Dataset \nBase R Package")
theme_organisation <- function(){
font <- "Times New Roman"
  theme(
    panel.grid.major.x = element_blank(),
    panel.grid.major.y = element_blank(),
    panel.grid.minor.x = element_blank(),
    panel.grid.minor.y = element_blank(),    
    panel.border = element_rect(colour = "black", fill = NA, linetype = 3, size = 0.5),
    panel.background = element_rect(fill = "#05a6f0"),
    legend.position = "bottom",
    plot.title = element_text( family = font, size = 20, face = 'bold', hjust = 0,vjust = 2),               
    axis.text = element_text(family = font,size = 9),                
    axis.text.x = element_text(margin=margin(5, b = 10))
  )
}
ggplot(data = iris, aes(Sepal.Length)) + geom_bar(color="grey", fill="red") +
labs(x = "Length of Sepal", 
     y = "Count of flowers", 
     title = "Number of flowers \nby sepal length",
     caption = "Source: IRIS Dataset \nBase R Package") +
  theme_organisation()
library(magick)
logo <- image_read("../Useless_R_functions/image/myiriscompany.png")
#adding the logo
grid::grid.raster(logo, x = 0.1, y = 0.02, just = c('left', 'bottom'), width = unit(1.9, 'inches'))

Besides graphs, tables can also follow a similar theme. R package flextable offers a great framework for creating tables with astonishing formats, layouts, cell formats and plotting capabilities.

library(flextable)
myiris <- flextable(head(iris), 
                         col_keys = c("Species", "Petal.Width", "Sepal.Length", "Sepal.Width" ))
myiris <- color(myiris, ~ Sepal.Length > 4.5, ~ Sepal.Length, color = "red")
myiris <- add_header_row(
  x = myiris, values = c("Name and Petals", "Measures on Sepal"),colwidths = c(2, 2))
myiris

With both packages, you will be able to create corporative reports, with capabilities to export them to different tools or formats (word, PDF, PowerPoint, HTML, and others).

4. Use coding practices and never forget to document

There are many sections, that will improve your code readability and reusability. I have grouped them into scopes, that each delivers better code.

Documenting code

  • Starting your code with an annotated description of what the code does when it is run will help you when you have to look at or change it in the future. Give the author name, date and, change log.
  • Loading all of the dependencies, packages and files in accordance with your file structure. Also, add the global environments and R engine version. In addition, a nice way to do this is also to indicate which packages are necessary to run your code.
  • Use setwd()to determine the files (script, project or packages) location, unless there are standards in your organisation, that make this obvious.
  • Use comments to mark off sections of code.
  • Comment your code with care. Comments should explain the why, not the what. Add comments to your function with the added description of all input arguments and result set.

Syntax practices

  • Place spaces around all operators (=, +, -, <-, boolean, etc).
  • Use <-, not =, for the assignment.
  • When using packages with similar function names, add a package name to the function: dplyr::filter() and a Filter() function from base R.
  • To improve readability, indent the code inside the curly braces. You can also use the formatR package to help you refactor and indent your code.
  • Factor out common operations rather than repeating them. And keep your code in smaller chunks. If a single function or loop gets too long, consider looking for ways to break it into smaller pieces.
  • There is a 80 characters line, that will help you comfortably fit code on a printed page at a reasonable size. If you find yourself running out of room, consider encapsulating some of the work in a separate function.

Naming convention

There are many naming conventions to choose from and all are ok, as long as you are using the selected one consistently. I will just list a few:

  • alllowercase: e.g. irisdataset
  • period.separated: e.g. iris.dataset
  • underscore_separated: e.g. iris_dataset
  • lowerCamelCase: e.g. addIrisDataset
  • UpperCamelCase: e.g. AddIrisDataset

Keep names concise and meaningful, nouns and verbs should be used in functions and variables. Give the function a verb, eg.: add, calculate, reduce, and give a variable a noun, eg.: calculatedNumbers, vectorOfValues.

In general, you can separate helper functions with a prefix of “.”. And also distinguish between local and global variables, data objects and functions.

Also, store your files with meaningful names and always store them in *.R

Posit — RStudio tips

  • Choose your IDE. Consider using Posit — RStudio. My second favourite for writing R code is Visual Studio code.
  • There is no need to save the current workspace if you are writing reproducible code. You should be able to reproduce the workspace by re-running your script.
  • Keep track of data, variables, and functions versions, and use also integrated facilities to access SVN or Github.
  • R projects are a great way to organize your script files, and your outputs consider using Markdown to prepare finalised reports of your analysis
  • Check the memory used, use a garbage collector (gc()) and it always helps to keep session information in your project.

The complete code is available on Github in this repository.

This Article was originally published on Medium : https://medium.com/@tomazkastrun/tips-for-organising-your-r-code-ebbda2309b8

Tagged with: ,
Posted in Useless R functions

Performance comparison of converting list to data.frame with R language

When you are working with large datasets performance comes to everyone’s mind. Especially when converting datasets from one data type to another. And choosing the right method can make a huge difference.

So in this case, I will be creating a dummy list, and I will convert the values in the list into data.frame.

Simple function to create a large list (approx. 46MB with 250.000 elements and each element consists of 10 measurements).

cre_l <- function(len,start,end){ return(round(runif(len,start,end),8)) }
myl2 <- list()
# 250.000 elements is approx 46Mb in Size
# 2.500 elements for demo
for (i in 1:2500){ myl2[[i]] <- (cre_l(10,0,50))  } 

The list will be transformed into data.frame in such a way that data.frame will have 10 (ten) variables with a number of observations corresponding with the length of the list. To give you the perspective, this code does exactly this:

for (i in 1:250){ myl2[[i]] <- (cre_l(10,0,50))  } 
df <- data.frame(do.call(rbind, myl2))

And you end up from list to data.frame:

Fig 1: from List to Data.frame

There are many ways to convert a list to data.frame. But where it becomes important is, when your list is a larger object. I have written 8 ways to do the conversion (and I know there are at least 20 more).

By far the fastest method were do.call and sapply ways. Both outperforming all other methods with the following snippets:

#do.call
data.frame(do.call(rbind, myl2))
#sapply
data.frame(t(sapply(myl2,c)))

Both methods were consistent with larger list conversions.

And the worst was for loop solution, reduce and as.data.frame. There are no surprises here, just a pause here, that for loop was performing so poorly due to constant row binding to an existing data.frame.

Complete comparison and graph code:

library(data.table)
library(plyr)
library(ggplot2) 

res <- summary(microbenchmark::microbenchmark(
    do_call_solution = {
          sol1 <- NULL
          sol1 <- data.frame(do.call(rbind, myl2))
    }, 
    for_loop_solution = { 
        sol2 <- NULL
        for (i in 1:length(myl2)){ sol2 <- rbind(sol2, data.frame(t(unlist(myl2[i])))) }
    },
     ldply_to_df = { 
          sol3 <- NULL 
          sol3 <- ldply(myl2, sol3)
     },
    ldply_to_c = {
      sol4 <- NULL
      sol4 <- ldply(myl2, c())
    },
    sapply = {
        sol5 <- NULL
        sol5 <- data.frame(t(sapply(myl2,c)))
        },
    recude = {
      sol6 <- NULL
      sol6 <- data.frame(Reduce(rbind, myl2))
    },
    data_table_rbindlist = {
      sol7 <- NULL
      sol7 <- data.frame(t(rbindlist(list(myl2))))
    },
    as_data_frame = {
      sol8 <- NULL
      sol8 <- data.frame(t(as.data.frame(myl2)))
    },
        times = 10L))

# producing graph
ggplot(res, aes(x=expr, y=(mean/1000/60))) + geom_bar(stat="identity", fill = "lightblue") +
     coord_flip() +
     labs(title = "Perfomance comparison", subtitle = "Converting list with 2.500 element to data.frame ") +
     xlab("Methods") + ylab("Conversion time (s)") +
     theme_light() +
     geom_text(aes(label=(round(mean/1000/60*1.000,3))))

Fig 2 : Comparison results with different converting methods on list with 2500 elements

I have also removed the slowest performing conversions and created a 250.000 elements list and compared only the fastest methods with 10 consecutive runs (using the microbenchmark library) and mean values.

Fig 3 : Comparing the fastest methods on a list with 250000 elements

So, using for loops is super slow, and do.call with rbind or sapply will for sure deliver the best performances.

As always, the code is available at Github in repository Useless_R_function and the file is here.

Happy R-coding!

Tagged with: , , ,
Posted in Uncategorized, Useless R functions

Little useless-useful R functions – Mandelbrot set

The Mandelbrot set is a set of complex numbers  c for which the function {\displaystyle f_{c}(z)=z^{2}+c} does not diverge to infinity when iterate from z=0, and therefore remains bounded in absolute value.

For little stretching, we can create a Mandlebrot set and draw it with image function.


MandelBrotImage <- function(){
    
    cols <- colorRampPalette(c("white","black","white","grey","black"))(11)
    n <- 400
    x <- seq(-2, 1, length.out=250)
    y <- seq(-1.5, 1.5, length.out=250)
    c <- outer(x,y*1i,"+")
    z <- matrix(0.0, nrow=length(x), ncol=length(y))
    k <- matrix(0.0, nrow=length(x), ncol=length(y))
    
    
    for (rep in 1:n) { 
      for (i in 1:250) { 
        for (j in 1:250) { 
          if(Mod(z[i,j]) < 2 && k[i,j] < n) {
            z[i,j] <- z[i,j]^2 + c[i,j]
            k[i,j] <- k[i,j] + 1
          }
        }
      }
    }
    image(x,y,k, col=cols, axes = FALSE, xlab = "" , ylab = "" )
}


# run function
MandelBrotImage()

As always, code is available on Github in  Useless_R_function repository. The sample file in this repository is here (filename: MandelbrotSet.R) Check the repository for future updates.

Happy R-coding and stay healthy!

Tagged with: , , ,
Posted in Uncategorized, Useless R functions

Advent of 2022, Day 24 – Further reading and code material for Azure Machine Learning

In the series of Azure Machine Learning posts:

If you want to immerse in further reading and additional knowledge, here are some links. Here are just couple.

Microsoft Learn website:

Github:

There are great GitHub repositories with great end-to-end solutions.

Books:

Blogposts:

Courses:

Certifications:

  • DP 100
  • AI 102, AI 900

Compete set of code, documents, notebooks, and all of the materials will be available at the Github repository: https://github.com/tomaztk/Azure-Machine-Learning

The series ends with this blogpost 🙂 Wish you all a merry Christmas and a happy new year 2023.

Posted in Azure Machine Learning, Uncategorized

Advent of 2022, Day 23 – Working with R

In the series of Azure Machine Learning posts:

This post is super short 😦

R language and Azure Machine Learning SDK for R was deprecated a year ago (end of 2021). But R can be still used for training and deployment by using Azure Machine learning CLI 2.0!

Furthermore, R language can be used in Machine Learning Designer, for data preparation, data wrangling and statistical analysis.

Fig 1.: Using R in Azure Machine Learning Designer

AzureML SKD for R can be found on this page: https://azure.github.io/azureml-sdk-for-r/index.html.

Another way to user R is to use Posit (RStudio) on your compute instance. When you will be creating a new compute instance and you will get to the application usages

Fig 2.: Applications available for compute instance

You can also install RStudio (Posit). But you will need a R Workbench license in order to install it.

On the Posit website, you can purchase the product

Fig 3.: Posit + Azure

And from there on, you need to add an additional application when configuring a compute instance.

Fig 4.: Installing R Workbench by Posit

You enter the license key and the Posit (Rstudio) will be available as an application and development environment for your disposal.

So, there are ways to use R in Azure ML, and you can always choose your preferred tool.

Compete set of code, documents, notebooks, and all of the materials will be available at the Github repository: https://github.com/tomaztk/Azure-Machine-Learning

Happy Advent of 2022!

Tagged with: , , , , ,
Posted in Azure Machine Learning, Uncategorized
Follow TomazTsql on WordPress.com
Programs I Use: SQL Search
Programs I Use: R Studio
Programs I Use: Plan Explorer
Rdeči Noski – Charity

Rdeči noski

100% of donations made here go to charity, no deductions, no fees. For CLOWNDOCTORS - encouraging more joy and happiness to children staying in hospitals (http://www.rednoses.eu/red-noses-organisations/slovenia/)

€2.00

Top SQL Server Bloggers 2018
TomazTsql

Tomaz doing BI and DEV with SQL Server and R, Python, Power BI, Azure and beyond

Discover WordPress

A daily selection of the best content published on WordPress, collected for you by humans who love to read.

Revolutions

Tomaz doing BI and DEV with SQL Server and R, Python, Power BI, Azure and beyond

tenbulls.co.uk

tenbulls.co.uk - attaining enlightenment with the Microsoft Data and Cloud Platforms with a sprinkling of Open Source and supporting technologies!

SQL DBA with A Beard

He's a SQL DBA and he has a beard

Reeves Smith's SQL & BI Blog

A blog about SQL Server and the Microsoft Business Intelligence stack with some random Non-Microsoft tools thrown in for good measure.

SQL Server

for Application Developers

Business Analytics 3.0

Data Driven Business Models

SQL Database Engine Blog

Tomaz doing BI and DEV with SQL Server and R, Python, Power BI, Azure and beyond

Search Msdn

Tomaz doing BI and DEV with SQL Server and R, Python, Power BI, Azure and beyond

R-bloggers

Tomaz doing BI and DEV with SQL Server and R, Python, Power BI, Azure and beyond

MsSQLGirl

Bringing value to data & insights through experiences users love

R-bloggers

R news and tutorials contributed by hundreds of R bloggers

Data Until I Die!

Data for Life :)

Paul Turley's SQL Server BI Blog

sharing my experiences with the Microsoft data platform, SQL Server BI, Data Modeling, SSAS Design, Power Pivot, Power BI, SSRS Advanced Design, Power BI, Dashboards & Visualization since 2009

Grant Fritchey

Intimidating Databases and Code

Madhivanan's SQL blog

A modern business theme

Alessandro Alpi's Blog

DevOps could be the disease you die with, but don’t die of.

Paul te Braak

Business Intelligence Blog

Sql Insane Asylum (A Blog by Pat Wright)

Information about SQL (PostgreSQL & SQL Server) from the Asylum.

Gareth's Blog

A blog about Life, SQL & Everything ...