When testing random functions or predictions in R it is usually a good thing to have some sample or random data. A lot of libraries and base libraries in R are equipped with good sample data, but let me show you a nice way of generating a data frame of random data.
We will generate random data using rnorm function (random generation for the normal distribution with mean equal to defined mean). We will apply a linear function to random values using sapply function (applying a function to list or vector or array of values). Similar functions are lapply or vapply.
x <- rnorm(1000,10,5) y <- sapply(x, function(x) rnorm(1,2*x+6,10)) dat_set <- data.frame(x,y)
After this, we can visulize the dataset dat_set to see the dispersion.
ggplot()+geom_point(data=dat_set, aes(x=x, y=y),size=1, color='brown')
Visualization looks like:
One can tell that initial data distribution follows the linear function of y=2x+6 with applied (using sapply) y-coordinated values.