Introduction to R Cloud Computing

Contents

Introduction

Almost all domains / current businesses are being transformed through SMAC. SMAC is a collective term that refers to the changes that occur in Social, METROobile, Aanalytical and Cstrong. The impact of this change has been widespread: organizations, people and products. In today's article, We will allow you to take your analytics capabilities to the next level by using cloud computing.

We have explained the concept of cloud computing using R programming and RStudio using a step-by-step methodology. What's more, you will also learn about the benefits of using R in the cloud compared to traditional desktop or client architecture / local server.

final-new-1702064

Cloud: an enabling platform for data science:

Cloud computing has seen unprecedented growth and penetration in recent years. Has allowed organizations to scale quickly and easily. When using cloud services, companies collect, store and analyze a large amount of data, what was almost unthinkable before. But nevertheless, with services from companies like Amazon, Google and Microsoft, cloud services are now accessible to any analyst.

Gone are the days when you would buy a server for a particular capacity and then need to buy a new one, when the previous capacity was exhausted. For instance, most of the analysis I do normally is based on a few GB of data, enough to run directly on my laptop. But nevertheless, recently Microsoft posted ~ 400 GB of malware and virus data in Kaggle. If I had thought of solving this problem on my laptop, I would have run out of internet plan just by downloading the dataset. Analyzing it is a separate challenge in itself.

Even if I had downloaded the dataset, the only way to do the calculation of meanings through a non-cloud-based way was to buy a new machine, which is not a very practical solution. This is where cloud computing comes in!!

Must read: step-by-step guide to learn to program in R

Why do you need the 'cloud'?

As discussed in the previous case study, the cloud is cheaper to handle big data than desktop storage, laptops or local servers. Expect. Big Data? Yes! Big Data is a general term that basically denotes data whose volume, variety and speed is greater than conventional data sources and requires distributed computing like Hadoop and non-RDBMS storage like NoSQL databases.

Must read: a beginner's guide to using big data using MongoDB

What is cloud computing?

According to the NIST Definition of Cloud Computing,

Cloud computing is a model for enabling on-demand network access, convenient and ubiquitous to a shared pool of configurable computing resources (for instance, networks, servers, storage, applications and services) that can be provisioned and released quickly with minimal administration effort or service provider interaction. This cloud model is made up of five essential characteristics, three service models and four deployment models.

Cloud computing consists of 3 components:

  • Infrastructure as a service (IaaS)
  • Platform as a service (PaaS)
  • Software as a service (SaaS)

IaaS– To implement your applications, cloud users install images of the operating system and their application software on the cloud infrastructure. In this model, cloud user patches and maintains operating systems and application software.

PaaS– Cloud providers offer a computing platform, which usually includes the operating system, the runtime environment of the programming language, the database and the web server. Application developers can develop and run their software solutions on a cloud platform without the cost and complexity of purchasing and managing the underlying hardware and software layers.

SaaS – In software as a service (SaaS), users have access to databases and application software. Cloud providers manage the infrastructure and platforms that run the applications. SaaS is sometimes called “software on demand”.

pic-1-2668012

What are the advantages and disadvantages of using cloud computing with R versus other applications?

Python is free just like R, but the main reason why R scores is that the statistical library of R packages is much more extensive. SAS remains the leading language for corporate analytics on the desktop, but it is still expensive for small businesses and has a significant disadvantage in the capital expenditure commitment due to the annual license structure rather than the one-time license fee.

Must read: a quick guide on SAS vs R vs Python

What are the advantages of using R in the cloud compared to the desktop?

  1. Since we know that R is limited to handling data as large as the size of the RAM, the cloud offers us a quick solution to handle Big Data Science using R. This can be done by simply increasing the RAM on the virtual machine instance. You can see the various types of RAM options available in the cloud that are simply not affordable on the local machine.
  2. For large data sets, it is better to use it in the cloud than to download the dataset, process it and then rate it. For instance, if you have a competition that uses 30 GB of data, you better use it in the cloud. Therefore, the cloud is a great way to learn about big data without having to worry about internet speed.
  3. The cloud has a much better bandwidth speed. Therefore, installing software and transferring data is much faster in the cloud.
  4. You can use additional services like AzureML with R in the cloud instead of building your own machine learning service from scratch. You can this tutorial for more information.
  5. The cloud is much more scalable for changes in data volume or speed.

Take the test: Should I become a data scientist?

How to use R Programming in the cloud?

You can create an instance (a virtual machine that you access remotely) in Amazon Cloud, or Microsoft Azure or Google Cloud. Later, you can just install R the same way you use it on your local desktop. Connects to your remote machine via SSH or Remote Desktop.

Then, shows a step-by-step process to create a cloud instance on Amazon Web Services.

Note: Amazon has a free tier that allows you to try the Amazon cloud for free during 1 year. But nevertheless, this is only for micro instances that have very little RAM and very little disk space. For more RAM and more storage, must pay more. To see the different instances and their hourly prices, you can visit the page here. Basically, rates are charged in units of calculation, but this website makes it easy to calculate costs.

pic-2-2358997

You must first create your Amazon ID. Once i'm done, follow the steps below to create a cloud instance on Amazon web services:

    1. Log in to the Amazon Web Services console (AWS)
    2. Click Run Instance
    3. Choose the operating system for your virtual machine that you will access remotely. Here I have chosen Amazon Linux.
    4. Choose the instance type (RAM size and required memory). Watch here to compare prices.
      pic-3-3893831
    5. Create a security key. This is necessary for a secure cracker-proof login on the remote machine. Note that you can use remote desktop for Windows operating systems, but you will need to use SSH for Linux instances.pic-4-1201368
    6. Click Start Instance
    7. Connect to the instance using your security key following the instructions given.pic-5-2746552
    8. Now work on your remote machine as you would on a local machine.pic-6-7804408
    9. Here I am trying to install R pic-7-8002216
    10. Once you have finished your work- rRemember to close the instance lest you incur a high monthly bill.

You can choose instances on demand, or even have reserved instances (reserve a virtual machine for a fixed period of time and, Thus, with a considerable discount).

Take the test: Should I become a data scientist?

How to use R in the cloud using RStudio?

RStudio Server edition runs only on Linux. Therefore, we choose the Linux instance in the cloud and then configure R Studio Server. Then we can connect to the remote RStudio server via browser and use it in the same way.

Here is a step by step way to run RStudio in the cloud.

  • Note: we install R already using sudo yum install R
  • Download RStudio server on your virtual machine and then install RStudio Server
$ wget http://download2.rstudio.org/rstudio-server-rhel5-0.99.442-i686.rpm
$ sudo yum install --nogpgcheck rstudio-server-rhel5-0.99.442-i686.rpm
  • You verify the installation
$ sudo rstudio-server verify-installation
  • Open the port 8787 using the security group in the AWS Console (left bank security groups) creating a custom TCP rule (click Edit in the tab below)

pic-8-7717825

  • Create a new user with a new password using the SSH terminal for your virtual machine cloud instance
  1. sudo useradd newuser1
  2. sudo passwd newuser1
  • The public IP address of the cloud instance can be found in the Instances tab on the left side.

pic-9-9179741

  • Open your browser at IPAddress: 8787 and then log in with the user id and password created earlier

pic-10-4339480

  • You are now ready to use R using the cloud through a browser

pic-11-2436655

Using R through the bioconductor cloud?

The bioconductor cloud is an amazing way to launch R in the cloud. You can see the instructions here.

What are the other cloud computing options?

You can also use Google Cloud and Windows Azure options. But nevertheless, most of the space is dominated by Amazon Web Services.

Any examples of using R with platforms and other software as a service?

Yes, we can use Azure Machine Learning with R in the cloud and also use Google Big Query con R.

Any example of Big Data using R in the cloud?

Yes, there are many examples. Resource 1 Y Resource 2.

Final notes

At this stage, I would already have an overview of how to implement cloud computing using R and R Studio. I really enjoyed writing and selecting the helpful resources in this article. This article also covers questions that people often ask while learning cloud computing in R. Therefore, I have tried to cover them all in this article.. According to my personal experience, I found that demonstrating cloud in R is relatively easier compared to other softwares.

Hope this article helped you get familiar with cloud computing. We would love to hear from you.. Did you find it useful? Feel free to post your thoughts via the comments below..

If you like what you just read and want to continue your analytical earnings, subscribe to our emails, Follow us on twitter or like ours page the Facebook.

Subscribe to our Newsletter

We will not send you SPAM mail. We hate it as much as you.