SQL in R

Install sqldf package as below :

install.packages("sqldf")

We load sqldf package as :

library(sqldf)

We are working on UCBAdmissions dataset. We load UCBAdmissions dataset .

data("UCBAdmissions")

The data contains students admission status as Admit , Gender , Dept as Department , Freq as Frequency of students.

We convert dataset to data frame .

ucb <- as.data.frame(UCBAdmissions)

We are using sqldf() to run SQL commands. We select all columns of ucb .

sqldf("select * from ucb")

We can use conditional statement by using where statement . We want to see all Female observation. We select all columns from ucb data frame where Gender is equal to "Female".

sqldf("select * from ucb where Gender = 'Female'")

We want to see all Admitted students.

sqldf("select * from ucb where Admit = 'Admitted'")

It shows admitted students department-wise in decreasing number of students admitted.

sqldf("select * from ucb where Admit = 'Admitted' order by Freq DESC")

We want to see list of departments .

sqldf("select distinct Dept from ucb")

Output:

It shows total number of students admitted .

sqldf("select sum(Freq) from ucb where Admit = 'Admitted'")

Output:

It shows total number of students rejected in admission.

sqldf("select sum(Freq) from ucb where Admit = 'Rejected'")

Output:

It shows total number of males got admitted .

sqldf("select sum(Freq) as total_dudes from ucb where Admit = 'Admitted' AND Gender = 'Male'")

Output:

It shows total frequency of females who are rejected.

sqldf("select sum(Freq) as total_ladies from ucb where Admit = 'Rejected' AND Gender = 'Female'")

Output:

It shows average number of admitted student per department .

sqldf("select Dept, avg(Freq) as average_admitted from ucb where Admit = 'Admitted' group by Dept")

Output:

It shows minimum number of observations of students rejected .

sqldf("select min(Freq) from ucb where Admit = 'Rejected'")

Output:

We create a majors data frame as:

majors <- data.frame(major = c("math", "biology", "engineering", "computer science", "history", "architecture"), Dept = c(LETTERS[1:5], "Other"), Faculty = round(runif(6, min = 10, max = 30)))

It shows majors dataset as:

sqldf("select * from majors")

Output:


 

It shows frequency count in between 25 and 120.

sqldf("select * from ucb where Freq between 25 AND 120")

Output:

It shows all gender observations. We used "%" to match one or more characters .

sqldf("select * from ucb where Gender Like '%male%'")

Output:

We used "Ma%" which means Gender starts with Ma should be selected .

sqldf("select * from ucb where Gender Like 'Ma%'")

Output:

It shows observations where Gender is Female and Freq is greater than 125.

sqldf("select * from ucb where Gender = 'Female' AND Freq >= 125 ")

Output:

We are using nested statements . First , the inner query inside "()" should executed first. Then , external query executed. The internal query is " select max(Freq) from ucb where Admit = 'Admitted')" . It shows maximum frequency of admitted students .

In external query , it shows department where frequency is maximum. The output shows department name which have maximum number of frequency of admitted students.

sqldf("select Dept from ucb where Freq = (select max(Freq) from ucb where Admit = 'Admitted')")

Output:

We have internal query as " select max(Freq) from ucb where Gender = 'Female')" . We want to see maximum number of frequency of females . So , in external query we want to see department where frequency is maximum. The output shows department name where females have large frequency .

sqldf("select Dept from ucb where Freq = (select max(Freq) from ucb where Gender = 'Female')")

Output:

We want to check department name having maximum number of frequency in majors dataset.

sqldf("select Dept from majors where Faculty = (select max(Faculty) from majors)")

Output:

We want to join ucb and majors dataset where both have common departments.

sqldf("select * from ucb inner join majors on ucb.Dept = majors.Dept")

Output:

We are applying left joint to returns all rows from the left dataset , and rows have common department from right dataset. It shows all rows from ucb dataset and rows with common department from majors dataset.

sqldf("select * from ucb left join majors on ucb.Dept = majors.Dept")

Output:


 

Comments

There are no entries yet.
Please enter the code
* Required fields

Subscribe to our mailing list

* indicates required

Looking for Corporate Training ? Reach out to us at Akriti.Lal@instrovate.com

Reach out to us if you are looking for Corporate Training to Build The Next Generation Analytical Workforce with an in-depth understanding of  Exploratory Data Analysis , Data Visualisation, Data Analytics , AI First , Machine Learning & Deep Learning Training & Consulting helping them to take Data Informed Decision at each stage of the business. 

We understand that At the present times , the Entire Industry is in a Tranformation stage with the Softwares  being rebuilt with Artificial Intelligence Capabilities .

We need SMART WORKFORCE for the SMART SOFTWARES to reap the maximum return . 

 

Whatsapp at +91-9953805788 or email at - akriti.lal@instrovate.com if you would like to know more . 

 

  • Corporate Tableau Training in Gurgaon
  • Corporate Data Analytics Training in Gurgaon
  • Corporate Microsoft Power BI Training in Gurgaon
  • Corporate Microstrategy Training in Gurgaon
  • Corporate Google Data Studio Training in Gurgaon
  • Corporate Python Training in Gurgaon
  • Corporate Advance Analytics in R Programming Training in Gurgaon
  • Corporate Machine Learning Training in Gurgaon
  • Corporate Deep Learning Training in Gurgaon
  • Corporate Data Visualization Training in Gurgaon

 

Address

Instrovate Technologies

Noida, Gurgaon

 

akriti.lal@instrovate.com 

Hit Your Refresh Button To Rise Higher
Print Print | Sitemap
All rights reserved @ Instrovate Technologies
GSTIN : 09AAECI6862K1Z1

Terms Of Services    Privacy Policy    Disclaimer     Refund Policy    


Call

E-mail