12 Handy Solutions
12.1 Tea solution
First ensure you have the “tea_df” loaded (remember your working directory will need to be in the correct location first). Also it needs to be preprocessed with the gsub()
function.
<- read.csv("Chapter_10-11/tea_consumption.csv", check.names=FALSE)
tea_df $lb <- tea_df$KG_LB_annual_per_capita
tea_df$lb <- gsub(pattern = ".*_", replacement = "", tea_df$lb)
tea_dfcolnames(tea_df)[3] <- "kg"
$kg <- gsub(pattern = "_.*", replacement = "", tea_df$kg)
tea_df$kg <- as.numeric(tea_df$kg)
tea_df$lb <- as.numeric(tea_df$lb) tea_df
Remember there are many ways to carry this out but here is one.
First create a vector with the names of the countries we want:
<- c("Ireland", "United Kingdom", "France", "Australia") countries
Set the row names to the countries for easy indexing:
Note: Row name must be unique which is the case here.
row.names(tea_df) <- tea_df$Country
Create a data frame that only contains our countries of interest. We use the vector as an index for the rows.
<- tea_df[countries,] tea_df_subset
Here because we are working with a temporary variable we will overwrite the kg column so the values only contain one decimal place
$kg <- round(x = tea_df_subset$kg, digits = 1) tea_df_subset
Last step is to print out the statement. We will use paste0()
which is exactly like paste()
but the sep =
option is set to ""
.
paste0(tea_df_subset$Country, " is the number ", tea_df_subset$Rank,
" consumer of tea. It consumes ", tea_df_subset$kg, "kg of tea annually per capita.")
12.2 English speakers across the world solution
First make sure the data frame is created. Remember to set your working directory to where the file is.
<- read.csv("Chapter_10-11/english_speaking_population_of_countries.tsv",
english_df sep = "\t",
row.names = 1,
check.names = FALSE)
is.na(english_df)] <- 0
english_df[<-
english_complete_datasets_df
english_df[$`As first language` + english_df$`As an additional language`) ==
(english_df$`Total English speakers`,
english_df ]
Create new data frame only containing countries with an eligible population of > 100 million.
<- english_complete_datasets_df[
english_100mil_df $`Eligible population` > 100000000,
english_complete_datasets_df ]
Create column with fraction of total english speakers against population
$`Fraction of population that are English speakers` <-
english_100mil_df$`Total English speakers` /
english_100mil_df$`Eligible population` english_100mil_df
Create row with mean values
"Mean",] <- colMeans(english_100mil_df) english_100mil_df[
Create row with totals
"Total",1:4] <- colSums(english_100mil_df[1:7,1:4]) english_100mil_df[
Create the total fraction of english speakers
"Total","Fraction of population that are English speakers"] <-
english_100mil_df["Total","Total English speakers"] /
english_100mil_df["Total","Eligible population"] english_100mil_df[
Write the data as a file
write.table(x = english_100mil_df,
file = "Chapter_10-11/English_top_7_populated_countries.csv",
col.names=NA,
quote = TRUE,
sep = ",")