
12 Handy Solutions
12.1 Tea solution

First ensure you have the “tea_df” loaded (remember your working directory will need to be in the correct location first). Also it needs to be preprocessed with the gsub() function.
tea_df <- read.csv("Chapter_10-11/tea_consumption.csv", check.names=FALSE)
tea_df$lb <- tea_df$KG_LB_annual_per_capita
tea_df$lb <- gsub(pattern = ".*_", replacement = "", tea_df$lb)
colnames(tea_df)[3] <- "kg"
tea_df$kg <- gsub(pattern = "_.*", replacement = "", tea_df$kg)
tea_df$kg <- as.numeric(tea_df$kg)
tea_df$lb <- as.numeric(tea_df$lb)Remember there are many ways to carry this out but here is one.
First create a vector with the names of the countries we want:
countries <- c("Ireland", "United Kingdom", "France", "Australia")Set the row names to the countries for easy indexing:
Note: Row name must be unique which is the case here.
row.names(tea_df) <- tea_df$CountryCreate a data frame that only contains our countries of interest. We use the vector as an index for the rows.
tea_df_subset <- tea_df[countries,]Here because we are working with a temporary variable we will overwrite the kg column so the values only contain one decimal place
tea_df_subset$kg <- round(x = tea_df_subset$kg, digits = 1)Last step is to print out the statement. We will use paste0() which is exactly like paste() but the sep = option is set to "".
paste0(tea_df_subset$Country, " is the number ", tea_df_subset$Rank,
" consumer of tea. It consumes ", tea_df_subset$kg, "kg of tea annually per capita.")12.2 English speakers across the world solution

First make sure the data frame is created. Remember to set your working directory to where the file is.
english_df <- read.csv("Chapter_10-11/english_speaking_population_of_countries.tsv",
sep = "\t",
row.names = 1,
check.names = FALSE)
english_df[is.na(english_df)] <- 0
english_complete_datasets_df <-
english_df[
(english_df$`As first language` + english_df$`As an additional language`) ==
english_df$`Total English speakers`,
]Create new data frame only containing countries with an eligible population of > 100 million.
english_100mil_df <- english_complete_datasets_df[
english_complete_datasets_df$`Eligible population` > 100000000,
]Create column with fraction of total english speakers against population
english_100mil_df$`Fraction of population that are English speakers` <-
english_100mil_df$`Total English speakers` /
english_100mil_df$`Eligible population`Create row with mean values
english_100mil_df["Mean",] <- colMeans(english_100mil_df)Create row with totals
english_100mil_df["Total",1:4] <- colSums(english_100mil_df[1:7,1:4])Create the total fraction of english speakers
english_100mil_df["Total","Fraction of population that are English speakers"] <-
english_100mil_df["Total","Total English speakers"] /
english_100mil_df["Total","Eligible population"]Write the data as a file
write.table(x = english_100mil_df,
file = "Chapter_10-11/English_top_7_populated_countries.csv",
col.names=NA,
quote = TRUE,
sep = ",")