library(tidyverse)
library(packageRank)
4 Daily Download Counts for All Packages
The following code does not run here due to time consumption. The generated data is timestamped with a data collection time and produced from local R scripts that are pushed up to Github.
The below walks through a script designed to generate daily download count data for identified packages in the previously generated all_packages file to utilize in creating plots.
First import the necessary packages.
Read in the packages stored in the ‘all_packages.csv’ file scraped from the tidyverse gallery and CRAN webpages in previous appendices.
<- read_csv("generated_data/all_packages.csv", skip = 2)
sorted_packages
#top 30 packages by download count, excludes ggplot2
<- c(sorted_packages$package[2:31])
top_packages
#all packages that can be found on CRAN (cran download count is not null)
<- c(sorted_packages$package[!is.na(sorted_packages$downloads)]) packages_in_cran
Defines a ‘get_cd_data’ function that retrieves historical cran download count data using the cranDownloads function of packageRank.
<- function(pkg) {
get_cd_data cranDownloads(packages = pkg, to = 2025)$cranlogs.data
}
Retrieves cran download data for top 30 packages and all cran packages.
<- map_dfr(top_packages, get_cd_data)
dc_top_packages <- map_dfr(packages_in_cran, get_cd_data) dc_cran_packages
write_csv(dc_top_packages, "generated_data/dc_top_packages.csv")
write_csv(dc_cran_packages, "generated_data/dc_cran_packages.csv")