Statistics

The R community embraces the future framework

Since the first CRAN release of future in June 2015, its uptake among end-users and package developers have grown steadily. During March 2021, future was among the top-1.0% most downloaded package on CRAN (Figure 1) and there are 170 packages on CRAN and Bioconductor that directly depend on it (Figure 2). For map-reduce parallelization packages future.apply (top-2.1% most downloaded) and and furrr (top 1.4%), the corresponding number of packages are 75 and 44, respectively. If we consider recursive dependencies too, that is, packages that use the future package either directly or indirectly via another package, then there are more than 18,500 CRAN and Bioconductor packages (87%) out of 21,000 that rely on the future framework for their processing.

A line graph with 'Date' on the horizontal axis and 'Download rates on CRAN (four-week averages)' on the vertical axis. The dates goes from mid 2015 to mid 2021 and the ranks for 0 to 20%. Lines for package 'foreach', 'future', 'future.apply', and 'furrr' are displayed in different colors. The foreach curve is the highest but decreases slowly, whereas the other three are rapidly increasing toward the level of foreach.

Figure 1: The download percentile ranks for future, future.apply, furrr, and foreach average every four weeks. future is among the top-1.0% most downloaded packages on CRAN. The data are based on the RStudio CRAN mirror logs. There are approximately 150 million package downloads per month from the RStudio CRAN mirror alone. Since none of the other CRAN mirrors provide statistics, it is impossible to know the total amount of package installations.

As a reference, the popular foreach, released in 2009, was among the top-0.9% most downloaded packages during the same period and it has almost 800 reverse package dependencies on CRAN and Bioconductor. The number of users that download future has grown rapidly whereas the the same number has slowly decreased for the foreach package (Figure 1). Similarly, the number of reverse package dependencies on future appear to grow faster than for foreach (Figure 2).

A line graph with 'Date' (2015-2021) on the horizontal axis and 'Number of reverse dependencies on CRAN' on the vertical axis. Rapidly growing curves for three packages, 'future', 'future.apply', and 'furrr', are shown with 'future' increasing the fastest.A line graph with 'Date' (2015-2021) on the horizontal axis and 'Number of reverse dependencies on CRAN' on the vertical axis, which is on the logarithmic scale. Curves for four packages, 'foreach', 'future', 'future.apply', and 'furrr', are shown, where foreach has more dependencies but with a lower slope than the others during recent years.

Figure 2: Number of CRAN packages that depend on future, future.apply, furrr, and foreach over time since the first release of future in June 2015. Left: The package counts on the linear scale without foreach. Right: The same data on the logarithmic scale to fit also foreach. (Because historical data for reverse dependencies on Bioconductor are hard to track down, they are currently not reported in these graphs.)

Importantly, the comparison toward foreach is only done as a reference for the current demand for parallelization frameworks in R and to show the rapid uptake of the future framework since its release. It is not a competition because foreach can per design be used in companion with the future framework via doFuture. The choice between foreach with doFuture, future.apply, and furrr is a matter of preference of coding style - they all rely on futures for parallelization.