Overview of All Packages
The core package
- future - Unified Parallel and Distributed Processing in R for Everyone
This is the core package of the future framework. It implements the Future API, which comprises three basic functions -future()
,resolved()
, andvalue()
, is designed to unify parallel processing in R at the lowest possible level. It provides a standard for building richer, higher-level parallel frontends without having to worry about and re-implement common, critical tasks such as identifying global variables and packages, parallel RNG, and relaying of output and conditions - cumbersome tasks that are often essential to parallel processing. Nearly all parallel processing tasks can be implemented using this low-level API. But, most users and package developers use the futures via higher-level map-reduce functions provided by future.apply, furrr, and foreach with doFuture.
Parallel map-reduce functions for popular programming style
future.apply - Apply Function to Elements in Parallel using Future
Implementations ofapply()
,by()
,eapply()
,lapply()
,Map()
,.mapply()
,mapply()
,replicate()
,sapply()
,tapply()
, andvapply()
that can be resolved using any future-supported backend, e.g. parallel on the local machine or distributed on a compute cluster. Thesefuture_*apply()
functions come with the same pros and cons as the corresponding base R*apply()
functions but with the additional feature of being able to be processed via the future framework.furrr - Apply Mapping Functions in Parallel using Futures
Implementations of the family ofmap()
functions from purrr that can be resolved using any future-supported backend, e.g. parallel on the local machine or distributed on a compute cluster. (Developed by Davis Vaughan)foreach with doFuture - Use Foreach to Parallelize via the Future Framework
The doFuture package combines the best of foreach and future and offers two alternative solution: (i) the traditionalforeach(...) %dopar% { ... }
solution together viaregisterDoFuture()
, and (ii) the modernforeach(...) %dofuture% { ... }
by itself.BiocParallel with BiocParallel.FutureParam - Use Futures with BiocParallel
A ‘BiocParallelParam’ class for using futures with the BiocParallel framework, e.g.BiocParallel::register(FutureParam())
. An alternative, is to useBiocParallel::register(DoParam())
in combination of doFuture.
Choosing how and where to parallelize
future - the future package has a set of built-in parallel backends that build upon the parallel package, e.g.
multicore
,multisession
, andcluster
.future.batchtools - A Future API for Parallel and Distributed Processing using batchtools
Implementation of the Future API on top of the batchtools package. This allows you to process futures, as defined by the future package, in parallel out of the box, not only on your local machine or ad-hoc cluster of machines, but also via high-performance compute (HPC) job schedulers such as LSF, OpenLava, Slurm, SGE, and TORQUE/PBS, e.g.y <- future.apply::future_lapply(files, FUN = process)
.future.callr - A Future API for Parallel Processing using callr
Implementation of the Future API on top of the callr package. This allows you to process futures, as defined by the future package, in parallel out of the box, on your local (Linux, macOS, Windows, …) machine. Contrary to backends relying on the parallel package (e.g.future::multisession
), thecallr
backend provided here can run more than 125 parallel R processes.future.mirai - A Future API for Parallel Processing using mirai
Implementation of the Future API on top of the mirai package. This allows you to process futures, as defined by the future package, in parallel out of the box, on local and remote machines. The mirai package implements distributed computing via local or network resources and supports Transport Layer Security over TCP/IP for remote connections.
Reporting on progress
- progressr - An Inclusive, Unifying API for Progress Updates
A minimal, unifying API for scripts and packages to report progress updates from anywhere including when using parallel processing. The package is designed such that the developer can to focus on what progress should be reported on without having to worry about how to present it. The end user has full control of how, where, and when to render these progress updates, including via progress bars of the popular progress package. The future ecosystem support progressr at its core and many of the backends can report progress in a near-live fashion.
Validation
- future.tests - Test Suite for Future API Backends
Backends implementing the Future API, as defined by the future package, should use the tests provided by this package to validate that they meet the minimal requirements of the Future API. The tests can be performed easily from within R or from outside of R from the command line making it easy to include them package tests and in Continuous Integration (CI) pipelines.
Supporting packages
parallelly - Enhancing the parallel Package
Utility functions that enhance the parallel package and support the built-in parallel backends of the future package.globals - Identify Global Objects in R Expressions
Identifies global (“unknown” or “free”) objects in R expressions by code inspection using various strategies (ordered, liberal, or conservative). The objective of this package is to make it as simple as possible to identify global objects for the purpose of exporting them in parallel, distributed compute environments.listenv - Environments Behaving (Almost) as Lists
List environments are environments that have list-like properties. For instance, the elements of a list environment are ordered and can be accessed and iterated over using index subsetting, e.g.x <- listenv(a = 1, b = 2);
for (i in seq_along(x)) x[[i]] <- x[[i]] ^ 2;
y <- as.list(x)
.marshal - Framework to Marshal Objects to be Used in Another R Process
PROTOTYPE: Some types of R objects can be used only in the R session they were created. If used as-is in another R process, such objects often result in an immediate error or in obscure and hard-to-troubleshoot outcomes. Because of this, they cannot be saved to file and re-used at a later time. They can also not be exported to a worker in parallel processing. These objects are sometimes referred to as non-exportable or non-serializable objects. One solution to this problem is to use “marshaling” to encode the R object into an exportable representation that then can be used to re-create a copy of that object in another R process.