Lots of backward-compatible improvements and extensions ahead
Below are some1 of the bigger milestones on the roadmap. The issues linked that are examples of feature requests from end-users are prefixed ‘FR’, which will be resolved by the other non-FR issues.
Support for controlling R options and environment variables via
future()
[future#480]
Termination of futures, e.g. suspend(f)
. Since not
all backends can support termination, the design should be such that
failed termination attempts must not affect the workflow and its results
[future#93,
future#FR213,
future.batchtools#FR27,
parallelly#33]
Restarting of failed or terminated futures,
e.g. reset(f)
and v <- value(f)
. Also
automatic restarting of futures when parallel worker dies [future#FR188,
future#205,
future.callr#FR11,
parallelly#32]
Troubleshooting: Even more informative error message when futures
fails, e.g. include information on globals and packages involved if a
parallel worker dies [future#477,
future#479;
implemented in future 1.22.0]
Troubleshooting: Improve debugging of future that produce errors,
e.g. capture call frames in parallel workers and let user inspect them
via tools such as utils::browser()
and
utils::recover()
[future#253]
Performance: Add built-in speed and memory profiling tools to
help developers and users identify bottle necks and improve overall
performance [future#59,
future#142,
future#FR437]
MS Windows: Support relaying of UTF-8 output on MS Windows by
working around limitations in base R [future#FR473;
implemented in future 1.23.0]
Marshaling: Make it possible to parallelize also some of the object types that are currently “non-exportable”, e.g. objects by packages such as DBI, keras, ncdf4, rstan, sparklyr, xgboost, and xml2 [Project: marshal]
Resources: Make it possible to declare “resources” that a future
requires in order to be resolved, e.g. minimal memory requirements,
access to the local file system, a specific operating system, internet
access, sandboxed, etc. This will also make it possible to work with
semi-persistent “sticky” globals [doFuture#FR63,
future#FR181,
future#FR301,
future#FR346,
future#FR430,
future#FR450,
parallelly#18]
Multiple backends: Send different futures to different backends, based on which parallel backend can provide the requested resources
Scheduling: Add generic support for queueing such that futures can be queued on the local computer, on a remote scheduler, or in peer-to-peer scheduler among collaborators [future#FR256, future.apply#FR63, future.batchtools#FR23]
Flattening nested parallelism: Process nested futures via a common future queue instead of via nested future plans [future#FR361]
Develop a future.mapreduce package to provide an essential, infrastructure Map-Reduce Future API, e.g. load balancing, chunking, parallel random number generation (RNG), and early stopping. Packages such as future.apply, furrr, and doFuture are currently implementing their own version of these features. Consolidating these essentials into a shared utility package will guarantee a consistent behavior for all together with faster deployment of bug fixes and improvements [future.apply#20, future.apply#FR32, future.apply#FR44, future.apply#59, future.apply#FR60]
Develop more efficient chunking of large objects in map-reduce calls. This can be achieved by introducing a mechanism or a syntax to declare that part of the code should be processed prior to being parallelized, e.g. further subsetting of and validation of input data before distributing them to parallel workers
Early stopping, e.g. have future_lapply(X, FUN)
terminate active futures as soon as one of the FUN(X[[i]])
calls produces an error [future#FR213,
future.apply#FR75]
Automatic map-reduce: Support for merging of many futures into
chunks of futures to be processed more efficiently with less total
overhead. This could make
fs <- lapply(X, function(x) future(FUN(x))
and
vs <- value(fs)
automatically as efficient as
future_lapply(X, FUN)
removing the need for custom
implementations
Robustness of foreach: Implement bug fixes and outstanding
foreach feature requests in the
doFuture adapter until resolved upstream. For example,
add options to make foreach()
use proper parallel RNG even
when the develop has forgotten to use %doRNG%
[doFuture#61]
Generic support for progress updates for common map-reduce
functions, e.g. y <- lapply(X, slow_fcn) %progress% TRUE
and y <- X %>% future_map(slow_fcn) %progress% TRUE
[progressr#FR85,
progressr#113]
Below is a list of future backends that are planned.
future.aws.lambda - a backend to resolve futures via the Amazon AWS Lambda service. A group of several people is actively working on this since the end of 2020 [future#FR423]
future.aws.batch - a backend to resolve futures via the Amazon AWS Batch service. We have working group of several people actively working on this [future#FR423]
future.clustermq - a backend to resolve futures on a HPC cluster via the clustermq package [future#FR204, future#FR267]
future.google.cloud.functions - a backend to resolve futures via the Google Cloud Functions service. Work on this will start when a stable future.aws.lambda prototype has been established
future.p2p - a backend to resolve futures via a peer-to-peer network of trusted collaborators
future.rrq - a backend to resolve futures via a Redis queue using the rrq package The future and rrq maintainers are working on this since April 2021 [future#FR151]
future.sandbox - a backend to resolve untrusted futures via locked-down, sandboxed Linux containers (Docker, Singularity, …) without access to the hosts file system or network. This can already be partially achieved by low-level container setups via parallelly
future.sparklyr - a backend to resolve futures in Spark via the sparklyr package [future#FR286, sparklyr#FR1935]
Internationalization (i18n): Make all error and warning messages
translatable via R’s gettext()
framework. Invite community
to provide message translations.
Consider relicensing code to a permissive license, where possible
This list is manual curated and may not be comprehensive. Please see the issue trackers and milestones of the different package repositories for a more up-to-date view, e.g. https://github.com/HenrikBengtsson/future/milestones.↩︎