Parallel Backends
By default, future-based code runs sequentially, but with a single line of code, we easily switch to run the exact same code in parallel. The most common approach is to parallelize on the local machine, but we have also the option to harness the CPUs of other local or remote machines. For example, to parallelize on the local machine, the end-user can call:
plan(multisession)
After this, all of Futureverse, including future.apply, furrr, and doFuture, and any package that use these, will run the code in parallel.
To switch back to sequential processing, we can call:
plan(sequential)
If you have Secure Shell (SSH) access to other machines on your local network, or remote machines, call:
plan(cluster, workers = c("n1", "n1", "n2", "remote.server.org"))
This will set up four parallel workers, where two run on the local ‘n1’ machine, another on the local ‘n2’ machine, and the fourth on the remote ‘remote.server.org’ machine.
The future package comes with built-in future backends that leverage the parallel package part of R itself. In addition to these backends, others exist in package extensions, e.g. future.callr, future.mirai, and future.batchtools. Below is an overview of the most common backends that you as an end-user can chose from.
Package / Backend | Features | How futures are evaluated |
---|---|---|
futuresequential |
📶 ♻️ |
sequentially and in the current R process; default |
futuremultisession |
📶 ♻️ |
parallelly via background R sessions on current machine |
futurecluster |
📶 ♻️* |
parallelly in external R sessions on current, local, and/or remote machines |
futuremulticore |
📶 ♻️ |
(not recommended) parallelly via forked R processes on current machine; not with GUIs like RStudio; not on Windows |
future.callrcallr |
📶(next) ♻️(next) |
parallelly via transient callr background R sessions on current machine; all memory is returned when as each future is resolved |
future.miraimirai_multisession |
📶(next) ♻️(next) |
parallelly via mirai background R sessions on current machine; low latency |
future.miraimirai_cluster |
♻️(next) |
parallelly via mirai daemons running locally or remotely |
future.batchtoolsbatchtools_lsf batchtools_openlava batchtools_sge batchtools_slurm batchtools_torque |
📶(soon) ♻️(soon) |
parallelly on HPC job schedulers (Load Sharing Facility [LSF], OpenLava, TORQUE/PBS, Son/Sun/Oracle/Univa Grid Engine [SGE], Slurm) via batchtools; for long-running tasks; high latency |
📶: futures relay progress updates in real-time, e.g. progressr
♻️: futures are interruptible and restartable; * disabled by default
(next): next release; (soon): in a near-future release
It is straightforward to implement new backends that leverage other ways to harness available compute resources. As soon as a new backend has been validated to be compliant with the Future API specifications, which can be done by the future.tests package, then it can be used anywhere future-based code is used.