Skip to content

Parallel execution

API for parallel execution functions that work independently of the app framework.

LokyBackend

Bases: Parallel

parallel backend using the loky library

loky provides reusable process pools that are more robust than the stdlib ProcessPoolExecutor, particularly in Jupyter notebooks. Requires pip install "scinexus[loky]".

MPIBackend

Bases: Parallel

parallel backend using MPI via mpi4py

Requires pip install "scinexus[mpi]" and an MPI implementation (e.g. OpenMPI).

MultiprocessBackend

Bases: Parallel

parallel backend using the stdlib concurrent.futures.ProcessPoolExecutor

Parallel

Bases: ABC

abstract base class for parallel execution backends

Subclass this to integrate a custom parallel engine (e.g. ray, dask).

as_completed(f, s, max_workers=None, **kwargs) abstractmethod

yield results of f applied to each element of s, in completion order

get_rank() abstractmethod

return the rank of the current process

get_size() abstractmethod

return the number of available workers

imap(f, s, max_workers=None, **kwargs) abstractmethod

yield results of f applied to each element of s, in order

is_master_process() abstractmethod

return True if the current process is the master

PicklableAndCallable

Bases: Generic[P, R]

wraps a callable so it is picklable for use with MPI executors

get_default_chunksize(s, max_workers)

compute a stable chunksize for distributing items across workers

Parameters:

Name Type Description Default
s Sized

a sized collection of work items

required
max_workers int

number of worker processes

required

get_parallel_backend(backend=None)

return the current parallel execution backend

Parameters:

Name Type Description Default
backend BackendType | None

if provided, return an instance of this backend type without changing the global default. This lets a package obtain the backend it needs without disrupting the settings of other packages.

None

Returns:

Type Description
`MultiprocessBackend`` when no backend has been set and
``backend is None``.

get_rank()

Returns the rank of the current process

get_size()

Returns the num cpus

imap(f, s, max_workers=None, use_mpi=False, if_serial='raise', chunksize=None)

Parameters:

Name Type Description Default
f Callable[[T], R]

function that operates on values in s

required
s Iterable[T]

series of inputs to f

required
max_workers int | None

maximum number of workers. Defaults to 1-maximum available.

None
use_mpi bool

use MPI for parallel execution. Temporarily switches to MPIBackend for the duration of the call.

False
if_serial Literal['raise', 'ignore', 'warn']

action to take if conditions will result in serial execution. Valid values are 'raise', 'ignore', 'warn'. Defaults to 'raise'.

'raise'
chunksize int | None

Size of data chunks executed by worker processes. Defaults to None where stable chunksize is determined by get_default_chunksize()

None

Returns:

Type Description
imap and as_completed are generators yielding result of f(s[i]), map returns the result
series. imap and map return results in the same order as s, as_completed returns results
in the order completed (which can differ from the order in s).
Notes

To use MPI, you must have openmpi (use conda or your preferred package manager) and mpi4py (use pip or conda) installed. In addition, your initial script must have a if __name__ == '__main__': block. You then invoke your program using

$ mpiexec -n <number CPUs> python3 -m mpi4py.futures <initial script>

is_master_process()

Evaluates if current process is master

In case of MPI checks whether current process is being run on file generated by mpi4py.futures

In case of Multiprocessing checks if generated process name included "ForkProcess" for Windows or "SpawnProcess" for POSIX

set_parallel_backend(backend=None)

set the default parallel execution backend

Parameters:

Name Type Description Default
backend BackendType | Parallel | None

a Parallel instance, a string literal ("multiprocess", "loky", or "mpi"), or None to reset to the default (MultiprocessBackend).

None