krotov.parallelization module¶
Support routines for running the optimization in parallel across the objectives
The time-propagation that is the main numerical effort in an optimization with
Krotov’s method can naturally be performed in parallel for the different
objectives. There are three time-propagations that happen inside
optimize_pulses()
:
A forward propagation of the
initial_state
of each objective under the initial guess pulse.A backward propagation of the states \(\ket{\chi_k}\) constructed by the chi_constructor routine that is passed to
optimize_pulses()
, where the number of states is the same as the number of objectives.A forward propagation of the
initial_state
of each objective under the optimized pulse in each iteration. This can only be parallelized per time step, as the propagated states from each time step collectively determine the pulse update for the next time step, which is then used for the next propagation step. (In this sense Krotov’s method is “sequential”)
The optimize_pulses()
routine has a parameter parallel_map that can
receive a tuple of three “map” functions to enable parallelization,
corresponding to the three propagation listed above. If not given,
qutip.parallel.serial_map()
is used for all three propations, running in
serial. Any alternative “map” must have the same interface as
qutip.parallel.serial_map()
.
It would be natural to assume that qutip.parallel.parallel_map()
would be
a good choice for parallel execution, using multiple CPUs on the same machine.
However, this function is only a good choice for the propagation (1) and (2):
these run in parallel over the entire time grid without any communication, and
thus minimal overhead. However, this is not true for the propagation (3),
which must synchronize after each time step. In that case, the “naive” use of
qutip.parallel.parallel_map()
results in a communication overhead that
completely dominates the propagation, and actually makes the optimization
slower (potentially by more than an order of magnitude).
The function parallel_map_fw_prop_step()
provided in this module is an
appropriate alternative implementation that uses long-running processes,
internal caching, and minimal inter-process communication to eliminate the
communication overhead as much as possible. However, the internal caching is
valid only under the assumption that the propagate function does not have
side effects.
In general,
parallel_map=(
qutip.parallel_map,
qutip.parallel_map,
krotov.parallelization.parallel_map_fw_prop_step,
)
is a decent choice for enabling parallelization for a typical multi-objective optimization.
You may implement your own “map” functions to exploit parallelization paradigms
other than Python’s built-in multiprocessing
, provided here. This
includes distributed propagation, e.g. through ipyparallel clusters. To write
your own parallel_map functions, review the source code of
optimize_pulses()
in detail.
In most cases, it will be difficult to obtain a linear speedup from parallelization: even with carefully tuned manual interprocess communication, the communication overhead can be substantial. For best results, it would be necessary to use parallel_map functions implemented in Cython, where the GIL can be released and the entire propagation (and storage of propagated states) can be done in shared-memory with no overhead.
Summary¶
Classes:
A process-based task consumer |
|
A task that performs a single forward-propagation step |
Functions:
parallel_map function for the forward-propagation by one time step |
__all__
: Consumer
, FwPropStepTask
, parallel_map_fw_prop_step
Reference¶
-
class
krotov.parallelization.
Consumer
(task_queue, result_queue, data)[source]¶ Bases:
multiprocessing.context.Process
A process-based task consumer
- Parameters
task_queue (multiprocessing.JoinableQueue) – A queue from which to read tasks.
result_queue (multiprocessing.Queue) – A queue where to put the results of a task
data – cached (in-process) data that will be passed to each task
-
class
krotov.parallelization.
FwPropStepTask
(i_state, pulse_vals, time_index)[source]¶ Bases:
object
A task that performs a single forward-propagation step
The task object is a callable, receiving the single tuple of the same form as task_args in
parallel_map_fw_prop_step()
as input. This data is internally cached by theConsumer
that will execute the task.- Parameters
i_state (int) – The index of the state to propagation. That is, the index of the objective from whose
initial_state
the propagation startedpulse_vals (list[float]) – the values of the pulses at time_index to use.
time_index (int) – the index of the interval on the time grid covered by the propagation step
The passed arguments update the internal state (data) of the
Consumer
executing the task; they are the minimal information that must be passed via inter-process communication to enable the forward propagation (assuming propagate inoptimize_pulses()
has no side-effects)
-
krotov.parallelization.
parallel_map_fw_prop_step
(shared, values, task_args)[source]¶ parallel_map function for the forward-propagation by one time step
- Parameters
shared – A global object to which we can attach attributes for sharing data between different calls to
parallel_map_fw_prop_step()
, allowing us to have long-runningConsumer
processes, avoiding process-management overhead. This happens to be a callable (the original internal routine for performing a forward-propagation), but here, it is (ab-)used as a storage object only.values (list) – a list 0..(N-1) where N is the number of objectives
task_args (tuple) –
A tuple of 7 components:
A list of states to propagate, one for each objective.
The list of objectives
The list of optimized pulses (updated up to time_index)
The “pulses mapping”, cf
extract_controls_mapping()
The list of time grid points
The index of the interval on the time grid over which to propagate
A list of propagate callables, as passed to
optimize_pulses()
. The propagators must not have side-effects in order forparallel_map_fw_prop_step()
to work correctly.