tasks

task-oriented execution for Python

Author: Forest Bond
Copyright: © 2005-2007 Forest Bond
License:GNU Free Documentation License (http://www.gnu.org/licenses/fdl.html)

Contents

Overview

The tasks module is intended for use in programs that perform actions in sequence. A common example of this is an installer or setup utility. The tasks module makes it very easy to assemble a set of tasks to run and execute each in turn while tracking state changes for each task.

Tasks are represented by task objects. The tasks module defines several classes whose instances represent various types of tasks with differing behavior.

Task State

Task objects possess a number of read-only attributes that reflect their current state.

These attributes are important execution details:

progress
An integer or float that indicates the percent of the task that has been completed.
status
A textual description of the current state of the task.

The following boolean flags indicate specific conditions:

finished
A boolean; indicates whether or not the task has run and reached a terminal state.
completed
A boolean; indicates if the task has run and finished successfully.
failed
A boolean; indicates if the task has run and finished unsuccessfully.
paused
A boolean; indicates if the task is currently paused.
cancelled
A boolean; indicates if the task has terminated without finishing.
running
A boolean; indicates whether or not the task is currently being executed.

These boolean flags are not mutually exclusive, and all are required to adequately represent the current state of the task. The task's state can also be represented as a tuple that specifies which of the above flags are True, current_state. So, if a particular task is running, the current_state attribute will be a tuple containing the string "running". If the task completes successfully, it will be a tuple containing the strings "finished" and "completed".

As we will see shortly, tasks can be interacted with during execution. Some of these interactions will be directives for which state changes can be anticipated. Thus, assuming no external force acts to change it, the final state of a particular task is deterministic at a given point in time. The final_state attribute represents this state whose value is a tuple much like the current_state attribute. Additionally, the terminal attribute is a boolean flag indicating whether or not the task's final state would result in the task becoming finished.

Finally, the following attribute stores additional information in the event of task failure:

failure
If failed is True, this may contain an object indicating the nature of the failure, but is otherwise undefined.

Running Tasks and Processing State Changes

Task instances are run by calling the run method, which is actually a generator function that yields data structures representing task state changes. Thus, task instances are properly executed by calling the run method and iterating over the result. Each iteration yields a list of new task changes, which should be assumed to have occured simultaneously.

Instances of the TaskChange class are used to represent individual task state changes. TaskChange objects have the following properties:

parameter
The name of the task property that is changing.
value
The new value for that property.
oldvalue
The value of that property prior to the change.
task
The task for which the change is occuring.

While executing a task, each list of TaskChange objects can be viewed as a packet in a stream of state information, or as a series of events that can be handled with an event loop:

for task_changes in mytask.run():
    for task_change in task_changes:
        if task_change.parameter == 'progress':
            print 'Task %s is now %.1f%% complete.' % (
              task_change.task, task_change.value)
        elif task_change.parameter == 'running':
            if task_change.value:
                print 'Task %s is now running.' % task_change.task
            else:
                print 'Task %s is no longer running.' % task_change.task

Interacting With Running Tasks

During execution, directives ("instructions") can be issued that will help to determine the future state of the task. Internally, these instructions are appended to a queue and handled in a normal execution context (inside the run generator). Deferring instruction processing to be handled in the task function ensures that events are handled in the correct order, despite inevitable lag in the task event stream.

Instructions are issued to the running task using the following methods:

fail
Causes the task to terminate with failure status. Will resume the task first if it is paused.
cancel
Causes the task to terminate without finishing. Will resume the task first if it is paused.
complete
Causes the task to terminate successfully. Will resume the task first if it is paused.
pause
Causes the task to suspend execution indefinitely.
resume
Causes the task to resume execution, terminating the paused state.

If an instruction is issued that is impossible to fulfill due to the current state of the task, an InstructionError is raised immediately. However, note that, while a task may initially accept an instruction, it may not necessarily be possible for the instruction to be carried out.

Defining Tasks

There are two main task classes, SimpleTask and ComplexTask. SimpleTask instances represent single atomic tasks, while ComplexTask instances represent a task made up of one ore more sub-tasks (each of which can be a SimpleTask, ComplexTask, or derivative).

The SimpleTask and ComplexTask initializers accept some common keyword parameters:

description
Usually a short sentence explaining the purpose of the task.
critical
A boolean indicating whether or not failure of this task represents a critical failure. In general, a critical failure can be handled differently by callers. For instance, complex tasks halt execution immediately when a critical subtask fails.
progress_weight
A float indicating how progress values for different tasks should be compared. Essentially, this serves as a rough indication of how much longer it tasks to complete one task versus another. For instance, if task A takes twice as long as task B to complete, task A could reasonably have a progress weight of 2.0, while the progress weight for task A would be 1.0. This makes it possible to produce more accurate estimations of total progress for completion of multiple tasks.
data
Arbitrary object that can be used to store task-specific information in a somewhat standardized place.

SimpleTask Objects

The SimpleTask initializer requires two positional arguments: a name for the task, and a function that performs the task, which I'll often refer to as a "task function.":

>>> from tasks import SimpleTask
>>> def my_task_fn(task):
...     yield []
>>> my_task = SimpleTask('my_task', my_task_fn)

The task function should perform any work necessary to fulfill the task, and should be a generator function that yields lists of TaskChange objects at various (task-specific) points during execution.

Task Functions

Most of the time, task changes will be yielded when the task metadata has changed in some way, like a status change or progress update. However, the yield statement also serves as the primary mechanism by which control is temporarily returned to the caller. As a result, if the task function engages in some activity that would cause a long period of time to go by without any yield statements executed, additionally yield statements should be inserted. These additional statements should yield an empty list.

A simple task function follows:

>>> import time
>>> from tasks import TaskChange
>>> def my_task_fn(task):
...     task.status = 'Sleeping...'
...     yield [TaskChange('status', task.status, task)]

... for x in range(10): ... time.sleep(0.1)

... old_progress = task.progress ... task.progress = old_progress + 10.0 ... yield [TaskChange('progress', task.progress, task)]

... yield task.finish()

Note that all of the instruction-issuing methods discussed previously can be called from within task functions like the finish method was in this example.

Actually, the example task function above is not as concise as it could be. The change method, available as an attribute of Task objects, allows changes to be made and TaskChange instances to be generated at the same time. Its use is simple; for instance, the above example can be rewritten as:

>>> import time
>>> def my_task_fn(task):
...     yield [task.change('status', 'Sleeping...')]

... for x in range(10): ... time.sleep(0.1) ... yield [task.change('progress', task.progress + 10.0)]

... yield task.finish()

Further, this particular task function can be made even simpler by relying on the built-in clean-up mechanisms that run automatically after the task function returns. Any task that hasn't explicitly terminated is considered finished and, to preserve the integrity of the stream of changes, the necessary task changes are introduced automatically. As a result, the task function needn't explicitly call finish:

>>> import time
>>> def my_task_fn(task):
...     yield [task.change('status', 'Sleeping...')]

... for x in range(10): ... time.sleep(0.1) ... yield [task.change('progress', task.progress + 10.0)]

Indicating Task Failures

Failures are indicated much as one would expect:

>>> def my_task_fn(task):
...     yield [task.change('status', 'Printing "foo"')]
...     print 'foo'
...     yield [
...       task.change('status', 'Finished printing.'),
...       task.change('progress', 50),
...     ]

... yield [task.change('status', 'Reading file "bar"')] ... try: ... f = open('bar', 'r') ... f.read() ... f.close() ... except (IOError, OSError): ... yield task.fail('Failed to read file "bar"') ... yield [task.change('status', 'Finished reading file.')]

ComplexTask Objects

ComplexTask's are similar to SimpleTask's in many ways. The most significant difference to callers is that, rather than specifying a task function that should be run, the task is initialized with a list of other tasks that will be executed sequentially. For instance:

>>> import time
>>> from tasks import ComplexTask, SimpleTask
>>> def taskfn_sleep10(task):
...     yield [task.change('status', 'Sleeping')]
...     for x in range(10):
...         yield [task.change('progress', task.progress + 10.0)]
...         time.sleep(0.1)
>>> sleep10_1 = SimpleTask('sleep10_1', taskfn_sleep10)
>>> sleep10_2 = SimpleTask('sleep10_2', taskfn_sleep10)
>>> sleep10_3 = SimpleTask('sleep10_3', taskfn_sleep10)
>>> sleep30 = ComplexTask('sleep30', [sleep10_1, sleep10_2, sleep10_3])
>>> all_task_changes = []
>>> for task_changes in sleep30.run():
...     all_task_changes = all_task_changes + task_changes
>>> for task_change in all_task_changes[:9]:
...     print task_change
sleep30 running: False => True
sleep30 running_subtask: None => <SimpleTask sleep10_1>
sleep30 status: '' => 'running sleep10_1'
sleep10_1 running: False => True
sleep10_1 status: '' => 'Sleeping'
sleep10_1 progress: 0.0 => 10.0
sleep30 progress: 0.0 => 3.3333333333333335
sleep10_1 progress: 10.0 => 20.0
sleep30 progress: 3.3333333333333335 => 6.666666666666667
>>> for task_change in all_task_changes[-9:]:
...     print task_change
sleep30 progress: 96.666666666666629 => 99.999999999999957
sleep10_3 completed: False => True
sleep10_3 finished: False => True
sleep10_3 running: True => False
sleep30 running_subtask: <SimpleTask sleep10_3> => None
sleep30 progress: 99.999999999999957 => 100.0
sleep30 completed: False => True
sleep30 finished: False => True
sleep30 running: True => False

Helper Classes

Several helper classes are provided that address common needs:

SimpleFunctionTask
Like a SimpleTask, except that the function passed in should be a plain function, rather than a generator. The function will be called when the task is exectued, and the task will finish with a status that is dependent on the return value from the function; success if the function returns None or any value that evaluates to True in a boolean context, and failure otherwise.
SimpleCommandTask
This task class is like SimpleFunctionTask, except a command to be executed with os.system is specified instead of a function. Success or failure is determined by the exit status of the command—success if zero, failure otherwise.
ComplexTaskWithSharedData
A complex task that overwrites each of its subtasks' data with its own data before executing the subtask. Note that this is performed as a simple assignment, so only mutable objects should be used as the data object if it is expected that both the parent task and its subtasks will continue to hold the same data despite changes. A dict is often a good choice.

More Information

See the tests included with the source distribution for more complete examples of usage.