[provenance-challenge] temporally overlapping processes on collections

Ben Clifford benc at hawaga.org.uk
Tue Apr 28 16:34:21 BST 2009


In the OPM representation that I have for Swift at the moment, collections 
of artifacts are represented as artifacts themselves.

These collections are immutable in the sense that they end up with a well 
defined value, but what that final value is appears over time - for 
example, a large number of processes may execute, each contributing one 
element (an artifact) to an array (also an artifact). So knowledge of the 
artifact's immutable value monotonically increases over time.

Certain constructs in Swift (such as 'foreach') can safely run processes 
on the members of a collection before the entire value of that collection 
is known.

These can be contained in compound procedures which also fairly well map 
to OPM processes (with the individual activities of the compound procedure 
being finer grained processes, which should appear in a different account)

For example, say we have a collection of two elements, C = [A, B] produced 
by swift compound procedure P:

(data[] C) P() {
  data A = Pa();
  data B = Pb();
  C=[A,B];
}

So the collection C is produced by Pa and Pb.

Now we can have a Swift compound procedure K, which runs some procedure Kx 
foreach element x of C,

in swift syntax:

K(data[] C) {
  foreach x in C { k(x); }
}

The executions of P and K can overlap temporally - for example, the 
following is a valid execution order:

Pa kA Pb Kb

as is:

Pa Pb Kb Ka

This fits a little awkwardly with the some of the stuff in OPM 1.01 
section 8, at least in my head - if the compound procedures P and K are 
represented as OPM processes, they have a clearly defined data dependency 
between them (the collection C) but their executions can be concurrent (in 
that the start time of K is before the end time of P).

-- 


More information about the Provenance-challenge-ipaw-info mailing list