[provenance-challenge] Re: PC3 workflow code available and schedule
Paul Groth
pgroth at gmail.com
Thu Feb 26 19:55:45 GMT 2009
Hi Simon,
These look great. Can you post them to the wiki?
Thanks,
Paul
On Feb 26, 2009, at 11:35 AM, Miles, Simon wrote:
> Hello Paul, Yogesh,
>
> I've got a few suggestions of queries for the provenance challenge. I
> will not be offended if you ignore those you don't find interesting
> enough to use :-). I include the queries below, but am happy to put
> them on the Wiki myself if you tell me to.
>
> =========
>
> 1. For a given detection, which CSV files contributed to it?
>
> Basic sample answer: The CSV file containing the Detection table.
>
> Advanced sample answer: The CSV file containing the Detection table,
> CSV file containing the Image table (as the image is an attribute of
> the detection), and CSV file containing the FrameMetadata table (as
> the frame metadata is an attribute of the image).
>
> =========
>
> 2. A CSV or header file is deleted during the workflow's execution.
> How much time expired between a successful IsMatchCSVFileTables test
> (when the file existed) and an unsuccessful IsExistsCSVFile test (when
> the file had been deleted)?
>
> Sample answer: 3ms
>
> For testing the above query, we it may be simplest to edit the
> workflow to include deletion of the CSV file as a step.
>
> =========
>
> 3. The user considers a table to contain values they do not expect.
> Was the range check (IsMatchTableColumnRanges) performed for this
> table?
>
> Sample answer: Yes
>
> =========
>
> 4. The workflow halts due to failing an IsMatchTableColumnRanges
> check. How many tables successfully loaded before the workflow halted
> due to a failed check?
>
> Sample answer: 2
>
> =========
>
> Finally, a couple of questions inspired by dynamic program slicing:
>
> 5. Which operation executions were strictly necessary for the Image
> table to contain a particular (non-computed) value?
>
> Sample answer: call of ReadCSVReadyFile, call of CreateEmptyLoadDB,
> 2nd call of ReadCSVFileColumnNames, 2nd call of LoadCSVFileIntoTable
> (2nd calls because Image is loaded in the 2nd iteration of the for
> loop, excluded checks because they do not change anything, excluded
> UpdatedComputedColumns because it is non-computed, excluded
> CompactDatabase because it does not affect the value).
>
> =========
>
> 6. Which pairs of procedures in the workflow could be swapped and the
> same result still be obtained (given the particular data input)?
>
> Sample answer: (I won't enumerate them all, but I think some can be
> swapped as the checks in particular are not causally dependent, but we
> cannot swap those inside the loop with those outside).
>
> Thanks,
> Simon
>
> 2009/2/6 Paul Groth <pgroth at isi.edu>:
> - Hide quoted text -
>> Hi Everyone,
>>
>> Yogesh Simmhan has now made the code, in both C# and Java, available
>> for the PanStarrs workflow. Everything is available at http://twiki.ipaw.info/bin/view/Challenge/ThirdProvenanceChallenge
>> . We are looking for your help in reviewing the code and proposing
>> provenance queries for the challenge. We are aiming to complete this
>> by the end of the month so that we can start the challenge.
>>
>> The proposed PC3 schedule is as follows:
>>
>> 1. Review of code and provenance query proposals (to Feb 27)
>> - March 2 - PC3 Starts
>> 2. Make the workflow work with individual team's systems [Mar 2 - Mar
>> 30]
>> 3. Generate provenance for the challenge workflow & run queries on it
>> [Mar 30 - Apr 13]
>> 4. Export OPM Graphs and import from others [Apr 13 - May 4]
>> 5. Run queries on imported OPM graph [Apr 27 - Jun 1]
>> 6. Prepare slides for challenge [Jun 1 - Jun 8]
>> - PC3 Workshop June 10 - 11 held in Amsterdam
>>
>> Thanks for your participation and I look forward to seeing your
>> provenance queries and comments on the code.
>>
>> Paul
>>
>>
>>
>> --------------------------------------------------------------
>> Paul Groth, Ph.D.
>> Postdoctoral Research Associate
>> Information Sciences Institute
>> University of Southern California
>> pgroth at isi.edu
>> Tel: 310 448 8482 Fax: 310 822 0751
>> http://www.isi.edu/~pgroth/
>> http://thinklinks.wordpress.org
-------------------------------------------------------
pgroth at gmail.com http://thinklinks.wordpress.org
More information about the Provenance-challenge-ipaw-info
mailing list