[GOAL] Re: Open data

Jan Velterop velterop at gmail.com
Wed May 9 09:36:12 BST 2012


On 9 May 2012, at 00:53, Andrew A. Adams wrote:

> Jan Velterop wrote:
>> The trouble with focussing on 'green', rather than on full
>> BOAI-compliant OA for research literature, is that it has become an a
>> priori concession and an end in itself. That only confuses matters (as
>> do ill-defined labels such as 'gratis' and 'libre'). 
> 
>> We should insist on BOAI-compliant OA (CC-BY or CC-0) for all research
>> articles, including for self-archived articles. And if anything, we
>> should insist on institutional repositories to actually be searchable
>> and accessible also for text mining. Human-readable OA is a conditio
>> sine qua non, but it is not sufficient for modern science.
> 
> The trouble with focussing on this high level

So you're saying that the OA definition in the BOAI is too high level?

> is that there isn't agreement 
> amongst scientists that this is what is needed

Really? What do scientists agree on then?

> , and on exactly what the 
> limits of this are. Indeed, opening up one's data can involve significantly 
> more work for the scientist/scholar.

Who is talking about opening up one's data? That may be desirable in its own right, but it is not part of BOAI-compliant OA. Any data in this context is just data "as published in articles" (and therefore potentially mine-able).

> Green OA requires a few keystrokes per 
> paper. Perhaps five minutes work (so long as one's repository is set up well 
> and one keeps focussed on providing the paper's text and does not get too 
> hung up about more than citing meta-data).

What I'm after is that green OA should not weaken the BOAI (in which the definition of OA equates to libre OA) to mean just 'gratis' (humans can read it, but it can't be used for text-mining).
> 
> I have a PhD student, for example, who has just finished her thesis. The 
> thesis and papers from it contain various statistics and quotes drawn from 
> her field work. We are still working on further papers from the thesis with 
> an expectatin of two more to come. The data has been appropriately developed 
> for the publications written at present, but the fullw interviews from which 
> quotes are drawn have not had their source seudonymised; the numeric data has 
> only been put systematised for the precise analyses used in the thesis and 
> the papers. Some of it is in incompatible file formats with chunks of the raw 
> data put into different tools in different (overlapping but not a single set 
> in any one tool).

As said, not relevant in the context of BOAI-compliant OA to published articles.

> 
> What rights to first publication of specific analysis on this data do my 
> student and her supervisors have? Which elements of the data are required to 
> be made available?
> 
> If we wait until we can answer these questions before providing the 
> additional access to the existing outputs

Additional access? Is 'libre' as opposed to 'gratis' OA "additional access"? Or are you against re-use of data published in peer-reviewed articles? If so, that means you're against the BOAI.

> then we are likely to wait another 
> twenty years or more before achieving full access to the papers. Yes, in a 
> few fields perhaps, the data must be in a publishable form before a paper can 
> be published, but there are currently no social mechanisms, and indeed few 
> technological mechanisms, that can cope with providing this data at present. 
> There is an easy, simple solution to providing access to the text of papers: 
> put a pdf, word, html, rtf, odt or even plain text of the author's final 
> submitted text in an institutional repository or the opendepot.

Indeed, and make it 'libre' rather than just 'gratis'. In fact, do away with both 'libre' and 'gratis', since OA as defined by the BOAI suffices.

> 
> Human-readable OA is within our grasp but we're not grasping it!

Let us not assume that putting, say, pdfs and Word documents in repositories prevents them in principle from being machine-readable and text-mine-able. To do so is to grossly underestimate what computer scientists can already do, and I fully expect generally available tools to appear soon. If there are any impediments to text-mining them, it's more likely due to the repository than to the file format. Certainly there is no reason to deny any article BOAI-compliance. Where does the 'gratis' versus 'libre' come from anyway? 'Gratis' is an unnecessary weakening of what is defined as OA in the BOAI.


> Let us grasp 
> this first and THEN go on to sort out the more difficult issues.

What's more difficult about self-archiving under 'libre' (BOAI-compliant OA) conditions than under just 'gratis' conditions?

> Otherwise 
> we're just fiddling while Rome burns (struggling to reform the whole of 
> scholarly and scientific communications in one go rather than doing what is 
> simple and achievable now with little in the way of controversy about its 
> beneficial effects on science and scholarship and then and only then dealing 
> with the more difficult issues).
> 
> 
> -- 
> Professor Andrew A Adams                      aaa at meiji.ac.jp
> Professor at Graduate School of Business Administration,  and
> Deputy Director of the Centre for Business Information Ethics
> Meiji University, Tokyo, Japan       http://www.a-cubed.info/
> 
> 
> 
> _______________________________________________
> GOAL mailing list
> GOAL at eprints.org
> http://mailman.ecs.soton.ac.uk/mailman/listinfo/goal




More information about the GOAL mailing list