[GOAL] Re: CC-BY: the wrong goal for open access, and neither necessary nor sufficient for data and text mining

Ross Mounce ross.mounce at gmail.com
Tue Oct 9 21:10:25 BST 2012


Hi Heather,

I'm aware we disagree on the licensing of Open Access from previous
encounters and I don't want this devolve into a personal point scoring
affair but I do have to take issue with your assertion that:

"just minutes ago you were proudly asserting that you and other researchers
are knowingly using illegal methods for gaining access to research
literature such as asking for PDFs over twitter. Which is it, Ross - do
academics need to be accountable and transparent, or can they do what they
like?"

I re-read what I posted previously just to be sure, and nowhere did I say
*I* used these alternative methods for accessing research (although I can
forgive you for inferring that I have), I merely observed that it was
commonplace in research. Anyway, no offence taken.

Furthermore, I might add that this kind of copyright infringement is only
illegal in some jurisdictions (not worldwide AFAIK). I know this is perhaps
a dubious source, but this Wikipedia article seems to think that file
sharing (without profit) in Canada is legal:
http://en.wikibooks.org/wiki/Intellectual_Property_and_the_Internet/Copyright_infringement#Countries_where_sharing_files_without_profit_is_legal
and
it also appears to be legal in Russia too under certain circumstances e.g.
"home use" http://civil-code.narod.ru/ch69-art1271-1273.html.

Your position on the alternative to CC-BY by default for OA:

"articulating the commons (what it can or should mean) is a long-term
project, and advocating for specific licenses shuts down the conversation
prematurely."

Is an interesting one, and I applaud the spirit. But I do not think it is
practical. Researchers IMO have a very poor understanding of licencing and
if given a choice will often make poor or inappropriate choices.

you "think CC-BY-NC-SA is the strongest license for open access, as it
protects OA downstream"

I see that argument and I think I understand the logic going into it.
But again, practically the SA clause means that other content that doesn't
have that *exact* licence (CC-BY-NC-SA) cannot be remixed with content
under this licence (Ball, 2011 & many other sources). SA clauses are well
known to cause such awkward incompatibility issues. The NC clause prevents
wealth generation from research (which is arguably exactly why RCUK is
asking for CC-BY , to NOT prevent wealth-generating usage of research). So
in short I do not think CC-BY-NC-SA is best for Open Access.

WRT to your point 2 "CC-BY is not sufficient for data and text-mining" (nor
is *any* applicable licence AFAIK - I know of no licence that asserts that
digital material must be made available in a readily machine-interpretable
form in the licence)

you wrote: "Similarly, if the aim is to encourage publication of reusable
tables, then demanding CC-BY is not helpful. You can publish images with
the CC-BY license."

So, we agree here I think. I said licencing was irrelevant to this problem,
and you have pointed out that CC-BY does not prevent this problem. These
opinions agree but perhaps from different sides of the same coin :)

A good note to finish on?

Best,

Ross

Ball, A 2011. How to Licence Research Data. DCC How-to Guides. Edinburgh:
Digital Curation Centre
http://www.dcc.ac.uk/resources/how-guides/license-research-data

On 9 October 2012 19:52, Heather Morrison <hgmorris at sfu.ca> wrote:

> On 2012-10-09, at 10:47 AM, Ross Mounce wrote:
>
> >
> > 1.      CC-BY is not necessary for data and text-mining.
> >
> > In some sense true, it is not *strictly* necessary
>
> Glad we agree on that!
>
>
> > - but it sure does alleviate concerns over being sued!
> > Google can 'get away with it' because they don't need to document the
> in-between steps - transparency. Researchers and academics *do* need to be
> able to display reproducible literature mining techniques and thus will
> need to reproduce some published content (in my understanding) in order to
> show that their methods work as described. Thus there is an easily
> explainable difference between Google's needs (no need for transparency,
> just present the results of the mining analyses without republishing the
> analysed content), and the needs of academic research
> (reproducibility/transparency demonstrated by reproducing some
> annotated/analysed content AND results). I'm sure there are other reasons
> too but AFAIK CC-BY is 'best' for mining (well, CC0 would be better, but
> that's not realistic for OA)
>
> Ross, just minutes ago you were proudly asserting that you and other
> researchers are knowingly using illegal methods for gaining access to
> research literature such as asking for PDFs over twitter. Which is it, Ross
> - do academics need to be accountable and transparent, or can they do what
> they like?
>
> >
> > As you well know other licences like CC-BY-NC leave one uncomfortably
> open to legal action if one posts such material on say, an ad-supported
> blog.
>
> Forcing CC-BY could well leave one open to legal action. Picture, for
> example, a research subject whose picture is used for advertising purposes
> without their permission, or a scholar whose work is used in this manner
> who actively disagrees with the ad (e.g. a researcher whose conclusions
> suggest that one should avoid a drug, and a pharma company that
> cherry-picks a bit of the article that appears to support use of the drug).
>
> Speaking of open and transparent methods, are researchers telling human
> research subjects that their contributions may be given away on a blanket
> basis for third parties to sell? Would a research ethics committee even
> approve such an approach? Without this permission, I would argue that
> CC-BY, where human subjects are involved, will frequently be in violation
> of research ethics.
>
> As part of my dissertation, I did some interviews with senior people in
> academic publishing. The results were very interesting, and in some cases I
> have quoted the respondent at some length. I can assure that I did not ask
> permission from these people to give away rights to sell their words to
> others, and if I had wanted to do so, I would have needed to clear this
> with research ethics first.
>
>
> > I do not believe Open Access should prevent the sharing of materials on
> blogs and other popular places/uses and thus CC-BY is the 'safest' licence
> from the re-user POV.
> >
> > Digital content placed publicly on the internet needs *a* licence, and
> for OA research works; CC-BY looks like the best of those available to me.
> You are free to suggest an alternate licence and I think it would help your
> argument if you actually did, rather than just criticizing one option and
> seemingly providing no alternative.
>
> I do not agree that licensing is necessary needed, or always helpful. My
> own position is that articulating the commons (what it can or should mean)
> is a long-term project, and advocating for specific licenses shuts down the
> conversation prematurely. Of the CC licenses, I think CC-BY-NC-SA is the
> strongest license for open access, as it protects OA downstream. However,
> there may be good reasons for not allowing derivatives, and so I do not
> recommend insisting that everyone use any one particular license.
> >
> >
> > 2.  CC-BY is not sufficient for data and text-mining. The Creative
> Commons licenses are designed as a means for creators to waive rights that
> they would otherwise have under copyright; they do not place any
> obligations on the Licensor. There is nothing to stop a creator from using
> a CC-BY license with a locked-down PDF with extra DRM designed to prevent
> data and text-mining.
> >
> >
> > I also see the problem described here.
>
> Thanks - interjecting for emphasis, I think we might be getting
> somewhere...
>
>
> > But licencing and CC-BY has nothing to do with this problem!
> >
> > The problem described here, in my words is: obfuscation. This kind of
> thing is commonly encountered when publishers publish non-machine
> interpretable tables of data as *images* in academic works rather than
> copy-pasteable numbers or data as they should do.  It doesn't matter what
> the licence is, CC-BY or even All Rights Reserved(!) - it's very difficult
> to mine usable correct information out of such tables/content. As a further
> example, they could provide all the text as a 'screenshot' style image to
> further hamper mining efforts. Thus I'm afraid point 2 bares no relevance
> to Open Access & CC-BY.
>
> Similarly, if the aim is to encourage publication of reusable tables, then
> demanding CC-BY is not helpful. You can publish images with the CC-BY
> license.
>
> best,
>
> Heather Morrison
>
> >
> >
> > Ross
> >
> > --
> > -/-/-/-/-/-/-/-/-/-/-/-/-/-/-/-/-/-/-/-/-/-/-/-/-
> > Ross Mounce
> > PhD Student & Open Knowledge Foundation Panton Fellow
> > Fossils, Phylogeny and Macroevolution Research Group
> > University of Bath, 4 South Building, Lab 1.07
> > http://about.me/rossmounce
> > -/-/-/-/-/-/-/-/-/-/-/-/-/-/-/-/-/-/-/-/-/-/-/-/-
> > _______________________________________________
> > GOAL mailing list
> > GOAL at eprints.org
> > http://mailman.ecs.soton.ac.uk/mailman/listinfo/goal
>
>
> _______________________________________________
> GOAL mailing list
> GOAL at eprints.org
> http://mailman.ecs.soton.ac.uk/mailman/listinfo/goal
>



-- 
-- 
-/-/-/-/-/-/-/-/-/-/-/-/-/-/-/-/-/-/-/-/-/-/-/-/-
Ross Mounce
PhD Student & Open Knowledge Foundation Panton Fellow
Fossils, Phylogeny and Macroevolution Research Group
University of Bath, 4 South Building, Lab 1.07
http://about.me/rossmounce
-/-/-/-/-/-/-/-/-/-/-/-/-/-/-/-/-/-/-/-/-/-/-/-/-
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.ecs.soton.ac.uk/pipermail/goal/attachments/20121009/67ccae84/attachment-0001.html 


More information about the GOAL mailing list