[GOAL] Re: OASPA's ironic demonstration of the inadequacy of CC-BY for data mining

Hans Pfeiffenberger hans.pfeiffenberger at awi.de
Tue Mar 12 23:28:37 GMT 2013


Am 12.03.13 18:06, schrieb David Prosser:
> (An additional complication is, of course, that we are really talking about data here rather than papers, and so perhaps a database license would be even more appropriate.)

Indeed, facts are not copyrightable - at least in Germany ;-)) - and 
thus a CC-License (except perhaps CC0) or any other license _based on 
copyright_ (or German Urheberrecht) would be mostly pointless. (For a 
comparison of the situation in some jurisdictions, see 
http://www.knowledge-exchange.info/Default.aspx?ID=461; there seems be 
be a "risk" that /some/ data might be copyrightable under UK or Danish 
law.)

A database right in EU jurisdictions (not in the US, as I understood 
from spurious sources) comes into play only if some entity has had 
"significant" investments in the database as such, not counting cost 
of acquiring the data in the database. This is clearly not the case here.

As far as I understood, a simple table of numbers is not protect-able 
(in most cases) as soon as it is out "in the wild". So no need for a 
license, if you intend to make it freely (libre) available. (In 
contrast to text or photos, which are protected, even if there is no 
(C) mark on it.)

As to Heather's argument in her blog, that " On the Internet, the way 
to note that a web page is /not/ available for text and data mining is 
to use the norobots.txt in the web page's metadata.": That is true, 
but not necessarily accepted by lawyers. Proof: German publishers of 
newspaper lobby - quite successfully, so far - that Google shall pay 
for displaying snippets - and still implicitly expect to be displayed 
in Google search. "Ironically", they do use robot.txt, but not to 
drive Google or any other big search engine away. (AFAIK there is no 
"norobots.txt". Also, robots.txt is a file at the root "/" of a 
_site_, not "in the metadata" of a _page_),

My overall point is that one cannot assume that what seems appropriate 
or sensible will be seen as legal or unproblematic by lawyers. And 
nobody can justify building an infrastructure or even a common 
practise on shaky ground. So a simple, unambigous and (hopefully) 
internationally identical legal environment is indispensable for 
research and information infrastructures. One of the outstanding 
features of CC is that it is providing such an environment for text - 
except for the NC clause, which is wide open for doubt about its meaning.

best,

Hans
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.ecs.soton.ac.uk/pipermail/goal/attachments/20130313/e492d901/attachment.html 


More information about the GOAL mailing list