<div dir="ltr"><br><div class="gmail_extra"><br><div class="gmail_quote">On Tue, Jan 24, 2017 at 2:10 PM, Heather Morrison <span dir="ltr"><<a href="mailto:Heather.Morrison@uottawa.ca" target="_blank">Heather.Morrison@uottawa.ca</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div style="word-wrap:break-word">
<div>Another critique that may be more relevant to this argument: I challenge PMR's contention that it is necessary to limit this kind of research to works that are licensed CC-BY. If you gather data from a great many different tables and analyze it, what you
will be publishing is your own work. </div>
<div><br>
</div>
<div>This is no different from doing a great deal of reading and thinking and writing a new work that draws on this knowledge, with appropriate citations to the works that you have read.</div>
<div><br>
</div>
<div>Copyright is only invoked if you want to actually copy an original table for inclusion in a publication. If you are drawing on data from thousands of tables it is not clear how often this will happen. If what you want to copy is an insubstantial amount
this would be covered under fair dealing. If the work is free-to-read, whether All Rights Reserved or under an open license, you can point readers to the original. At worst, this is a minor inconvenience.</div></div></blockquote><div><br></div><div>This is completely wrong. The problem is that this is a legal issue and copyright law, by default, covers all aspects of copying. Copying material into a machine for the purpose of mining involves copyright. Whether it seems reasonable or fair is irrelevant. If you carry out mining then you should be prepared to answer in court.<br><br></div><div>The problem is compounded by:<br></div><div>* it is jurisdiction-dependent. Fair-use only exists in certain domains. It is not the same as fair dealing which is generally weaker. What is permissible in the US may not be in UK and vice versa.<br></div><div>* It is extremely complex. Guessing the law will not be useful.<br></div><div>* Much of the law has not been tested in court. "Non-commercial" is not what you or I would like it to mean. It is what a court finds when I or others are summoned before it.<br><br></div><div>I have been involved in this for over 4 years in the UK and in Europe (Parliament and Commission). There is no consensus on what should be allowed and what will ultimately be decided by the Commission and Member States. I have taken legal opinion on some of this and consulted with other experts and the answers are often unclear.<br></div><div><br></div><div>The legality of Text and Data Mining is formally unrelated to whether the miner publishes the results or not.<br><br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div style="word-wrap:break-word">
<div><br>
</div>
<div>If you prefer to limit your research to works that are CC-BY licensed, it is your right to make this choice. Many other researchers, myself included, work with a wide range of data and do not choose to limit what we gather to works that are licensed CC-BY.
One example from my own research: if a publisher has a table listing APCs, I screen scrape the table, pop the data into a spreadsheet, and work with it. </div></div></blockquote><div><br></div><div>The primary issue for Text and data Mining is automated analysis of many tables. This is an inconsistency in the law that we are trying to get legislators to change.<br></div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div style="word-wrap:break-word"><div>Even publishers that use CC-BY for articles usually have All Rights Reserved for pages that contain this
type of information. </div></div></blockquote><div><br></div><div>Do you have metrics for this. Because this is incompatible with the licence and should be challenged - as I frequently do.<br> <br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div style="word-wrap:break-word"><div>If I limited myself to data sources that are CC-BY I could not do this kind of research.</div></div></blockquote><div><br></div><div>I agree that this is limiting and that is why it would be useful for scientific material to be licensed CC BY. <br><br></div><div>In summary this is a complex legal question and the answers have to be based on law not guesswork.<br></div><div> <br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div style="word-wrap:break-word">
<div><br>
</div><br clear="all"></div></blockquote></div><br>-- <br><div class="gmail_signature" data-smartmail="gmail_signature"><div dir="ltr"><div>Peter Murray-Rust<br>Reader Emeritus in Molecular Informatics<br>Unilever Centre, Dept. Of Chemistry<br>University of Cambridge<br>CB2 1EW, UK<br>+44-1223-763069</div></div></div>
</div></div>