<br><br><div class="gmail_quote">---------- Forwarded message ----------<br>From: <b class="gmail_sendername">Eugene Garfield</b> <span dir="ltr"><<a href="mailto:eugene.garfield@thomsonreuters.com">eugene.garfield@thomsonreuters.com</a>></span><br>
Date: Tue, Jul 31, 2012 at 5:44 PM<br>Subject: Re: [SIGMETRICS] Online Academic Abuses and the Power of Openness: Naming & Shaming<br>To: <a href="mailto:SIGMETRICS@listserv.utk.edu">SIGMETRICS@listserv.utk.edu</a><br>
<br><br><div>
<div dir="ltr" align="left"><span><font color="#0000ff"><strong>Dear Stevan: Some of your readers may not be able to access my 1997 letter to BMJ because you have posted a link that requires a subscription to the BMJ files. The proper
link to use is
<div><strong><a href="http://garfield.library.upenn.edu/papers/bmj14june1997.pdf" target="_blank">http://garfield.library.upenn.edu/papers/bmj14june1997.pdf</a></strong></div>
<div> </div>
<div><span>where I have posted the full text. </span></div>
<div><span></span> </div>
<div><span>Unfortunately my comments have been distorted in some cases to justify deplorable excesses in the use of references to the same journal when I emphasized that such references should be relevant and not mere window dressing--
a blatant attempt to increase the impact factor of the journal in question. Best wishes. Gene Garfield</span></div>
</strong></font></span></div>
<br>
<div dir="ltr" lang="en-us" align="left">
<hr>
<font face="Tahoma"><b>From:</b> ASIS&T Special Interest Group on Metrics [mailto:<a href="mailto:SIGMETRICS@LISTSERV.UTK.EDU" target="_blank">SIGMETRICS@LISTSERV.UTK.EDU</a>]
<b>On Behalf Of </b>Stevan Harnad<br>
<b>Sent:</b> Tuesday, July 31, 2012 10:03 AM<br>
<b>To:</b> <a href="mailto:SIGMETRICS@LISTSERV.UTK.EDU" target="_blank">SIGMETRICS@LISTSERV.UTK.EDU</a><br>
<b>Subject:</b> Re: [SIGMETRICS] Online Academic Abuses and the Power of Openness: Naming & Shaming<br>
</font><br>
</div>
<div></div>
<div><div class="h5">
<div style="FONT-FAMILY:Helvetica;FONT-SIZE:medium">
<div style="MARGIN:0cm 0cm 0pt"><font face="Arial"><br></font></div><div style="MARGIN:0cm 0cm 0pt"><font face="Arial">Sorry for the long delay in replying to this. I missed it, and it has just been drawn to my attention:</font></div>
<div style="MARGIN:0cm 0cm 0pt"><font face="Arial"><br>
</font></div>
<div style="MARGIN:0cm 0cm 0pt"><font face="Arial">On 10 April 2012 Gustaf Nelhans wrote:</font></div>
</div>
<div style="MARGIN:0cm 0cm 0pt;FONT-SIZE:medium"><font size="+0">
<div>
<div>
<div style="WORD-WRAP:break-word" lang="EN-US" vlink="blue" link="blue">
<div>
<div style="FONT-FAMILY:Helvetica">
<blockquote type="cite">
<div style="WORD-WRAP:break-word" lang="EN-US" vlink="blue" link="blue">
<div><font face="Arial">
<blockquote style="BORDER-LEFT:rgb(204,204,204) 1px solid;MARGIN:0px 0px 0px 0.8ex;PADDING-LEFT:1ex" class="gmail_quote">
<span lang="EN-GB">Dear Professor Harnad, <br>
</span><span lang="EN-GB"> I believe that it is not always easy to identify the motives behind specific instances of self references (although in the case at hand, the number of mutual citations identified seem to speak for themselves…). The practice of
self citation is (as you acknowledge) not in itself a bad thing, but the problem is how to distinguish its legitimate use from its abuse.</span></blockquote>
<p class="MsoNormal"></p>
</font></div>
</div>
</blockquote>
<font face="Arial">Agreed. And in fact the outcome of tests comparing rankings and correlation patterns based on total citation counts, and citation counts minus self-citations tend to be very similar. Nevertheless, looking at individuals
with or without self-citations and comparing them to the population norms can raise a red flag which can then be examined manually.<br>
</font>
<blockquote type="cite">
<div style="WORD-WRAP:break-word" lang="EN-US" vlink="blue" link="blue">
<div><font face="Arial">
<blockquote style="BORDER-LEFT:rgb(204,204,204) 1px solid;MARGIN:0px 0px 0px 0.8ex;PADDING-LEFT:1ex" class="gmail_quote">
<span lang="EN-GB">This is equally valid on the individual level as in editor-suggested references. I would like to draw into attention an exchange about these matters from 1997, where Eugene Garfield stated:<br>
</span><span lang="EN-GB">“Recognising the reality of the Matthew effect, I believe that an editor is justified in reminding authors to cite equivalent references from the same journal, if only because readers of that journal presumably have ready access to
it. To call this “manipulation” seems excessive unless the references chosen are irrelevant or mere window dressing.” (</span>Garfield, Eugene. 1997. Editors are justified in asking authors to cite equivalent references from same journal. <i><span style="FONT-STYLE:italic">BMJ</span></i> 314
(7096):1765. <a href="http://www.bmj.com/content/314/7096/1765.2.short" target="_blank">http://www.bmj.com/content/314/7096/1765.2.short</a><span lang="EN-GB"> )</span></blockquote>
<p class="MsoNormal"></p>
</font></div>
</div>
</blockquote>
</div>
<div style="FONT-FAMILY:Helvetica"><font face="Arial">Gene Garfield made this suggestion in 1997, before OA became a distinct possibility. In a world where the only way to access articles is if your institution can afford a subscription,
"preferentially cite this journal" might have had an ounce of validity -- alongside the obvious pound of self-interest.</font></div>
<div style="FONT-FAMILY:Helvetica"><font face="Arial"><br>
</font></div>
<div style="FONT-FAMILY:Helvetica"><font face="Arial">But no longer today.</font></div>
<div style="FONT-FAMILY:Helvetica"><font face="Arial"><br>
</font></div>
<div style="FONT-FAMILY:Helvetica">
<div><font face="Arial">An editor telling the author of an article to cite more articles in his journal because readers have "more access" to it is outrageous. Rather, he should tell authors to self-archive it (Green OA) if they really
want to make their articles more accessible.</font></div>
</div>
<div>
<blockquote style="FONT-FAMILY:Helvetica" type="cite">
<div style="WORD-WRAP:break-word" lang="EN-US" vlink="blue" link="blue">
<div><font face="Arial">
<p class="MsoNormal"></p>
<blockquote style="BORDER-LEFT:rgb(204,204,204) 1px solid;MARGIN:0px 0px 0px 0.8ex;PADDING-LEFT:1ex" class="gmail_quote">
<span lang="EN-GB"> My question is if there could exist any method of identifying “bad apples” that does not account for the specific context in the article in which the reference is placed.</span></blockquote>
</font></div>
</div>
</blockquote>
<div style="FONT-FAMILY:Helvetica"><font face="Arial">Only in a population statistical sense. Individual anomalies flagged by the population metrics would still need to be examined manually.</font></div>
<div style="FONT-FAMILY:Helvetica"><font face="Arial"><br>
</font></div>
<div style="FONT-FAMILY:Helvetica"><font face="Arial">But automated text-analytic tools may eventually also become sensitive enough to make a contribution, sorting out some of the nature of the citation from the accompanying text,
not just from the author/article/journal counts.</font></div>
<blockquote type="cite">
<div style="WORD-WRAP:break-word" lang="EN-US" vlink="blue" link="blue">
<div>
<blockquote style="BORDER-LEFT:rgb(204,204,204) 1px solid;MARGIN:0px 0px 0px 0.8ex;PADDING-LEFT:1ex" class="gmail_quote">
<font face="Arial"><span lang="EN-GB">In my understanding of the problem, </span>the proposed way of using statistical methods for identifying baselines for self citations in various fields could be one important step, but I wonder if it would suffice to make
the identification process complete?</font></blockquote>
</div>
</div>
</blockquote>
<font style="FONT-FAMILY:Helvetica" face="Arial">It is a necessary but not a sufficient condition for answering all the kinds of questions one might have about uses and misuses of citations.</font></div>
<div style="FONT-FAMILY:Helvetica"><font face="Arial"><br>
</font></div>
<div style="FONT-FAMILY:Helvetica">
<div><font face="Arial">In statistics there is always, and necessarily, a difference between population data and individual cases. Medical conditions are the best illustration: I have an illness. I want to be treated for my illness,
and not for what, on average, works most often with people that have symptoms most like mine. (See Kahneman & Tversky on the <a href="http://en.wikipedia.org/wiki/Base_rate_fallacy" target="_blank">base rate fallacy</a>.)</font></div>
<div><font face="Arial"><br>
</font></div>
<div><font face="Arial">For citations, "bad" citations can be identified on a statistical basis, comparing two populations of citations, and perhaps even one individual's total citations as compared to the population norms, to see whether
there is something anomalous (such as excess self-citation swelling the citation count).</font></div>
<div><font face="Arial"><br>
</font></div>
<div><font face="Arial">But it won't tell you whether an individual citation is good or bad. It is possible to develop and apply automated text-analytic algorithms to the text surrounding a citation, to try to predict whether it is
positive or negative, and such algorithms can even be "trained up" with corrective feedback based on human evaluation of whether each individual citation was positive or negative.</font></div>
<div><font face="Arial"><br>
</font></div>
<div><font face="Arial">But it's early days for both of these, and validating statistical predictors will first take an awful lot of individual hand-validation in order to test and improve the algorithms.</font></div>
<div><font face="Arial"><br>
</font></div>
<div><font face="Arial">But for journals or individuals it is definitely possible to check computationally whether they deviate from population norms/baselines, and then look at the cases that the population anomaly detectors single
out, and check them manually to see whether they are indeed cases of bad faith, legitimate practice, or just statistical anomalies.</font></div>
</div>
<div style="FONT-FAMILY:Helvetica"><font face="Arial"><br>
</font></div>
<div style="FONT-FAMILY:Helvetica">
<div><font face="Arial">Citation cartels (and many other systematic abuses) are more detectable if the entire corpus is accessible precisely because everybody can detect them: no need to wait to see whether proprietary database owners
with other interests get around to or see fit to provide the data needed to monitor and detect abuses.</font></div>
<div><font face="Arial"><br>
</font></div>
<div><font face="Arial">Global OA not only provides the open database, but it provides the (continuous) open means of flagging anomalies in the population pattern, checking them, and naming and shaming the cases where there really has
been willful misuse or abuse.</font></div>
<div><font face="Arial"><br>
</font></div>
<div><font face="Arial">It's yet another potential application for crowd-sourcing.</font></div>
<div><font face="Arial"><br>
</font></div>
<div><font face="Arial">Stevan Harnad</font></div>
<div><font face="Arial"><br>
</font></div>
</div>
<div style="FONT-FAMILY:Helvetica">
<div>
<div style="MARGIN:0px 0px 0px 40px;FONT:12px 'Times New Roman'"><font face="Arial"><b>Harnad, S. (2008) </b><a href="http://www.int-res.com/abstracts/esep/v8/n1/p103-107/" target="_blank"><span style="COLOR:rgb(24,65,170)"><b>Validating Research
Performance Metrics Against Peer Rankings</b></span></a><b>.<i> </i></b><i>Ethics in Science and Environmental Politics</i> 8 (11) doi:10.3354/esep00088 The Use And Misuse Of Bibliometric Indices In Evaluating Scholarly Performance <a href="http://eprints.ecs.soton.ac.uk/15619/" target="_blank"><span style="COLOR:rgb(24,65,170)">http://eprints.ecs.soton.ac.uk/15619/</span></a></font></div>
<div style="MARGIN:0px 0px 0px 40px;MIN-HEIGHT:15px;FONT:12px 'Times New Roman'">
<font face="Arial"><b></b><br>
</font></div>
<div style="MARGIN:0px 0px 0px 40px;FONT:12px 'Times New Roman'"><font face="Arial"><b>Harnad, S. (2009) </b><a href="http://eprints.ecs.soton.ac.uk/17142/" target="_blank"><span style="COLOR:rgb(24,65,170)"><b>Open Access Scientometrics and the
UK Research Assessment Exercise</b></span></a><b>. <i>Scientometrics</i> 79 (1) </b>Also in <i>Proceedings of 11th Annual Meeting of the International Society for Scientometrics and Informetrics</i> 11(1), pp. 27-33, Madrid, Spain. Torres-Salinas, D. and Moed,
H. F., Eds. (2007) </font></div>
<div style="MARGIN:0px 0px 0px 40px;MIN-HEIGHT:15px;FONT:12px 'Times New Roman'">
<font face="Arial"><br>
</font></div>
<div style="MARGIN:0px 0px 0px 40px;FONT:12px 'Times New Roman';COLOR:rgb(22,54,238)">
<font face="Arial"><span style="COLOR:rgb(0,0,0)"><b>Harnad, S. (2009) </b><a href="http://openaccess.eprints.org/index.php?/archives/508-guid.html" target="_blank"><b>Multiple metrics required to measure research performance</b></a><b>. </b><a href="http://www.nature.com/nature/journal/v457/n7231/full/457785a.html" target="_blank">Nature
(Correspondence) 457 (785) (12 February 2009<span style="FONT:14px 'Times New Roman'">)</span></a></span><span style="FONT:14px 'Times New Roman';COLOR:rgb(0,0,0)"> </span></font></div>
<div><span style="font:14px 'Times New Roman'"><font size="3" face="Arial"><br>
</font></span></div>
</div>
</div>
<div style="FONT-FAMILY:Helvetica"><font face="Arial"><br>
</font></div>
<div style="FONT-FAMILY:Helvetica">
<div style="MARGIN:0px 0px 0px 40px;FONT:12px 'Times New Roman';COLOR:rgb(22,54,238)">
<span style="FONT:14px 'Times New Roman';COLOR:rgb(0,0,0)"><br>
</span></div>
</div>
</div>
</div>
</div>
</div>
</font></div>
</div></div></div>
</div><br>