[GOAL] Re: Open Access for Science in UK - request for sources
Jenny Molloy
jenny.molloy at okfn.org
Mon Jul 16 15:25:45 BST 2012
Hi Peter
Assuming the same methodology as Gargouri Y, Hajjem C, Larivière V,
Gingras Y, Carr L, et al. (2010) Self-Selected or Mandated, Open Access
Increases Citation Impact for Higher Quality Research. PLoS ONE 5(10):
e13636. doi:10.1371/journal.pone.0013636
Available from:
http://www.plosone.org/article/info:doi/10.1371/journal.pone.0013636
Quote:
The full-text OA status of the articles in our sample was verified using an
automated webwide
search-robot[8]<http://www.plosone.org/article/info:doi/10.1371/journal.pone.0013636#pone.0013636-Hajjem1>
as
well as an automated Google Scholar search. (Note that any OA articles that
our robot missed would reduce any OA Advantage. Hence our estimate of the
OA Advantage is conservative.) *Figure
1*<http://www.plosone.org/article/info:doi/10.1371/journal.pone.0013636#pone-0013636-g001>
shows
each of our four mandated institutions' verified annual OA article deposits
as a percentage of the institution's total published article output for
each year based (only) on those articles published in the journals indexed
by the Thomson-Reuters citation database; the resulting estimate of the
overall OA mandate compliance rate is about 60%.(for publishing years
2002–2006, with the deposits up to 2009, when the analysis was conducted).
Note also the robot data's confirmation of the approximately 15% baseline
for spontaneous, self-selected (i.e., non-mandated) OA self-archiving among
the control articles in the same
journal/years[19]<http://www.plosone.org/article/info:doi/10.1371/journal.pone.0013636#pone.0013636-Bjrk1>
.
Ref 8 is to Hajjem, Chawki, Harnad, Stevan and Gingras, Yves (2005) Ten-Year
Cross-Disciplinary Comparison of the Growth of Open Access and How it
Increases Research Citation Impact. *IEEE Data Engineering Bulletin*, 28,
(4), 39-47.
Available from: http://eprints.soton.ac.uk/262906/
Quote:
The robot’s search algorithm was the following: (1) Send request to ISI
database for metadata of
article (firstauthor name and article title). (2) Send request (name,
title) to: Yahoo, Metacrawler,
Vivissimo, Eo, AlltheWeb and Altavista. (3) Extract external (irrelevant)
links. (4) Remove
duplicate URLs. (5) Sort URLs to process PDF and PS files first (probable
full-texts). (5) Convert
files (PDF, PS, Latex, HTML, XML, RTF, and Word) to text. (6) Parse files
to test for full-text of
reference article (name/title in first 20% of text, references in last
20%). (7) If, in parsing HTML
file, title found but not full text, extract and follow links in file
further as references possibly
leading to the full text (to depth of 3 levels). (8) Sort articles by
discipline/journal/issue/year;
calculate percent OA articles within each; then by discipline/journal; and
finally for each
discipline. (9) Sort articles by discipline/journal/issue/year, calculate
citation ratio as (OA -
NOA/NOA) within each, then by discipline/journal and finally for each
discipline. (10) Exclude
data for all journals that are 100% OA (OA journals) from both the article
counts and the citation
counts (as we are only doing within-journal comparisons for NOA journals);
exclude data from
all single issues that are 100% OA (to eliminate denominators).
On Mon, Jul 16, 2012 at 2:20 PM, Peter Murray-Rust <pm286 at cam.ac.uk> wrote:
> Thanks very much Alma,
> This is very useful - I have some more questions, and would be grateful
> for answers if you can...
>
>
>
>> The data are from Yassine Gargouri (who has used the methodology he
>> previously used, which consists of trawling the web for openly accessible
>> full-texts and comparing the number of those with the papers in Web of
>> Science, which is not a perfect, but a reasonable measure of the ‘universe’
>> for UK researchers).
>>
>
> Is this published anywhere (formally or informally) such that we can
> understand the details?
> * How does he or Google know that the full-text is "openly accessible"? Is
> this by trying to read it or is there a Google flag for openly accessible?
>
>>
>> Previously, Yassine has done this only on a global basis, but this time
>> he has looked for papers with at least one UK author.
>>
>> * How is this done? Does *he* analyze the author affiliations or does he
> get them from WoS?
>
>
>> * is there an open electronic list of the publications (and their
>> funders) so that I can access them
>>
>>
>> He used Google to search for the papers.
>>
>
> More questions:
> * Google or GoogleScholar? [Apparently they can give very different
> answers]
>
> Assuming it was GoogleScholar.
> * How was the subject classification done?
>
> I can see one method how the "Gold" access papers were retrieved - by
> mapping the Journal onto known Gold journals (sic). (I cannot see how
> hybrid gold were easily measured but the numbers are probably too small to
> worry about statistically)
>
> I cannot see the next phase but I can conjecture. More questions:
> * did he use his/Google results to compare with WoS?
>
> * how did he determine that the paper was Green? Almost by definition this
> has to be somewhere other than the publisher's site. [so the paper needs
> another search for the paper mounted somewhere OTHER than the publisher.
>
> * does he then have a system to determine whether the paper is readable
> (not all papers in repositories are readable, as we have seen).
>
> If he has such as system then it would seem to answer the key question:
> * if I find a paper on a publisher's site can I find a free-as-in-beer
> copy somewhere else on the web?
>
> If he can really answer that question then is his system openly available?
>
> P.
>
> _______________________________________________
>> GOAL mailing list
>> GOAL at eprints.org
>> http://mailman.ecs.soton.ac.uk/mailman/listinfo/goal
>>
>>
>
>
> --
> Peter Murray-Rust
> Reader in Molecular Informatics
> Unilever Centre, Dep. Of Chemistry
> University of Cambridge
> CB2 1EW, UK
> +44-1223-763069
>
> _______________________________________________
> GOAL mailing list
> GOAL at eprints.org
> http://mailman.ecs.soton.ac.uk/mailman/listinfo/goal
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.ecs.soton.ac.uk/pipermail/goal/attachments/20120716/8e7b6675/attachment-0001.html
More information about the GOAL
mailing list