[EP-tech] Making a static copy of an EPrints repo
Yuri
yurj at alfa.it
Tue Jul 18 11:33:29 BST 2017
I would use:
wget --no-parent \
--no-check-certificate \
--html-extension \
--convert-links \
--restrict-file-names=windows \
--recursive \
--level=inf \
-N \
--page-requisites \
-e robots=off \
--wait=0 \
--quota=inf \
I think --convert-links will do the job of converting links.
Il 18/07/2017 11:04, Ian Stuart ha scritto:
> I need to make a read-only, static, copy of an old repo (the hardware is
> dying, the installation was heavily tailored for the environment, and I
> don't have the time to re-create in a new environment.)
>
> I can grab all the active pages:
>
> wget --local-encoding=UTF-8 --remote-encoding=UTF-8 --no-cache
> --mirror -nc -k http://my.repo/
>
> This is good, however it doesn't edit all the absolute URLs in the view
> pages, so we need to modify them:
>
> find my.repo -type f -exec sed -i 's_http://my.repo/_/_g' {} +
>
> However this leaves me with the problem that the http://my.repo/nnn/
> pages haven't been pulled down!
>
> Any suggestions on how to do this?
>
> Cheers
>
More information about the Eprints-tech
mailing list