[EP-tech] .docx Export

Dennis Müller dennis.mueller at bib.uni-mannheim.de
Fri Jul 31 13:25:13 BST 2020


Hi John,

thanks for your input. You brought me on to the right track, I think.

This is the very first working version below. (Haven't cleaned it up
yet, but I'm sharing it anyway since I'm so glad it finally worked. :D)

sub output_list
{
    my( $self, %opts ) = @_;

    my ($FH, $filename) = tempfile();
    my $container = odfContainer(
        $filename,
        create => "text",
        template_path => $self->{repository}->config("config_path") .
"/static/office-templates",
        work_dir => "/tmp"
    );
    die("Unable to create container: $!") unless defined $container;

    # get content
    my $content = odfDocument(
        container => $container,
        part => 'content'
    );

    # change content
    $content->selectElementsByContent("Titel", "Zitationsliste");
    my $date = localtime->dmy('.');
    $content->selectElementsByContent("Datum", "$date");
    foreach my $eprint ($opts{list}->get_records) {
        my $newparagraph = $content->appendParagraph(
            # TODO: render nice citation here
            text => $eprint->id() . " - " .
$eprint->get_value("title_title")
        );
    }

    # save container
    $container->save();

    # not quite sure, how and why this works, but it does...
    open FILE,"< $filename" or die "Can't open: $!";
    binmode FILE;
    my $output = "";
    while (<FILE>) {
        $output = $output . $_;
    }
    return $output;
}


Thank your very much for your help!

Cheers and have a nice weekend
Dennis

Am 30.07.20 um 23:50 schrieb John Salter:
> Hi Dennis,
> I had a bit of an experiment with it, but didn't get it working - I'm sure there's a way - but I haven't found it yet!
> 
> The possible work-around is - as you suggested:
> "How do I return an existing file on the server via the "output_list" sub in an export plugin"
> 
> Normally you can set a filename in the Apache request object:
> $r->filename( "/path/to/file" );
> But I'm not sure this will work, as the export script is ready to print what is returned.
> It might be that setting the filename and returning undef would work?
> 
> Apologies I haven't had time to experiment more with this.
> 
> Cheers,
> John
> 
> -----Original Message-----
> From: Dennis Müller [mailto:dennis.mueller at bib.uni-mannheim.de] 
> Sent: 29 July 2020 07:59
> To: John Salter <J.Salter at leeds.ac.uk>
> Subject: Re: [EP-tech] .docx Export
> 
> Hi John,
> 
> have you had a chance to take at look at this again, yet? I'd really
> appreciate your input.
> 
> Cheers,
> Dennis
> 
> Am 17.07.20 um 16:08 schrieb Dennis Müller:
>> Hi John,
>>
>>> I think what's getting returned currently isn't any form of OODoc -
>>> it's a text file with an odt extension - which when opened defaults to
>>> OpenOffice and looks like an ODF document.
>>
>> You're right, I saw this when removing plugin's suffix and mimetype for
>> testing earlier on. What the user receives is basically just plain text
>> of whatever is returned in output_list - but being interpreted as an
>> OODoc by the browser/OS because of the mimetype. This exported document
>> however has nothing to do with the one I create on the server.
>>
>>> Possibly something like:
>>> my $tmp;
>>> open(my $TMP,'>',\$tmp);
>>> $container->save( $TMP );
>>> return $tmp;
>>>
>>> would work?
>>
>> Variations on your proposal (which is similar to what I tried from
>> reading the multiline csv export you linked to) unfortunately didn't
>> work either. The container could not be created from neither $TMP nor
>> $tmp ("bad file descriptor" or "no such file or directory"). Creating
>> the container from $output (a copy of template.odt), saving to $TMP and
>> returning $tmp resulted in an empty output (wrapped in an ODT-file, of
>> course :D).
>>
>> Now, for the sake of isolating the return value problem from any OODoc
>> specialties, I've come to first create and save the doc like this:
>>
>>
>> # persistent path for debugging
>> my $output = "/data/madoc/test.odt";
>>
>> # create new container from template
>> # (eliminates the need to copy template.odt)
>> my $container = odfContainer(
>>   $output, # create in specified path
>>   create => "text", # create text doc
>>   template_path => "/data/madoc/", # template.odt lives here
>>   work_dir => "/data/madoc/" # specify work dir for writing
>> );
>> die("Unable to create container: $!") unless defined $container;
>>
>> # get content and tamper with it
>> my $content = odfDocument(
>>   container => $container,
>>   part => 'content'
>> );
>> $content->selectElementsByContent("Title", "Citation List");
>>
>> # save container to $output
>> $container->save();
>>
>>
>> Everything works fine until here: test.odt lies in the specified path
>> and looks exactly how we want.
>>
>> So, if I'm not mistaken, the remaining question generally speaking is:
>> How do I return an existing file on the server via the "output_list" sub
>> in an export plugin?
>>
>> From what I understand, the return value should be a (reading?) file
>> handle, but that does not seem to work. I suspect to have a basic
>> misunderstanding of how it is supposed to work from here on. I'd be very
>> glad for further help with this.
>>
>> Cheers (and don't waste your weekend on this!)
>> Dennis
>>
>> Am 16.07.20 um 18:25 schrieb John Salter:
>>> Hi Dennis,
>>> I think what's getting returned currently isn't any form of OODoc - it's a text file with an odt extension - which when opened defaults to OpenOffice and looks like an ODF document.
>>> Changing the filename at the end of the exportview URL supports this e.g.:
>>> https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fmadoc-dev.bib.uni-mannheim.de%2Fcgi%2Fexportview%2Fyear%2F2019%2FOffice%2Ftest.txt&amp;data=01%7C01%7Ceprints-tech%40ecs.soton.ac.uk%7Cc456183a5b1e4ba6da1108d8354cc7d5%7C4a5378f929f44d3ebe89669d03ada9d8%7C0&amp;sdata=NDOQ1pfT7DAMbISIAm23LQ0znRI%2FLGZYmpM5AW%2FkAIE%3D&amp;reserved=0
>>>
>>>
>>> Looking at the OODoc->save method:
>>> 	https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fmetacpan.org%2Fsource%2FJMGDOC%2FOpenOffice-OODoc-2.125%2FOODoc%2FFile.pm%23L679-704&amp;data=01%7C01%7Ceprints-tech%40ecs.soton.ac.uk%7Cc456183a5b1e4ba6da1108d8354cc7d5%7C4a5378f929f44d3ebe89669d03ada9d8%7C0&amp;sdata=KdsNGqoTZnQ4WqLPf554T8HCJjQcl5A0WAOurKPuZc4%3D&amp;reserved=0
>>> there is some logic to handle a few different situations.
>>>
>>> Possibly something like:
>>> my $tmp;
>>> open(my $TMP,'>',\$tmp);
>>> $container->save( $TMP );
>>> return $tmp;
>>>
>>> would work?
>>> I don't have a way to test this right now - if that doesn't help, I'll take a look over the weekend!
>>>
>>> Cheers,
>>> John
>>>
>>>
>>> -----Original Message-----
>>> From: Dennis Müller [mailto:dennis.mueller at bib.uni-mannheim.de] 
>>> Sent: 16 July 2020 10:38
>>> To: John Salter <J.Salter at leeds.ac.uk>; eprints-tech at ecs.soton.ac.uk
>>> Subject: Re: [EP-tech] .docx Export
>>>
>>> Dear John,
>>>
>>> thanks again for assisting. I played around with ideas from the links
>>> you provided and the ooffice documentation, but it seems I misunderstood
>>> something fundamentally. Please see my simplified testing code attached.
>>>
>>> As previously mentioned, the file saved on the server under $output
>>> looks correct; I just can't get it handed over to the browser.
>>>
>>> The export is called via the cgi/exportview url:
>>> <form method="get" accept-charset="utf-8"
>>> action="https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fmadoc-dev.bib.uni-mannheim.de%2Fcgi%2Fexportview&amp;data=01%7C01%7Ceprints-tech%40ecs.soton.ac.uk%7Cc456183a5b1e4ba6da1108d8354cc7d5%7C4a5378f929f44d3ebe89669d03ada9d8%7C0&amp;sdata=prYcRvDVnMgZjpbBgoqJ78CV6yszKoTnp9mMZKSH47Y%3D&amp;reserved=0">
>>>
>>> These are the called urls I see in the Firefox console:
>>> https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fmadoc-dev.bib.uni-mannheim.de%2Fcgi%2Fexportview%3Fformat%3DOffice%26_action_export_redir%3DExportieren%26view%3Dpeople%26values%3DMueller%253D3ADennis%253D3A%253D3A&amp;data=01%7C01%7Ceprints-tech%40ecs.soton.ac.uk%7Cc456183a5b1e4ba6da1108d8354cc7d5%7C4a5378f929f44d3ebe89669d03ada9d8%7C0&amp;sdata=4UHfAs2U9puPx%2FbrDQhyGtNHmdir7EVKp4rRDKKzLDc%3D&amp;reserved=0
>>> https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fmadoc-dev.bib.uni-mannheim.de%2Fcgi%2Fexportview%2Fpeople%2FMueller%3D3ADennis%3D3A%3D3A%2FOffice%2FMueller%3D3ADennis%3D3A%3D3A.odt&amp;data=01%7C01%7Ceprints-tech%40ecs.soton.ac.uk%7Cc456183a5b1e4ba6da1108d8354cc7d5%7C4a5378f929f44d3ebe89669d03ada9d8%7C0&amp;sdata=xXV4mvaLlGjkp5n8AhRJzASD39iZkKcqZUByt0OwEn0%3D&amp;reserved=0
>>>
>>> Many thanks and best regards
>>> Dennis
>>>
>>> Am 14.07.20 um 18:50 schrieb John Salter:
>>>> Hi Dennis,
>>>> Glad you made this work - hopefully the last part isn't too difficult to sort out!
>>>>
>>>> How are you calling the Export? Is it via the normal /cgi/export url, or another means?
>>>>
>>>> Have you got a 'mimetype' and a 'suffix' parameter set in your export plugin?
>>>>
>>>> Does the 'MultilineExcel' export plugin provide a useful example?
>>>> - Setting up either the supplied filehandle, or a filehandle to a variable:
>>>> https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Feprintsug%2Fmultiline_excel%2Fblob%2Fmaster%2Flib%2Fplugins%2FEPrints%2FPlugin%2FExport%2FMultilineExcel.pm%23L44-L58&amp;data=01%7C01%7Ceprints-tech%40ecs.soton.ac.uk%7Cc456183a5b1e4ba6da1108d8354cc7d5%7C4a5378f929f44d3ebe89669d03ada9d8%7C0&amp;sdata=tc1jx9NJtfT8pF21ybwfTxkPv%2FHN0rAKJg7SVrfVNV4%3D&amp;reserved=0
>>>> - returning - either undef (if a filehandle was supplied) - or the variable:
>>>> https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Feprintsug%2Fmultiline_excel%2Fblob%2Fmaster%2Flib%2Fplugins%2FEPrints%2FPlugin%2FExport%2FMultilineExcel.pm%23L93-L98&amp;data=01%7C01%7Ceprints-tech%40ecs.soton.ac.uk%7Cc456183a5b1e4ba6da1108d8354cc7d5%7C4a5378f929f44d3ebe89669d03ada9d8%7C0&amp;sdata=uizLSng%2FpyWs4DGM%2F0M7AGwznfllUUcU2XNWUsCb2Eo%3D&amp;reserved=0
>>>>
>>>> If that doesn't help, let me know and I'll do some more thinking!
>>>>
>>>> Cheers,
>>>> John
>>>>
>>>> -----Original Message-----
>>>> From: eprints-tech-bounces at ecs.soton.ac.uk [mailto:eprints-tech-bounces at ecs.soton.ac.uk] On Behalf Of Dennis Müller via Eprints-tech
>>>> Sent: 14 July 2020 16:49
>>>> To: John Salter <J.Salter at leeds.ac.uk>; eprints-tech at ecs.soton.ac.uk
>>>> Subject: Re: [EP-tech] .docx Export
>>>>
>>>> Hi John,
>>>>
>>>> thanks for your quick answer. With your help, I could more or less do
>>>> what I want with the document saving it to some path on the server.
>>>>
>>>> However, I'm struggling to find out how to return the actual document in
>>>> my export plugin's "output_list" subroutine. Returning $oodoc or the
>>>> file path doesn't work, obviously. Can you help me out once more, please?
>>>>
>>>> Cheers,
>>>> Dennis
>>>>
>>>> Am 10.07.20 um 15:33 schrieb John Salter:
>>>>> Hi Dennis,
>>>>> I think you should be able to achieve this.
>>>>> It's similar to the way the OpenOffice / Coversheets works.
>>>>> That's normally configured to export as a PDF, but it first takes an OpenOffice document (your branded template), and replaces tags (like '##TITLE##') with rendered content.
>>>>>
>>>>> These are the tags:
>>>>> https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Feprintsug%2Fcoversheets%2Fblob%2Fmaster%2Fcfg%2Fcfg.d%2Fz_coversheet_tags.pl&amp;data=01%7C01%7Ceprints-tech%40ecs.soton.ac.uk%7Cc456183a5b1e4ba6da1108d8354cc7d5%7C4a5378f929f44d3ebe89669d03ada9d8%7C0&amp;sdata=tmlbktbZ2PIpjDa5ekBrqy%2BLqRUs54COMrq6%2B1i%2BLHA%3D&amp;reserved=0
>>>>> and this adds them to the OpenOffice document:
>>>>> https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Feprintsug%2Fcoversheets%2Fblob%2Fmaster%2Flib%2Fplugins%2FEPrints%2FPlugin%2FConvert%2FAddCoversheet.pm%23L114-L127&amp;data=01%7C01%7Ceprints-tech%40ecs.soton.ac.uk%7Cc456183a5b1e4ba6da1108d8354cc7d5%7C4a5378f929f44d3ebe89669d03ada9d8%7C0&amp;sdata=yn3CsO%2BPpBR2NCRJP1xEl9Mzz2VeEG5FdREW5XU%2FjEg%3D&amp;reserved=0 
>>>>>
>>>>> Does that help?
>>>>>
>>>>> There are probably other ways - possibly other perl modules that would allow a more direct approach - but the above stuff seems to work OK.
>>>>> I'm using it with LibreOffice rather than OpenOffice if that's useful to know too.
>>>>>
>>>>> Cheers,
>>>>> John
>>>>>
>>>>> -----Original Message-----
>>>>> From: eprints-tech-bounces at ecs.soton.ac.uk [mailto:eprints-tech-bounces at ecs.soton.ac.uk] On Behalf Of Dennis Müller via Eprints-tech
>>>>> Sent: 10 July 2020 14:15
>>>>> To: eprints-tech at ecs.soton.ac.uk
>>>>> Subject: [EP-tech] .docx Export
>>>>>
>>>>> Hi everyone,
>>>>>
>>>>> we've had a user request for exporting views/searches as a .docx file
>>>>> styled in our corporate design. Has anyone ever done something similar?
>>>>>
>>>>> Just for the file format, it might work to "wrap" a simple text export
>>>>> in an exporter that has a
>>>>> "application/vnd.openxmlformats-officedocument.wordprocessingml.document" mimetype,
>>>>> but I don't see where I could slip in the design template along the way.
>>>>>
>>>>> I'd be glad to hear from your experiences. :)
>>>>>
>>>>> Best regards
>>>>> Dennis
>>>>>
>>>>
>>>
>>
> 

-- 
Dennis Müller, B.A.


Universität Mannheim
Universitätsbibliothek
Digitale Bibliotheksdienste | Schloss Schneckenhof West | 68131 Mannheim

Tel: +49 621 181-3023
E-Mail: dennis.mueller at bib.uni-mannheim.de

Web: https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.bib.uni-mannheim.de%2F&amp;data=01%7C01%7Ceprints-tech%40ecs.soton.ac.uk%7Cc456183a5b1e4ba6da1108d8354cc7d5%7C4a5378f929f44d3ebe89669d03ada9d8%7C0&amp;sdata=mRh22EDMvzEaaCH16hYT6aCva8frtZCTrMYwBraU6Zg%3D&amp;reserved=0



More information about the Eprints-tech mailing list