Skip to main content.
home | support | download

Back to List Archive

Re: swish-e and bibtex

From: Peter Karman <peter(at)>
Date: Thu Dec 01 2005 - 15:23:19 GMT
I would approach this a different way.

I would take the content of my PDF and the content of the related .bib
file and create a virtual XML (or HTML) file for handing to swish-e -S
prog .


  [output of pdftotext here]

(use HTML tags if indexing as HTML).

Then you could configure each of your bibtex fields as metanames and
properties and search/retrieve bibtex info by specific field.

> Hi all,
> I recently learned about swish-e and have started using it today. The
> problem
> I am faced with is to search in scientific articles:
> - their full text, usually in .pdf format
> - their bibliographic data in BiBTeX format
> the way I have organized things is to give each article a separate .bib
> file
> and a .pdf file and create a single index. That way I can just:
>   swish-e -w "Einstein 2005" | grep -i bib
> to find all .bib files with the words 'Eindstein 2005' in it. Fine. After
> browsing the documentation I learned that wildcards are only supported at
> the
> ending of a word and that is rather annoying. Suppose, for example, I want
> to
> search those .bib files where this Einstein figure is an editor. That is,
> match lines like:
>   editor = "Einstein",
> in my .bib files. I had hoped that
>   swish-e -w "editor.*einstein"
> would work but that's not the case obviously. I've browsed the web a bit
> but
> haven't found a satisfying solution yet. Is anyone here using swish-e to
> index
> BiBTeX data already? Thoughts on how to deal with this kind of searchers?
> Any
> help is greatly appreciated.
> Cheers,
>   Bas
> --
> <> - GPG Key ID: 2768A493  -
> Radboud University Nijmegen Institute for Computing and Information
> Sciences

Peter Karman . . peter(at)
Received on Thu Dec 1 07:23:25 2005