Skip to main content.
home | support | download

Back to List Archive

Re: swish-e and bibtex

From: Peter Karman <peter(at)not-real.peknet.com>
Date: Thu Dec 01 2005 - 15:23:19 GMT
I would approach this a different way.

I would take the content of my PDF and the content of the related .bib
file and create a virtual XML (or HTML) file for handing to swish-e -S
prog .

e.g.:

<bibdoc>
 <bibinfo>
  <tag>content</tag>
  <tagN>contentN</tag>
 </bibinfo>
 <doc>
  [output of pdftotext here]
 </doc>
</bibdoc>

(use HTML tags if indexing as HTML).

Then you could configure each of your bibtex fields as metanames and
properties and search/retrieve bibtex info by specific field.


>
> Hi all,
>
> I recently learned about swish-e and have started using it today. The
> problem
> I am faced with is to search in scientific articles:
>
> - their full text, usually in .pdf format
> - their bibliographic data in BiBTeX format
>
> the way I have organized things is to give each article a separate .bib
> file
> and a .pdf file and create a single index. That way I can just:
>
>   swish-e -w "Einstein 2005" | grep -i bib
>
> to find all .bib files with the words 'Eindstein 2005' in it. Fine. After
> browsing the documentation I learned that wildcards are only supported at
> the
> ending of a word and that is rather annoying. Suppose, for example, I want
> to
> search those .bib files where this Einstein figure is an editor. That is,
> match lines like:
>
>   editor = "Einstein",
>
> in my .bib files. I had hoped that
>
>   swish-e -w "editor.*einstein"
>
> would work but that's not the case obviously. I've browsed the web a bit
> but
> haven't found a satisfying solution yet. Is anyone here using swish-e to
> index
> BiBTeX data already? Thoughts on how to deal with this kind of searchers?
> Any
> help is greatly appreciated.
>
> Cheers,
>
>   Bas
>
>
> --
> <Bas.vanGils@cs.ru.nl> - GPG Key ID: 2768A493  -
> http://www.cs.ru.nl/~basvg
> Radboud University Nijmegen Institute for Computing and Information
> Sciences
>


-- 
Peter Karman . http://peknet.com/ . peter(at)not-real.peknet.com
Received on Thu Dec 1 07:23:25 2005