Re: [swish-e] Change the indexed 'title'

From: <josh(at)>
Date: Wed Oct 24 2007 - 19:46:52 GMT
>On 10/24/2007 01:30 PM, wrote:
>> I am not concerned with filtered queries or anything fancy like that; i 
>> want the title on the output to be pulled from the HTML tags mentioned
>> previously. I looked up the ExtractPath metaname; not really 100% on how
>> that could help me use these fields as the 'title' on the output as opposed
>> to the <title></title> field. I also looked up PropertyNames; and I set 
>> in my cfg, but not sure how to populate them with the fields that I want
>> form the html that it indexes......?
>If you want the title on the output to be pulled from tags other than <title>
>tags, then you either need to (a) filter the content before it reaches 
>to put your title content inside the <title> tagset, or (b) use existing
>swish-e features to mimic that approach.
>In my example below, I created 3 dummy docs, one in each directory, using the
>tags you specified. I configured swish-e to extract the first part of the 
>path (the directory name) to a MetaName called 'flavor' (which I can then 
>search on), and I added a PropertyName for each of the tags you want to save
>the content for display in results. I also set 'flavor' as a PropertyName, 
>so you can easily see what value is being set for the title.
>NOTE: if you're planning to use the swish.cgi example script in the distrib,
>then you'd have to hack it to return different PropertyNames instead of
>swishtitle for the title of each result, and then test the flavor property
>value as well to know which one to display. If you're writing your own search
>app, you'd have to put the same logic in there.
>[pek@dewpoint:~/tmp/josh]$ ls -1
>[pek@dewpoint:~/tmp/josh]$ cat docs*/*
> <head><title>real title is the title I want</title></head>
> <body><a href="bar">link text</a> <strong>strong text</strong> blah</body>
> <head><title>real title</title></head>
> <body><a href="bar">title I want</a></body>
> <head><title>real title</title></head>
> <body><strong>title I want</strong></body>
>[pek@dewpoint:~/tmp/josh]$ cat conf
>ExtractPath flavor regex !^([^/]+)/.*$!$1!
>PropertyNames strong a flavor
>[pek@dewpoint:~/tmp/josh]$ swish-e -w title AND flavor=strong -x '"<strong>"
>"<swishtitle>" "<flavor>"\n'
># SWISH format: 2.5.6
># Search words: title AND flavor=strong
># Removed stopwords:
># Number of hits: 1
># Search time: 0.001 seconds
># Run time: 0.008 seconds
>"title I want" "real title" "docswith-strong"
>Peter Karman  .  peter(at)  .


I see what you are saying; not sure what I am doing wrong though. I tried to simulated exactly what you had there and do not return the results as you are saying I should.. the only 'result' i am returning is the swishtitle.....

Do i have to index it in a special way for it to populate the 'strong' PropertyName? I don't understand how it should correspond that property name to the html tag...

