Skip to main content.
home | support | download

Back to List Archive

Re: Hyphens

From: Bill Moseley <moseley(at)>
Date: Mon Mar 25 2002 - 15:24:31 GMT
At 06:44 AM 03/25/02 -0800, David Ayres wrote:
>When inserting a hyphen character into WORDCHARS in config.h, does the 
>character need to be preceded with a backslash? I ask because after 
>inserting the character, recompiling and reindexing, searching for 
>something like "4d01-4108" brings up the correct hit, but also a document 
>with 4d and 4108 in separate places (i.e., not joined by the hyphen).  It 
>seems that just adding the character to WORDCHARS is ineffective.

Seems effective when I try:

> cat d
WordCharacters -0123456789abcdefghijklmnopqrstuvwzyz

> cat d.txt
First 4d01-4108 last

> ./swish-e -c d -i d.txt -T indexed_words -v0
Indexing Data Source: "File-System"
    Adding:[1:swishdefault(1)]   'first'   Pos:1  Stuct:0x1 ( FILE )
    Adding:[1:swishdefault(1)]   '4d01-4108'   Pos:2  Stuct:0x1 ( FILE )
    Adding:[1:swishdefault(1)]   'last'   Pos:3  Stuct:0x1 ( FILE )
Indexing done!

Ok, so it was indexed as a single word.

Let's try searching:

> ./swish-e -w 4d01-4108 -H 9
# SWISH format: 2.1-dev-25
# Search words: 4d01-4108
# WordCharacters: -0123456789abcdefghijklmnopqrstuvwyz
# Search Words: 4d01-4108
# Parsed Words: 4d01-4108 
# Number of hits: 1
# Search time: 0.003 seconds
# Run time: 0.043 seconds
1000 d.txt "d.txt" 22

Yep, and it's searching correctly:

  # Parsed Words: 4d01-4108 

shows what word swish is searching in the index.

Bill Moseley
Received on Mon Mar 25 15:25:04 2002