Skip to main content.
home | support | download

Back to List Archive

Split file using <a name=... during index

From: Reino Va'inaste <reino(at)not-real.postimees.ee>
Date: Sat Oct 25 1997 - 17:05:33 GMT
Hi everyone!

I have some problems with indexing.

First of all:
3) is it possible to convert all entities (numeric&named) to 8 bit ISO-Latin
char set 
during indexing  (possibly switch in .conf file) to get uniform presentation



The problem: I need to get access with index to different parts of one htm file
(anchors <a name=   > will be separators)

Therefore I intended to cheat SWISH - split .htm file to subdirectory
each part to its own file, index subdirectory and using ReplaceRules
in conf file put it in one index and correct path

file   --->   sw. index
aaaa.htm_xx/xx_anchor  ---> aaaa.htm#anchor
or
aaaa.htm_xx/xx_anchor.htmx  ---> aaaa.htm#anchor

And I stuck on two problems 


1) first ex, when directory is aaaa.htm_xx and file is xx_anchor 
file.c:ishtml() will find "." for suffix in directory name  
(but length of suffix array is only 10)   


2) in second ex, I will need these lines in conf file
ReplaceRules "_xx/xx_" "#"
ReplaceRules ".htmx" ""

the problem lies in file.c:getdefaults()
when (4) GETWORD will return "\0" and (5) IF
will break while(1) and there will be only 2 values in replacelist

replacerules() afterwards will use 3 values from replacelist
without checking consistensy

>1                else if (c = (char *) lstrstr(line, "ReplaceRules")) {
>2                        c += strlen("ReplaceRules");
>3                        while (1) {
>4                                strcpy(value, (char *) getword(c, &skiplen));
>5                                if (!skiplen | value[0] == '\0' || 
>6                                   value[0] == '\n')
>7                                        break;
>8                                else {
>9                                        c += skiplen;
>10                                        replacelist = (struct swline *)
>11                                        addswline(replacelist, value);
>12                               }
>13                        }
>14                }







Happy Swishing

   Reino




-----------------------------------------------| Poor man wanna be rich
Reino Va'inaste     e-mail  reino@postimees.ee | Rich man wanna be king
phone (372) 7 390379   Postimees  Gildi 1      | And a king ain't satisfied
fax   (372) 7 390345   Tartu Estonia           | Till he rules everything
                                               |      B. Springsteen
Received on Sat Oct 25 10:07:45 1997