Skip to main content.
home | support | download

Back to List Archive

DJGPP DOS/Win95 plus 8bit support

From: A.S.Igumnov <igumnov(at)not-real.perl.imm.intec.ru>
Date: Mon Apr 13 1998 - 06:55:31 GMT
Yesterday I port swish-e to DJGPP whith minor support of 8bit character
encoding
Some comments:
1) For use with Win95 long names you must  set environment variable LFN=y
2) CFLAGS= -O2 -funsigned-char
    This compilation flag solve the problem whit comparision 8bit
characters,
3)   #include <fcntl.h>
      _fmode=O_BINARY;
      This lines force DJGPP IO library open files in standart UNIX mode
     ( default is DOS text whith <LF>  - <LF><CR>   translation)
     This solve many problems incude (IMHO) described in
http://sunsite.berkeley.edu/SWISH-E/Ports/Windows/message2  (integers stored
with fputc)
4) setlocale(LC_ALL,"");
    Correct translation UPPER -> LOWER for national alphabets
 5)   #define VOWELCHARS

     char indexchars[257]=WORDCHARS;
Support for national languages
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
     This part of swish-e need more changes:
My version of VOWELCHARS implement russian vowels in CP866 encoding, many
languages, I think,
need such additional configurable table.
indexchars ? I dont understand difference between indexchars and WORDCHARS
if logycaly they indentical
please answer me.

While definition of indexchars not included in config.h I spent about 30 min
trying understend why swish-e dont work whith russian language.
In any case it would be more elegant create table whith char property like
this
#define SW_WRD 0x01
#define SW_BEG 0x02
#define SW_END 0x04
#define SW_VWL 0x08
.....................
char sw_CHARS[]=
{0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0.......   /* 0x00 - 0x1F */
.............
SW_WRD | SW_BEG | SW_END | SW_VWL,    /* 0x41 a */
.....................
SW_WRD | SW_BEG | SW_END | SW_VWL,    /* 0xFF russian ya */

}

#define isvowel(c) ( sw_CHARS[(c)] & SW_VWL )
...........................................

Additionaly it is interesting move language dependent information like char
classes and stopwords
in external config file.

This is my minor changes (I remove sw_CHARS table described above whith
support of russian, and all changes connected whith it)

----------------------------------------------------swish-e.dif-------------
-------------------------------------
diff srcdos8/Makefile src/Makefile
13c13
< CFLAGS= -O2 -funsigned-char
---
> CFLAGS= -O2
diff srcdos8/check.c src/check.c
175,178c175,176
< int i;
<  for (i = 0; VOWELCHARS[i] != '\0'; i++)
<   if (c == VOWELCHARS[i])
<    return 1;
---
>  if (c == 'a' || c == 'e' || c == 'i' || c == 'o' || c == 'u')
>   return 1;
diff srcdos8/config.h src/config.h
125,126d124
< #define VOWELCHARS "aeiou ?R"
<
diff srcdos8/file.c src/file.c
47d46
< #ifndef DJGPP
53,55d51
< #else
<  return 0;
< #endif
diff srcdos8/swish.c src/swish.c
36,39d35
<  setlocale(LC_ALL,"");
< #ifdef DJGPP
<  _fmode=O_BINARY;
< #endif
diff srcdos8/swish.h src/swish.h
21,24d20
< #ifdef DJGPP
< #include <fcntl.h>
< #endif
< #include <locale.h>
186c182
< char indexchars[257]=WORDCHARS;
---
> char *indexchars =
"abcdefghijklmnopqrstuvwxyz&#;0123456789_\\|/-+=?!@$%^'\"`~,.<>[]{}";

----------------------------------------------------end
swish-e.dif--------------------------------------------------
Received on Sun Apr 12 23:05:08 1998