Skip to main content.
home | support | download

Back to List Archive

Documentation structure

From: Bill Moseley <moseley(at)>
Date: Tue Dec 12 2000 - 19:58:17 GMT
I don't really want to go into this much more, but I will make a few
comments (most of which I've said before).  If you are bored of this thread
please scroll down to "Documentation structure".

POD really is the best base format.  perldoc (and Perl) will very likely be
on any machine where swish is compiled and run.  On ported binary
distributions, such as David's build for Windows, the distribution can
include HTML or POD or Windows help.  David can work that out.

POD can be easily converted to HTML and PDF where desired (and the build
process can even do this automatically).  PDF and HTML cannot easily be
converted to POD, as far as I know.  And if all else fails, POD can be read
as-is in text format, too,  I don't think "less somefile.pdf" works that
great.  I'm not sure what good PDF does since the POD format is such a
simple format and doesn't include much layout.  The idea isn't to have a
consistent look to the documentation, rather just up to date content.

Currently, there's documentation that comes with the swish distribution,
there's the swish website at, and Jose has a website
about 2.0.x and 2.1.x.  There's other sites, too.  All that can be
confusing to new users of swish.

So the goal is to put all the documentation into the source distribution
and from that create man pages and/or HTML documentation.  POD is a very
simple file format, so it's really easy for multiple developers to work
with and keep up to date -- which is the real goal.  And since it's POD you
can create PDF or text or whatever you like.  Although not the most
beautiful web pages in the world, the on-line documentation can be kept up
to date from the PODs in whatever the current release of swish is.  It's
just not a great situation to download the software and then try to use
on-line docs that don't match the software.

Documentation structure

Another goal is to try to make the documentation in the source distribution
similar to other software packages one might download and install.  The
documentation should be split up into logical sections, but not so many
that it's a pain to look through.

Now, I haven't had much time to think about this, but I was thinking about
this type of layout:  (most of this is going to be with regard to building
on unix and not binary distributions such as for Windows).

  It's common to have a README file in source distributions,
  so we should have one too!  It would include:

  1) An overview of what swish does, and it's cool features.  Explain
  about Spider vs. File indexing methods.

  2) An overview of the document ion and how to read it.  Something like
  this overview.

  3) Where to get help -- the web site and this list.

  This is another common file in source distributions.  This should
  explain how to compile and install swish, and perhaps how "make html"
  to convert the documentation.

  I think the INSTALL page should point to the config.h file that explains
  compile-time settings.  (I'd like to see as many of those settings moved
  to the configuration file as possible.)

  Here would be a good place for help with common building problems, and
  platform specific build issues.

  I think INSTALL might also have the Quick Start information from the 
  web site -- that way one document, INSTALL, can get someone using

Now, I was thinking that there could be an option to install the
documentation as man pages.  So, it might be good to prefix these pages
with "SWISH-" so one could say man SWISH-FAQ or man SWISH-SEARCH.

I was thinking at first about having two pages, one explaining the
configuration file options, and another for command line switches. But, in
some way it might make sense to have one page that just deals with
everything you need to know about creating an index.

I'm unclear if things like filters should be integrated or a separate page.
 The sunsite web pages are a bit fragmented -- things like stemming and doc
properties are listed separately and that may only be due to the fact that
there were late additions to the program.

At this point I don't really need to know *everything* that should be
included, but just if this layout makes sense.  I will need help with this
as I certainly do not know about all the features of swish.

So, just to throw out some ideas:

  1) Overview -- explain the indexing process for both file and spider
  methods, and note some advanced features like filters.

  2) Configuration -- detail every configuration directive in the
  configuration file, and their defaults.  I wonder if it might be useful
  to list them by some type of category instead of alphabetical.

  3) Command line options for indexing

  4) Examples?

  1) Overview of searching, and comments/suggestions on building
  front-end scripts in a secure way.

  2) Command line options

  3) Examples

  4) Reference to information on the C Library for embedding
  swish in other applications.

  Common questions from the list -- anything that might help a new user.

  This might be involved enough to require its own page.

  Describes how to embed swish into other applications and the API.

  Describes how to build and install a perl module to embed swish
  into perl programs.  Also point to the SWISH modules on CPAN (shameless

  List bug fixes and feature additions.

So, is that a reasonable layout of documentation files?

Bill Moseley
Received on Tue Dec 12 20:01:10 2000