I'm working on a project that requires the indexing of a directory with =
frequent file additions to the directory. In addition, I would like to =
be able to add metadata in the form of a user comment to any file. So, =
the basic idea is to add a file and a comment about the file to the =
index at will. When searching, I need to be able to search the text in =
the file and the comment that may or may not have been added. This will =
be necessary for every type of file so that, ideally, even if a file is =
an executable or other binary I would still get a result back if the =
search text is in the comment. I would also like to be able to search =
based on access times and creation dates. Furthermore, I will need to =
do all of this programmatically. A final requirement, though not =
immediately needed, I will need to be able to delete files from the =
index at will.
After closely looking at swish-e, here is what I think I know (please =
correct me if I'm wrong here):=20
- Adding an individual file to an index is currently not possible =
programmatically. Using the command line I can add a file to a blank =
index and then merge that index with the "main" index. =20
- Adding metadata to a file is currently only possible with HTML and XML =
types. I think it is possible though to add metadata to any file by =
using -S command line option and using a separate program which could =
supply the metadata and the file itself.
- Searching for text in a file or in the metadata is possible =
programmatically and on the command line, but I'm not sure how to search =
based on create/access dates.
- Removing a file is currently not possible.
Here is my initial solution (again please correct me if you know a =
better way):
- I will need to add a library function to let me add a file directly =
into the index. I believe this should be done using the =
indexafile(SWISH *sw, char *path) function in fs.c. =20
- It looks like there is a storage for metadata already reserved for =
every file via the StoreDescription structure. I will need to add some =
functionality to add data to this structure using my proposed library =
function. =20
- The only function I found that removes a file is =
remove_last_file_from_list(SWISH *, IndexFILE *) in index.c. This only =
removes the last file from the index and, judging by the comments, is =
only intended to be used to clean up an aborted index operation.
Any thoughts, ideas, or comments are appreciated,
Brian Mila
*********************************************************************
Due to deletion of content types excluded from this list by policy,
this multipart message was reduced to a single part, and from there
to a plain text message.
*********************************************************************
Received on Tue Feb 4 04:59:47 2003