SWISH-CONFIG - Configuration File Directives
Swish-e version 2.4.7Table of Contents
- OVERVIEW
-
CONFIGURATION FILE
- Alphabetical Listing of Directives
- Directives that Control Swish
- Administrative Headers Directives
- Document Source Directives
- Document Contents Directives
- Directives for the File Access method only
- Directives for the HTTP Access Method Only
- Directives for the prog Access Method Only
- Document Filter Directives
- Document Info
CONFIGURATION FILE
What files Swish-e indexes and how they are indexed, and where the index is written can be controlled by a configuration file.
The configuration file is a text file composed of comments, blank lines, and configuration directives. The order of the directives is not important. Some directives may be used more than once in the configuration file, while others can only be used once (e.g. additional directives will overwrite preceding directives). Case of the directive is not important -- you may use upper, lower, or mixed case.
Comments are any line that begin with a "#".
# This is a comment
As of 2.4.3 lines may be continued by placing a backslas as the last character on the line:
IgnoreWords \
am \
the \
foo
Directives may take more than one parameter. Enclose single parameters that include whitespace in quotes (single or double). Inside of quotes the backslash escapes the next character.
ReplaceRules append "foo bar" <- define "foo bar" as a single parameter
If you need to include a quote character in the value either use a backslash to escape it, or enclose it in quotes of the other type.
Backslashes also have special meaning in regular expressions.
FileFilterMatch pdftotext "'%p' -" /\.pdf$/
This says that the dot is a real dot (instead of matching any character). If you place the regular expression in quotes then you must use double-backslashes.
FileFilterMatch pdftotext "'%p' -" "/\\.pdf$/"
Swish-e will convert the double backslash into a single backslash before passing the parameter to the regular expression compiler.
Commented example configuration files are included in the conf directory of the Swish-e distribution.
Some command line arguments can override directives specified in the configuration file. Please see also the SWISH-RUN for instructions on running Swish-e, and the SWISH-SEARCH page for information and examples on how to search your index.
The configuration file is specified to Swish-e by the -c switch. For
example,
swish-e -c myconfig.conf
You may also split your directives up into different configuration files. This
allows you to have a master configuration file used for many different indexes,
and smaller configuration files for each separate index. You can specify the
different configuration files when running from the command line with the -c
switch (see SWISH-RUN), or you may include other Configuration
file with the IncludeConfigFile directive below.
Typically, in a configuration file the directives are grouped together in some logical order -- that is, directives that control the source of the documents would be grouped together first, and directives that control how each document is filtered or its words index in another group of directives. (The directives listed below are grouped in this order).
The configuration file directives are listed below in these groups:
-
Administrative Headers Directives -- You may add administrative information to the header of the index file.
-
Document Source Directives -- Directives for selecting the source documents and the location of the index file.
-
Document Contents Directives -- Directives that control how a document content is indexed.
-
Directives for the File Access method only -- These directives are only applicable to the File Access indexing method.
-
Directives for the HTTP Access Method Only -- Likewise, these only apply to the HTTP Access method.
-
Directives for the prog Access Method Only -- These only apply to the prog Access method.
-
Document Filter Directives -- This is a special section that describes using document filters with Swish-e.
Alphabetical Listing of Directives
-
AbsoluteLinks [yes|NO]
-
BeginCharacters *string of characters*
-
BumpPositionCounterCharacters *string*
-
Buzzwords [*list of buzzwords*|File: path]
-
CompressPositions [yes|NO]
-
ConvertHTMLEntities [YES|no]
-
DefaultContents [TXT|HTML|XML|TXT2|HTML2|XML2|TXT*|HTML*|XML*]
-
Delay *seconds*
-
DontBumpPositionOnEndTags *list of names*
-
DontBumpPositionOnStartTags *list of names*
-
EnableAltSearchSyntax [yes|NO]
-
EndCharacters *string of characters*
-
EquivalentServer *server alias*
-
ExtractPath *metaname* [replace|remove|prepend|append|regex]
-
FileFilter *suffix* *program* [options]
-
FileFilterMatch *program* *options* *regex* [*regex* ...]
-
FileInfoCompression [yes|NO]
-
FileMatch [contains|is|regex] *regular expression*
-
FileRules [contains|is|regex] *regular expression*
-
FuzzyIndexingMode [NONE|Stemming|Soundex|Metaphone|DoubleMetaphone]