SWISH::API - Perl interface to the Swish-e C Library
Swish-e version 2.4.7Table of Contents
SYNOPSIS
use SWISH::API;
my $swish = SWISH::API->new( 'index.swish-e' );
$swish->abort_last_error
if $swish->Error;
# A short-cut way to search
my $results = $swish->query( "foo OR bar" );
# Or more typically
my $search = $swish->new_search_object;
# then in a loop
my $results = $search->execute( $query );
# always check for errors (but aborting is not always necessary)
$swish->abort_last_error
if $swish->Error;
# Display a list of results
my $hits = $results->hits;
if ( !$hits ) {
print "No Results\n";
return; /* for example *.
}
print "Found ", $results->hits, " hits\n";
# Seek to a given page - should check for errors
$results->seek_result( ($page-1) * $page_size );
while ( my $result = $results->next_result ) {
printf("Path: %s\n Rank: %lu\n Size: %lu\n Title: %s\n Index: %s\n Modified: %s\n Record #: %lu\n File #: %lu\n\n",
$result->property( "swishdocpath" ),
$result->property( "swishrank" ),
$result->property( "swishdocsize" ),
$result->property( "swishtitle" ),
$result->property( "swishdbfile" ),
$result->result_property_str( "swishlastmodified" ),
$result->property( "swishreccount" ),
$result->property( "swishfilenum" )
);
}
# display properties and metanames
for my $index_name ( $swish->index_names ) {
my @metas = $swish->meta_list( $index_name );
my @props = $swish->property_list( $index_name );
for my $m ( @metas ) {
my $name = $m->name;
my $id = $m->id;
my $type = $m->type;
}
# (repeat above for @props)
}
DESCRIPTION
This module provides a Perl interface to the Swish-e search engine. This module allows embedding the swish-e search code into your application avoiding the need to fork to run the swish-e binary and to keep an index file open when running multiple queries. This results in increased search performance.
DEPENDENCIES
You must have installed Swish-e version 2.4 before building this module. Download from:
http://swish-e.org
OVERVIEW
This module includes a number of classes.
Searching consists of connecting to a swish-e index (or indexes), and then running queries against the open index. Connecting to the index creates a swish object blessed into the SWISH::API class.
A SWISH::API::Search object is created from the SWISH::API object. The SWISH::API::Search object can have associated parameters (e.g. result sort order).
The SWISH::API::Search object is used to query the associated index file or files. A query on a search object returns a results object of the class SWISH::API::Results. Then individual results of the SWISH::API::Result class can be fetched by calling a method of the results object.
Finally, a result's properties can be accessed by calling methods on the result object.
SWISH::API - Swish Handle Object
To begin using Swish you must first create a Swish Handle object. This object makes the connection to one or more index files and is used to create objects used for searching the associated index files.
- $swish = SWISH::API->new( $index_files );
This method returns a swish handle object blessed into the SWISH::API class. $index_files is a space separated list of index files to open. This always returns an object, even on errors. Caller must check for errors (see below).
- @indexes = $swish->index_names;
Returns a list of index names associated with the swish handle. These were the indexes specified as a parameter on the SWISH::API->new call. This can be used in calls below that require specifying the index file name.
- @header_names = $swish->header_names;
Returns a list of possible header names. These can be used to lookup header values. See
Swishheader_valuemethod below. - @values = $swish->header_value( $index_file, $header_name );
A swish-e index has data associated with it stored in the index header. This method provides access to that data.
Returns the header value for the header and index file specified. Most headers are a single item, but some headers (e.g. "Stopwords") return a list.
The list of possible header names can be obtained from the Swishheader_names method.
- $swish->rank_scheme( 0|1 );
Similar to the -R option with the swish-e command line tool. The default ranking scheme is 0. Set it to 1 to experiment with other ranking features. See the SWISH-CONFIG documentation for more on ranking schemes.
Error Handling
All errors are stored in and accessed via the SWISH::API object (the Swish Handle). That is, even an error that occurs when calling a method on a result (SWISH::API::Result) object will store the error in the parent SWISH:API object.
Check for errors after every method call. Some errors are critical errors and will require destruction of the SWISH::API object. Critical errors will typically only happen when attaching to the database and are errors such as an invalid index file name, permissions errors, or passing invalid objects to calls.
Typically, if you receive an error when attaching to an index file or files you should assume that the error is critical and let the swish object fall out of scope (and destroyed). Otherwise, if an error is detected you should check if it is a critical error. If the error is not critical you may continue using the objects that have been created (for example, an invalid meta name will generate a non-critical error, so you may continue searching using the same search object).
Error state is cleared upon a new query.
Again, all error methods need to be called on the parent swish object
- $swish->error
Returns true if an error occurred on the last operation. On errors the value returned is the internal Swish-e error number (which is less than zero).
- $swish->critical_error
Returns true if the last error was a critical error
- $swish->abort_last_error
Aborts the running program and prints an error message to STDERR.
- $str = $swish->error_string
Returns the string description of the current error (based on the value returned by $swish->error). This is a generic error string.
- $msg = $swish->last_error_msg
Returns a string with specific information about the last error, if any. For example, if a query of:
badmeta=foo
and "badmeta" is an invalid metaname $swish->error_string might return "Unknown metaname", but $swish->last_error_msg might return "badmeta".
Generating Search and Result Objects
- $search = $swish->new_search_object( $query );
This creates a new search object blessed into the SWISH::API::Search class. The optional $query parameter is a query string to store in the search object.
See the section on
SWISH::API::Searchfor methods available on the returned object.The advantage of this method is that a search object can be used for multiple queries:
$search = $swish->New_Search_Objet; while ( $query = next_query() ) { $results = $search->execute( $query ); ... } - $results = $swish->query( $query );
This is a short-cut which avoids the step of creating a separate search object. It returns a results object blessed into the SWISH::API::Results class described below.
This method basically is the equivalent of
$results = $swish->new_search_object->execute( $query );
SWISH::API::Search - Search Objects
A search object holds the parameters used to generate a list of results. These methods are used to adjust these parameters and to create the list of results for the current set of search parameters.
- $search->set_query( $query );
This will set (or replace) the query string associated with a search object. This method is typically not used as the query can be set when executing the actual query or when creating a search object.
- $search->set_structure( $structure_bits );
This method may change in the future.
A "structure" is a bit-mapped flag used to limit search results to specific parts of an HTML document, such as the title or in H tags. The possible bits are:
IN_FILE = 1 This is the default IN_TITLE = 2 In <title> tag IN_HEAD = 4 In <head> tag IN_BODY = 8 In <body> IN_COMMENTS = 16 In html comments IN_HEADERSwish-e :: SWISH::API - Perl interface to the Swish-e C Library SWISH::API - Perl interface to the Swish-e C Library
Swish-e version 2.4.7Table of Contents
SYNOPSIS
use SWISH::API; my $swish = SWISH::API->new( 'index.swish-e' ); $swish->abort_last_error if $swish->Error; # A short-cut way to search my $results = $swish->query( "foo OR bar" ); # Or more typically my $search = $swish->new_search_object; # then in a loop my $results = $search->execute( $query ); # always check for errors (but aborting is not always necessary) $swish->abort_last_error if $swish->Error; # Display a list of results my $hits = $results->hits; if ( !$hits ) { print "No Results\n"; return; /* for example *. } print "Found ", $results->hits, " hits\n"; # Seek to a given page - should check for errors $results->seek_result( ($page-1) * $page_size ); while ( my $result = $results->next_result ) { printf("Path: %s\n Rank: %lu\n Size: %lu\n Title: %s\n Index: %s\n Modified: %s\n Record #: %lu\n File #: %lu\n\n", $result->property( "swishdocpath" ), $result->property( "swishrank" ), $result->property( "swishdocsize" ), $result->property( "swishtitle" ), $result->property( "swishdbfile" ), $result->result_property_str( "swishlastmodified" ), $result->property( "swishreccount" ), $result->property( "swishfilenum" ) ); } # display properties and metanames for my $index_name ( $swish->index_names ) { my @metas = $swish->meta_list( $index_name ); my @props = $swish->property_list( $index_name ); for my $m ( @metas ) { my $name = $m->name; my $id = $m->id; my $type = $m->type; } # (repeat above for @props) }DESCRIPTION
This module provides a Perl interface to the Swish-e search engine. This module allows embedding the swish-e search code into your application avoiding the need to fork to run the swish-e binary and to keep an index file open when running multiple queries. This results in increased search performance.
DEPENDENCIES
You must have installed Swish-e version 2.4 before building this module. Download from:
http://swish-e.org
OVERVIEW
This module includes a number of classes.
Searching consists of connecting to a swish-e index (or indexes), and then running queries against the open index. Connecting to the index creates a swish object blessed into the SWISH::API class.
A SWISH::API::Search object is created from the SWISH::API object. The SWISH::API::Search object can have associated parameters (e.g. result sort order).
The SWISH::API::Search object is used to query the associated index file or files. A query on a search object returns a results object of the class SWISH::API::Results. Then individual results of the SWISH::API::Result class can be fetched by calling a method of the results object.
Finally, a result's properties can be accessed by calling methods on the result object.
SWISH::API - Swish Handle Object
To begin using Swish you must first create a Swish Handle object. This object makes the connection to one or more index files and is used to create objects used for searching the associated index files.
- $swish = SWISH::API->new( $index_files );
This method returns a swish handle object blessed into the SWISH::API class. $index_files is a space separated list of index files to open. This always returns an object, even on errors. Caller must check for errors (see below).
- @indexes = $swish->index_names;
Returns a list of index names associated with the swish handle. These were the indexes specified as a parameter on the SWISH::API->new call. This can be used in calls below that require specifying the index file name.
- @header_names = $swish->header_names;
Returns a list of possible header names. These can be used to lookup header values. See
Swishheader_valuemethod below. - @values = $swish->header_value( $index_file, $header_name );
A swish-e index has data associated with it stored in the index header. This method provides access to that data.
Returns the header value for the header and index file specified. Most headers are a single item, but some headers (e.g. "Stopwords") return a list.
The list of possible header names can be obtained from the Swishheader_names method.
- $swish->rank_scheme( 0|1 );
Similar to the -R option with the swish-e command line tool. The default ranking scheme is 0. Set it to 1 to experiment with other ranking features. See the SWISH-CONFIG documentation for more on ranking schemes.
Error Handling
All errors are stored in and accessed via the SWISH::API object (the Swish Handle). That is, even an error that occurs when calling a method on a result (SWISH::API::Result) object will store the error in the parent SWISH:API object.
Check for errors after every method call. Some errors are critical errors and will require destruction of the SWISH::API object. Critical errors will typically only happen when attaching to the database and are errors such as an invalid index file name, permissions errors, or passing invalid objects to calls.
Typically, if you receive an error when attaching to an index file or files you should assume that the error is critical and let the swish object fall out of scope (and destroyed). Otherwise, if an error is detected you should check if it is a critical error. If the error is not critical you may continue using the objects that have been created (for example, an invalid meta name will generate a non-critical error, so you may continue searching using the same search object).
Error state is cleared upon a new query.
Again, all error methods need to be called on the parent swish object
- $swish->error
Returns true if an error occurred on the last operation. On errors the value returned is the internal Swish-e error number (which is less than zero).
- $swish->critical_error
Returns true if the last error was a critical error
- $swish->abort_last_error
Aborts the running program and prints an error message to STDERR.
- $str = $swish->error_string
Returns the string description of the current error (based on the value returned by $swish->error). This is a generic error string.
- $msg = $swish->last_error_msg
Returns a string with specific information about the last error, if any. For example, if a query of:
badmeta=foo
and "badmeta" is an invalid metaname $swish->error_string might return "Unknown metaname", but $swish->last_error_msg might return "badmeta".
Generating Search and Result Objects
- $search = $swish->new_search_object( $query );
This creates a new search object blessed into the SWISH::API::Search class. The optional $query parameter is a query string to store in the search object.
See the section on
SWISH::API::Searchfor methods available on the returned object.The advantage of this method is that a search object can be used for multiple queries:
$search = $swish->New_Search_Objet; while ( $query = next_query() ) { $results = $search->execute( $query ); ... } - $results = $swish->query( $query );
This is a short-cut which avoids the step of creating a separate search object. It returns a results object blessed into the SWISH::API::Results class described below.
This method basically is the equivalent of
$results = $swish->new_search_object->execute( $query );
SWISH::API::Search - Search Objects
A search object holds the parameters used to generate a list of results. These methods are used to adjust these parameters and to create the list of results for the current set of search parameters.
- $search->set_query( $query );
This will set (or replace) the query string associated with a search object. This method is typically not used as the query can be set when executing the actual query or when creating a search object.
- $search->set_structure( $structure_bits );
This method may change in the future.
A "structure" is a bit-mapped flag used to limit search results to specific parts of an HTML document, such as the title or in H tags. The possible bits are:
IN_FILE = 1 This is the default IN_TITLE = 2 In <title> tag IN_HEAD = 4 In <head> tag IN_BODY = 8 In <body> IN_COMMENTS = 16 In html comments IN_HEADERSwish-e :: SWISH::API - Perl interface to the Swish-e C Library
- $swish = SWISH::API->new( $index_files );