Skip to main content.
home | support | download

SWISH::API - Perl interface to the Swish-e C Library

Swish-e version 2.4.7

Table of Contents


SYNOPSIS

    use SWISH::API;

    my $swish = SWISH::API->new( 'index.swish-e' );

    $swish->abort_last_error
        if $swish->Error;

    # A short-cut way to search

    my $results = $swish->query( "foo OR bar" );

    # Or more typically
    my $search = $swish->new_search_object;

    # then in a loop
    my $results = $search->execute( $query );

    # always check for errors (but aborting is not always necessary)

    $swish->abort_last_error
        if $swish->Error;

    # Display a list of results

    my $hits = $results->hits;
    if ( !$hits ) {
        print "No Results\n";
        return;  /* for example *.
    }

    print "Found ", $results->hits, " hits\n";

    # Seek to a given page - should check for errors
    $results->seek_result( ($page-1) * $page_size );

    while ( my $result = $results->next_result ) {
        printf("Path: %s\n  Rank: %lu\n  Size: %lu\n  Title: %s\n  Index: %s\n  Modified: %s\n  Record #: %lu\n  File   #: %lu\n\n",
            $result->property( "swishdocpath" ),
            $result->property( "swishrank" ),
            $result->property( "swishdocsize" ),
            $result->property( "swishtitle" ),
            $result->property( "swishdbfile" ),
            $result->result_property_str( "swishlastmodified" ),
            $result->property( "swishreccount" ),
            $result->property( "swishfilenum" )
        );
    }

    # display properties and metanames

    for my $index_name ( $swish->index_names ) {
        my @metas = $swish->meta_list( $index_name );
        my @props = $swish->property_list( $index_name );

        for my $m ( @metas ) {
            my $name = $m->name;
            my $id = $m->id;
            my $type = $m->type;
        }
        # (repeat above for @props)
    }

DESCRIPTION

This module provides a Perl interface to the Swish-e search engine. This module allows embedding the swish-e search code into your application avoiding the need to fork to run the swish-e binary and to keep an index file open when running multiple queries. This results in increased search performance.

DEPENDENCIES

You must have installed Swish-e version 2.4 before building this module. Download from:

    http://swish-e.org

OVERVIEW

This module includes a number of classes.

Searching consists of connecting to a swish-e index (or indexes), and then running queries against the open index. Connecting to the index creates a swish object blessed into the SWISH::API class.

A SWISH::API::Search object is created from the SWISH::API object. The SWISH::API::Search object can have associated parameters (e.g. result sort order).

The SWISH::API::Search object is used to query the associated index file or files. A query on a search object returns a results object of the class SWISH::API::Results. Then individual results of the SWISH::API::Result class can be fetched by calling a method of the results object.

Finally, a result's properties can be accessed by calling methods on the result object.

METHODS

SWISH::API - Swish Handle Object

To begin using Swish you must first create a Swish Handle object. This object makes the connection to one or more index files and is used to create objects used for searching the associated index files.

  • $swish = SWISH::API->new( $index_files );

    This method returns a swish handle object blessed into the SWISH::API class. $index_files is a space separated list of index files to open. This always returns an object, even on errors. Caller must check for errors (see below).

  • @indexes = $swish->index_names;

    Returns a list of index names associated with the swish handle. These were the indexes specified as a parameter on the SWISH::API->new call. This can be used in calls below that require specifying the index file name.

  • @header_names = $swish->header_names;

    Returns a list of possible header names. These can be used to lookup header values. See Swishheader_value method below.

  • @values = $swish->header_value( $index_file, $header_name );

    A swish-e index has data associated with it stored in the index header. This method provides access to that data.

    Returns the header value for the header and index file specified. Most headers are a single item, but some headers (e.g. "Stopwords") return a list.

    The list of possible header names can be obtained from the Swishheader_names method.

  • $swish->rank_scheme( 0|1 );

    Similar to the -R option with the swish-e command line tool. The default ranking scheme is 0. Set it to 1 to experiment with other ranking features. See the SWISH-CONFIG documentation for more on ranking schemes.

Error Handling

All errors are stored in and accessed via the SWISH::API object (the Swish Handle). That is, even an error that occurs when calling a method on a result (SWISH::API::Result) object will store the error in the parent SWISH:API object.

Check for errors after every method call. Some errors are critical errors and will require destruction of the SWISH::API object. Critical errors will typically only happen when attaching to the database and are errors such as an invalid index file name, permissions errors, or passing invalid objects to calls.

Typically, if you receive an error when attaching to an index file or files you should assume that the error is critical and let the swish object fall out of scope (and destroyed). Otherwise, if an error is detected you should check if it is a critical error. If the error is not critical you may continue using the objects that have been created (for example, an invalid meta name will generate a non-critical error, so you may continue searching using the same search object).

Error state is cleared upon a new query.

Again, all error methods need to be called on the parent swish object

  • $swish->error

    Returns true if an error occurred on the last operation. On errors the value returned is the internal Swish-e error number (which is less than zero).

  • $swish->critical_error

    Returns true if the last error was a critical error

  • $swish->abort_last_error

    Aborts the running program and prints an error message to STDERR.

  • $str = $swish->error_string

    Returns the string description of the current error (based on the value returned by $swish->error). This is a generic error string.

  • $msg = $swish->last_error_msg

    Returns a string with specific information about the last error, if any. For example, if a query of:

        badmeta=foo

    and "badmeta" is an invalid metaname $swish->error_string might return "Unknown metaname", but $swish->last_error_msg might return "badmeta".

Generating Search and Result Objects

  • $search = $swish->new_search_object( $query );

    This creates a new search object blessed into the SWISH::API::Search class. The optional $query parameter is a query string to store in the search object.

    See the section on SWISH::API::Search for methods available on the returned object.

    The advantage of this method is that a search object can be used for multiple queries:

        $search = $swish->New_Search_Objet;
        while ( $query = next_query() ) {
            $results = $search->execute( $query );
            ...
        }
  • $results = $swish->query( $query );

    This is a short-cut which avoids the step of creating a separate search object. It returns a results object blessed into the SWISH::API::Results class described below.

    This method basically is the equivalent of

        $results = $swish->new_search_object->execute( $query );

SWISH::API::Search - Search Objects

A search object holds the parameters used to generate a list of results. These methods are used to adjust these parameters and to create the list of results for the current set of search parameters.

  • $search->set_query( $query );

    This will set (or replace) the query string associated with a search object. This method is typically not used as the query can be set when executing the actual query or when creating a search object.

  • $search->set_structure( $structure_bits );

    This method may change in the future.

    A "structure" is a bit-mapped flag used to limit search results to specific parts of an HTML document, such as the title or in H tags. The possible bits are:

        IN_FILE         = 1      This is the default
        IN_TITLE        = 2      In <title> tag
        IN_HEAD         = 4      In <head> tag
        IN_BODY         = 8      In <body>
        IN_COMMENTS     = 16     In html comments
        IN_HEADER     
    
        
    
        
            
        
            
        
            
        
            
        
            
        
            
        
    
        Swish-e :: SWISH::API - Perl interface to the Swish-e C Library
    
    
      
    
    
    
        
    
        
    
        
          
    
        
    
        
        
    
    
    
    home | support | download

    SWISH::API - Perl interface to the Swish-e C Library

    Swish-e version 2.4.7

    Table of Contents


    SYNOPSIS

        use SWISH::API;
    
        my $swish = SWISH::API->new( 'index.swish-e' );
    
        $swish->abort_last_error
            if $swish->Error;
    
        # A short-cut way to search
    
        my $results = $swish->query( "foo OR bar" );
    
        # Or more typically
        my $search = $swish->new_search_object;
    
        # then in a loop
        my $results = $search->execute( $query );
    
        # always check for errors (but aborting is not always necessary)
    
        $swish->abort_last_error
            if $swish->Error;
    
        # Display a list of results
    
        my $hits = $results->hits;
        if ( !$hits ) {
            print "No Results\n";
            return;  /* for example *.
        }
    
        print "Found ", $results->hits, " hits\n";
    
        # Seek to a given page - should check for errors
        $results->seek_result( ($page-1) * $page_size );
    
        while ( my $result = $results->next_result ) {
            printf("Path: %s\n  Rank: %lu\n  Size: %lu\n  Title: %s\n  Index: %s\n  Modified: %s\n  Record #: %lu\n  File   #: %lu\n\n",
                $result->property( "swishdocpath" ),
                $result->property( "swishrank" ),
                $result->property( "swishdocsize" ),
                $result->property( "swishtitle" ),
                $result->property( "swishdbfile" ),
                $result->result_property_str( "swishlastmodified" ),
                $result->property( "swishreccount" ),
                $result->property( "swishfilenum" )
            );
        }
    
        # display properties and metanames
    
        for my $index_name ( $swish->index_names ) {
            my @metas = $swish->meta_list( $index_name );
            my @props = $swish->property_list( $index_name );
    
            for my $m ( @metas ) {
                my $name = $m->name;
                my $id = $m->id;
                my $type = $m->type;
            }
            # (repeat above for @props)
        }

    DESCRIPTION

    This module provides a Perl interface to the Swish-e search engine. This module allows embedding the swish-e search code into your application avoiding the need to fork to run the swish-e binary and to keep an index file open when running multiple queries. This results in increased search performance.

    DEPENDENCIES

    You must have installed Swish-e version 2.4 before building this module. Download from:

        http://swish-e.org

    OVERVIEW

    This module includes a number of classes.

    Searching consists of connecting to a swish-e index (or indexes), and then running queries against the open index. Connecting to the index creates a swish object blessed into the SWISH::API class.

    A SWISH::API::Search object is created from the SWISH::API object. The SWISH::API::Search object can have associated parameters (e.g. result sort order).

    The SWISH::API::Search object is used to query the associated index file or files. A query on a search object returns a results object of the class SWISH::API::Results. Then individual results of the SWISH::API::Result class can be fetched by calling a method of the results object.

    Finally, a result's properties can be accessed by calling methods on the result object.

    METHODS

    SWISH::API - Swish Handle Object

    To begin using Swish you must first create a Swish Handle object. This object makes the connection to one or more index files and is used to create objects used for searching the associated index files.

    • $swish = SWISH::API->new( $index_files );

      This method returns a swish handle object blessed into the SWISH::API class. $index_files is a space separated list of index files to open. This always returns an object, even on errors. Caller must check for errors (see below).

    • @indexes = $swish->index_names;

      Returns a list of index names associated with the swish handle. These were the indexes specified as a parameter on the SWISH::API->new call. This can be used in calls below that require specifying the index file name.

    • @header_names = $swish->header_names;

      Returns a list of possible header names. These can be used to lookup header values. See Swishheader_value method below.

    • @values = $swish->header_value( $index_file, $header_name );

      A swish-e index has data associated with it stored in the index header. This method provides access to that data.

      Returns the header value for the header and index file specified. Most headers are a single item, but some headers (e.g. "Stopwords") return a list.

      The list of possible header names can be obtained from the Swishheader_names method.

    • $swish->rank_scheme( 0|1 );

      Similar to the -R option with the swish-e command line tool. The default ranking scheme is 0. Set it to 1 to experiment with other ranking features. See the SWISH-CONFIG documentation for more on ranking schemes.

    Error Handling

    All errors are stored in and accessed via the SWISH::API object (the Swish Handle). That is, even an error that occurs when calling a method on a result (SWISH::API::Result) object will store the error in the parent SWISH:API object.

    Check for errors after every method call. Some errors are critical errors and will require destruction of the SWISH::API object. Critical errors will typically only happen when attaching to the database and are errors such as an invalid index file name, permissions errors, or passing invalid objects to calls.

    Typically, if you receive an error when attaching to an index file or files you should assume that the error is critical and let the swish object fall out of scope (and destroyed). Otherwise, if an error is detected you should check if it is a critical error. If the error is not critical you may continue using the objects that have been created (for example, an invalid meta name will generate a non-critical error, so you may continue searching using the same search object).

    Error state is cleared upon a new query.

    Again, all error methods need to be called on the parent swish object

    • $swish->error

      Returns true if an error occurred on the last operation. On errors the value returned is the internal Swish-e error number (which is less than zero).

    • $swish->critical_error

      Returns true if the last error was a critical error

    • $swish->abort_last_error

      Aborts the running program and prints an error message to STDERR.

    • $str = $swish->error_string

      Returns the string description of the current error (based on the value returned by $swish->error). This is a generic error string.

    • $msg = $swish->last_error_msg

      Returns a string with specific information about the last error, if any. For example, if a query of:

          badmeta=foo

      and "badmeta" is an invalid metaname $swish->error_string might return "Unknown metaname", but $swish->last_error_msg might return "badmeta".

    Generating Search and Result Objects

    • $search = $swish->new_search_object( $query );

      This creates a new search object blessed into the SWISH::API::Search class. The optional $query parameter is a query string to store in the search object.

      See the section on SWISH::API::Search for methods available on the returned object.

      The advantage of this method is that a search object can be used for multiple queries:

          $search = $swish->New_Search_Objet;
          while ( $query = next_query() ) {
              $results = $search->execute( $query );
              ...
          }
    • $results = $swish->query( $query );

      This is a short-cut which avoids the step of creating a separate search object. It returns a results object blessed into the SWISH::API::Results class described below.

      This method basically is the equivalent of

          $results = $swish->new_search_object->execute( $query );

    SWISH::API::Search - Search Objects

    A search object holds the parameters used to generate a list of results. These methods are used to adjust these parameters and to create the list of results for the current set of search parameters.

    • $search->set_query( $query );

      This will set (or replace) the query string associated with a search object. This method is typically not used as the query can be set when executing the actual query or when creating a search object.

    • $search->set_structure( $structure_bits );

      This method may change in the future.

      A "structure" is a bit-mapped flag used to limit search results to specific parts of an HTML document, such as the title or in H tags. The possible bits are:

          IN_FILE         = 1      This is the default
          IN_TITLE        = 2      In <title> tag
          IN_HEAD         = 4      In <head> tag
          IN_BODY         = 8      In <body>
          IN_COMMENTS     = 16     In html comments
          IN_HEADER     
      
          
      
          
              
          
              
          
              
          
              
          
              
          
              
          
      
          Swish-e :: SWISH::API - Perl interface to the Swish-e C Library