Documentation: the xmlproc APIs

Using the API

Ordinary XML parsing

An application that uses the xmlproc API has to import the xmlproc module (non-validating parsing) or the xmlval module (validating parsing). A parser object is created by instantiating an object of the XMLProcessor class (non-validating) or XMLValidator (validating). Both classes have the same interface.

If you want to receive information about the document being parsed you must implement an object conforming to the Application interface, and tell the parser about it with the set_application method.

If you want to receive error events and react to them you must implement an object conforming to the ErrorHandler interface, and tell the parser to use your error handler with the set_error_handler method.

It is also possible to control the way the parser interprets system identifiers, by implementing an object conforming to the InputSourceFactory interface and giving it to the parser with the set_inputsource_factory method.

Working with DTDs and catalog files

See the DTD API documentation and the catalog file documentation.

List of interfaces

These are the classes of interest to xmlproc application writers:

The Parser interface

This is the interface implemented by the two XML parser objects and is used to control parsing.

def __init__(self):
Instantiates a parser.
def set_application(self,app):
Tells the parser where to send data events.
def set_error_handler(self,err):
Tells the parser where to send error events.
def set_inputsource_factory(self,isf):
Tells the parser which object to use to map system identifiers to file-like objects.
def set_pubid_resolver(self,pubres):
Tells the parser which object to use to map public identifiers to system identifiers.
def set_dtd_listener(self, dtd_listener):
Tells the parser where to send DTD parse events. The dtd_listener object must implement the DTDConsumer interface.
def parse_resource(self,sysID,bufsize=16384):
Makes the parser parse the XML document with the given system identifier.
def reset(self):
Resets the parser to process another file, losing all unparsed data.
def feed(self,new_data):
Makes the parser parse a chunk of data.
def close(self):
Closes the parser, making it process all remaining data. The effects of calling feed after close and before the first reset are undefined.
def get_current_sysid(self):
Returns the system identifier of the current entity being parsed.
def get_offset(self):
Returns the current offset (in characters) from the start of the entity.
def get_line(self):
Returns the current line number.
def get_column(self):
Returns the current column position.
def get_dtd(self):
Returns the object holding information about the DTD of the document. This object conforms to the DTD interface. (Note that the DTD object returned by XMLProcessor will have much less information, since the XMLProcessor does not keep as much DTD information.)
def set_error_language(self,language):
Tells the parser which language to report errors in. 'language' must be an ISO 3166 language code (case does not matter). A KeyError will be thrown if the language is not supported.
def set_data_after_wf_error(self,stop_on_error):
Tells the parser whether to report data events to the application after a well-formedness error (0) or whether to stop reporting data (which is the default, 1).
def set_read_external_subset(self, read):
Tells the parser whether to read the external DTD subset of documents (including external parameter entities). Note that XMLValidator will ignore this method and always read the external subset.
def deref(self):
The parser creates circular data structures during parsing. When the parser object is no longer to be used and you wish to free the memory it has allocated, call this method. The parser object will be non-functional afterwards.
def get_elem_stack(self):
This method returns the list that holds the stack of open elements. Note that this list is live and must not be modified by the application.
def get_raw_construct(self):
Returns the raw XML string that triggered the current callback event.
def get_current_ent_stack(self):
Returns a snapshot of the current stack of open entities as a list of (entity name, entity sysid) tuples.

Application

This is the interface of the objects has atity sysid) tuples.

Application

This is the interface of the objects has atity sysid) tuples.

Application

This is the interface of the objects has atity sysid) tuples.

Application

This is the interface of the objects has atity sysid) tuples.

Application

This is the interface of the objects has atity sysid) tuples.

Application

This is the interface of the objects has atity sysid) tuples.

Application

This is the interface of the objects has atity sysid) tuples.

Application

This is the interface of the objects has atity sysid) tuples.

Application

This is the interface of the objects has atity sysid) tuples.

Application

This is the interface of the objects has atity sysid) tuples.

Application

This is the interface of the objects has atity sysid) tuples.

Application

This is the interface of the objects has atity sysid) tuples.

Application

This is the interface of the objects has atity sysid) tuples.

Application

This is the interface of the objects has atity sysid) tuples.

Application

This is the interface of the objects has atity sysid) tuples.

Application

This is the interface of the objects has atity sysid) tuples.

Application

This is the interface of the objects has atity sysid) tuples.

Application

This is the interface of the objects has atity sysid) tuples.

Application

This is the interface of the objects has atity sysid) tuples.

Application

This is the interface of the objects has atity sysid) tuples.

Application

This is the interface of the objects has atity sysid) tuples.

Application

This is the interface of the objects has atity sysid) tuples.

Application

This is the interface of the objects has atity sysid) tuples.

Application

This is the interface of the objects has atity sysid) tuples.

Application

This is the interface of the objects has atity sysid) tuples.

Application

This is the interface of the objects has atity sysid) tuples.

Application

This is the interface of the objects has atity sysid) tuples.

Application

This is the interface of the objects has atity sysid) tuples.

Application

This is the interface of the objects has atity sysid) tuples.

Application

This is the interface of the objects has atity sysid) tuples.

Application

This is the interface of the objects has atity sysid) tuples.

Application

This is the interface of the objects has atity sysid) tuples.

Application

This is the interface of the objects has atity sysid) tuples.

Application

This is the interface of the objects has atity sysid) tuples.

Application

This is the interface of the objects has atity sysid) tuples.

Application

This is the interface of the objects has atity sysid) tuples.

Application

This is the interface of the objects has atity sysid) tuples.

Application

This is the interface of the objects has atity sysid) tuples.

Application

This is the interface of the objects has atity sysid) tuples.

Application

This is the int