Class AdvancedContentHandler

java.lang.Object
org.tquadrat.foundation.xml.parse.AdvancedContentHandler
All Implemented Interfaces:
ContentHandler

@ClassVersion(sourceVersion="$Id: AdvancedContentHandler.java 1101 2024-02-18 00:18:48Z tquadrat $") @API(status=STABLE, since="0.0.5") public abstract class AdvancedContentHandler extends Object implements ContentHandler

This class implements the interface ContentHandler as a base class for more advanced versions of the DefaultHandler class or for stand-alone use.

Instead of implementing the three methods characters(), endElement(), and startElement() only handlers for the elements have to implemented; after registration of these handlers using registerElementHandler(String, HandlerMethod) these handler methods will be called automatically by the default implementations of processElement() and openElement().

These method can still be overwritten if a different processing is desired. When processElement() is called after the element is terminated, the attributes together with the character data after closing the element is provided. The method openElement() is called each time an element will be opened, providing the attributes only.

Some convenience methods have been implemented that will give access to the parent element and to the path down to the current element.

Note: Unfortunately, this class do not work for XML streams that has elements embedded into text, as it is usual for HTML. The snippet

<p>First Text <b>Bold Text</b> Second Text</p>

will be parsed as "First Text Second Text" for the p element and "Bold Text" for the b element; the information that the b element was embedded in between is lost.

Author:
Thomas Thrien (thomas.thrien@tquadrat.org)
Version:
$Id: AdvancedContentHandler.java 1101 2024-02-18 00:18:48Z tquadrat $
Since:
0.0.5
UML Diagram
UML Diagram for "org.tquadrat.foundation.xml.parse.AdvancedContentHandler"

UML Diagram for "org.tquadrat.foundation.xml.parse.AdvancedContentHandler"

UML Diagram for "org.tquadrat.foundation.xml.parse.AdvancedContentHandler"
  • Field Details

  • Constructor Details

  • Method Details

    • characters

      public final void characters(char[] ch, int start, int length) throws SAXException
      Receives notification of character data inside an element.
      Specified by:
      characters in interface ContentHandler
      Parameters:
      ch - The characters.
      start - The start position inside the characters array.
      length - The length of the subset to process.
      Throws:
      SAXException - Something has gone wrong.
    • composeAttribute

      private static final Attribute composeAttribute(Attributes attributes, int index) throws IllegalArgumentException, URISyntaxException
      Composes an Attribute instance from the data of the given Attributes instance at the given index.
      Parameters:
      attributes - The attributes.
      index - The index.
      Returns:
      The attribute.
      Throws:
      URISyntaxException - The URI for the attribute's namespace cannot be parsed correctly.
      IllegalArgumentException - The attribute type is invalid.
    • endDocument

      @MountPoint public void endDocument() throws SAXException
      Receives the notification about the end of the document.

      This implementation does nothing by default. Application writers may override this method in a subclass to take specific actions at the end of a document (such as finalising a tree or closing an output file).
      Specified by:
      endDocument in interface ContentHandler
      Throws:
      SAXException - Any SAX exception, possibly wrapping another exception.
    • endElement

      public final void endElement(String uri, String localName, String qName) throws SAXException
      Receives the notification about the end of an element. This method will call the processElement() method and afterwards it will remove the element from the stack - in exactly that order, otherwise the getPath() method would return wrong results.
      Specified by:
      endElement in interface ContentHandler
      Parameters:
      uri - The URI for the namespace of this element; can be empty.
      localName - The local name of the element.
      qName - The element's qualified name.
      Throws:
      SAXException - The element was not correct according to the DTD.
    • endPrefixMapping

      public final void endPrefixMapping(String prefix) throws SAXException
      Receives the notification of the end for a name space mapping.
      Specified by:
      endPrefixMapping in interface ContentHandler
      Parameters:
      prefix - The Namespace prefix being declared.
      Throws:
      SAXException - Any SAX exception, possibly wrapping another exception.
    • getDocumentType

      public final String getDocumentType()
      Returns the name of the document type.
      Returns:
      The document type.
    • getLocator

      protected final Locator getLocator()
      Returns a copy of the locator.
      Returns:
      A copy of the locator object or null if there was none provided by the parser.
    • getPath

      protected final String[] getPath()
      Returns the path for the element as an array, with the qualified element names as the entries in the array. The array is ordered in the way that the current element is at position [0], while the root element (the document element) is at [length - 1].
      Returns:
      The list of element names that build the path to the current element.
    • getPathDepth

      protected final int getPathDepth()
      Returns the path depth for the element.
      Returns:
      The number of nodes on the path to the current element. 0 means that the current element is the document.
    • handleElement

      @MountPoint @API(status=MAINTAINED, since="0.1.0") protected void handleElement(AdvancedContentHandler.Element element, boolean terminateElement) throws SAXException
      The default element handling; it does nothing.
      Parameters:
      element - The element.
      terminateElement - true if called by processElement(Element), indicating that the element processing will be terminated, false when called by openElement(Element).
      Throws:
      SAXException - The element cannot be handled properly.
      Since:
      0.1.0
    • ignorableWhitespace

      @MountPoint public void ignorableWhitespace(char[] ch, int start, int length) throws SAXException
      Receives the notification of ignorable whitespace in element content.

      This implementation does nothing by default. Application writers may override this method to take specific actions for each chunk of ignorable whitespace (such as adding data to a node or buffer, or printing it to a file).
      Specified by:
      ignorableWhitespace in interface ContentHandler
      Parameters:
      ch - The whitespace characters.
      start - The start position in the character array.
      length - The number of characters to use from the character array.
      Throws:
      SAXException - Any SAX exception, possibly wrapping another exception.
    • openElement

      This method is called every time a new element was encountered by the parser. It should be overwritten if it is necessary to perform any activities for a specific element.

      The default implementation looks up a method handler in the map of element handlers and calls that, or throws an exception if no handler was registered for that element.
      Parameters:
      element - The element.
      Throws:
      SAXException - Something has gone wrong.
      Since:
      0.1.0
    • processElement

      Processing of an element of the XML file. This method will be called by endElement() any time an element was closed.

      The default implementation looks up a method handler in the map of element handlers and calls that, or throws an exception if no handler was registered for that element.
      Parameters:
      element - The element.
      Throws:
      SAXException - Something has gone wrong.
      Since:
      0.1.0
    • processingInstruction

      @MountPoint public void processingInstruction(String target, String data) throws SAXException
      Receives notification of a processing instruction.

      This implementation does nothing by default. Application writers may override this method in a subclass to take specific actions for each processing instruction, such as setting status variables or invoking other methods.
      Specified by:
      processingInstruction in interface ContentHandler
      Parameters:
      target - The processing instruction target.
      data - The processing instruction data, or null if none is supplied.
      Throws:
      SAXException - Any SAX exception, possibly wrapping another exception.
    • registerElementHandler

      Adds an element handler to the map of handler methods.
      Parameters:
      qName - The qualified name of the elements that should be processed by the handler .
      method - The method reference for the handler.
    • retrieveCurrentColumn

      protected final int retrieveCurrentColumn()
      Returns the current column number in the XML file. A negative value indicates that the column is unknown.
      Returns:
      The current column number.
    • retrieveCurrentLine

      protected final int retrieveCurrentLine()
      Returns the current line number in the XML file. A negative value indicates that the line is unknown.
      Returns:
      The current line number.
    • retrieveCurrentNamespace

      Returns the namespace for the current element (that one that is on top of the element stack).
      Returns:
      An instance of Optional that holds the namespace for the current element. Will be empty if there is no namespace for the current element.
      Throws:
      SAXException - An error occurred while retrieving the namespace information.
      Since:
      0.1.0
    • retrieveNamespace

      @API(status=MAINTAINED, since="0.1.0") protected final Optional<URI> retrieveNamespace(String prefix)
      Returns the URI of the namespace for the given prefix.
      Parameters:
      prefix - The prefix.
      Returns:
      An instance of Optional that holds the namespace for the prefix. Will be empty if there is no namespace for the given prefix.
      Since:
      0.1.0
    • retrievePrefix

      @API(status=MAINTAINED, since="0.1.0") protected final Optional<String> retrievePrefix(URI namespace)
      Returns the registered prefix for the given namespace. If more than one prefix is registered for the same namespace, only that one that is alphabetically the first one will be returned.
      Parameters:
      namespace - The URI for the namespace.
      Returns:
      An instance of Optional that holds the registered prefix.
      Since:
      0.1.0
    • setDocumentLocator

      public final void setDocumentLocator(Locator locator)
      Receives an object for locating the origin of SAX document events.

      SAX parsers are strongly encouraged (though not absolutely required) to supply a locator: if it does so, it must supply the locator to the application by invoking this method before invoking any of the other methods in the ContentHandler interface.

      The locator allows the application to determine the end position of any document-related event, even if the parser is not reporting an error. Typically, the application will use this information for reporting its own errors (such as character content that does not match an application's business rules). The information returned by the locator is probably not sufficient for use with a search engine.

      Note that the locator will return correct information only during the invocation SAX event callbacks after startDocument returns and before endDocument is called. The application should not attempt to use it at any other time.
      Specified by:
      setDocumentLocator in interface ContentHandler
      Parameters:
      locator - An object that can return the location of any SAX document event.
    • skippedEntity

      @MountPoint public void skippedEntity(String name) throws SAXException

      Receives notification of a skipped entity.

      This implementation does nothing by default. Application writers may override this method in a subclass to take specific actions for each processing instruction, such as setting status variables or invoking other methods.

      Specified by:
      skippedEntity in interface ContentHandler
      Parameters:
      name - The name of the skipped entity.
      Throws:
      SAXException - Any SAX exception, possibly wrapping another exception.
    • startElement

      public final void startElement(String uri, String localName, String qName, Attributes attributes) throws SAXException
      Receives the notification about the start of an element.
      Specified by:
      startElement in interface ContentHandler
      Parameters:
      uri - The URI for the namespace of this element; can be empty.
      localName - The local name of the element.
      qName - The element's qualified name.
      attributes - The element's attributes.
      Throws:
      SAXException - The element was not correct according to the DTD.
    • startDocument

      @MountPoint public void startDocument() throws SAXException
      Receives the notification of the beginning of the document.

      This implementation does nothing by default. Application writers may override this method in a subclass to take specific actions at the beginning of a document (such as allocating the root node of a tree or creating an output file).
      Specified by:
      startDocument in interface ContentHandler
      Throws:
      SAXException - Any SAX exception, possibly wrapping another exception.
    • startPrefixMapping

      public final void startPrefixMapping(String prefix, String uri) throws SAXException
      Receives the notification of the start of a Namespace mapping.
      Specified by:
      startPrefixMapping in interface ContentHandler
      Parameters:
      prefix - The Namespace prefix being declared.
      uri - The Namespace URI mapped to the prefix.
      Throws:
      SAXException - Any SAX exception, possibly wrapping another exception.