|
||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||
SUMMARY: INNER | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Object | +--gnu.xml.pipeline.DomConsumer
This consumer builds a DOM Document from its input, acting either as a
pipeline terminus or as an intermediate buffer. When a document's worth
of events has been delivered to this consumer, that document is read with
a DomParser
and sent to the next consumer. It is also available
as a read-once property.
The DOM tree is constructed as faithfully as possible. There are some complications since a DOM should expose behaviors that can't be implemented without API backdoors into that DOM, and because some SAX parsers don't report all the information that DOM permits to be exposed. The general problem areas involve information from the Document Type Declaration (DTD). DOM only represents a limited subset, but has some behaviors that depend on much deeper knowledge of a document's DTD. You shouldn't have much to worry about unless you change handling of "extra" nodes from its default setting (which ignores them all); note if you use JAXP to populate your DOM trees, it wants to save "extra" nodes by default. Otherwise, your main worry will be if you use a SAX parser that doesn't flag ignorable whitespace unless it's validating (few don't).
The SAX2 events used as input must contain XML Names for elements
and attributes, with original prefixes. In SAX2,
this is optional unless the "namespace-prefixes" parser feature is set.
Moreover, many application components won't provide completely correct
structures anyway. Before you convert a DOM to an output document,
you should plan to postprocess it to create or repair such namespace
information. The NSFilter
pipeline stage does such work.
Note: changes late in DOM L2 process made it impractical to
attempt to create the DocumentType node in any implementation-neutral way,
much less to populate it (L1 didn't support even creating such nodes).
To create and populate such a node, subclass the inner
DomConsumer.Handler
class and teach it about the backdoors into
whatever DOM implementation you want. It's possible that some revised
DOM API will finally resolve this problem.
DomParser
Inner Class Summary | |
static class |
DomConsumer.Handler
Class used to intercept various parsing events and use them to populate a DOM document. |
Constructor Summary | |
DomConsumer(java.lang.Class impl)
Configures this consumer to use the specified implementation of DOM when constructing its result value. |
|
DomConsumer(java.lang.Class impl,
EventConsumer n)
Configures this consumer as a buffer/filter, using the system default DOM implementation when constructing its result value. |
Method Summary | |
ContentHandler |
getContentHandler()
Returns the document handler being used. |
Document |
getDocument()
Returns the document constructed from the preceding sequence of events. |
DTDHandler |
getDTDHandler()
Returns the DTD handler being used. |
java.lang.Object |
getProperty(java.lang.String id)
Returns the lexical handler being used. |
boolean |
isExpandingReferences()
Returns true if the consumer is expanding entity references in place (the default), and false if childless EntityReference nodes should instead be created. |
boolean |
isHidingComments()
Returns true if the consumer is hiding comments (the default), and false if they should be placed into the output document. |
boolean |
isHidingWhitespace()
Returns true if the consumer is hiding ignorable whitespace (the default), and false if such whitespace should be placed into the output document as children of element nodes. |
boolean |
isSavingExtraNodes()
Returns true if the consumer is saving "extra" nodes, and false (the default) otherwise. |
boolean |
isUsingNamespaces()
Returns true (the default for L2 DOM implementations) if the consumer is using an "XML + Namespaces" style DOM construction, which will cause fatal errors on some legal XML 1.0 documents. |
void |
setErrorHandler(ErrorHandler handler)
This method provides a filter stage with a handler that abstracts presentation of warnings and both recoverable and fatal errors. |
void |
setExpandingReferences(boolean flag)
Controls whether the consumer will expand entity references in place, or will instead replace them with childless entity reference nodes. |
protected void |
setHandler(DomConsumer.Handler h)
This is the hook through which a subclass provides a handler which knows how to access DOM extensions, specific to some implementation, to record additional data in a DOM. |
void |
setHidingComments(boolean flag)
Controls whether the consumer is hiding comments. |
void |
setHidingWhitespace(boolean flag)
Controls whether the consumer hides ignorable whitespace |
void |
setSavingExtraNodes(boolean flag)
Controls whether the consumer will save "extra" nodes. |
void |
setUsingNamespaces(boolean flag)
Controls whether the consumer uses an "XML + Namespaces" style DOM construction. |
Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Constructor Detail |
public DomConsumer(java.lang.Class impl) throws SAXException
impl
- class implementing Document
which publicly exposes a default constructorSAXException
- when there is a problem creating an
empty DOM document using the specified implementationpublic DomConsumer(java.lang.Class impl, EventConsumer n) throws SAXException
This event consumer acts as a buffer and filter, in that it
builds a DOM tree and then writes it out when endDocument
is invoked. Because of the limitations of DOM, much information
will as a rule not be seen in that replay. To get a full fidelity
copy of the input event stream, use a TeeConsumer
.
impl
- class implementing Document
which publicly exposes a default constructornext
- receives a "replayed" sequence of parse events when
the endDocument method is invoked.SAXException
- when there is a problem creating an
empty DOM document using the specified DOM implementationMethod Detail |
protected void setHandler(DomConsumer.Handler h)
public final Document getDocument()
public void setErrorHandler(ErrorHandler handler)
EventConsumer
setErrorHandler
in interface EventConsumer
gnu.xml.pipeline.EventConsumer
handler
- encapsulates error handling policy for this stagepublic final boolean isExpandingReferences()
setExpandingReferences(boolean)
public final void setExpandingReferences(boolean flag)
flag
- True iff extra nodes should be saved; false otherwise.isExpandingReferences()
public final boolean isHidingComments()
setHidingComments(boolean)
public final void setHidingComments(boolean flag)
isHidingComments()
public final boolean isHidingWhitespace()
setHidingWhitespace(boolean)
public final void setHidingWhitespace(boolean flag)
isHidingComments()
public final boolean isSavingExtraNodes()
You may not consistently see all these node types even if you set this flag to true. Only Level 2 DOM implementations can create DocumentType nodes portably, but they can't be populated with any portable APIs. No DOM implementation can populate EntityReference nodes with any portable APIs. Not all parsers expose comment and CDATA nodes, but if they do than most DOM implementations are able to expose those nodes. Any SAX parser may expose ignorable whitespace, and most do so, so stripping out such whitespace is the most reliable of this set of inconsistently supportable DOM features.
setSavingExtraNodes(boolean)
public final void setSavingExtraNodes(boolean flag)
flag
- True iff extra nodes should be saved; false otherwise.isSavingExtraNodes()
public boolean isUsingNamespaces()
setUsingNamespaces(boolean)
public void setUsingNamespaces(boolean flag)
flag
- True iff namespaces should be enforced; else false.isUsingNamespaces()
public final ContentHandler getContentHandler()
getContentHandler
in interface EventConsumer
public final DTDHandler getDTDHandler()
getDTDHandler
in interface EventConsumer
public final java.lang.Object getProperty(java.lang.String id) throws SAXNotRecognizedException
getProperty
in interface EventConsumer
gnu.xml.pipeline.EventConsumer
id
- This is a URI identifying the type of property desired.SAXNotRecognizedException
- Thrown if the particular
pipeline stage does not understand the specified identifier.
|
Source code is GPL'd in the JAXP subproject at http://savannah.gnu.org/projects/classpathx |
|||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||
SUMMARY: INNER | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |