Design and Implementation of a validating XML parser in Haskell: Master's thesis; University of Applied Sciences Wedel | ||
---|---|---|
Prev | Chapter 3. Package hparser | Next |
The module HdomParser provides functions for parsing XML files and building the generic tree data structure XmlTree from these documents. The whole parsing process takes place in the State-I/O monad of the module XmlState, so that well-formedness errors can be reported and different computations can be traced by outputting their results.
Parse functions
Parses the file specified by the first parameter.
Parses an XML file specified by a command line argument:
--source "source file" - XML file to parse
--encoding "encoding" - Encoding scheme used in the file (optional)
The following example shows how an XML parser is constructed in the module HdomParser. The whole parsing process takes place in the State-I/O monad defined in the module XmlState. All computations are of type XmlStateFilter.
processXmlN :: Int -> XmlTree -> IO [XmlTree] processXmlN n t0 = run' $ do setSysState (selXTagAttrl . getNode $ t0) setTraceLevel n t1 <- getXmlContents $ t0 t2 <- parseXmlDoc $$< t1 t3 <- liftM transfAllCharRef $$< t2 t4 <- processDTD $$< t3 t5 <- processGeneralEntities $$< t4 el <- getErrorLevel return ( if el == 0 then t5 else [] ) |
Actions during parsing
Returns a filter for reading the XML file. The filename is retrieved from the attribute with the name "source" which must be part of the initial node t0.
Parses the XML file and builds the XmlTree.
The XmlFilter transfAllCharRef has to be lifted to an XmlStateFilter. The filter substitutes character references by their characters.
Substitutes parameter entities, adds include sections and removes exclude sections of DTDs, merges internal and external DTD subsets.
Substitutes general entities.
If an error occurred during applying the XmlStateFilters, processXmlN returns an empty list, otherwise it returns the constructed XmlTree.