xsxp man page on Ubuntu

Man page or keyword search:  
man Server   6591 pages
apropos Keyword Search (all sections)
Output format
Ubuntu logo
[printable version]

xsxp(3tcl)		  eXtremely Simple Xml Parser		    xsxp(3tcl)

______________________________________________________________________________

NAME
       xsxp - eXtremely Simple Xml Parser

SYNOPSIS
       package require Tcl  8.4

       package require xml

       xsxp::parse xml

       xsxp::fetch pxml path ?part?

       xsxp::fetchall pxml_list path ?part?

       xsxp::only pxml tagname

       xsxp::prettyprint pxml ?chan?

_________________________________________________________________

DESCRIPTION
       This package provides a simple interface to parse XML into a pure-value
       list.  It also provides accessor routines to pull out specific subtags,
       not  unlike  DOM	 access.   This package was written for and is used by
       Darren New's Amazon S3 access package.

       This is pretty lame, but I needed something like this for  S3,  and  at
       the  time,  TclDOM  would  not work with the new 8.5 Tcl due to version
       number problems.

       In addition, this is a pure-value implementation. There is  no  garbage
       to  clean up in the event of a thrown error, for example.  This simpli‐
       fies the code for sufficiently small XML documents, which is what  Ama‐
       zon's S3 guarantees.

       Copyright  2006	Darren New. All Rights Reserved.  NO WARRANTIES OF ANY
       TYPE ARE PROVIDED.  COPYING OR USE INDEMNIFIES THE AUTHOR IN ALL	 WAYS.
       This  software is licensed under essentially the same terms as Tcl. See
       LICENSE.txt for the terms.

COMMANDS
       The package implements five rather simple procedures.  One parses,  one
       is  for	debugging, and the rest pull various parts of the parsed docu‐
       ment out for processing.

       xsxp::parse xml
	      This parses an XML document (using the standard xml tcllib  mod‐
	      ule  in  a SAX sort of way) and builds a data structure which it
	      returns if the parsing succeeded. The return value  is  referred
	      to herein as a "pxml", or "parsed xml". The list consists of two
	      or more elements:

	      ·	     The first element is the name of the tag.

	      ·	     The second element is  an	array-get  formatted  list  of
		     key/value	pairs.	The  keys  are attribute names and the
		     values are attribute values. This is  an  empty  list  if
		     there are no attributes on the tag.

	      ·	     The  third	 through  end elements are the children of the
		     node, if any. Each child is, recursively, a pxml.

	      ·	     Note that if the zero'th element, i.e. the tag  name,  is
		     "%PCDATA",	 then  the  attributes	will  be empty and the
		     third element will be the text of the element.  In	 addi‐
		     tion,  if	an element's contents consists only of PCDATA,
		     it will have only one child, and all the PCDATA  will  be
		     concatenated.  In	other  words, this parser works poorly
		     for XML with elements that contain both  child  tags  and
		     PCDATA.   Since  Amazon S3 does not do this (and for that
		     matter most uses of XML where XML is a poor choice	 don't
		     do this), this is probably not a serious limitation.

       xsxp::fetch pxml path ?part?
	      pxml  is	a parsed XML, as returned from xsxp::parse.  path is a
	      list of element tag names. Each element is the name of  a	 child
	      to  look up, optionally followed by a hash ("#") and a string of
	      digits. An empty list or an initial empty element selects	 pxml.
	      If  no hash sign is present, the behavior is as if "#0" had been
	      appended to that element. (In addition to a  list,  slashes  can
	      separate subparts where convenient.)

	      An element of path scans the children at the indicated level for
	      the n'th instance of a child whose tag matches the part  of  the
	      element  before the hash sign. If an element is simply "#"  fol‐
	      lowed by digits, that indexed child is selected,	regardless  of
	      the  tags in the children. Hence, an element of "#3" will always
	      select the fourth child of the node under consideration.

	      part defaults to "%ALL". It can be one of	 the  following	 case-
	      sensitive terms:

	      %ALL   returns the entire selected element.

	      %TAGNAME
		     returns lindex 0 of the selected element.

	      %ATTRIBUTES
		     returns index 1 of the selected element.

	      %CHILDREN
		     returns  lrange  2	 through  end of the selected element,
		     resulting in a list of elements being returned.

	      %PCDATA
		     returns a concatenation of all the bodies of direct chil‐
		     dren  of  this  node  whose tag is %PCDATA.  It throws an
		     error  if	no  such  children   are   found.   That   is,
		     part=%PCDATA  means  return  the textual content found in
		     that node but not its children nodes.

	      %PCDATA?
		     is like %PCDATA, but returns an empty string if no PCDATA
		     is found.

       For  example,  to fetch the first bold text from the fifth paragraph of
       the body of your HTML file,
       xsxp::fetch $pxml {html body p#4 b} %PCDATA

       xsxp::fetchall pxml_list path ?part?
	      This iterates over each PXML in pxml_list (which must be a  list
	      of  pxmls)  selecting the indicated path from it, building a new
	      list with the selected data, and returning that new list.

	      For example, pxml_list might be the %CHILDREN  of	 a  particular
	      element,	and  the  path and part might select from each child a
	      sub-element in which we're interested.

       xsxp::only pxml tagname
	      This iterates over the direct children of pxml and selects  only
	      those with tagname as their tag. Returns a list of matching ele‐
	      ments.

       xsxp::prettyprint pxml ?chan?
	      This outputs to chan (default stdout) a  pretty-printed  version
	      of pxml.

BUGS, IDEAS, FEEDBACK
       This  document,	and the package it describes, will undoubtedly contain
       bugs and other problems.	 Please report such in the category  amazon-s3
       of	the	  Tcllib       SF	Trackers       [http://source‐
       forge.net/tracker/?group_id=12883].  Please also report any  ideas  for
       enhancements you may have for either package and/or documentation.

KEYWORDS
       dom, parser, xml

CATEGORY
       Text processing

COPYRIGHT
       Copyright (c) Copyright 2006 Darren New. All Rights Reserved.

amazon-s3			      1.0			    xsxp(3tcl)
[top]

List of man pages available for Ubuntu

Copyright (c) for man pages and the logo by the respective OS vendor.

For those who want to learn more, the polarhome community provides shell access and support.

[legal] [privacy] [GNU] [policy] [cookies] [netiquette] [sponsors] [FAQ]
Tweet
Polarhome, production since 1999.
Member of Polarhome portal.
Based on Fawad Halim's script.
....................................................................
Vote for polarhome
Free Shell Accounts :: the biggest list on the net