xml - The Go Programming Language

Golang

Package xml

import "encoding/xml"
Overview
Index
Examples

Overview ?

Overview ?

Package xml implements a simple XML 1.0 parser that understands XML name spaces.

Index

Constants
Variables
func Escape(w io.Writer, s []byte)
func Marshal(v interface{}) ([]byte, error)
func MarshalIndent(v interface{}, prefix, indent string) ([]byte, error)
func Unmarshal(data []byte, v interface{}) error
type Attr
type CharData
    func (c CharData) Copy() CharData
type Comment
    func (c Comment) Copy() Comment
type Decoder
    func NewDecoder(r io.Reader) *Decoder
    func (d *Decoder) Decode(v interface{}) error
    func (d *Decoder) DecodeElement(v interface{}, start *StartElement) error
    func (d *Decoder) RawToken() (Token, error)
    func (d *Decoder) Skip() error
    func (d *Decoder) Token() (t Token, err error)
type Directive
    func (d Directive) Copy() Directive
type Encoder
    func NewEncoder(w io.Writer) *Encoder
    func (enc *Encoder) Encode(v interface{}) error
type EndElement
type Name
type ProcInst
    func (p ProcInst) Copy() ProcInst
type StartElement
    func (e StartElement) Copy() StartElement
type SyntaxError
    func (e *SyntaxError) Error() string
type TagPathError
    func (e *TagPathError) Error() string
type Token
    func CopyToken(t Token) Token
type UnmarshalError
    func (e UnmarshalError) Error() string
type UnsupportedTypeError
    func (e *UnsupportedTypeError) Error() string
Bugs

Examples

MarshalIndent
Unmarshal

Package files

marshal.go read.go typeinfo.go xml.go

Constants

const (
    // A generic XML header suitable for use with the output of Marshal.
    // This is not automatically added to any output of this package,
    // it is provided as a convenience.
    Header = `<?xml version="1.0" encoding="UTF-8"?>` + "\n"
)

Variables

var HTMLAutoClose = htmlAutoClose

HTMLAutoClose is the set of HTML elements that should be considered to close automatically.

var HTMLEntity = htmlEntity

HTMLEntity is an entity map containing translations for the standard HTML entity characters.

func Escape

func Escape(w io.Writer, s []byte)

Escape writes to w the properly escaped XML equivalent of the plain text data s.

func Marshal

func Marshal(v interface{}) ([]byte, error)

Marshal returns the XML encoding of v.

Marshal handles an array or slice by marshalling each of the elements. Marshal handles a pointer by marshalling the value it points at or, if the pointer is nil, by writing nothing. Marshal handles an interface value by marshalling the value it contains or, if the interface value is nil, by writing nothing. Marshal handles all other data by writing one or more XML elements containing the data.

The name for the XML elements is taken from, in order of preference:

- the tag on the XMLName field, if the data is a struct
- the value of the XMLName field of type xml.Name
- the tag of the struct field used to obtain the data
- the name of the struct field used to obtain the data
- the name of the marshalled type

The XML element for a struct contains marshalled elements for each of the exported fields of the struct, with these exceptions:

- the XMLName field, described above, is omitted.
- a field with tag "-" is omitted.
- a field with tag "name,attr" becomes an attribute with
  the given name in the XML element.
- a field with tag ",attr" becomes an attribute with the
  field name in the in the XML element.
- a field with tag ",chardata" is written as character data,
  not as an XML element.
- a field with tag ",innerxml" is written verbatim, not subject
  to the usual marshalling procedure.
- a field with tag ",comment" is written as an XML comment, not
  subject to the usual marshalling procedure. It must not contain
  the "--" string within it.
- a field with a tag including the "omitempty" option is omitted
  if the field value is empty. The empty values are false, 0, any
  nil pointer or interface value, and any array, slice, map, or
  string of length zero.
- a non-pointer anonymous struct field is handled as if the
  fields of its value were part of the outer struct.

If a field uses a tag "a>b>c", then the element c will be nested inside parent elements a and b. Fields that appear next to each other that name the same parent will be enclosed in one XML element.

See MarshalIndent for an example.

Marshal will return an error if asked to marshal a channel, function, or map.

func MarshalIndent

func MarshalIndent(v interface{}, prefix, indent string) ([]byte, error)

MarshalIndent works like Marshal, but each XML element begins on a new indented line that starts with prefix and is followed by one or more copies of indent according to the nesting depth.

? Example

? Example

Code:

type Address struct {
    City, State string
}
type Person struct {
    XMLName   xml.Name `xml:"person"`
    Id        int      `xml:"id,attr"`
    FirstName string   `xml:"name>first"`
    LastName  string   `xml:"name>last"`
    Age       int      `xml:"age"`
    Height    float32  `xml:"height,omitempty"`
    Married   bool
    Address
    Comment string `xml:",comment"`
}

v := &Person{Id: 13, FirstName: "John", LastName: "Doe", Age: 42}
v.Comment = " Need more details. "
v.Address = Address{"Hanga Roa", "Easter Island"}

output, err := xml.MarshalIndent(v, "  ", "    ")
if err != nil {
    fmt.Printf("error: %v\n", err)
}

os.Stdout.Write(output)

Output:

<person id="13">
      <name>
          <first>John</first>
          <last>Doe</last>
      </name>
      <age>42</age>
      <Married>false</Married>
      <City>Hanga Roa</City>
      <State>Easter Island</State>
      <!-- Need more details. -->
  </person>

func Unmarshal

func Unmarshal(data []byte, v interface{}) error

Unmarshal parses the XML-encoded data and stores the result in the value pointed to by v, which must be an arbitrary struct, slice, or string. Well-formed data that does not fit into v is discarded.

Because Unmarshal uses the reflect package, it can only assign to exported (upper case) fields. Unmarshal uses a case-sensitive comparison to match XML element names to tag values and struct field names.

Unmarshal maps an XML element to a struct using the following rules. In the rules, the tag of a field refers to the value associated with the key 'xml' in the struct field's tag (see the example above).

* If the struct has a field of type []byte or string with tag
   ",innerxml", Unmarshal accumulates the raw XML nested inside the
   element in that field.  The rest of the rules still apply.

* If the struct has a field named XMLName of type xml.Name,
   Unmarshal records the element name in that field.

* If the XMLName field has an associated tag of the form
   "name" or "namespace-URL name", the XML element must have
   the given name (and, optionally, name space) or else Unmarshal
   returns an error.

* If the XML element has an attribute whose name matches a
   struct field name with an associated tag containing ",attr" or
   the explicit name in a struct field tag of the form "name,attr",
   Unmarshal records the attribute value in that field.

* If the XML element contains character data, that data is
   accumulated in the first struct field that has tag "chardata".
   The struct field may have type []byte or string.
   If there is no such field, the character data is discarded.

* If the XML element contains comments, they are accumulated in
   the first struct field that has tag ",comments".  The struct
   field may have type []byte or string.  If there is no such
   field, the comments are discarded.

* If the XML element contains a sub-element whose name matches
   the prefix of a tag formatted as "a" or "a>b>c", unmarshal
   will descend into the XML structure looking for elements with the
   given names, and will map the innermost elements to that struct
   field. A tag starting with ">" is equivalent to one starting
   with the field name followed by ">".

* If the XML element contains a sub-element whose name matches
   a struct field's XMLName tag and the struct field has no
   explicit name tag as per the previous rule, unmarshal maps
   the sub-element to that struct field.

* If the XML element contains a sub-element whose name matches a
   field without any mode flags (",attr", ",chardata", etc), Unmarshal
   maps the sub-element to that struct field.

* If the XML element contains a sub-element that hasn't matched any
   of the above rules and the struct has a field with tag ",any",
   unmarshal maps the sub-element to that struct field.

* A non-pointer anonymous struct field is handled as if the
   fields of its value were part of the outer struct.

* A struct field with tag "-" is never unmarshalled into.

Unmarshal maps an XML element to a string or []byte by saving the concatenation of that element's character data in the string or []byte. The saved []byte is never nil.

Unmarshal maps an attribute value to a string or []byte by saving the value in the string or slice.

Unmarshal maps an XML element to a slice by extending the length of the slice and mapping the element to the newly created value.

Unmarshal maps an XML element or attribute value to a bool by setting it to the boolean value represented by the string.

Unmarshal maps an XML element or attribute value to an integer or floating-point field by setting the field to the result of interpreting the string value in decimal. There is no check for overflow.

Unmarshal maps an XML element to an xml.Name by recording the element name.

Unmarshal maps an XML element to a pointer by setting the pointer to a freshly allocated value and then mapping the element to that value.

? Example

? Example

This example demonstrates unmarshaling an XML excerpt into a value with some preset fields. Note that the Phone field isn't modified and that the XML <Company> element is ignored. Also, the Groups field is assigned considering the element path provided in its tag.

Code:

type Email struct {
    Where string `xml:"where,attr"`
    Addr  string
}
type Address struct {
    City, State string
}
type Result struct {
    XMLName xml.Name `xml:"Person"`
    Name    string   `xml:"FullName"`
    Phone   string
    Email   []Email
    Groups  []string `xml:"Group>Value"`
    Address
}
v := Result{Name: "none", Phone: "none"}

data := `
    <Person>
        <FullName>Grace R. Emlin</FullName>
        <Company>Example Inc.</Company>
        <Email where="home">
            <Addr>[email protected]</Addr>
        </Email>
        <Email where='work'>
            <Addr>[email protected]</Addr>
        </Email>
        <Group>
            <Value>Friends</Value>
            <Value>Squash</Value>
        </Group>
        <City>Hanga Roa</City>
        <State>Easter Island</State>
    </Person>
`
err := xml.Unmarshal([]byte(data), &v)
if err != nil {
    fmt.Printf("error: %v", err)
    return
}
fmt.Printf("XMLName: %#v\n", v.XMLName)
fmt.Printf("Name: %q\n", v.Name)
fmt.Printf("Phone: %q\n", v.Phone)
fmt.Printf("Email: %v\n", v.Email)
fmt.Printf("Groups: %v\n", v.Groups)
fmt.Printf("Address: %v\n", v.Address)

Output:

XMLName: xml.Name{Space:"", Local:"Person"}
Name: "Grace R. Emlin"
Phone: "none"
Email: [{home [email protected]} {work [email protected]}]
Groups: [Friends Squash]
Address: {Hanga Roa Easter Island}

type Attr

type Attr struct {
    Name  Name
    Value string
}

An Attr represents an attribute in an XML element (Name=Value).

type CharData

type CharData []byte

A CharData represents XML character data (raw text), in which XML escape sequences have been replaced by the characters they represent.

func (CharData) Copy

func (c CharData) Copy() CharData

type Comment

type Comment []byte

A Comment represents an XML comment of the form <!--comment-->. The bytes do not include the <!-- and --> comment markers.

func (Comment) Copy

func (c Comment) Copy() Comment

type Decoder

type Decoder struct {
    // Strict defaults to true, enforcing the requirements
    // of the XML specification.
    // If set to false, the parser allows input containing common
    // mistakes:
    //	* If an element is missing an end tag, the parser invents
    //	  end tags as necessary to keep the return values from Token
    //	  properly balanced.
    //	* In attribute values and character data, unknown or malformed
    //	  character entities (sequences beginning with &) are left alone.
    //
    // Setting:
    //
    //	d.Strict = false;
    //	d.AutoClose = HTMLAutoClose;
    //	d.Entity = HTMLEntity
    //
    // creates a parser that can handle typical HTML.
    Strict bool

    // When Strict == false, AutoClose indicates a set of elements to
    // consider closed immediately after they are opened, regardless
    // of whether an end element is present.
    AutoClose []string

    // Entity can be used to map non-standard entity names to string replacements.
    // The parser behaves as if these standard mappings are present in the map,
    // regardless of the actual map content:
    //
    //	"lt": "<",
    //	"gt": ">",
    //	"amp": "&",
    //	"apos": "'",
    //	"quot": `"`,
    Entity map[string]string

    // CharsetReader, if non-nil, defines a function to generate
    // charset-conversion readers, converting from the provided
    // non-UTF-8 charset into UTF-8. If CharsetReader is nil or
    // returns an error, parsing stops with an error. One of the
    // the CharsetReader's result values must be non-nil.
    CharsetReader func(charset string, input io.Reader) (io.Reader, error)
    // contains filtered or unexported fields
}

A Decoder represents an XML parser reading a particular input stream. The parser assumes that its input is encoded in UTF-8.

func NewDecoder

func NewDecoder(r io.Reader) *Decoder

NewDecoder creates a new XML parser reading from r.

func (*Decoder) Decode

func (d *Decoder) Decode(v interface{}) error

Decode works like xml.Unmarshal, except it reads the decoder stream to find the start element.

func (*Decoder) DecodeElement

func (d *Decoder) DecodeElement(v interface{}, start *StartElement) error

DecodeElement works like xml.Unmarshal except that it takes a pointer to the start XML element to decode into v. It is useful when a client reads some raw XML tokens itself but also wants to defer to Unmarshal for some elements.

func (*Decoder) RawToken

func (d *Decoder) RawToken() (Token, error)

RawToken is like Token but does not verify that start and end elements match and does not translate name space prefixes to their corresponding URLs.

func (*Decoder) Skip

func (d *Decoder) Skip() error

Skip reads tokens until it has consumed the end element matching the most recent start element already consumed. It recurs if it encounters a start element, so it can be used to skip nested structures. It returns nil if it finds an end element matching the start element; otherwise it returns an error describing the problem.

func (*Decoder) Token

func (d *Decoder) Token() (t Token, err error)

Token returns the next XML token in the input stream. At the end of the input stream, Token returns nil, io.EOF.

Slices of bytes in the returned token data refer to the parser's internal buffer and remain valid only until the next call to Token. To acquire a copy of the bytes, call CopyToken or the token's Copy method.

Token expands self-closing elements such as <br/> into separate start and end elements returned by successive calls.

Token guarantees that the StartElement and EndElement tokens it returns are properly nested and matched: if Token encounters an unexpected end element, it will return an error.

Token implements XML name spaces as described by http://www.w3.org/TR/REC-xml-names/. Each of the Name structures contained in the Token has the Space set to the URL identifying its name space when known. If Token encounters an unrecognized name space prefix, it uses the prefix as the Space rather than report an error.

type Directive

type Directive []byte

A Directive represents an XML directive of the form <!text>. The bytes do not include the <! and > markers.

func (Directive) Copy

func (d Directive) Copy() Directive

type Encoder

type Encoder struct {
    // contains filtered or unexported fields
}

An Encoder writes XML data to an output stream.

func NewEncoder

func NewEncoder(w io.Writer) *Encoder

NewEncoder returns a new encoder that writes to w.

func (*Encoder) Encode

func (enc *Encoder) Encode(v interface{}) error

Encode writes the XML encoding of v to the stream.

See the documentation for Marshal for details about the conversion of Go values to XML.

type EndElement

type EndElement struct {
    Name Name
}

An EndElement represents an XML end element.

type Name

type Name struct {
    Space, Local string
}

A Name represents an XML name (Local) annotated with a name space identifier (Space). In tokens returned by Decoder.Token, the Space identifier is given as a canonical URL, not the short prefix used in the document being parsed.

type ProcInst

type ProcInst struct {
    Target string
    Inst   []byte
}

A ProcInst represents an XML processing instruction of the form <?target inst?>

func (ProcInst) Copy

func (p ProcInst) Copy() ProcInst

type StartElement

type StartElement struct {
    Name Name
    Attr []Attr
}

A StartElement represents an XML start element.

func (StartElement) Copy

func (e StartElement) Copy() StartElement

type SyntaxError

type SyntaxError struct {
    Msg  string
    Line int
}

A SyntaxError represents a syntax error in the XML input stream.

func (*SyntaxError) Error

func (e *SyntaxError) Error() string

type TagPathError

type TagPathError struct {
    Struct       reflect.Type
    Field1, Tag1 string
    Field2, Tag2 string
}

A TagPathError represents an error in the unmarshalling process caused by the use of field tags with conflicting paths.

func (*TagPathError) Error

func (e *TagPathError) Error() string

type Token

type Token interface{}

A Token is an interface holding one of the token types: StartElement, EndElement, CharData, Comment, ProcInst, or Directive.

func CopyToken

func CopyToken(t Token) Token

CopyToken returns a copy of a Token.

type UnmarshalError

type UnmarshalError string

An UnmarshalError represents an error in the unmarshalling process.

func (UnmarshalError) Error

func (e UnmarshalError) Error() string

type UnsupportedTypeError

type UnsupportedTypeError struct {
    Type reflect.Type
}

A MarshalXMLError is returned when Marshal encounters a type that cannot be converted into XML.

func (*UnsupportedTypeError) Error

func (e *UnsupportedTypeError) Error() string

Bugs

Mapping between XML elements and data structures is inherently flawed: an XML element is an order-dependent collection of anonymous values, while a data structure is an order-independent collection of named values. See package json for a textual representation more suitable to data structures.