summaryrefslogtreecommitdiff
path: root/README.xml
blob: ee205c028f88619f0517349f23b456fa50e980d3 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
XML parser
----------

A small XML parser.

For the original version see:
https://git.codemadness.org/xmlparser/


Dependencies
------------

- C compiler (ANSI).


Features
--------

- Relatively small parser.
- Pretty simple API.
- Pretty fast.
- Portable
- No dynamic memory allocation.


Supports
--------

- Tags in short-form (<img src="lolcat.jpg" title="Meow" />).
- Tag attributes.
- Short attributes without an explicitly set value (<input type="checkbox" checked />).
- Comments
- CDATA sections.
- Helper function (xml_entitytostr) to convert XML 1.0 / HTML 2.0 named entities
  and numeric entities to UTF-8.
- Reading XML from a fd, string buffer or implement a custom reader:
  see: XMLParser.getnext or GETNEXT() macro.


Caveats
-------

- It is not a compliant XML parser.
- Performance: data is buffered even if a handler is not set: to make parsing
  faster change this code from xml.c.
- The XML is not checked for errors so it will continue parsing XML data, this
  is by design.
- Internally fixed-size buffers are used, callbacks like XMLParser.xmldata are
  called multiple times for the same tag if the data size is bigger than the
  internal buffer size (sizeof(XMLParser.data)). To differentiate between new
  calls for data you can use the xml*start and xml*end handlers.
- It does not handle XML white-space rules for tag data. The raw values
  including white-space is passed. This is useful in some cases, like for
  HTML <pre> tags.
- The XML specification has no limits on tag and attribute names. For
  simplicity/sanity sake this XML parser takes some liberties. Tag and
  attribute names are truncated if they are excessively long.
- Entity expansions are not parsed as well as DOCTYPE, ATTLIST etc.


Files used
----------

xml.c and xml.h


Interface / API
---------------

Should be trivial, see xml.c and xml.h and the examples below.


Examples
--------

sfeed_opml_import.c or sfeed_web.c or sfeed_xmlenc.c

See skeleton.c in the original xmlparser repository for a base program to start
quickly.


License
-------

ISC, see LICENSE file.