summaryrefslogtreecommitdiff
path: root/sfeed.5
diff options
context:
space:
mode:
authorHiltjo Posthuma <hiltjo@codemadness.org>2016-04-10 19:51:18 +0200
committerHiltjo Posthuma <hiltjo@codemadness.org>2016-04-10 19:51:18 +0200
commiteb6fe6f11a14afc82cd0039d88759d6c1c524d2f (patch)
tree641ae3fc5f338e359addb33619f5ee98686e1575 /sfeed.5
parent969ec64ef3195e00ae597e49a39e804bb6ce6464 (diff)
improve documentation, add sfeed(5) for the file format
separate sfeed(5) page for just the feed file format.
Diffstat (limited to 'sfeed.5')
-rw-r--r--sfeed.550
1 files changed, 50 insertions, 0 deletions
diff --git a/sfeed.5 b/sfeed.5
new file mode 100644
index 0000000..7e978b1
--- /dev/null
+++ b/sfeed.5
@@ -0,0 +1,50 @@
+.Dd April 10, 2016
+.Dt SFEED 5
+.Os
+.Sh NAME
+.Nm sfeed
+.Nd feed format
+.Sh SYNOPSIS
+.Nm
+.Sh DESCRIPTION
+.Xr sfeed 1
+reads RSS or Atom feed data (XML) from stdin. It writes the feed data in a
+TAB-separated format to stdout.
+.Sh TAB-SEPARATED FORMAT FIELDS
+The fields: title, id, author are not allowed to have newlines and TABs, all
+whitespace characters are replaced by a single space character. Control
+characters are removed.
+.Pp
+The content field can contain newlines and is escaped. TABs, newlines and '\\'
+are escaped with '\\', so it becomes: '\\t', '\\n' and '\\\\'. Other whitespace
+characters except space are removed. Control characters are removed.
+.Pp
+The order and format of the fields are:
+.Bl -tag -width 17n
+.It item timestamp
+UNIX timestamp in UTC+0, empty on parse failure.
+.It item title
+Title text, HTML in titles is treated as plain-text.
+.It item link
+Absolute url, unsafe characters are encoded.
+.It item content
+Newlines and TABs are escaped. Control characters are removed. See the
+.Sx TAB-SEPARATED FORMAT FIELDS
+text.
+.It item content\-type
+"html" or "plain".
+.It item id
+RSS item GUID or Atom id.
+.It item author
+Item author.
+.It feed type
+"rss" or "atom".
+.El
+.Sh SEE ALSO
+.Xr sfeed 1
+.Sh AUTHORS
+.An Hiltjo Posthuma Aq Mt hiltjo@codemadness.org
+.Sh CAVEATS
+if a timezone is not supported (non-RFC-822) the UNIX timestamp is interpreted
+as UTC+0.
+HTML in titles is treated as plain-text.