summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
-rw-r--r--README35
-rw-r--r--sfeed.130
-rw-r--r--sfeed.532
-rw-r--r--sfeed_frames.12
4 files changed, 37 insertions, 62 deletions
diff --git a/README b/README
index 8a1b900..da8cddc 100644
--- a/README
+++ b/README
@@ -87,7 +87,7 @@ Platforms tested
- Linux (glibc+gcc, musl-gcc, clang).
- NetBSD
-- OpenBSD
+- OpenBSD: (gcc, pcc).
- Windows (cygwin gcc, mingw).
@@ -132,36 +132,11 @@ feedname - TAB-separated format containing all items per feed. The
feedname.new - Temporary file used by sfeed_update(1) to merge items.
-TAB-separated format fields
----------------------------
+File format
+-----------
-The items are saved in a TSV-like format.
-
-The fields: title, id, author are not allowed to have newlines and TABs, all
-whitespace characters are replaced by a space character. Control characters are
-removed.
-
-The content field can contain newlines and TABS and are escaped. TABs, newlines
-and '\' are escaped with '\', so it becomes: '\t', '\n' and '\\'. Other
-whitespace characters except space are removed. Control characters are removed.
-
-The order and format of the fields are:
-
-item UNIX timestamp - UNIX timestamp (UTC+0), empty on parse failure.
-item title - Title text, HTML in titles is treated as
- plain-text.
-item link - Absolute url, unsafe characters are encoded.
-item content - Newlines and TABs are escaped. Control characters
- are removed. See the "TAB-separated format fields"
- text.
-item contenttype - "html" or "plain".
-item id - RSS item GUID or Atom id.
-item author - Item author.
-
-CAVEATS:
-- if a timezone is not supported (non-RFC-822) the UNIX timestamp is
- interpreted as UTC+0.
-- HTML in titles is not supported on purpose.
+man 5 sfeed
+man 1 sfeed
Usage and examples
diff --git a/sfeed.1 b/sfeed.1
index dc0d336..3784694 100644
--- a/sfeed.1
+++ b/sfeed.1
@@ -25,32 +25,30 @@ The content field can contain newlines and is escaped. TABs, newlines and '\\'
are escaped with '\\', so it becomes: '\\t', '\\n' and '\\\\'. Other whitespace
characters except space are removed. Control characters are removed.
.Pp
-The order and format of the fields are:
+The order and content of the fields are:
.Bl -tag -width 17n
-.It item timestamp
+.It timestamp
UNIX timestamp in UTC+0, empty on parse failure.
-.It item title
-Title text, HTML in titles is treated as plain-text.
-.It item link
+.It title
+Title text, HTML code in titles is ignored and is treated as plain-text.
+.It link
Absolute url, unsafe characters are encoded.
-.It item content
-Newlines and TABs are escaped. Control characters are removed. See the
-.Sx TAB-SEPARATED FORMAT FIELDS
-text.
-.It item content\-type
+.It content
+Content, can have plain-text or HTML code depending on the content\-type field.
+.It content\-type
"html" or "plain".
-.It item id
+.It id
RSS item GUID or Atom id.
-.It item author
+.It author
Item author.
.El
.Sh SEE ALSO
.Xr sfeed_plain 1 ,
-.Xr sfeed_update 1 ,
-.Xr sh 1
+.Xr sfeed 5
.Sh AUTHORS
.An Hiltjo Posthuma Aq Mt hiltjo@codemadness.org
.Sh CAVEATS
-if a timezone is not supported (non-RFC-822) the UNIX timestamp is interpreted
-as UTC+0.
+If a timezone is not in the RFC-822 or RFC-3332 format it is not supported and
+the UNIX timestamp is interpreted as UTC+0.
+.Pp
HTML in titles is treated as plain-text.
diff --git a/sfeed.5 b/sfeed.5
index 16b4bb1..17dc58a 100644
--- a/sfeed.5
+++ b/sfeed.5
@@ -11,6 +11,8 @@
reads RSS or Atom feed data (XML) from stdin. It writes the feed data in a
TAB-separated format to stdout.
.Sh TAB-SEPARATED FORMAT FIELDS
+The items are saved in a TSV-like format.
+.Pp
The fields: title, id, author are not allowed to have newlines and TABs, all
whitespace characters are replaced by a single space character. Control
characters are removed.
@@ -19,30 +21,30 @@ The content field can contain newlines and is escaped. TABs, newlines and '\\'
are escaped with '\\', so it becomes: '\\t', '\\n' and '\\\\'. Other whitespace
characters except space are removed. Control characters are removed.
.Pp
-The order and format of the fields are:
+The order and content of the fields are:
.Bl -tag -width 17n
-.It item timestamp
+.It timestamp
UNIX timestamp in UTC+0, empty on parse failure.
-.It item title
-Title text, HTML in titles is treated as plain-text.
-.It item link
+.It title
+Title text, HTML code in titles is ignored and is treated as plain-text.
+.It link
Absolute url, unsafe characters are encoded.
-.It item content
-Newlines and TABs are escaped. Control characters are removed. See the
-.Sx TAB-SEPARATED FORMAT FIELDS
-text.
-.It item content\-type
+.It content
+Content, can have plain-text or HTML code depending on the content\-type field.
+.It content\-type
"html" or "plain".
-.It item id
+.It id
RSS item GUID or Atom id.
-.It item author
+.It author
Item author.
.El
.Sh SEE ALSO
-.Xr sfeed 1
+.Xr sfeed 1 ,
+.Xr sfeed_plain 1
.Sh AUTHORS
.An Hiltjo Posthuma Aq Mt hiltjo@codemadness.org
.Sh CAVEATS
-if a timezone is not supported (non-RFC-822) the UNIX timestamp is interpreted
-as UTC+0.
+If a timezone is not in the RFC-822 or RFC-3332 format it is not supported and
+the UNIX timestamp is interpreted as UTC+0.
+.Pp
HTML in titles is treated as plain-text.
diff --git a/sfeed_frames.1 b/sfeed_frames.1
index 96118ad..b37866b 100644
--- a/sfeed_frames.1
+++ b/sfeed_frames.1
@@ -42,7 +42,7 @@ The maximum length of the path is PATH_MAX or filesystem-specific (truncated).
.Sh AUTHORS
.An Hiltjo Posthuma Aq Mt hiltjo@codemadness.org
.Sh SECURITY CONSIDERATIONS
-Each item file contain the item content formatted as HTML, if the feed data
+Each item content file contains the content formatted as HTML, if the feed data
contains HTML like Javascripts, tracking cookies, custom styles and such
these will also be displayed. Due to the crazy nature of "the web" these things
are complex to filter. Some security and privacy can be gained by using an