Age | Commit message (Collapse) | Author |
|
|
|
|
|
Tested using 1000 feeds, took about 20s, now 0.02s.
The overhead of sed and printf was too high.
Now pipe an intermediate TAB format to awk and print it in one stage.
|
|
|
|
|
|
|
|
see RFC2822 4.3 page 32:
"
[...]
However, because of the error in
[RFC822], they SHOULD all be considered equivalent to "-0000" unless
there is out-of-band information confirming their meaning.
"
|
|
|
|
|
|
|
|
... both are out-of-scope for sfeed.
- sfeed_tail can be written as some simple customized awk script reading from a
FIFO. The C version did not work well on FIFO's.
- Security considerations are mentioned in the official HTML spec and applies to
all HTML and protocol handlers, so is out-of-scope.
|
|
|
|
|
|
|
|
|
|
- handle type attribute for MRSS media:description,
media:description type="plain" is now parsed properly.
- handle default content-types per tag now.
- when multiple content-like fields are specified use the proper content-type.
- be flexible about type attribute handling.
- minor code tweaks.
|
|
this program does not store anything, but just write to stdout.
|
|
|
|
- add preface text.
- use "\t" pattern for awk (easier to read and copy-paste).
- add a small example to get the most recent enclosure.
|
|
|
|
|
|
|
|
|
|
The message-id has not been working as intended for a while. It only hashed the
timestamp field because parseline() modifies the buffer in-place.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
- Show how to filter protocol schemes more strictly. For example to allow only
http://, https:// and gopher:// (not file://, javascript:, etc).
- Filter links and now also enclosures.
|
|
Do not send referer header if the browser supports this tag. This makes sure in
some browsers where referer hiding is not setup this header is still hidden.
The proper way is to setup your browser environment however to strip/change the
referer header and trim your browser footprint.
|
|
|
|
|
|
... and simplify example in README.
|
|
char *line is a global variable (reused pointer to line buffer).
|
|
This is useful for example for podcasts (audio attachment), newsposts (usually
some image) or comic strips (link to page, image as enclosure).
thanks leot for the feedback!
|
|
- Better checking and verbose logging (on failure) of each stage:
fetchfeed, filter, merge, order, convertencoding. This makes sure on out-of-memory,
disk-space or other resource limits the output is not corrupted.
- This also has the added advantage it runs less processes (piped) at the same
time.
- Clear previous unneeded file to preserve space in /tmp
(/tmp is often mounted as mfs/tmpfs).
- Add logging function (able to override), use more logical logging format (pun
intended).
- Code-style: order overridable functions in execution order.
|
|
make the procmail example safer due to account process limits.
|
|
|
|
This reduces much function call overhead. getnext is defined in xml.h for
inline optimization. sfeed only uses one XML parser context per program, this
allows further optimizations of the compiler also.
On OpenBSD it was noticable because of retpoline etc function call overhead.
Using clang and a 500MB test XML file reduces processing time from +- 12s to
5s.
Tested using some crazy optimization flags:
SFEED_CFLAGS = -O3 -std=c99 -DGETNEXT=getchar_unlocked -fno-ret-protector \
-mno-retpoline -static
A GETNEXT macro is also nice for programs which mmap(2) some big XML file. Then
you can simply define:
#define GETNEXT() (off >= len ? EOF : reg[off++])
|
|
declare UTF-8 before <title>
|
|
|
|
|
|
on OpenBSD: make COMPATOBJ=
|
|
this allows to override x->getnext to expand to global context parsing and
allows the compiler to optimize this inline.
also remove checking if the x->getnext function exists (just crash hard).
|
|
POSIX says about snprintf:
"If an output error was encountered, these functions shall return a
negative value".
So check for < 0 instead of -1. Afaik all implementations return -1 though.
|
|
|
|
|