From cc9f0d5549b21bb6254aede2ff479698183ea5e3 Mon Sep 17 00:00:00 2001 From: Hiltjo Posthuma Date: Fri, 28 Sep 2018 17:11:56 +0200 Subject: sfeed_update: add filter(), order() support per feed + improvements Pass the name parameter to the functions and add these to the pipeline. They can be overridden in the config. - add the ability to change the merge logic per feed. - add the ability to filter lines and fields per feed. - add the ability to order lines differently per feed. - add filter example to README. - code-style: - fetchfeed consistency in parameter order. - change [ x"" = x"" ] to [ "" = "" ]. Simplify some if statements. - wrap long line in fetchfeed(). - use signal names for trap. --- README | 60 ++++++++++++++++++++++++++++++++++++++++++++++++++++++------ 1 file changed, 54 insertions(+), 6 deletions(-) (limited to 'README') diff --git a/README b/README index a98eb0b..5c7dab1 100644 --- a/README +++ b/README @@ -127,12 +127,18 @@ Files read at runtime by sfeed_update(1) ---------------------------------------- sfeedrc - Config file. This file is evaluated as a shellscript in - sfeed_update(1). You can for example override the fetchfeed() - function to use wget(1), OpenBSD ftp(1) an other download program or - you can override the merge() function to change the merge logic. The - function feeds() is called to fetch the feeds. The function feed() - can safely be executed concurrently as a background job in your - sfeedrc(5) config file to make updating faster. + sfeed_update(1). + +Atleast the following functions can be overridden per feed: + +- fetchfeed: to use wget(1), OpenBSD ftp(1) or an other download program. +- merge: to change the merge logic. +- filter: to filter on fields. +- order: to change the sort order. + +The function feeds() is called to fetch the feeds. The function feed() can +safely be executed concurrently as a background job in your sfeedrc(5) config +file to make updating faster. Files written at runtime by sfeed_update(1) @@ -212,6 +218,48 @@ argument is optional): - - - +# filter fields. +# filter(name) +filter() { + case "$1" in + "tweakers") + LC_LOCALE=C awk -F ' ' 'BEGIN { + OFS = " "; + } + # skip ads. + $2 ~ /^ADV:/ { + next; + } + # shorten link. + { + if (match($3, /^https:\/\/tweakers\.net\/(nieuws|downloads|reviews|geek)\/[0-9]+\//)) { + $3 = substr($3, RSTART, RLENGTH); + } + print $0; + }';; + "yt BSDNow") + # filter only BSD Now from channel. + LC_LOCALE=C awk -F ' ' '$2 ~ / \| BSD Now/';; + *) + cat;; + esac | \ + # replace youtube links with embed links. + sed 's@www.youtube.com/watch?v=@www.youtube.com/embed/@g' | \ + # try to strip utm_ tracking parameters. + LC_LOCALE=C awk -F ' ' 'BEGIN { + OFS = " "; + } + { + gsub(/\?utm_([^&]+)/, "?", $3); + gsub(/&utm_([^&]+)/, "", $3); + gsub(/\?&/, "?", $3); + gsub(/[\?&]+$/, "", $3); + print $0; + }' +} + +- - - + Over time your feeds file might become quite big. You can archive items from a specific date by doing for example: -- cgit v1.2.3