writing an RSS reader in 80 lines of bash

21 points by y0b1byte a day ago on lobsters | 12 comments

Foxboron | a day ago

Uh, this could actually just be done a lot simpler if you use xslt for the XML parsing with xsltproc. Then you can just pre-format the entire feed properly and use bash for what it's good at.

I had a script for this (from an IRC user), but it seems like I've lost the feed.xslt for the actual parsing, but I'll see if I can find it.

https://github.com/Foxboron/home/blob/master/.local/bin/rss

EDIT: A similar source for parsing RSS XML structures into plain text you can parse with bash. https://paste.xinu.at/c38U5a95qgWw/xslt

All derivative work from Earnestly that is doing some rss reader stuff here: https://github.com/Earnestly/webfeeds

[OP] y0b1byte | a day ago

That's definitely cleaner and more robust! However, I will still stick to bash as it's less dependencies, and it requires me to learn the new thing, which is xslt.

chrismorgan | a day ago

You need to fix the embedded code, escaping < as &lt;. Without it, the code presented doesn’t work in a few ways, leading me to think the code was more horrifying than it actually is. It’s still somewhat horrifying, but at least now it should mostly work (though it’s still pretty fragile), rather than seldom work.

[OP] y0b1byte | a day ago

Thanks for that! I did replace in a couple of places, but not everywhere.

chrismorgan | 19 hours ago

You need to spell it &lt; rather than &lt.

Really, https://validator.w3.org/nu/?doc=https://yobibyte.github.io/yr.html is probably worthwhile going through.

[OP] y0b1byte | 12 hours ago

Ups, thanks again!

refaktor | 11 hours ago

For us, impatient ones, this blog-post would be much better if it also showed the output of the script. "Screenshot" or "textshot" is worth 1000 loc :)

ploum | a day ago

I had a similar setup where the links I wanted to read in newsboat were stored in a text file. My "do_the_internet.sh" would then send those links to forlater.email so I would receive the content.

Then forlater.email went down for a week. Not having anything to read during that week, I managed to add http/html/rss support to Offpunk.

Never used newsboat again…

Anecdote: forlate.email founder is also the one who invited me on this site. So double thanks to him!

timetoplatypus | a day ago

Given the recent FOSDEM talk, now it's time to port it to Amber and just transpile to bash, lol

veqq | a day ago

Comment removed by author

veqq | a day ago

I tried to duplicate it using Janet's sh-dsl, but it's 50% longer and has some issues

jlarocco | 16 hours ago

It doesn't reproduce everything from the article, but here's an attempt in Common Lisp:

(ql:quickload '(:alexandria :cxml :xpath :dexador :cxml-stp :cl-ppcre) :silent t)

(defun show-feed (url &optional (filter-regex nil))
  (let* ((filter (alexandria:curry #'cl-ppcre:scan filter-regex))
         (stream (dex:get url :want-stream t :force-binary t))
         (document (cxml:parse stream (stp:make-builder)))
         (nodes (xpath:evaluate "/rss/channel/item" document))
         (node-strings (xpath:map-node-set->list #'xpath:string-value nodes))
         (matches (if filter-regex
                      (remove-if filter node-strings)
                      node-strings)))
    (format t "~{~a~%~}~%" matches)))

;; Lobste.rs feed, minus vibecoding and design articles
(show-feed "https://lobste.rs/top/rss" "vibecoding|design")

Nice!

I was trying out a bigger example of this shell scripting idea: https://acha.ninja/blog/dsl_for_shell_scripting/ and I thought all do nothing scripts etc. best work in Janet... (and thus pipe different Janet processes into each other to compose bigger systems?) That's not really the case as your example shows - handling all the logic natively makes far more sense - and failing that, there's no reason to just print things through small utilities piped together, you can just use everything within the same process etc.