Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

It looks like the project became inactive for a bit and there are alternatives such as htmlq, etc. https://github.com/ericchiang/pup/issues/150


From the looks of it, htmlq doesn’t have anything comparable to pup’s JSON output. That JSON is cumbersome to work with, but combined with jq it allows one to extend the shell hackery just a little bit beyond what CSS can do.


Hey, i'm the author of fq. It can convert to/from html and JSON (in two different modes). Use -d html, or the fromhtml, fromxml and toxml functions. Ex:

  $ curl -s https://news.ycombinator.com/ | fq -r -d html 'grep_by(."@class"=="titleline").a."#text"'
  Inkbase: Programmable Ink
  New details on commercial spyware vendor Variston
  How We Built Fly Postgres
  ...
  $ curl -s https://news.ycombinator.com/ | fq -r -d html '{hosts: {host: [grep_by(."@class"=="titleline").a."@href" | fromurl.host]}} | toxml({indent:2})'
  <hosts>
    <host>www.inkandswitch.com</host>
    <host>blog.google</host>
    <host>fly.io</host>
    ...
  </hosts>
See https://github.com/wader/fq/blob/master/doc/formats.md#xml and https://github.com/wader/fq/blob/master/doc/formats.md#html for examples and documentations.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: