* Debug mode improvements
- Improve debug warning message
- Restore error reporting in debug mode
- Fix 'notice' messages for unset fields
* Add parsing utility functions
html.php
- extractFromDelimiters
- stripWithDelimiters
- stripRecursiveHTMLSection
- markdownToHtml (partial)
bridges
- remove now-duplicate functions
- call functions from html.php instead
* [Anidex] New bridge
Anime torrent tracker
* [Anime-Ultime] Restore thumbnail
* [CNET] Recreate bridge
Full rewrite as the previous one was broken
* [Dilbert] Minor URI fix
Use new self::URI property
* [EstCeQuonMetEnProd] Fix content extraction
Bridge was broken
* [Facebook] Fix "SpSonsSoriSsés" label
... which was taking space in item title
* [Futura-Sciences] Use HTTPS, More cleanup
Use HTTPS as FS now offer HTTPS
Clean additional useless HTML elements
* [GBATemp] Multiple fixes
- Fix categories: missing "break" statements
- Restore thumbnail as enclosure
- Fix date extraction
- Fix user blog post extraction
- Use getSimpleHTMLDOMCached
* [JapanExpo] Fix bridge, HTTPS, thumbnails
- Fix getSimpleHTMLDOMCached call
- Upgrade to HTTPS as JE now offers HTTPS
- Restore thumbnails as enclosures
* [LeMondeInformatique] Fix bridge, HTTPS
- Upgrade to HTTPS as LMI now offers HTTPS
- Restore thumbnails using small images
- Fix content extraction
- Fix text encoding issue
* [Nextgov] Fix content extraction
- Restore thumbnail and use small image
- Field extraction fixes
* [NextInpact] Add categories and filtering by type
- Offer all RSS feeds
- Allow filtering by article type
- Implement extraction for brief articles
- Remove article limit, many brief articles are publied all at once
* [NyaaTorrents] New bridge
Anime torrent tracker
* [Releases3DS] Cache content, restore thumbnail
- Use getSimpleHTMLDOMCached
- Restore thumbnail as enclosure
* [TheHackerNews] Fix bridge
- Fix content extraction including article body
- Restore thumbnail as enclosure
* [WeLiveSecurity] HTTPS, Fix content extraction
- Upgrade to HTTPS as WLS now offers HTTPS
- Fix content extraction including article body
* [WordPress] Reduce timeout, more content selectors
- Reduce timeout to use default one (1h)
- Add new content selector (articleBody)
- Find thumbnail and set as enclosure
- Fix <script> cleanup
* [YGGTorrent] Increase limit, use cache
- Increase item limit as uploads are very frequent
- Use getSimpleHTMLDOMCached
* [ZDNet] Rewrite with FeedExpander
- Upgrade to HTTPS as ZD now offers HTTPS
- Use FeedExpander for secondary fields
- Fix content extraction for article body
* [Main] Handle MIME type for enclosures
Many feed readers will ignore enclosures (e.g. thumbnails) with no MIME type. This commit adds automatic MIME type detection based on file extension (which may be inaccurate but is the only way without fetching the content).
One can force enclosure type using #.ext anchor (hacky, needs improving)
* [FeedExpander] Improve field extraction
- Add support for passing enclosures
- Improve author and uri extraction
- Fix 'notice' PHP error messages
* [Pull] Coding style fixes for #802
* [Pull] Implementing changes for #802
- Fix coding style issues with str append
- Remove useless CACHE_TIMEOUT
- Use count() instead of $limit
- Use defaultLinkTo() + handle strings
- Use http_build_query()
- Fix missing </em>
- Remove error_reporting(0)
- warning CSS (@LogMANOriginal)
- Fix typo in FeedExpander comment
* [Main] More documentation for markdownToHtml
See #802 for more details
The bridge would generate empty titles if the content is longer than
50 characters, but doesn't have further spaces in it. With this commit
the title is correctly generated based on the contents, taking missing
spaces into account.
References #786
https://cad-comic.com/ now provides feeds at
- https://cad-comic.com/feed (rss)
- https://cad-comic.com/feed/atom (atom)
Thus multiple alternatives are available to choose from, making this
bridge obsolete:
- FilterBridge (using one of the feeds above)
- WordPressBridge (on the main site)
- One of the two available feeds
References #752
This commit fixes an issue caused by self closing tags not supported
by simplehtmldom (<source>).
Adds a monkey patch to extend simplehtmldom with the ability to detect
that particular tag. Most of the code added is copied directly from
simplehtmldom (see vendor/simplehtmldom) with adjustments to account
for RSS-Bridge formatting.
Related to: https://sourceforge.net/p/simplehtmldom/bugs/83/
Notice: The tag itself is valid according to Mozilla:
The HTML <picture> element serves as a container for zero or more
<source> elements and one <img> element to provide versions of an
image for different display device scenarios. The browser will
consider each of the child <source> elements and select one
corresponding to the best match found; if no matches are found
among the <source> elements, the file specified by the <img>
element's src attribute is selected. The selected image is then
presented in the space occupied by the <img> element.
-- https://developer.mozilla.org/en-US/docs/Web/HTML/Element/picture
References #753
* use defaultLinkTo
* remove duplicate video links
* remove line ending before "Reposted" label
* return newline before reposted string
* remove comments
* use video links that won't require login
* set title if video has no title
Adds a new option '&title_from_content=on' to build the title for feed
items from the feeds content. The title is generated from the first
whitespace after 50 characters of the content or the entire content if
the total size is lower than 50 characters.
References #587
This commit fixes a few things related to static::URI
1) Remove trailing slash from the URI to simplify using 'defaultLinkTo'
2) Use static::URI instead of self::URI for consistency
3) Remove custom implementation of 'defaultLinkTo'
Images are collected for each post and added to enclosures. Images or
animtions from lh3.googleusercontent.com are specifically handled in
order to return the animated version of the gif and the original sized
image (this is normally taken care of by JS in the browser).
Adds a new bridge for https://gist.github.com
The bridge generates feeds for comments on a particular gist based on
the gist ID or full URI. For better readability the general behavior
of code sections is manually restored with the original CSS styles
from GitHub.
New bridge for Skimfeed: https://skimfeed.com
Generates feeds for all features of Skimfeed:
- News (the ones displayed on the front page)
- Hot topics ("What's Hot" section on the front page)
- Tech news (preconfigured feeds in the menu bar)
- Custom feeds (using the configuration system of Skimfeed), see
https://skimfeed.com/custom.php
The number of items returned by the bridge can be limited for all
categories ('&limit=...'). This parameter is optional, all categories
are unlimited by default!
Authors are added with HTML anchors in order to allow quick navigation
to source channels.
The bridge ships with developer tools to auto-generate lists in the
future (especially useful for 'Tech news'!)
References #748
Following changes in the JSON data and selecting images for the
content (320x240 or bigger) and enclosure (largest version). All of
the data is now extracted from the JSON data instead of parsing the
DOM.
References #754
Removing this bridge for two reasons:
1) The service moved from www.cpasbien.cm to www.torrents9.blue,
changing the layout in the process (incompatible).
2) The new site is permanently protected by Cloudflare IUAM, making
it inaccessible by RSS-Bridge.
While it would certainly be possible to rewrite the bridge to work
with the new layout, the site is still inaccessible.
References #605
This commit adds a new bridge for http://www.instructables.com. This bridge
currently supports fetching content by category (all categories available 200+),
using available filters (featured, recent, popular, views, contest winners).
Adds duration limits (minimum duration, maximum duration) for all
modes (user/id/playlist/search). Duration limits are optional, so
existing subscriptions don't break.
The limits are specified by two separate parameters, each of which
is optional:
- `&duration_min=` (minimum duration in minutes, default: -1)
- `&duration_max=` (maximum duration in minutes, default: INF)
If duration limits are specified in either user, id or playlist mode,
the bridge defaults to fetching data from HTML intead of XML feeds,
which requires more bandwidth and takes longer, because each video is
loaded individually!
References #670
The previous context is now labeled 'User', while the new context is
labeled 'Group'. The existing code was not changed, instead new group*
functions were implemented to handle groups.
The general principle of capturing groups is the same as done for users
with adjustments to account for different HTML structures.
Captcha responses are currently not supported for groups! There doesn't
seem to be a way to trigger them consistently, which makes it hard to
handle them properly.
Features of the group context:
- The feed title is based on the group name
- The group URI used for capturing is returned for the feed URI
- Author names and timestamps are reproduced from the source
- Post titles are reproduced from the source if they exist, otherwise
the title is build manually from the author name and the content
- Original contents are included with the feed
- All images are attached as enclosures as well
Closes #