* refactor: move function to class
* fix: use the computed bridge name as cache key
* refactor: extract method
* fix: set a feed item uid on errors
* docs
* fix: remove year from uid
* [core] Add html/convertLazyLoading($dom)
Looks for lazy-loading attributes such as 'data-src' and converts
them back to regular ones such as 'src', easier for RSS readers.
It also converts <picture> elements to plain <img> elements.
* [core] Document html/stripRecursiveHTMLSection()
Add documentation for that function (no code changes).
* [WordPressBridge] Use convertLazyLoading()
* [WordPressBridge] Unwrap image figures
<img> inside <figure> may not display on RSS readers.
This converts them back to <img>, without losing caption if present.
* [ZDNet] Convert lazy loading images
* [code] html/stripRecursiveHTMLSection: Fix typo
* fix: Call to a member function find() on bool
Happens when defaultLinkTo() is passed the empty string.
* fix: prevent exception in defaultLinkTo() when passed the empty string
* refactor
* fix: notice
* fix: Trying to get property content of non-object at bridges/PcGamerBridge.php line 36
* fix: better exception message
* fix: strpos(): Non-string needles will be interpreted as strings in the future. Use an explicit chr() call to preserve the current behavior
* fix: improve FeedExpander
Include the first libxml error in exception.
Give better error message if trying to parse the empty string.
Log all libxml errors if debug mode is enabled.
* error handling and logging tweak
* feat: improve logging and error handling
* trim absolute path from file name
* fix: suppress php errors from xml parsing
* fix: respect the error reporting level in the custom error handler
* feat: dont log error which is produced by bots
* ignore error about invalid bridge name
* upgrade bridge exception from warning to error
* remove remnants of using phps builin error handler
* move responsibility of printing php error from logger to error handler
* feat: include url in log record context
* fix: always include url in log record contect
Also ignore more non-interesting exceptions.
* more verbose httpexception
* fix
* fix
* refactor: search.js
* feat: use bridge description and short name in search
* fix bug in previous merge commit
Also reformat string from tabs to spaces
* refactor
* fix: bug in previous refactor
* chore: exclude phpcompat sniff due to bug in phpcompat
* fix: do not leak absolute paths
* refactor/fix: batch extensions checking, fix DOS issue
This fixes a future problem when code is placed under a namespace because `get_class($bridge)` will then return e.g. `RssBridge\Bridge\TwitterBridge` instead of the the current value `TwitterBridge`.
Also a bit refactoring of `Configuration.php`.
* docs: Do not use constant names when referring to config options
The options are customizable using a config file and no longer hardcoded in index.php since 8ac8e08abf
* Do not use constants for configuration
Since <8ac8e08abf>, they are just set to the configuration object values.
This moves the responsibility for getting a valid class name
to the users of BridgeFactory, avoiding the repeated sanitation.
Improper use can also be checked statically.
It was just getting out of sync:
- Minimum PHP version was bumped in 8365a7a34d
- Cache directory permission check was removed in 8e2b65556f
- Whitelist permission check was removed in d4e867f240
* refactor: fix exception handling
The removed catch is never uses in php versions above 7.
The need for multiple catch statements like this is to support both php 5 and 7.
* remove traces of old exception handling
* add typehints
* dont treat exception code 0 specially
This bug was introduced by me when refactoring the http client.
Fixes:
Fatal error: Uncaught TypeError: Argument 2 passed to getContents() must be of the type array, null given
* Fixup deprecations on PHP 8
Fix#2448
* Configure a default fallback for getInput function
* Appease phpcs
* Avoid changing getInput function
Revert "Configure a default fallback for getInput function"
This reverts commit 94004c5104.
* [BridgeAbstract] Add loadCacheValue() and saveCacheValue()
Bridges currently need to implement value caching manually, which
results in duplicate code and more complex bridges.
This commit adds two protected functions to BridgeAbstract that make
it possible for bridges to store and retrieve values from a temporary
cache by key.
Co-Authored-By: Roliga <roliga.here@gmail.com>
Co-authored-by: Roliga <roliga.here@gmail.com>
This fixes a bug where it didnt use curl from cli
even though it's installed.
I believe this preserves the original intention to
not require the curl module to be installed.
https://github.com/RSS-Bridge/rss-bridge/pull/979
Updates the data-ref tag of each bridge card to use the bridge's full name (eg. Apple Music) instead of its filename (eg. AppleMusic). This fixes issues with the search not returned some bridges.
This commit fixes following issues:
1. 'Unexpected response' error message was returned, even if upstream did not return anything
2. Inability to handle non-20x messages with checking response body
Updates displayBridgeCard() in BridgeCard to allow configuration options noproxy and cache_timeout to be displayed, if enabled, when a bridge has no parameters in its PARAMETERS array
* [DarkReading] Hide dummy articles
* [FuturaSciences] Strip inline scripts from content
* [FeedExpander] Fix PHP notice on missing uri field
(guid is valid uri AND item uri is not valid)
=> (guid is valid uri AND item uri is empty or not valid)
* [NextInpact] Fix subtitle extraction
* [Markdown] Fix images with empty replacement text
* [TheHackerNews] Fix Author name cleanup
* [LeMondeInformatique] Remove encoding conversion
Was previously needed due to actual encoding on the page
being inconsistent with encoding specified in <meta> tag
* [AnimeUltime] Remove encoding conversion
Was previously needed due to encoding on the page being incorrect
* [FuturaSciences] Fix content extraction
* [FuturaSciences] Fix unneeded unset()
* [GBAtemp] Fix tutorial mode URL extraction
* [GBAtemp] Fix tutorial mode Title extraction
Most of the code in RSS-Bridge uses the long array syntax.
This commit adds a check to enforce using this syntax over
the short array syntax.
All failures have been fixed.
setInputs() currently looks if the global array defines a 'value'
for a given parameter, but that isn't supported by the API. It
needs to be 'defaultValue'.
* action: Add action to check bridge connectivity
It is currently not simply possible to check if the remote
server for a bridge is reachable or not, which means some
of the bridges might no longer work because the server is
no longer on the internet.
In order to find those bridges we can either check each
bridge individually (which takes a lot of effort), or use
an automated script to do this for us.
If a server is no longer reachable it could mean that it is
temporarily unavailable, or shutdown permanently. The results
of this script will at least help identifying such servers.
* [Connectivity] Use Bootstrap container to properly display contents
* [Connectivity] Limit connectivity checks to debug mode
Connectivity checks take a long time to execute and can require a lot
of bandwidth. Therefore, administrators should be able to determine
when and who is able to utilize this action. The best way to prevent
regular users from accessing this action is by making it available in
debug mode only (public servers should never run in debug mode anyway).
* [Connectivity] Split implemenation into multiple files
* [Connectivity] Make web page responsive to user input
* [Connectivity] Make status message sticky
* [Connectivity] Add icon to the status message
* [contents] Add the ability for getContents to return header information
* [Connectivity] Add header information to the reply Json data
* [Connectivity] Add new status (blue) for redirected sites
Also adds titles to status icons (Successful, Redirected, Inactive, Failed)
* [Connectivity] Fix show doesn't work for inactive bridges
* [Connectivity] Fix typo
* [Connectivity] Catch errors in promise chains
* [Connectivity] Allow search by status and update dynamically
* [Connectivity] Add a progress bar
* [Connectivity] Use bridge factory
* [Connectivity] Import Bootstrap v4.3.1 CSS
The current solution for titles on input boxes is not obvious to the
user as support varies between bridges. This commit adds an button to
all input boxes with titles in order to make it clear to the user that
additional information is available.
Allows getting the expected MIME type of the format's output. A
corresponding MIME_TYPE constant is also defined in FormatAbstract for
the format implementations to overwrite.
Error reporting currently takes place for each error. This can result
in many error messages if a server has connectivity issues (i.e. when
it re-connects to the internet every 24 hours).
This commit adds a new option to the configuration file to define the
number of error reports to suppress before returning an error message
to the user.
Error reports are cached and therefore automatically purged after 24
hours. A successful bridge request does **not** clear the error count
as sporadic issues can be the result of actual problems on the server.
The implementation currently makes no assumption on the type of error,
which means it also suppresses bridge errors in debug mode. The default
value is, however, set to 1 which means all errors are reported.
References #994
Bridge errors are currently included as part of the feed to
notify users about erroneous bridges (before that, bridges
silently failed).
This solution, however, can produce a high load of error
messages if servers are down (see #994 for more details).
Admins may also not want to include error messages in feeds
in order to keep those kind of problems away from users or
simply to silently fail by choice.
This commit adds a new configuration section "error" with
one option "output" which can be set to following values:
"feed": To include error messages in the feed (default)
"http": To return a HTTP header for each error
"none": To disable error reporting
Note that errors are always logged to 'error.log' independent
of the settings above.
Closes#1066
* [ParameterValidator] Ensure context has all fields
Previously if a bridge had a set of parameters like:
const PARAMETERS = array(
'ContextA' => array(
'Param1' => array(
'name' => 'Param1',
'required' => true
)
),
'ContextB' => array(
'Param1' => array(
'name' => 'Param1',
'required' => true
),
'Param2' => array(
'name' => 'Param2',
'required' => true
)
)
)
and a query specifying both Param1 and Param2 was provided a 'Mixed
context parameters' error would be returned. This change ensures
ContextA in the above example would not be considered a relevant context.
RSS-Bridge currently has to guess the queried context from the data
provided by the user. This, however, can cause issues for bridges
that have multiple contexts with conflicting parameters (i.e. none).
This commit adds context hinting to queries via '&context=<context>'
which can be omitted in which case the context is determined as before.
The format factory can be based on the abstract factory class if it
wasn't static. This allows for higher abstraction and makes future
extensions possible. Also, not all parts of RSS-Bridge need to work
on the same instance of the factory.
References #1001
The cache factory can be based on the abstract factory class if it
wasn't static. This allows for higher abstraction and makes future
extensions possible. Also, not all parts of RSS-Bridge need to work
on the same instance of the factory.
References #1001
The bridge factory can be based on the abstract factory class if it
wasn't static. This allows for higher abstraction and makes future
extensions possible. Also, not all parts of RSS-Bridge need to work
on the same instance of the bridge factory.
References #1001
RSS-Bridge currently sanitizes the format name only for the display
action, which can cause problems if other actions depend on formats
as well.
It is therefore better to do sanitization in the factory class for
formats. Additionally, formats should not require a perfect match,
so 'Atom' and 'aToM' make no difference. This will also allow users
to define formats in their own style (i.e. only lowercase via CLI).
References #1001
Response headers may contain fields with no values.
Example:
"Referrer-Policy: "
In this case the current implementation of explode() results in an
error because there is no content after ": ". Changing the delimiter
to ":" and trimming the value manually fixes that issue.
Users currently only get one option: to open a new issue on GitHub.
This can, however, result in duplicate issues, which is not desired.
This commit adds a second button to the error message, which links
to the GitHub issues tracker with the search query set to find
errors for the current bridge. That way, users can collaborate
on the same issue.
Incorrect configuration values are currently handled individually
for each condition, resulting in a lot of repetitive operations.
This commit adds two new private functions to report errors to the
user and end execution of the script.
The configuration files are currently hard-coded in the configuration
classes and error messages. However, the implementation should not
rely on specific details like the file name. Instead, the files should
be part of the global definition.
This commit introduces two global constants for the configuration files
- FILE_CONFIG => 'config.ini.php'
- FILE_CONFIG_DEFAULT => 'config.default.ini.php'
RSS-Bridge currently statically sets the timezone to UTC which can
result in incorrect timestamps if the server is hosted in another
region.
This commit adds a new configuration parameter to allow admins to
specify their own timezone for their servers. Invalid values will
result in an error message.
Example:
[system]
timezone = "UTC"
For compatibility reasons the default value is set to UTC.
This parameter accepts any of the supported timezones listed at
https://www.php.net/manual/en/timezones.phpCloses#956
References #1001
If the bridge name matches exactly, it is not necessary to perform
a strtolower compare of bridges. In some situations this can lead
to much faster response times (depending on the amount of bridges
in whitelist.txt).
Default bridges are currently statically defined in index.php, which
is not the right place if we want to keep responsibilities separated.
This commit introduces a new file whitelist.default.txt that holds
the default bridges and which is loaded automatically, if whitelist.txt
doesn't exist.
Due to this it is also no longer necessary to have write permission
for the root directory.
References #1001
This reverts commit 052844f5e1.
There is a bug in ->remove() that causes the parser to incorrectly
identify elements in the DOM tree that shouldn't exist anymore.
References #1151
simplehtmldom 1.9 introduced new functions to recursively remove
nodes from the DOM. This allows removing elements without the need
to re-load the document by using $html->load($html->save()), which
is very inefficient.
Find more information about remove() at
https://simplehtmldom.sourceforge.io/docs/1.9/api/simple_html_dom_node/remove/
find('*') wasn't supported in older versions of simplehtmldom but it
is supported now. Thus, all custom implementations can be replaced
by the correct solution.
This fixes the following issue:
1. bridge sets unique ids for the items (ids get hashed)
2. items go to the cache
3. on next run items get loaded from cache
4. these items have different ids because they were hashed again
5. they show up twice in feed reader
- For consistency, functions should always return null on non-existing data.
- WordPressPluginUpdateBridge appears to have used its own cache instance in the past. Obviously not used anymore.
- Since $key can be anything, the cache implementation must ensure to assign the related data reliably; most commonly by serializing and hashing the key in an appropriate way.
- Even though the default path for storage is perfectly fine, some people may want to use a different location. This is an example how a cache implementation is responsible for its requirements.
This commit adds support for a new parameter which specifies the type
of cache to use for caching. It is specified in config.ini.php:
[cache]
type = "..."
Currently only one type of cache is supported (see /caches). All uses
of 'FileCache' were replaced by this configuration option.
Note: Caching currently depends on files and folders (due to FileCache).
Experience may vary depending on the selected cache type. For now always
check if FileCache is working before testing alternative types.
References #1000
'uid' represents the unique id for a feed item. This item is null by
default and can be set to any string value. The provided string value
is always hashed to sha1 to make it the same length in all cases.
References #977, #1005
Add transformation from legacy items to FeedItems, before transforming
items to the desired format. This allows using legacy bridges alongside
bridges that return FeedItems.
As discussed in #940, instead of throwing exceptions on invalid
parameters, add messages to the debug log instead
Add support for strings to setTimestamp(). If the provided timestamp
is a string, automatically try to parse it using strtotime().
This allows bridges to simply use `$item['timestamp'] = $timestamp;`
instead of `$item['timestamp'] = strtotime($timestamp);`
Support simple_html_dom_node as input paramter for setURI
Support simple_html_dom_node as input parameter for setContent