Commit graph

63 commits

Author SHA1 Message Date
Pavel Korytov
b505667168
[SubstackBridge] Add Substack bridge (#4174)
* [SubstackBridge] Add Substack

* [SubstackBridge] Add docs

* [SubstackBridge] Fix lint

* [SubstackBridge] Update description

* [SubstackBridge] Update description (x2)
2024-07-31 21:57:20 +02:00
Dag
001dd47439
fix: small tweaks (#4057) 2024-04-04 19:12:04 +02:00
Dag
d01c462ad5
fix(FeedExpander): if parse fails, include offending url in exception message (#3938)
Also some refactors
2024-01-29 21:51:34 +01:00
Dag
c4fceab7b3
refactor(FeedParser): (#3928) 2024-01-29 21:51:06 +01:00
Dag
d08d13f2c8
refactor: introduce http Request object (#3926) 2024-01-25 16:06:24 +01:00
Dag
191e5b0493
feat: add etag support to getContents (#3893) 2024-01-12 01:31:01 +01:00
Dag
5f37c72be0
fix(binance): plus some other tweaks (#3753) 2023-10-13 20:48:08 +02:00
Dag
49d9dafaec
refactor: more feed parsing tweaks (#3748) 2023-10-13 02:31:09 +02:00
Dag
2880524dfc
refactor: remove parent calls to parseItem (#3747) 2023-10-13 01:59:05 +02:00
Dag
9bda9e246a
refactor: FeedExpander (#3740)
* refactor: FeedExpander
2023-10-12 22:14:04 +02:00
Dag
41df17bc46
refactor (#3712)
* test: refactor test suite

* docs

* refactor

* yup

* docs
2023-10-01 19:23:30 +02:00
Dag
7329b83cc0
refactor: logger (#3678) 2023-09-21 22:05:55 +02:00
Dag
e6aef73a02
refactor (#3668) 2023-09-20 02:45:48 +02:00
Dag
4b9f6f7e53
fix: rewrite and improve caching (#3594) 2023-09-10 21:50:15 +02:00
User123698745
4976cd227e
[FeedExpander] support xhtml content / content with child elements (#3598)
* [core] support xhtml content type in FeedExpander

* [FilterBridge] change defaultValue to exampleValue

* [core] support content with child elements in FeedExpander
2023-08-04 22:14:08 +02:00
Dag
ee498eadf9
fix: move debug mode to config (#3324)
* fix: move debug mode to config

* fix: also move debug_whitelist to .ini config

* fix: move logic back to Debug class

* docs

* docs

* fix: disable debug mode by default

* fix: restore previous behavior for alerts

* fix: center-align alert text
2023-06-02 20:22:09 +02:00
Dag
4c3ebb312d
feat: improve error handling ux (#3298)
* feat: improve error handling ux

* feat: add error messages for failed xml parsing
2023-03-20 19:11:51 +01:00
Dag
95c199c2eb
fix: various php notices (#3145)
* fix: notice

* fix: Trying to get property content of non-object at bridges/PcGamerBridge.php line 36

* fix: better exception message

* fix: strpos(): Non-string needles will be interpreted as strings in the future. Use an explicit chr() call to preserve the current behavior
2022-11-15 00:30:51 +01:00
Dag
e027bd9274
fix: improve FeedExpander (#3103)
* fix: improve FeedExpander

Include the first libxml error in exception.

Give better error message if trying to parse the empty string.

Log all libxml errors if debug mode is enabled.

* error handling and logging tweak
2022-10-29 10:27:02 +02:00
ORelio
05f2fb5ec7
[FeedExpander] Decode HTML entities in title (#3110)
Feed item title may contain HTML entities that we need to decode,
else they are encoded twice when generating the expanded feed.
2022-10-20 18:26:43 +02:00
Dag
27b3d7c34e
feat: improve logging and error handling (#2994)
* feat: improve logging and error handling

* trim absolute path from file name

* fix: suppress php errors from xml parsing

* fix: respect the error reporting level in the custom error handler

* feat: dont log error which is produced by bots

* ignore error about invalid bridge name

* upgrade bridge exception from warning to error

* remove remnants of using phps builin error handler

* move responsibility of printing php error from logger to error handler

* feat: include url in log record context

* fix: always include url in log record contect

Also ignore more non-interesting exceptions.

* more verbose httpexception

* fix

* fix
2022-09-08 19:07:57 +02:00
Dag
8ea9472300
feat: improve exception message when xml parsing fails (#3009) 2022-09-05 14:26:11 +02:00
Dag
2bbce8ebef
refactor: general code base refactor (#2950)
* refactor

* fix: bug in previous refactor

* chore: exclude phpcompat sniff due to bug in phpcompat

* fix: do not leak absolute paths

* refactor/fix: batch extensions checking, fix DOS issue
2022-08-06 22:46:28 +02:00
Jan Tojnar
951092eef3
Fix coding style missed by phpbcf (#2901)
$ composer require --dev friendsofphp/php-cs-fixer

$ echo >.php-cs-fixer.dist.php "<?php

$finder = PhpCsFixer\Finder::create()
    ->in(__DIR__);

$rules = [
    '@PSR12' => true,
    // '@PSR12:risky' => true,
    '@PHP74Migration' => true,
    // '@PHP74Migration:risky' => true,
    // buggy, duplicates existing comment sometimes
    'no_break_comment' => false,
    'array_syntax' => true,
    'lowercase_static_reference' => true,
    'visibility_required' => false,
    // Too much noise
    'binary_operator_spaces' => false,
    'heredoc_indentation' => false,
    'trailing_comma_in_multiline' => false,
];

$config = new PhpCsFixer\Config();

return $config
    ->setRules($rules)
    // ->setRiskyAllowed(true)
    ->setFinder($finder);

"

$ vendor/bin/php-cs-fixer --version
PHP CS Fixer 3.8.0 BerSzcz against war! by Fabien Potencier and Dariusz Ruminski.
PHP runtime: 8.1.7

$ vendor/bin/php-cs-fixer fix
$ rm .php-cs-fixer.cache
$ vendor/bin/php-cs-fixer fix
2022-07-08 13:00:52 +02:00
Dag
4f75591060
Reformat codebase v4 (#2872)
Reformat code base to PSR12

Co-authored-by: rssbridge <noreply@github.com>
2022-07-01 15:10:30 +02:00
Dag
5076d09de6
refactor: prepare for PSR2 (#2859) 2022-06-24 18:29:35 +02:00
Dag
1d0a0b927b
fix: use accept header when fetching feed (#2737)
* fix: use accept header when fetching feed

* fix: include atom too, and reuse constants from format classes

* add a catch all accept header
2022-05-18 00:18:33 +02:00
dag
dbee47f1d6
fix: give better error message when feed can't be parsed (#2618) 2022-04-10 18:54:32 +02:00
Stelfux
91b8e4196e
[FeedExpander.php] Preserve original icon (#2145) 2022-03-26 19:09:27 +01:00
ORelio
b754d14698
[FeedExpander] Handle Atom enclosures (#2039) 2021-04-04 15:21:15 +05:00
ORelio
8144488a9e [FeedExpander] Fix PHP notice on missing uri field
(guid is valid uri AND item uri is not valid)
 => (guid is valid uri AND item uri is empty or not valid)
2020-08-11 14:01:44 +02:00
ORelio
ca9c2abb60 [FeedExpander] Fix item href being used as feed uri (#1033) 2019-02-11 19:07:03 +01:00
logmanoriginal
1c17ffb5c4 [FeedExpander] Add constants for feed types 2018-11-18 16:18:40 +01:00
logmanoriginal
326cfb21cf [FeedExpander] Rename $name to $title 2018-11-18 16:11:38 +01:00
logmanoriginal
8ab1fb86a9 [FeedExpander] Let collectExpandableDatas() return self 2018-11-18 16:03:32 +01:00
logmanoriginal
c4550be812 lib: Add API documentation 2018-11-18 09:41:14 +01:00
logmanoriginal
c63af2e7ad core: Add separate Debug class
Replaces 'debugMessage' by specialized debug function 'Debug::log'.
This function takes the same arguments as the previous 'debugMessage'.

A separate Debug class allows for further optimization and separation
of concern.
2018-11-10 20:03:05 +01:00
logmanoriginal
4b7fea5ebc [RssBridge] Include interfaces once 2018-11-06 19:23:32 +01:00
ORelio
de8cee6a1c Catching up | [Main] Debug mode, parse utils, MIME | [Bridges] Add/Improve 20 bridges (#802)
* Debug mode improvements

 - Improve debug warning message
 - Restore error reporting in debug mode
 - Fix 'notice' messages for unset fields

* Add parsing utility functions

html.php
 - extractFromDelimiters
 - stripWithDelimiters
 - stripRecursiveHTMLSection
 - markdownToHtml (partial)

bridges
 - remove now-duplicate functions
 - call functions from html.php instead

* [Anidex] New bridge

Anime torrent tracker

* [Anime-Ultime] Restore thumbnail

* [CNET] Recreate bridge

Full rewrite as the previous one was broken

* [Dilbert] Minor URI fix

Use new self::URI property

* [EstCeQuonMetEnProd] Fix content extraction

Bridge was broken

* [Facebook] Fix "SpSonsSoriSsés" label

... which was taking space in item title

* [Futura-Sciences] Use HTTPS, More cleanup

Use HTTPS as FS now offer HTTPS
Clean additional useless HTML elements

* [GBATemp] Multiple fixes

- Fix categories: missing "break" statements
- Restore thumbnail as enclosure
- Fix date extraction
- Fix user blog post extraction
- Use getSimpleHTMLDOMCached

* [JapanExpo] Fix bridge, HTTPS, thumbnails

- Fix getSimpleHTMLDOMCached call
- Upgrade to HTTPS as JE now offers HTTPS
- Restore thumbnails as enclosures

* [LeMondeInformatique] Fix bridge, HTTPS

- Upgrade to HTTPS as LMI now offers HTTPS
- Restore thumbnails using small images
- Fix content extraction
- Fix text encoding issue

* [Nextgov] Fix content extraction

- Restore thumbnail and use small image
- Field extraction fixes

* [NextInpact] Add categories and filtering by type

- Offer all RSS feeds
- Allow filtering by article type
- Implement extraction for brief articles
- Remove article limit, many brief articles are publied all at once

* [NyaaTorrents] New bridge

Anime torrent tracker

* [Releases3DS] Cache content, restore thumbnail

- Use getSimpleHTMLDOMCached
- Restore thumbnail as enclosure

* [TheHackerNews] Fix bridge

 - Fix content extraction including article body
 - Restore thumbnail as enclosure

* [WeLiveSecurity] HTTPS, Fix content extraction

- Upgrade to HTTPS as WLS now offers HTTPS
- Fix content extraction including article body

* [WordPress] Reduce timeout, more content selectors

- Reduce timeout to use default one (1h)
- Add new content selector (articleBody)
- Find thumbnail and set as enclosure
- Fix <script> cleanup

* [YGGTorrent] Increase limit, use cache

- Increase item limit as uploads are very frequent
- Use getSimpleHTMLDOMCached

* [ZDNet] Rewrite with FeedExpander

- Upgrade to HTTPS as ZD now offers HTTPS
- Use FeedExpander for secondary fields
- Fix content extraction for article body

* [Main] Handle MIME type for enclosures

Many feed readers will ignore enclosures (e.g. thumbnails) with no MIME type. This commit adds automatic MIME type detection based on file extension (which may be inaccurate but is the only way without fetching the content).

One can force enclosure type using #.ext anchor (hacky, needs improving)

* [FeedExpander] Improve field extraction

- Add support for passing enclosures
- Improve author and uri extraction
- Fix 'notice' PHP error messages

* [Pull] Coding style fixes for #802

* [Pull] Implementing changes for #802

 - Fix coding style issues with str append
 - Remove useless CACHE_TIMEOUT
 - Use count() instead of $limit
 - Use defaultLinkTo() + handle strings
 - Use http_build_query()
 - Fix missing </em>
 - Remove error_reporting(0)
 - warning CSS (@LogMANOriginal)
 - Fix typo in FeedExpander comment

* [Main] More documentation for markdownToHtml

See #802 for more details
2018-09-09 20:20:13 +01:00
Walter Barrett
704a87ad97 Icons: Allow Bridge-specified icons (#788) 2018-08-21 17:46:47 +02:00
LogMANOriginal
193ca87afa [phpcs] enforce single quotes (#732)
* [phpcs] Add rule to enforce single quoted strings
2018-06-29 22:55:33 +01:00
logmanoriginal
4fb1366aaf [FeedExpander] Fix Serialization of 'SimpleXMLElement' is not allowed 2017-08-10 13:35:19 +02:00
logmanoriginal
8166e33e7f [FeedExpander] Remove whitespace from source content
Whitespace at the beginning of feeds causes parsing errors. This is
an example using an ill-formatted RSS feed:

   "XML or text declaration not at start of entity"
-- https://validator.w3.org

This commit automatically removes all proceeding and trailing white-
space from the source content before resume parsing.
2017-08-10 13:20:35 +02:00
logmanoriginal
a4b9611e66 [phpcs] Add missing rules
- Do not add spaces after opening or before closing parenthesis

  // Wrong
  if( !is_null($var) ) {
    ...
  }

  // Right
  if(!is_null($var)) {
    ...
  }

- Add space after closing parenthesis

  // Wrong
  if(true){
    ...
  }

  // Right
  if(true) {
    ...
  }

- Add body into new line
- Close body in new line

  // Wrong
  if(true) { ... }

  // Right
  if(true) {
    ...
  }

Notice: Spaces after keywords are not detected:

  // Wrong (not detected)
  // -> space after 'if' and missing space after 'else'
  if (true) {
    ...
  } else{
    ...
  }

  // Right
  if(true) {
    ...
  } else {
    ...
  }
2017-07-29 19:55:12 +02:00
Frans de Jonge
781e4f1908 [FeedExpander] Deal with empty item 2017-06-24 15:09:15 +02:00
logmanoriginal
a2108c784f [FeedExpander] Properly cast simplexml elements
This fixes a possible cause of
"Serialization of 'SimpleXMLElement' is not allowed"
reported via #487
2017-03-13 22:12:11 +01:00
logmanoriginal
512a4f292b bridges: Return parent::getURI by default 2017-02-15 19:38:32 +01:00
logmanoriginal
c4169f1579 bridges: Return parent::getName by default 2017-02-15 19:38:32 +01:00
logmanoriginal
ff83410534 style: Fix coding styles 2017-02-14 17:28:07 +01:00
logmanoriginal
49281a2ed3 [FeedExpander] Remove orphan getDescription function 2016-10-16 12:47:37 +02:00