[ZeitBridge] Remove doubled text

The first two paragraphs were repeated at the end of articles. The first
CSS selector filters those out (example 1).
The second CSS selector removes a "Zum Anschauen benötigen wir Ihre Zustimmung"
line from a poll widget. We can't load the widget successfully,
therefore we should remove all embeds that seem to use javascript
(example 2).

1: https://www.zeit.de/campus/2024-03/bundesregierung-wissenschaft-arbeitsvertrag-regeln
2: https://www.zeit.de/campus/2024-03/ausbildung-abgebrochen-gruende-azubi-aufruf
This commit is contained in:
Mynacol 2024-03-10 22:21:10 +01:00
parent 84b93e0f8f
commit 254efc2812

View file

@ -87,7 +87,7 @@ class ZeitBridge extends FeedExpander
// remove known bad elements
foreach (
$article->find(
'aside, .visually-hidden, .carousel-container, #tickaroo-liveblog, .zplus-badge, .article-heading__container--podcast'
'aside, .visually-hidden, .carousel-container, #tickaroo-liveblog, .zplus-badge, .article-heading__container--podcast, div[data-paywall], .js-embed-consent'
) as $bad
) {
$bad->remove();