Tor Christian @tc

**DSGVO-Portal** @dsgvoportal@social.tchncs.de · 23t

DSGVO-Portal @dsgvoportal@social.tchncs.de

Oberlandesgericht Frankfurt a.M., Urteil vom 02.05.2025, 6 U 6-24: Schadensersatzanspruch in Scraping-Fällen. #Schadensersatz #Scraping #Soziale #Netzwerke #Telefonnummer #teamdatenschutz #dsgvoportal https://www.dsgvo-portal.de/gerichtsentscheidungen/2025-05-02-OLGFFM-6-U-6-24-Schadensersatz-Scraping-Soziale-Netzwerke-Telefonnummer-2527.php

Compliance Essentials GmbH · 21tSchadensersatzanspruch in Scraping-Fällen. | 10.11.2025 | dsgvo-portal.deAv Martin Holzhofer

**DSGVO-Portal** @dsgvoportal@social.tchncs.de · 23t

23t

DSGVO-Portal @dsgvoportal@social.tchncs.de

LG Hagen, Urteil vom 07.05.2025, 10 O 226-24: Grenzen des Auskunftsanspruchs bei Geheimhaltungsinteressen Dritter. #Schadensersatz #Soziale #Netzwerke #Immaterieller #Schaden #Scraping #teamdatenschutz #dsgvoportal https://www.dsgvo-portal.de/gerichtsentscheidungen/2025-05-07-LGHA-10-O-226-24-Schadensersatz-Soziale-Netzwerke-Immaterieller-Schaden-Scraping-2526.php

Compliance Essentials GmbH · 21tGrenzen des Auskunftsanspruchs bei Geheimhaltungsinteressen Dritter. | 10.11.2025 | dsgvo-portal.deAv Martin Holzhofer

**MakerTube** @MakerTube@mastodon.social · 5d

MakerTube @MakerTube@mastodon.social

Is there a public IP block list for AI bots? I see a jump in traffic usage lately and it seems robots.txt is widely ignored these days. #ai #blocking #firewall #scraping #peertube

**Erik Jonker** @ErikJonker@mastodon.social · 3. nov.

3. nov.

Erik Jonker @ErikJonker@mastodon.social

Als je een AI browser gebruikt zoals die van OpenAI, die met je meeleest en alles onthoudt, is er dan sprake van scrapen? En als dit geen scrapen is, dan lijkt me het handhaven van een scrapingsverbod voor AI moeilijk worden. Gebruikers zullen in de nabije toekomst hun browser namelijk vragen op de achtergrond wat "browse werk" voor hun te doen.
https://tweakers.net/nieuws/241004/browsers-openai-en-perplexity-omzeilen-paywalls-namens-gebruikers.html
#AI #browsing #scraping

Tweakers · 2. nov.Browsers OpenAI en Perplexity omzeilen paywalls namens gebruikersAv Arnoud Wokke

**Mariette Timmer** @mariettetimmer@mastodon.nl · 28. okt.

28. okt.

Mariette Timmer @mariettetimmer@mastodon.nl

Using the DSA to Study Platforms

https://verfassungsblog.de/dsa-platforms-digital-services-act/

#DSA #scraping

**Glyn Moody** @glynmoody@mastodon.social · 24. okt.

24. okt.

Glyn Moody @glynmoody@mastodon.social

#Reddit’s ‘#AI #Scraping’ Lawsuit Is An Attack On The #OpenInternet - https://www.techdirt.com/2025/10/24/reddits-ai-scraping-lawsuit-is-an-attack-on-the-open-internet/ important detailed analysis by @mmasnick of very bad, very reckless behaviour

Techdirt · 24. okt.Reddit’s ‘AI Scraping’ Lawsuit Is An Attack On The Open InternetWhen Reddit sued “data scraper” companies and AI firm Perplexity earlier this week, I assumed it was another predictable skirmish over AI training data—the kind of case we’ve been…

**Kevin Karhan** @kkarhan@infosec.space · 7. okt.

7. okt.

Kevin Karhan @kkarhan@infosec.space

Wusste nicht dass @sixtus was auf @arte gemacht hat...

Ich glaub' nicht, dass das "#Internet" an sich "stibt" sondern dass Leute sich entscheiden weg von der #Kommerzialisierung und #Enshittification mit #AIslop per #AntisocialMedia und mehr zum #Fediverse und/oder #SelfHosting auf #Tor gehen werden weil @torproject #Scraping als #DDoS passiv bekämpft!

https://www.youtube.com/watch?v=cGmVehWBdHI

YouTubeKI: Der Tod des Internets | Doku HD | ARTEDas Internet wird überschwemmt von KI-generiertem Müll. Automatisierte Bots produzieren eine Flut aus KI-generierten Inhalten, der das Internet zu ersticken ...

**securityaffairs** @securityaffairs@infosec.exchange · 6. okt.

6. okt.

securityaffairs @securityaffairs@infosec.exchange

#LinkedIn sues ProAPIs for $15K/Month LinkedIn #data #scraping scheme
https://securityaffairs.com/183001/security/linkedin-sues-proapis-for-15k-month-linkedin-data-scraping-scheme.html
#securityaffairs #hacking

Security Affairs · 6. okt.LinkedIn sues ProAPIs for $15K/Month LinkedIn data scraping schemeLinkedIn sued ProAPIs and its CEO for running millions of fake accounts to scrape and sell user data, charging up to $15,000 per month.

**Petra van Cronenburg** @NatureMC@mastodon.online · 25. sep.

25. sep.

Petra van Cronenburg @NatureMC@mastodon.online

Open source and the foundation of modern software development is cracking.
The reason: AI companies are scraping entire registries. Enterprise CI/CD systems hammer servers with wasteful, uncached requests."

An open letter to the industry, written by stewards of public #OpenSource infrastructure https://openssf.org/blog/2025/09/23/open-infrastructure-is-not-free-a-joint-statement-on-sustainable-stewardship/

openssf.orgOpen Infrastructure is Not Free: A Joint Statement on Sustainable Stewardship – Open Source Security Foundation

#AI #generativeAI #scraping

**Petra van Cronenburg** @NatureMC@mastodon.online · 22. sep. *

22. sep. *

Petra van Cronenburg @NatureMC@mastodon.online

1/2 You know: they are #scraping every text and book to ingest them in #AITraining. Large #AI companies don't care about intellectual property rights or remuneration.

But did you know that they do not simply lend books or give them away to children, people in need, or libraries afterwards? They are torn apart, destroyed, thrown away. https://arstechnica.com/ai/2025/06/anthropic-destroyed-millions-of-print-books-to-build-its-ai-models/

To show #writers + #authors Anthropic's degree of contempt:

Ars Technica · 25. juniAnthropic destroyed millions of print books to build its AI modelsAv Benj Edwards

#books #bookstodon #Anthropic

**Tara** @tarajdactyl@anarres.family · 21. sep.

21. sep.

Tara @tarajdactyl@anarres.family

learning about #browsercookie has made #python so much easier to reach for for simple one-off #scraping jobs - no need to fool with authentication, if you can see a page in your browser, you can get to it with python!

https://pypi.org/project/browsercookie/

pypi.orgClient Challenge

**Jeferson 'Shin'** @shinspiegel@mastodon.social · 28. aug.

28. aug.

Jeferson 'Shin' @shinspiegel@mastodon.social

Looks like we will join the AI hype train, baby! So to start, let's talk about poisoning AI, and how we can brew something. Did I grab your attention?

https://jeferson.me/blog/2025/08/28/choose-your-poison

jeferson.meChoose your poisonHow I poison bots, wreck AI scrapers, and fight back against leeches stealing words without consent.

#AI #Bot #Scraping

**nuagezero** @nuagezero@mamot.fr · 14. aug. *

14. aug. *

nuagezero @nuagezero@mamot.fr

Reddit will block the Internet Archive

The company says that AI companies have scraped data from the Wayback Machine, so it’s going to limit what the Wayback Machine can access.

https://www.theverge.com/news/757538/reddit-internet-archive-wayback-machine-block-limit

The Verge · 11. aug.Reddit will block the Internet ArchiveAv Jay Peters

#reddit #internetarchive #waybackmachine

**Austin Huang** @austin@mstdn.party · 12. aug.

12. aug.

Austin Huang @austin@mstdn.party

Since people are dunking on #Meta #scraping again I'll share one tidbit: when @jonah and I was investigating some performance issues, I noticed that Meta-ExternalAgent was scraping /auth/sign_up and one specific invite link with different `accept` parameters (which indicates acceptance of rules), however because Mastodon returns 200 (and shows the rules again) on invalid `accept` parameters the #scraper just keeps going...

**We Distribute** @news@wedistribute.org · 11. aug. *

11. aug. *

We Distribute @news@wedistribute.org

Is Meta Scraping the Fediverse for AI?

Is a large corporate entity scraping a community-run open social network to train AI models for profit?

https://wedistribute.org/2025/08/is-meta-scraping-the-fediverse-for-ai/

We Distribute · 11. aug.Is Meta Scraping the Fediverse for AI?

Mer fra

We Distribute

#ai #meta #scraping

Replied in thread

**Paula Gordon** @dbaplanb@mastodon.sdf.org · 11. aug.

11. aug.

Paula Gordon @dbaplanb@mastodon.sdf.org

@FediPact

601 instances of us.archive.org

also 858 URLs containing .gov, of which 614 are gov.xx, that is government sites of other countries. So 244 US government sites, including consumer.ftc.gov, houstontx.gov, webharvest.gov (a NARA site), emergency.cdc.gov, ...
I get it that US gov sites are not copyrighted, but still, talk about freeloading

The only positive is that this may be the most comprehensive list of websites existing today.

#META #AI #scraping

**Marcus "MajorLinux" Summers** @majorlinux@toot.majorshouse.com · 5. aug.

5. aug.

Marcus "MajorLinux" Summers @majorlinux@toot.majorshouse.com

Capitalism's only skill is to steal the work of others and further exploit the working class!

Perplexity is allegedly scraping websites it's not supposed to, again

https://www.engadget.com/ai/perplexity-is-allegedly-scraping-websites-its-not-supposed-to-again-211110756.html?src=rss

#Perplexity #Scraping #AI

**IT News** @itnewsbot@schleuss.online · 4. aug.

4. aug.

IT News @itnewsbot@schleuss.online

AI site Perplexity uses “stealth tactics” to flout no-crawl edicts, Cloudflare says - AI search engine Perplexity is using stealth bots and other ... - https://arstechnica.com/information-technology/2025/08/ai-site-perplexity-uses-stealth-tactics-to-flout-no-crawl-edicts-cloudflare-says/ #robots.txt #security #scraping #biz⁢ #ai

Ars Technica · 4. aug.AI site Perplexity uses “stealth tactics” to flout no-crawl edicts, Cloudflare saysAv Dan Goodin

**Thor A. Hopland** @hopland · 26. mars

26. mars

Thor A. Hopland @hopland

When it comes to #AI, could you tf not with all that #scraping? Pay-per-packet could be the future now if some people can't control themselves.

The more you #scrape, the more #developers have to pay, which should yield better and improved infrastructure to decrease the cost, but instead: it could turn the internet into a true "transactional" network.

Suddenly the #internet is run on #microtransactions... hell hath arrived. Granted, this fringe scenario is a bit hyperbolic, but still.

Nylige søk

Alternativer for søk

Administrert av:

Serverstatistikk:

#scraping