• rockSlayer@lemmy.world
        link
        fedilink
        arrow-up
        38
        ·
        7 months ago

        Well that’s part of the thing. Web scraping doesn’t get covered by policies. Like, they could ban your ip or any accounts you have, but web scraping itself will always be acceptable. It’s why projects like NewPipe and Invidious don’t care about YouTube cease and desist letters.

      • freebread@lemm.ee
        link
        fedilink
        English
        arrow-up
        9
        ·
        7 months ago

        Still waiting for the news that they took down old.reddit. Without the third party apps, that was the only way it could still be usable.

    • IphtashuFitz@lemmy.world
      link
      fedilink
      English
      arrow-up
      2
      ·
      7 months ago

      We use Akamai where I work for security, CDN, etc. Their services make it largely trivial to identify traffic from bots. They can classify requests in real time as coming from known bots like Googlebot to programming frameworks like python & java to bots that impersonate Googlebot, to virtually any other automated traffic from unknown bots.

      If Reddit was smart they’d leverage something like that to allow Google, Bing, etc. to crawl their data and block all others, or poison others with bogus data. But we’re talking about Reddit here…