• who@feddit.org
    link
    fedilink
    English
    arrow-up
    1
    ·
    edit-2
    14 days ago

    Unfortunately, robots.txt cannot express rate limits, so it would be an overly blunt instrument for things like GP describes. HTTP 429 would be a better fit.

    • redjard@lemmy.dbzer0.com
      link
      fedilink
      English
      arrow-up
      1
      ·
      14 days ago

      Crawl-delay is just that, a simple directive to add to robots.txt to set the maximum crawl frequency. It used to be widely followed by all but the worst crawlers …