From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from simark.ca by simark.ca with LMTP id /yRYLatrBmgyZj4AWB0awg (envelope-from ) for ; Mon, 21 Apr 2025 12:00:43 -0400 Received: by simark.ca (Postfix, from userid 112) id A35411E0C3; Mon, 21 Apr 2025 12:00:43 -0400 (EDT) X-Spam-Checker-Version: SpamAssassin 4.0.1 (2024-03-25) on simark.ca X-Spam-Level: X-Spam-Status: No, score=-6.3 required=5.0 tests=ARC_SIGNED,ARC_VALID,BAYES_00, MAILING_LIST_MULTI,RCVD_IN_DNSWL_MED,RCVD_IN_MSPIKE_H2 autolearn=ham autolearn_force=no version=4.0.1 Received: from server2.sourceware.org (server2.sourceware.org [8.43.85.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (prime256v1) server-digest SHA256) (No client certificate requested) by simark.ca (Postfix) with ESMTPS id DCDF71E05C for ; Mon, 21 Apr 2025 12:00:41 -0400 (EDT) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 1EF253858280 for ; Mon, 21 Apr 2025 16:00:41 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 1EF253858280 Received: from gnu.wildebeest.org (gnu.wildebeest.org [45.83.234.184]) by sourceware.org (Postfix) with ESMTPS id 8D0123858C2C; Mon, 21 Apr 2025 15:59:41 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 8D0123858C2C Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=klomp.org Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=klomp.org ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 8D0123858C2C Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=45.83.234.184 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1745251181; cv=none; b=iqjBJpSXH8VzpdsZkDHh8i3fFMhEsQmXf9+9VY8WEdGuG1VkcYvO2TdgmcdZ3p2qAP4lap6g49HTYm13q54GxjESp66VxYg5FSJKGxQrCT/DNtfLfX5dgzPpEu0h8CB3QzTtW9X5dulrPFgyZPq9jw4kHbNvXmNaG6VAm/nwNGc= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1745251181; c=relaxed/simple; bh=amaFF+EZuh5Urf/9CUXLmRhvw1XBKfCXiZf3VoUiSFA=; h=Date:From:To:Subject:Message-ID:MIME-Version; b=PtWm6NdIAH8ObyUQbmEtgCzBylGxe0nUdNp3DepLHzbL/ELEy8CQ6t4Dj2paVkk3mbwSEvYmVrW785BeK0oRzX9IWyIAXjzOl1T+kKtb/RnOGXOmjBXcYffuNjClsEWEGNoe+jjJdWWaoSJrFYLwd6xvXYp5LFX6aa/LNZmYjwc= ARC-Authentication-Results: i=1; server2.sourceware.org DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 8D0123858C2C Received: by gnu.wildebeest.org (Postfix, from userid 1000) id 7D0D4303AA02; Mon, 21 Apr 2025 17:59:40 +0200 (CEST) Date: Mon, 21 Apr 2025 17:59:40 +0200 From: Mark Wielaard To: binutils@sourceware.org, elfutils-devel@sourceware.org, gcc@gcc.gnu.org, gdb@sourceware.org, libc-alpha@sourceware.org, libabigail@sourceware.org, newlib@sourceware.org Cc: overseers@sourceware.org Subject: scraperbot protection - Patchwork and Bunsen behind Anubis Message-ID: <20250421155940.GE2323@gnu.wildebeest.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.5.21 (2010-09-15) X-BeenThere: gdb@sourceware.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gdb mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gdb-bounces~public-inbox=simark.ca@sourceware.org Sender: "Gdb" Hi hackers, TLDR; When using https://patchwork.sourceware.org or Bunsen https://builder.sourceware.org/testruns/ you might now have to enable javascript. This should not impact any scripts, just browsers (or bots pretending to be browsers). If it does cause trouble, please let us know. If this works out we might also "protect" bugzilla, gitweb, cgit, and the wikis this way. We don't like to hav to do this, but as some of you might have noticed Sourceware has been fighting the new AI scraperbots since start of the year. We are not alone in this. https://lwn.net/Articles/1008897/ https://arstechnica.com/ai/2025/03/devs-say-ai-crawlers-dominate-traffic-forcing-blocks-on-entire-countries/ We have tried to isolate services more and block various ip-blocks that were abusing the servers. But that has helped only so much. Unfortunately the scraper bots are using lots of ip addresses (probably by installing "free" VPN services that use normal user connections as exit point) and pretending to be common browsers/agents. We seem to have to make access to some services depend on solving a javascript challenge. So we have installed Anubis https://anubis.techaro.lol/ in front of patchwork and bunsen. This means that if you are using a browser that identifies as Mozilla or Opera in their User-Agent you will get a brief page showing the happy anime girl that requires javascript to solve a challenge and get a cookie to get through. Scripts and search engines should get through without. Also removing Mozilla and/or Opera from your User-Agent will get you through without javascript. We want to thanks Xe Iaso who has helped us set this up and worked with use over the Easter weekend solving some of our problems/typos. Please check out if you want to be one of their patrons as thank you. https://xeiaso.net/notes/2025/anubis-works/ https://xeiaso.net/patrons/ Cheers, Mark