From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from simark.ca by simark.ca with LMTP id 2GArCg8cCWgdugEAWB0awg (envelope-from ) for ; Wed, 23 Apr 2025 12:57:51 -0400 Authentication-Results: simark.ca; dkim=pass (1024-bit key; secure) header.d=sourceware.org header.i=@sourceware.org header.a=rsa-sha256 header.s=default header.b=kIIfwj3N; dkim-atps=neutral Received: by simark.ca (Postfix, from userid 112) id 255201E0C3; Wed, 23 Apr 2025 12:57:51 -0400 (EDT) X-Spam-Checker-Version: SpamAssassin 4.0.1 (2024-03-25) on simark.ca X-Spam-Level: X-Spam-Status: No, score=-5.4 required=5.0 tests=ARC_SIGNED,ARC_VALID,BAYES_00, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI, RCVD_IN_DNSWL_MED autolearn=unavailable autolearn_force=no version=4.0.1 Received: from server2.sourceware.org (server2.sourceware.org [8.43.85.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (prime256v1) server-digest SHA256) (No client certificate requested) by simark.ca (Postfix) with ESMTPS id 44C891E05C for ; Wed, 23 Apr 2025 12:57:50 -0400 (EDT) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id AD4783857BA7 for ; Wed, 23 Apr 2025 16:57:49 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org AD4783857BA7 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sourceware.org; s=default; t=1745427469; bh=DJImc0Quo+oqMI9tQgdRxMYh8gBzWTgkkfJGwhRWvl0=; h=References:In-Reply-To:Date:Subject:To:Cc:List-Id: List-Unsubscribe:List-Archive:List-Post:List-Help:List-Subscribe: From:Reply-To:From; b=kIIfwj3NDPHKbO/yY5Kk+tE6qW48bUfZn3BSKRIEgxxH0EaiFpcBekBjuXCBXSNS/ zsvbbsJLEO1/gotIhbcZ3ksBrEGYYFII28i6XW3i5tb1BQ13wsLILIEb/cuz9rS9AI DzLLsAUV5LtWO4/MSDGMxVOlJF9uCZxKlL0xYlzk= Received: from mail-pj1-x1033.google.com (mail-pj1-x1033.google.com [IPv6:2607:f8b0:4864:20::1033]) by sourceware.org (Postfix) with ESMTPS id 0AC683857C7F for ; Wed, 23 Apr 2025 16:56:45 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 0AC683857C7F ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 0AC683857C7F ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1745427405; cv=none; b=NTElBnOLr7sNydRyNl9utpZGURuDpx+JnRI2UM5kUy9xlreKonSLHihNgGozcVJ6jrOJqax5WgXylKRi+XfFiiVnTDb9+Yz3OCZJVYNv0fQisuXBVyeWFfSGcIyWK0OWGP55aLiMf3AIFr2e6c5D9cgM2Wiz7u6Wu90nTwpSPdw= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1745427405; c=relaxed/simple; bh=iZ5fW2cJ9wq3Rf18obQ18/Nnd9waOiKf/je9og1Kx10=; h=DKIM-Signature:MIME-Version:From:Date:Message-ID:Subject:To; b=f8Sfiuob2Q6T63zGjbVeXQfmEmmb8DREN/rEufvBgkwBG41mf4di5O101p5q3S/eKoV0LMhUhz7cJXFHSsvDdjK0Wpkb7OOvlw2g3L2y+Bhf5OV7MDGB1a3hUsG9R3zWGyb9L4BW9uAA5HNzBT+Zihyo7JqNsROap6Eyw65hLGo= ARC-Authentication-Results: i=1; server2.sourceware.org DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 0AC683857C7F Received: by mail-pj1-x1033.google.com with SMTP id 98e67ed59e1d1-301a4d5156aso104318a91.1 for ; Wed, 23 Apr 2025 09:56:44 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1745427404; x=1746032204; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=DJImc0Quo+oqMI9tQgdRxMYh8gBzWTgkkfJGwhRWvl0=; b=eGgBuhchSbxXyg1igmFhM1pih3sqRLiDRzJ8ousFVMpOrmg1kaNtHzfrITA6/WNbD0 IfKTaDiLuZOF2OtwVdjL3UIb7Kar4J6lltqGOWll+jerBjDg4UFZQIS6bTRD2R1hBKqk JIWA5n6yzwsqK2IU/EOTT8rDiBu8w3EPw58Nty1k9grfRNdRCpQsw/dDLnCuIz89W3NA rCd8QfVE1lECdASsVq1CbR5qRAodEN2HaHpjN4mM4lnAV+F/I29+5oNXO32209u2qOKr gvTAGhbbEIv/hv+8WbAdLlshjnl3rnzskSbkD2A1HIrBi5heL+eSycD6SFZha0UFNS/K Mm/A== X-Forwarded-Encrypted: i=1; AJvYcCWwj7U2sZzOUN5nAgVRfW+0KDnXKU9NhZL1Ss4ChDYQTUpt2LPegN36YfY0g54X7Z7+GJo=@sourceware.org X-Gm-Message-State: AOJu0YxILkVaFEiI114xybRMBYk77mJesDYSrEfcAnxrmoWfA0HPwnHb I+uppae4GEVX28Uz9ylsnnJyv0Tac+JxR9Ws/mDWTDfkq+2PKf1TRZa9qMXaBMw5ZfR91rb2VwA tz9lSun7nejGYTpIqi5STqHRodxGb35r1u1kzdA== X-Gm-Gg: ASbGnctbBZ4hQyNXD2WOX0WMgpGzzFLw9gKuZbH0ExTMssTXgnfLSl0nQsO8ALjWAPj GSd41fZt+pGlZO3sBPSV5Ci9/5cyFT3BueCDFzzGUgUCLVpKxQ1y6CnBwUM7GG+UH1xrsenx+5B 1esjkQ7xoW+blOZYUmwKljExo= X-Google-Smtp-Source: AGHT+IElhdNta3E8X+0dM4JhWXgCK+8X+tBoz4rg2wEyH2GjS36H0STCQ6fHuMwL4ohQusxkR8Iswm9lrnVIkYGeRMU= X-Received: by 2002:a17:90b:1f90:b0:2ff:6ac2:c5a5 with SMTP id 98e67ed59e1d1-3087bcc39b3mr22991884a91.26.1745427403986; Wed, 23 Apr 2025 09:56:43 -0700 (PDT) MIME-Version: 1.0 References: <20250421155940.GE2323@gnu.wildebeest.org> In-Reply-To: <20250421155940.GE2323@gnu.wildebeest.org> Date: Wed, 23 Apr 2025 18:56:31 +0200 X-Gm-Features: ATxdqUHizUe9WwWlVTuDRoafm_FvTZRbBCPnJEJEQAzJhQMfJTpGIVU_T3dcqhs Message-ID: Subject: Re: scraperbot protection - Patchwork and Bunsen behind Anubis To: Mark Wielaard Cc: binutils@sourceware.org, elfutils-devel@sourceware.org, gcc@gcc.gnu.org, gdb@sourceware.org, libc-alpha@sourceware.org, libabigail@sourceware.org, newlib@sourceware.org, overseers@sourceware.org Content-Type: text/plain; charset="UTF-8" X-BeenThere: gdb@sourceware.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gdb mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , From: Christophe Lyon via Gdb Reply-To: Christophe Lyon Errors-To: gdb-bounces~public-inbox=simark.ca@sourceware.org Sender: "Gdb" Hi! Thanks for all the hard work maintaining all this fundamental infrastructure. On Mon, 21 Apr 2025 at 18:00, Mark Wielaard wrote: > > Hi hackers, > > TLDR; When using https://patchwork.sourceware.org or Bunsen > https://builder.sourceware.org/testruns/ you might now have to enable > javascript. This should not impact any scripts, just browsers (or bots > pretending to be browsers). If it does cause trouble, please let us > know. If this works out we might also "protect" bugzilla, gitweb, > cgit, and the wikis this way. > > We don't like to hav to do this, but as some of you might have noticed > Sourceware has been fighting the new AI scraperbots since start of the > year. We are not alone in this. > > https://lwn.net/Articles/1008897/ > https://arstechnica.com/ai/2025/03/devs-say-ai-crawlers-dominate-traffic-forcing-blocks-on-entire-countries/ > > We have tried to isolate services more and block various ip-blocks > that were abusing the servers. But that has helped only so much. In terms of isolation, since I have no idea how / which services are currently isolated, I may ask obvious questions.. We were wondering if it would be possible / suitable to have https requests served by one container, and ssh ones by another? Maybe that's already the case though... Speaking with CI in mind, the Linaro CI is currently severely impacted by these scraperbots too: - directly because our git servers are also overloaded, so our build process often fail to checkout build scripts & infra scripts - indirectly because when the above succeed we may fail to connect to sourceware so maybe it would help if we switched to ssh access for our CI user when cloning GCC / binuitls / etc sources? If only the containers serving https requests are impacted, ssh access could still work well? (that would mean creating a Linaro-CI user on sourceware, I don't know what the policy is?) Thanks, Christophe > Unfortunately the scraper bots are using lots of ip addresses > (probably by installing "free" VPN services that use normal user > connections as exit point) and pretending to be common > browsers/agents. We seem to have to make access to some services > depend on solving a javascript challenge. > > So we have installed Anubis https://anubis.techaro.lol/ in front of > patchwork and bunsen. This means that if you are using a browser that > identifies as Mozilla or Opera in their User-Agent you will get a > brief page showing the happy anime girl that requires javascript to > solve a challenge and get a cookie to get through. Scripts and search > engines should get through without. Also removing Mozilla and/or Opera > from your User-Agent will get you through without javascript. > > We want to thanks Xe Iaso who has helped us set this up and worked > with use over the Easter weekend solving some of our problems/typos. > Please check out if you want to be one of their patrons as thank you. > https://xeiaso.net/notes/2025/anubis-works/ > https://xeiaso.net/patrons/ > > Cheers, > > Mark