From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gdb-return-45701-listarch-gdb=sources.redhat.com@sourceware.org>
Received: (qmail 13981 invoked by alias); 3 Apr 2017 10:01:33 -0000
Mailing-List: contact gdb-help@sourceware.org; run by ezmlm
Precedence: bulk
List-Id: <gdb.sourceware.org>
List-Subscribe: <mailto:gdb-subscribe@sourceware.org>
List-Archive: <http://sourceware.org/ml/gdb/>
List-Post: <mailto:gdb@sourceware.org>
List-Help: <mailto:gdb-help@sourceware.org>, <http://sourceware.org/ml/#faqs>
Sender: gdb-owner@sourceware.org
Received: (qmail 13943 invoked by uid 89); 3 Apr 2017 10:01:31 -0000
Authentication-Results: sourceware.org; auth=none
X-Virus-Found: No
X-Spam-SWARE-Status: No, score=-6.4 required=5.0 tests=BAYES_00,GIT_PATCH_2,RCVD_IN_DNSWL_NONE,RCVD_IN_SORBS_SPAM,SPF_PASS autolearn=ham version=3.3.2 spammy=price
X-HELO: mail-io0-f173.google.com
Received: from mail-io0-f173.google.com (HELO mail-io0-f173.google.com) (209.85.223.173) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Mon, 03 Apr 2017 10:01:28 +0000
Received: by mail-io0-f173.google.com with SMTP id z13so71893437iof.2        for <gdb@sourceware.org>; Mon, 03 Apr 2017 03:01:30 -0700 (PDT)
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;        d=1e100.net; s=20161025;        h=x-gm-message-state:mime-version:in-reply-to:references:from:date         :message-id:subject:to:cc;        bh=vQO3XmVQNakJQeJ2+2Q8YDbMiX6uTBiiiWA7m+WH7Og=;        b=mOMCPkD0rSxUCXfohPGzF/17aqhWfH9nmgmStrATnGUjEp9BvK1BDGOnynUkDqj+cG         5F77X+bIvqKZE3gEJV5A343DBuLaktrDpkvbw2dYW3q+onAi5f4O9PBke1Prc6C/1e1F         m5W89JU20UZNbLwdpBzGJUFGhFP6JpJkye7D0aq18i/aBTSp63pJjDPEH+S2SV4UxrjU         +oJ8KGmFinJVN6RaIa89RMBKfpFuXdWgwdqGn7M8NUU34cY55VXTzESOuane9rViUE1+         Jc1kt0QubMNrE9xyOWyg81W7bAm1/JFXD78kqracpat7cahrPwzqwE1LpSuEjpiM7aqS         Hd8Q==
X-Gm-Message-State: AFeK/H3+zSkKYqMQuYjA2UT7kZg/njphE35VXawNJL+CD1Yy/GANHcr7YyMiXsHlLnP6drTKbBU75OkYWgKg5IG+
X-Received: by 10.107.132.155 with SMTP id o27mr14688275ioi.138.1491213688562; Mon, 03 Apr 2017 03:01:28 -0700 (PDT)
MIME-Version: 1.0
Received: by 10.107.10.27 with HTTP; Mon, 3 Apr 2017 03:01:28 -0700 (PDT)
In-Reply-To: <20170403094501.GN3750@e103592.cambridge.arm.com>
References: <1490194274-30569-1-git-send-email-Dave.Martin@arm.com> <CAKv+Gu_JUqQu7nUB3MfJrpHL19NTSzGBDmZqLejBx8pNkwSq+A@mail.gmail.com> <20170403094501.GN3750@e103592.cambridge.arm.com>
From: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Date: Mon, 03 Apr 2017 10:01:00 -0000
Message-ID: <CAKv+Gu8vUq0A8BEuLHT8e3w9xJ4fbQ3j7K+mVwJafbQP3Eku5A@mail.gmail.com>
Subject: Re: [RFC PATCH v2 00/41] Scalable Vector Extension (SVE) core support
To: Dave Martin <Dave.Martin@arm.com>
Cc: Florian Weimer <fweimer@redhat.com>, Christoffer Dall <christoffer.dall@linaro.org>, gdb@sourceware.org, 	Marc Zyngier <Marc.Zyngier@arm.com>, Catalin Marinas <catalin.marinas@arm.com>, Yao Qi <Yao.Qi@arm.com>, 	Will Deacon <will.deacon@arm.com>, 	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>, Szabolcs Nagy <szabolcs.nagy@arm.com>, 	Richard Sandiford <richard.sandiford@arm.com>, 	"linux-arm-kernel@lists.infradead.org" <linux-arm-kernel@lists.infradead.org>, Alan Hayward <alan.hayward@arm.com>, 	libc-alpha@sourceware.org, Andrew Morton <akpm@linux-foundation.org>, 	Torvald Riegel <triegel@redhat.com>, Joseph Myers <joseph@codesourcery.com>
Content-Type: text/plain; charset=UTF-8
X-SW-Source: 2017-04/txt/msg00004.txt.bz2

On 3 April 2017 at 10:45, Dave Martin <Dave.Martin@arm.com> wrote:
> On Fri, Mar 31, 2017 at 04:28:16PM +0100, Ard Biesheuvel wrote:
>> On 22 March 2017 at 14:50, Dave Martin <Dave.Martin@arm.com> wrote:
>>
>> Hi Dave,
>>
>> > The Scalable Vector Extension (SVE) [1] is an extension to AArch64 which
>> > adds extra SIMD functionality and supports much larger vectors.
>> >
>> > This series implements core Linux support for SVE.
>> >
>> [...]
>> > KERNEL_MODE_NEON (non-)support
>> > ------------------------------
>> >
>> > "arm64/sve: [BROKEN] Basic support for KERNEL_MODE_NEON" is broken.
>> > There are significant design issues here that need discussion -- see the
>> > commit message for details.
>> >
>> > Options:
>> >
>> >  * Make KERNEL_MODE_NEON a runtime choice, and disable it if SVE is
>> >    present.
>> >
>> >  * Fully SVE-ise the KERNEL_MODE_NEON code: this will involve complexity
>> >    and effort, and may involve unfavourable (and VL-dependent) tradeoffs
>> >    compared with the no-SVE case.
>> >
>> >    We will nonetheless need something like this if there is a desire to
>> >    support "kernel mode SVE" in the future.  The fact that with SVE,
>> >    KERNEL_MODE_NEON brings the cost of kernel-mode SVE but only the
>> >    benefits of kernel-mode NEON argues in favour of this.
>> >
>> >  * Make KERNEL_MODE_NEON a dynamic choice, and have clients run fallback
>> >    C code instead if at runtime on a case-by-case basis, if SVE regs
>> >    would otherwise need saving.
>> >
>> >    This is an interface break, but all NEON-optimised kernel code
>> >    necessarily requires a fallback C implementation to exist anyway, and
>> >    the number of clients is not huge.
>> >
>> > We could go for a stopgap solution that at least works but is suboptimal
>> > for SVE systems (such as the first choice above), and then improve it
>> > later.
>> >
>>
>> Without having looked at the patches in detail yet, let me reiterate
>> my position after we discussed this when this series was sent out the
>> first time around.
>>
>> - The primary use case for kernel mode NEON is special purpose
>> instructions, i.e., AES is 20x faster when using the NEON, simply
>> because that is how one accesses the logic gates that implement the
>> AES algorithm. There is nothing SIMD or FP in nature about AES.
>> Compare the CRC extensions, which use scalar registers and
>> instructions. Of course, there are a couple of exceptions in the form
>> of bit-slicing algorithms, but in general, like general SIMD, I don't
>> think it is highly likely that SVE in kernel mode is something we will
>> have a need for in the foreseeable future.
>
> Certainly there is no immediate need for this, and if we decide we never
> need it then that helps us avoid some complexity.
>
> My main concern is around the extra save/restore cost, given that if
> the SVE registers are live then save/restore of the full SVE vectors
> is needed even if only FPSIMD is used in the meantime.
>
>> - The current way of repeatedly stacking/unstacking NEON register
>> contents in interrupt context is highly inefficient, given that we are
>> usually interrupting user mode, not kernel mode, and so stacking once
>> and unstacking when returning from the exception (which is how we
>> usually deal with it) would be much better. So changing the current
>> implementation to perform the eager stack/unstack only when a kernel
>> mode NEON call is already in progress is likely to improve our current
>> situation already, regardless of whether such a change is needed to
>> accommodate SVE. Note that to my knowledge, the only in-tree users of
>> kernel mode NEON operate in process context or softirq context, never
>> in hardirq context.
>
> Reassuring.
>
>> Given the above, I think it is perfectly reasonable to conditionally
>> disallow kernel mode NEON in hardirq context. The crypto routines that
>> rely on it can easily be fixed up (I already wrote the patches for
>> that). This would only be necessary on SVE systems, and the reason for
>> doing so is that - given how preserving and restoring the NEON
>> register file blows away the upper SVE part of the registers - whoever
>> arrives at the SVE-aware preserve routine first should be allowed to
>> run to completion. This does require disabling softirqs during the
>> time the preserved NEON state is being manipulated but that does not
>> strike me as a huge price to pay. Note that both restrictions
>> (disallowing kernel mode NEON in hardirq context, and the need to
>> disable softirqs while manipulating the state) could be made runtime
>> dependent on whether we are actually running on an SVE system.
>
> Given that we already bail out of kernel_neon_begin() with a WARN() if
> the hardware lacks FPSIMD, I'm not sure it would be worse to do the same
> if SVE is present.
>
> However, we should probably abstract that check: currently, drivers
> seemt to be using a cocktail of Kconfig dependencies,
> MODULE_DEVICE_TABLE() and runtime hwcap checks in order to deduce
> whether kernel_neon_begin() is safe to use.
>

Not quite. The Kconfig dependencies are about
CONFIG_KERNEL_MODE_NEON=y being set, without which it is pretty
pointless to build the modules that depend on it.

The hwcap dependencies are mostly about the actual instructions
themselves, i.e., AES, PMULL, SHA1/2 etc

> Would you be happy with a single API for checking whether
> kernel_neon_begin() works?  Maintaining this check in every driver
> doesn't feel very scalable.
>

Yes, that makes sense, but it is unrelated to hwcap. I'd like to see a
kernel_neon_allowed() that would return '!IS_ENABLED(CONFIG_SVE) ||
!(elf_hwcap & HWCAP_SVE) || !in_irq()' at some point, although I
understand you want to remove the last part for now.
To short-circuit some decisions in the driver when
kernel_neon_allowed() is always true (as it would be currently),
something like kernel_neon_need_fallback() comes to mind.

>
> This would allow SVE to disable KERNEL_MODE_NEON without having to make
> them conflict in Kconfig.  This wouldn't be our end goal, but it allows
> the dependency to be decoupled while we figure out a better solution.
>

I still don't think there is a need to disable kernel mode NEON
altogether. But we can defer that decision until later, but prepare
the existing code to deal with that in the mean time.

As I mentioned, I already looked into this a while ago, so I can
quickly respin the changes to the crypto code to take this into
account.