arm64: percpu: Implement this_cpu operations

The generic this_cpu operations disable interrupts to ensure that the
requested operation is protected from pre-emption. For arm64, this is
overkill and can hurt throughput and latency.

This patch provides arm64 specific implementations for the this_cpu
operations. Rather than disable interrupts, we use the exclusive
monitor or atomic operations as appropriate.

The following operations are implemented: add, add_return, and, or,
read, write, xchg. We also wire up a cmpxchg implementation from
cmpxchg.h.

Testing was performed using the percpu_test module and hackbench on a
Juno board running 3.18-rc4.

Signed-off-by: Steve Capper <steve.capper@linaro.org>
Reviewed-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2 files changed