3e69410e29b0cc4e8a6e9712d7c980d702597d62 - platform/external/libopus

commit	3e69410e29b0cc4e8a6e9712d7c980d702597d62	[log] [tgz]
author	Timothy B. Terriberry <tterribe@xiph.org>	Fri Feb 09 16:42:15 2024 -0500
committer	Jean-Marc Valin <jmvalin@jmvalin.ca>	Fri Feb 09 20:39:58 2024 -0500
tree	734aa456ff9eafa4ce0e88f83417f8e7ba2b8d45
parent	d503125101116d2b399287824d7902b6351b691d [diff]

Fix OOB read in fixed-point NEON intrinsics.

xcorr_kernel_neon_fixed() read one more sample from y[] in the
 main loop than it needed to allow use of vector loads, but unlike
 the native asm in celt_pitch_xcorr_arm.s, the loop condition did
 not exit early enough to prevent this from overrunning the end of
 the array.
Additionally, the tail loop _always_ read one value beyond what it
 needed.

This patch fixes the loop condition on the main loop.
Since this makes the tail section run even for lengths that are a
 multiple of 8 (e.g., on fully half the multiplies for usages like
 celt_fir() or celt_iir() with an order of 16, which is common),
 rather than try to fix the tail loop, we replace it with a
 non-looping adaptation of the native asm, which continues to use
 vector loads as much as possible for the remaining elements (and
 also does not read ahead past the end of the y[] array).

Overall slowdown of test_opus_encode on a Raspberry Pi 5 Model B
 Rev 1.0 is 0.12% vs. 0.13% for fixing the existing tail loop.

Signed-off-by: Jean-Marc Valin <jmvalin@jmvalin.ca>

celt/arm/celt_neon_intr.c[diff]

1 file changed