Skip to content
Snippets Groups Projects
Commit 65aa002d authored by Martin Storsjö's avatar Martin Storsjö
Browse files

aarch64: vp9itxfm: Avoid reloading the idct32 coefficients


The idct32x32 function actually pushed d8-d15 onto the stack even
though it didn't clobber them; there are plenty of registers that
can be used to allow keeping all the idct coefficients in registers
without having to reload different subsets of them at different
stages in the transform.

After this, we still can skip pushing d12-d15.

Before:
vp9_inv_dct_dct_32x32_sub32_add_neon: 8128.3
After:
vp9_inv_dct_dct_32x32_sub32_add_neon: 8053.3

Signed-off-by: default avatarMartin Storsjö <martin@martin.st>
parent 402546a1
No related branches found
No related tags found
No related merge requests found
Loading
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment