summaryrefslogtreecommitdiff
path: root/libbb/hash_md5_sha_x86-64.S.sh
AgeCommit message (Collapse)Author
2022-08-30libbb: mark stack in assembly files read-onlyLudwig Nussel
Signed-off-by: Ludwig Nussel <ludwig.nussel@suse.de> Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
2022-02-11whitespace fixesDenys Vlasenko
Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
2022-02-11libbb/sha1: revert last commit: pshufb is a SSSE3 insn, can't use itDenys Vlasenko
Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
2022-02-11libbb/sha1: shrink unrolled x86-64 codeDenys Vlasenko
function old new delta sha1_process_block64 3481 3384 -97 Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
2022-02-08libbb/sha1: shrink unrolled x86-64 codeDenys Vlasenko
function old new delta sha1_process_block64 3482 3481 -1 .rodata 108460 108412 -48 ------------------------------------------------------------------------------ (add/remove: 1/4 grow/shrink: 0/2 up/down: 0/-49) Total: -49 bytes Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
2022-02-07libbb/sha1: shrink and speed up unrolled x86-64 codeDenys Vlasenko
function old new delta sha1_process_block64 3514 3482 -32 Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
2022-02-03libbb/sha256: optional x86 hardware accelerated hashingDenys Vlasenko
64 bit: function old new delta sha256_process_block64_shaNI - 730 +730 .rodata 108314 108586 +272 sha256_begin 31 83 +52 ------------------------------------------------------------------------------ (add/remove: 5/1 grow/shrink: 2/0 up/down: 1055/-1) Total: 1054 bytes 32 bit: function old new delta sha256_process_block64_shaNI - 747 +747 .rodata 104318 104590 +272 sha256_begin 29 84 +55 ------------------------------------------------------------------------------ (add/remove: 5/1 grow/shrink: 2/0 up/down: 1075/-1) Total: 1074 bytes Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
2022-01-25libbb/sha1: in unrolled x86-64 code, pass initial W[] in registers, not on stackDenys Vlasenko
This can be faster on some CPUs. On Skylake, evidently load latency from L1 (or store-to-load forwarding in LSU) is fast enough to completely hide memory reference latencies here. function old new delta sha1_process_block64 3495 3514 +19 Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
2022-01-23libbb/sha1: use SSE2 in unrolled x86-64 code. ~10% fasterDenys Vlasenko
function old new delta .rodata 108241 108305 +64 sha1_process_block64 3502 3495 -7 ------------------------------------------------------------------------------ (add/remove: 5/0 grow/shrink: 1/1 up/down: 64/-7) Total: 57 bytes Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
2022-01-08libbb/sha1: add a commentDenys Vlasenko
Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
2022-01-08whitespace fixDenys Vlasenko
Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
2022-01-04libbb/sha1: x86_64 version: reorder prologue/epilogue insnsDenys Vlasenko
Not clear exactly why, but this increases hashing speed on Skylake from 454 MB/s to 464 MB/s. Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
2022-01-03libbb/sha1: x86_64 version: tidying up, no code changesDenys Vlasenko
Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
2022-01-03typo fixDenys Vlasenko
Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
2022-01-03libbb/sha1: x86_64 version: generate from a script, optimize a bitDenys Vlasenko
function old new delta sha1_process_block64 3569 3502 -67 Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>