Hi
MBEDTLS_SHA256_USE_A64_CRYPTO_IF_PRESENT
If you're building the software to run on a system that you know has the crypto extensions, then use MBEDTLS_SHA256_USE_A64_CRYPTO_ONLY - it will be (marginally) faster. There are few aarch64 systems without the crypto extensions, but one of them is the Raspberry Pi, which is used widely.
Is it possible to slice a big file into chunks and compute hash separately and merge?
No, the hash algorithms are sequential.
I've seen up to around 2 GB/s raw hashing speed (i.e. on data in memory) on Apple Silicon.
int BUFFER_SIZE = 4096
That seems very short. Even though fread() is buffered, a quick google suggests a typical buffer size of 8 KB, which means lots of calling into the kernel and context switches. I'd be inclined to read 512 MB at a time.
But if you want the fastest processing, the thing to do is benchmark the libraries you have access to (Mbed TLS, OpenSSL, WolfSSL come to mind) on the different systems you have access to (aarch64, x86_64) and use the winner.
Thanks
Tom
________________________________ From: James Liu icefrog1950@gmail.com Sent: 24 October 2022 12:35 To: Tom Cosgrove Tom.Cosgrove@arm.com Cc: mbed-tls@lists.trustedfirmware.org mbed-tls@lists.trustedfirmware.org Subject: Re: [mbed-tls] Performance tuning of SHA256 on big files
Hi,
Thanks for the tip. I test mbedtls-3.2.1 in M1 by adding two options in mbedtls_config.h:
MBEDTLS_SHA256_USE_A64_CRYPTO_IF_PRESENT, MBEDTLS_SHA512_USE_A64_CRYPTO_IF_PRESENT.
There are substantial improvements on two big files using sha256: CentOS-8.5.2111-x86_64-boot.iso (827.3 MB): (before) 5.9 sec, (after) 32 sec CentOS-8.5.2111-x86_64-boot.iso (10.79 GB): (before) 78 sec, (after) 41 sec
But the problem I'm trying to solve is still there: 1) sha256 incurs high overhead on big files (less than a few seconds are desired), considering there are many big files to process in real time; 2) not sure if tuning could work in x86.
Is it possible to slice a big file into chunks and compute hash separately and merge? I guess other crypto libraries or utilities have same overhead on big files.
Regards
Tom Cosgrove <Tom.Cosgrove@arm.commailto:Tom.Cosgrove@arm.com> 于2022年10月24日周一 16:24写道: Hi
I use same code with mbedtls-3.1.0 to run tests in x86, and performance is still downgraded
Mbed TLS has no acceleration for SHA-256 on x86 or x86_64 - optional or otherwise - it just uses C code. So this is as expected.
Thanks
Tom
________________________________ From: Liu James via mbed-tls <mbed-tls@lists.trustedfirmware.orgmailto:mbed-tls@lists.trustedfirmware.org> Sent: 22 October 2022 10:28 To: mbed-tls@lists.trustedfirmware.orgmailto:mbed-tls@lists.trustedfirmware.org <mbed-tls@lists.trustedfirmware.orgmailto:mbed-tls@lists.trustedfirmware.org> Subject: [mbed-tls] Performance tuning of SHA256 on big files
Hi,
This is an updated post from https://github.com/Mbed-TLS/mbedtls/issues/6464, which should be posted in mbedtls mail list.
My question is how to significantly improve SHA256 performance on big files (regardless of architectures).
=== Updates I use same code with mbedtls-3.1.0 to run tests in x86, and performance is still downgraded.
Mbed TLS version (number or commit id): 3.1.0 Operating system and version: Centos-8.5, CPU 11900K Configuration (if not default, please attach mbedtls_config.h): Compiler and options (if you used a pre-built binary, please indicate how you obtained it): gcc/g++ 8.5 Additional environment information:
Test files and performance CentOS-8.5.2111-x86_64-boot.iso (827.3 MB): sha256 5 sec CentOS-8.5.2111-x86_64-boot.iso (10.79 GB): sha256 66 sec
Also, as advised I try to turn on "MBEDTLS_SHA256_USE_A64_CRYPTO_IF_PRESENT " and "MBEDTLS_SHA512_USE_A64_CRYPTO_IF_PRESENT" using mbedtls-3.2.0 in M1, but compiler reported the following error:
CMake Error at library/CMakeLists.txt:257 (add_library): Cannot find source file:
psa_crypto_driver_wrappers.c
Tried extensions .c .C .c++ .cc .cpp .cxx .cu .mpp .m .M .mm .ixx .cppm .h .hh .h++ .hm .hpp .hxx .in .txx .f .F .for .f77 .f90 .f95 .f03 .hip .ispc
CMake Error at library/CMakeLists.txt:257 (add_library): No SOURCES given to target: mbedcrypto
Thanks for your help.
=== Original message at github
Summary
sha256() and sha1() incurs significant overhead on big files(~1G above). This might not be an issue, and I'm looking for an efficient way to calculate hash on big files.
System information
Mbed TLS version (number or commit id): 3.1.0 Operating system and version: M1 OSX Configuration (if not default, please attach mbedtls_config.h): Compiler and options (if you used a pre-built binary, please indicate how you obtained it): Clang++ Additional environment information:
Expected behavior
Fast calculation of big files in less than 1 second
Actual behavior
Test files: CentOS-8.5.2111-x86_64-boot.iso (827.3 MB): sha1 3.3 sec, sha256 5.9 sec CentOS-8.5.2111-x86_64-boot.iso (10.79 GB): sha1 40 sec, sha256 78 sec
Steps to reproduce
ISO files can be downloaded at: http://ftp.iij.ad.jp/pub/linux/centos-vault/8.5.2111/isos/x86_64/
Make sure use fast disk, say nvme, to store ISO files, or else loading big files could take lots of time. Also use user from time command to measure performance.
Workable code of sha256:
string test_sha256(string file_path) { mbedtls_sha256_context ctx; FILE *fp; string output; int BUFFER_SIZE = 4096; uint8_t buffer[BUFFER_SIZE]; size_t read, k_bytes; uint8_t hash[32];
mbedtls_sha256_init(&ctx); mbedtls_sha256_starts(&ctx, 0);
fp = fopen(file_path.c_str(), "r"); if (fp == NULL) { mbedtls_sha256_free(&ctx); return output; }
while ((read = fread(buffer, 1, BUFFER_SIZE, fp))) { mbedtls_sha256_update(&ctx, buffer, read); }
mbedtls_sha256_finish(&ctx, hash);
mbedtls_sha256_free(&ctx); fclose(fp);
// update hash string, omit here
return output;
}