Bugzilla – Bug 345510
CryptoStream ignores the ICryptoTransform.CanTransformMultipleBlocks property when reading
Last modified: 2011-05-16 16:32:05 UTC
As the summary specifies, the CryptoStream.Read method ignores the CanTransformMultipleBlocks property of its associated ICryptoTransform, which makes it always transform blocks one by one, that is, ICT.InputBlockSize bytes at a time. This behaviour, though not optimal, is usually OK for encryption algorithms, whose block sizes are usually around 16-64 bytes. However, hash algorithms can also be used in a CryptoStream (they don't affect the data passing through them, just hash it and make the result available in their Hash property), but their InputBlockSize is fixed at 1 byte regardless of the true implementation block size because they buffer data. This brings about the doomsday scenario where trying to hash a 3MB file (which takes less than .1 secs with sha1sum, IO time included) results in _at least_ three million calls* to ICryptoTransform.TransformBlock and a similar amount to Buffer.BlockCopy. In other words: both CryptoStream and the hash algorithm are buffering the data, having fun creating, copying and moving around millions of 1-byte arrays. In fact, 63 out of 64 calls to SHA1Managed.TransformBlock did nothing else than buffer one more byte and return. The speed penalty is massive, as hashing the file took nearly 2 seconds. This makes the CS+HAlg combination _unusable_ for its usual purposes, i.e. checking large streams of data. THUS: given than HashAlgorithm.InputBlockSize is fixed at 1 for MS compatibility, CryptoStream.Read needs to take full advantage of transforms that can much multiple "blocks" in a single call to TransformBlock (as are hash algorithms). This is not as easy as it seems, as one always needs to know whether the underlying stream has reached its end in order to call TransformFinalBlock, but it is performance-critical. Both the full MS implementation and the Mono CS.Write method do it fine. * All figures about calls were obtained using mono --profile. Elapsed times were measured _without_ the --profile option.
Downgrading priority to Normal (actually it probably should be moved to Enhancement since it's an optimization request). Specific Note: It's better to include a self-contained, compilable test case with benchmark numbers than being too verbose (anyone fixing this already knows how CryptoStream works ;-) Seriously CryptoStream can be used in many different ways so my own code may be different from yours. General Note: CryptoStream should be avoided for hash operations (it's way more complex than required for this case) unless you are chaining transforms. HashAlgorith provides a ComputeHash(Stream) method that will be a lot faster (even with a optimized CryptoStream).
Here are my results for a 90mb file using sha1sum, HashAlgorithm.ComputeHash and CryptoStream poupou@pollux:~/src/bugzilla> l ~/Desktop/RX-34_2007SE_4.2007.26-8_PR_COMBINED_MR0_ARM.bin -rw-r--r-- 1 poupou users 90298052 2007-09-07 17:43 /home/poupou/Desktop/RX-34_2007SE_4.2007.26-8_PR_COMBINED_MR0_ARM.bin poupou@pollux:~/src/bugzilla> time sha1sum ~/Desktop/RX-34_2007SE_4.2007.26-8_PR_COMBINED_MR0_ARM.bin 8f932c938a677f9e5a67070aa3dee0c170a32f69 /home/poupou/Desktop/RX-34_2007SE_4.2007.26-8_PR_COMBINED_MR0_ARM.bin real 0m2.314s user 0m1.440s sys 0m0.104s poupou@pollux:~/src/bugzilla> time mono 345510.exe SHA1 hash ~/Desktop/RX-34_2007SE_4.2007.26-8_PR_COMBINED_MR0_ARM.bin 8F-93-2C-93-8A-67-7F-9E-5A-67-07-0A-A3-DE-E0-C1-70-A3-2F-69 real 0m3.556s user 0m3.416s sys 0m0.072s poupou@pollux:~/src/bugzilla> time mono 345510.exe SHA1 cryptostream ~/Desktop/RX-34_2007SE_4.2007.26-8_PR_COMBINED_MR0_ARM.bin 8F-93-2C-93-8A-67-7F-9E-5A-67-07-0A-A3-DE-E0-C1-70-A3-2F-69 real 0m3.550s user 0m3.444s sys 0m0.048s I'm actually impressed that the later is as fast as ComputeHash since it involves extra steps. This also means that my CryptoStream code is different than yours ;-) Here's mine: using System; using System.IO; using System.Security.Cryptography; class Digest { private const int Size = 4096; static byte[] hash (HashAlgorithm algo, Stream s) { return algo.ComputeHash (s); } static byte[] cryptostream (HashAlgorithm algo, Stream s) { // note: it could make sense to use a BufferedStream byte[] data = new byte[Size]; using (CryptoStream cs = new CryptoStream (Stream.Null, algo, CryptoStreamMode.Write)) { int size = s.Read (data, 0, Size); while (size > 0) { cs.Write (data, 0, size); size = s.Read (data, 0, Size); } } return algo.Hash; } // args: algorithm method filename static void Main (string [] args) { using (FileStream fs = File.OpenRead (args [2])) { using (HashAlgorithm algo = HashAlgorithm.Create (args [0])) { byte[] digest = null; switch (args [1].ToLower ()) { case "hash": digest = hash (algo, fs); break; case "cryptostream": digest = cryptostream (algo, fs); break; default: throw new NotSupportedException (args [1]); } Console.WriteLine (BitConverter.ToString (digest)); } } } } Downgrading priority to Enhancement since the original test case _probably_ occurs only when reading byte-by-byte (which is bad for performance, crypto or not). Please attach your own test case so I can see where the bottleneck lies in this particular case.
As I made pretty clear, the bug is in the CS.Read method, not in CS.Write which you use in your code example. I know there are other ways I can go - ComputeHash is a good one for one algorithm, but I'm chaining them. Also, I'd rather not use CS.Write in order to read data from a file: seems too much of a hack to me. This is a sample of the code I'm using, modified to remove some Trace output and instances of classes from my project: static Stack<byte[]> HashFile(string fileName, HashAlgorithm[] algorithms) { if (fileName != null && algorithms != null && File.Exists(fileName)) { Stack<byte[]> output = new Stack<byte[]>; foreach (HashAlgorithm alg in algorithms) alg.Initialize(); using (FileStream source = File.OpenRead(fileName)) { // Channel all hash CryptoStreams: the output of a hash // ICryptoTransform is the same as its input, while the // result is saved in the HashAlgorithm instance. Stack<Stream> pipes = new Stack<Stream>(algorithms.Length + 1); pipes.Push(source); foreach (HashAlgorithm alg in algorithms) pipes.Push(new CryptoStream(pipes.Peek(), alg, CryptoStreamMode.Read)); Stream lastPipe = pipes.Peek(); // Set up a temporary buffer to read and hash just a chunk // at a time instead of reading a possibly very large file // to memory and then hashing. byte[] readBuffer = new byte[READBUFFER_SIZE]; // The real hashing is done here - on reading, data passes // through all set CryptoStreams. For all that matters, the // loop could well be empty while (0 < lastPipe.Read(readBuffer, 0, READBUFFER_SIZE)) { } lastPipe.Close(); // Closes all underlying Streams } // Get the hash values and return it foreach (HashAlgorithm alg in algorithms) output.Push(alg.Hash); return output; } else return null; } As you can see, I'm using hash chaining, so resorting to CS.Write would mean jumping through so many hoops that it'd probably not be useful. Besides, this code works correctly and efficiently in .NET 2, while in Mono it takes the time I stated in the first post (about 2 secs) to hash a 3 MB file (that's 1/30 of your test file).
By the way, I've experimented with many values for the constant READBUFFER_SIZE, from 1K to 512K through 4K (vmem_page_size) and others.
"As I made pretty clear" myself, being less verbose and "include a self-contained, compilable test case" ensure less confusion on the issues ;-) Now this is clearly a corner case, with several workarounds. I'll get a look at this when I have time assigned to optimize code or if bugs (like bad results) are found around that code. In the mean time you're welcome to attach a patch to implement the missing CanTransformMultipleBlocks optimization on reads.