Array<UInt8> is significantly slower than Data
While trying to debug some performance issues I was having today, I realised the root of my problems was an insanely slow parse in Multipart. Not that the work I was doing was too slow - the parser itself was slow.
I opened up the project in Xcode and added a simple test:
measure {
let parser = try! Parser(boundary: "MimeBoundary_8923gr836bidfb3i")
var parts: [Part] = []
parser.onPart = { part in
parts.append(part)
}
parts = []
try! parser.parse(message)
XCTAssertEqual(parts.count, 2)
}
For a 332kB message, with two parts (some text and a JPEG image), the implementation using Bytes (i.e. [UInt8]) took 0.422s. Replacing Bytes with Data resulted in the same test executing in 0.0475s. That's nearly a 10x speedup.
Now admittedly Multipart is a small library, and it wasn't zero work to change it over to use Data. Vapor as a whole is enormous, and such a changeover would require care and attention and a lot of work. But I think Bits really can't continue defining its own type for data when that type is so much slower than the native type. IMO, Bits should be a library for operating on Data objects, and Collections/Sequences of UInt8, and should ditch the Bytes type entirely.
Two questions:
- Was this compiled in release mode?
- What operating system did you run these tests on (i'm assuming macOS)?
Release mode makes a huge difference when working with Bytes. Additionally, we've historically had huge performance issues using Data on Linux (however those may be fixed now).
I definitely want Vapor to move to Data eventually, we've just been running into a lot of issues with it. Hopefully some of those are starting to be solved. :)
Marking this as 2.0 milestone since moving from Bytes ([UINT8]) -> Data would be a breaking change.