ARM and other architecture support
Hello there,
I'm doing lots of code on embedded systems and that's where optimizations are really needed. So I've been thinking about adding ARM or more exotic architectures support. I know that with AWS you're stuck at x86, but hear me out. ;)
Perhaps main backend could have list of slave backends, which could do architecture-specific compiling and benchmark running. Suppose you want to benchmark your code on raspi (armhf):
- user clicks "benchmark" button
- backend waits for free armhf-compatible slave
- backend sends it the code to compile and measure
- slave does compilation and measurement and returns disassembly/numbers to backend
- backend presents results to user
Of course this makes this project way more complex and introduces problem of slave availability/reliability but I think opportunities are worth it.
Suppose I want to do benchmarks on some AVR/MIPS/Extensa/whatever architecture chip. I can hook it up to my raspi via UART or something, launch slave daemon on pi so that it would compile code for given chip, transmit binary to it and wait for device to finish to measure performance. One could even write Amiga slave for some retro optimizing action.
And if not making public pool of slaves, there could be a text box to type IP address of private slave.
What about QEMU?
My guess is that QEMU ain't doing things in cycle-exact mode, which also may vary between versions. So ARM emulated in one version will show different performance than in the other. Wiki seems to confim this, since it states that QEMU uses binary translation to offer near-native speed.
Fair enough. Still, I presume that running an emulator may be way easier to implement than your original suggestion and it will be better than nothing at all.
Quick googling also suggests this.
I think it doesn't have to be perfect in 1st try, so using an emulator is a good starting point. I also think that implementation of this feature should expose some kind of interface for connecting other sources of benchmark results, not just one specific emulator. This way one could write a piece of software interfacing quickbench backend with real target hardware for use on local copy of QB.