occa
occa copied to clipboard
Deployment of only kernel binary files with commercial application code
Until recently, all of my OCCA kernel code was used internally by my company. But now, together with my applications, I will need to deploy / distribute externally only the required OCCA kernel binary files, but not any human-readable kernel (meta-)source files. I am not sure if that is possible. If it is not at this time, please make it so.
Even if I embed the kernel OKL source code as an encrypted string in my application code, to decrypt and feed to the OCCA kernel builder API at runtime, I am worried that the OCCA kernel builder API might still sometimes generate a human-readable kernel (meta-)source file somewhere in the cache area (???). That would be a problem for secure deployment of my IP-related work in a commercial context.
If the kernel (meta-) source files need to be there in the deployment, then another possibility might be for the OCCA API to directly support encryption of those files, based on a user-supplied key. It would be ideal if such support were two-phase, based on public-private key pairs (e.g., using AES). Namely, a user-supplied private key used by a "compilation" step implemented as a user application to build / encrypt the resulting OCCA kernel file(s). Then, the corresponding public key is used by the OCCA API at runtime to decrypt the required kernels files as deployed with the application. The user would define and manage the keys, and pass them to the OCCA API as required. So the OCCA API would just need to be extended to encrypt / decrypt the kernel files it now generates or reads.
Of course, runtime decryption will slow down runtime, but that is not too bad in my case, because at least with OCCA v0.2, I internally cache the OCCA kernel objects in application memory after first load anyway.
Hi @pdhahn !
Commercial deployment of OKL code seems like an interesting feature to add. To be honest it might not be super high priority at the moment due to other prioritizations but here is a possible solution that could be added.
Rough Potential Solution
The cached directory looks like:
> tree ~/.occa/cache/7228b9560e498511/
~/.occa/cache/7228b9560e498511/
├── binary
├── build.json
├── raw_source.cpp
└── source.cpp
In reality, the bare minimum would require only the binary
file. The build.json
is a plus for type safety but not required.
Currently building kernels requires the file source along with the function name
occa::buildKernel("addVectors.okl", "addVectors", props);
This generates a hash from:
-
A
: Device -
B
: Kernel source -
C
: Props
We currently store A ^ B ^ C
(7228b9560e498511...
) inside build.json
. If we could use the individual hashes A
and C
plus an additional unique identifier, it would be possible to only ship the binaries.
We could add an additional API method, something like:
occa::library myLibrary("pdhahn");
...
myLibrary.buildKernel(
"add_vectors_id", // Unique identifier
"addVectors.okl", // Same arguments as occa::buildKernel
"addVectors",
props
);
myLibrary.buildKernel(
device, // Optional device, defaulting to occa::getDevice()
"add_vectors_id", // Unique identifier
"addVectors.okl", // Same arguments as occa::buildKernel
"addVectors",
props
);
This method would prioritize looking at
- Library
pdhahn
- Kernel an identifier
add_vectors_id
- Which has a
A ^ C
hash matching givendevice
andprops
and fall back to the default occa::buildKernel
functionality if the binary doesn't exist.
The difference would be:
- Prioritization which only looks for a binary
- Avoids using the source has a hash
- Writes out the hashed output in the
~/.occa/libraries/pdhahn/add_vectors_id/<A^C hash>/
Additional Missing Functionalities
- A way to set
~/.occa/libraries/pdhahn
without settingOCCA_DIR
- A tool to recursively copy over the library directory but only the
binary
files (optionally thebuild.json
files)
From my perspective, the library idea sounds like a good one except for the fact that it could take too long to come into existence (per your statement about priority).
If the existing buildKernel
() method could be modified to return the actual binary file name from the hash-triggered JIT compile step, then the user code could store that in some kind of a persistent mapping (of user design) for later. Then that name could be passed later on within user code to the buildKernel
() method which has also been modified to understand it, which would just load it without any other logic applied whatsoever. Indeed this puts the onus of the persistent mapping of device+props (A+C) <=> actual binary file name
on the user's programming shoulders, but that is not such a big deal IMHO.
If this can work and could be implemented faster than the library concept, then that would be my strong preference.
Real-world exigencies for commercialization motivate me to ask.
Just pushed [f4fea623f09f12249ab2c783a5be6730f5497681] which adds a hash()
method to the occa::kernel
object.
The way to get the binary would be
occa::kernel myKernel = ...;
std::string hash = myKernel.hash().getString();
and to build the kernel from the hash, kernel name, and props
std::string getKernelFromHash(const std::string &hash,
occa::device device,
const std::string &kernelName,
const occa::properties &props = occa::properties()) {
const std::string binaryFilename = (
occa::env::OCCA_CACHE_DIR + ".cache/" + hash + "/binary"
);
return device.buildKernelFromBinary(binaryFilename, kernelName, props);
}
Note that there are a few things that are not guaranteed to stay the same until we standardize them:
- The
.cache/<hash>/binary
location - The
occa::hash_t
implementation. However, we can mark thegetString()
andgetFullString()
as public API methods.
Wow that was quick -- thanks so much! I will try this out ASAP.