azure-webjobs-sdk icon indicating copy to clipboard operation
azure-webjobs-sdk copied to clipboard

[FeatureRequest] Input binding composed by multiple blobs

Open goiri opened this issue 4 years ago • 3 comments

Overview

Currently, if we want to pass multiple blobs as inputs to our function, we need to explicitly specify each blob individually. It would be useful to have the ability to specify a set of them (e.g., folder) and expose them to the function as a collection (e.g., list).

Current status

We need to specify each input blob individually in the JSON:

{
  "scriptFile": "__init__.py“
  "bindings": [{
       "type": “httpTrigger",
       "direction": "in“
    },{
      "name": "inputblob1",
      "type": "blob",
      "dataType": "binary",
      "path": “{folder}/{file1}",
      "direction": "in“
    },{
      "name": "inputblobN",
      "type": "blob",
      "dataType": "binary",
      "path": “{folder}/{file2}",
      "direction": "in“
    },{
    "name": "$return",
    "type": “http",
    "direction": "out“
  }]
}

This translates into the Python code having to do:

import azure.functions as func

def main(req, inputblob1, inputblobN):
  content_len = 0
  content1 = inputblob1.read()
  contentN = inputblobN.read()
  ...
  content_len = len(content1) + len(contentN)
  return func.HttpResponse(f“Total content length: {content_len}")

For each number of parameters we need to create a new function. (Note that this also has a limit of function having 255 arguments for Python functions.)

Proposal

One could request all the files within a "folder":

{
  "scriptFile": "__init__.py“
  "bindings": [{
       "type": “httpTrigger",
       "direction": "in“
    },{
      "name": "inputblobs",
      "type": "blob",
      "dataType": "binary",
      "path": “{folder}/*",
      "direction": "in“
    },{
      "name": "$return",
      "type": “http",
      "direction": "out“
  }]
}

And the Python code could do:

def main(req, inputblobs):
  content_len = 0
  for inputblob in inputblobs:
    content = inputblob.read()
    content_len += len(content)
  return func.HttpResponse(f“Total content length: {content_len}")

Extensions

In addition to supporting "folders", one could define sets of files using regular expressions:

{
  "scriptFile": "__init__.py“
  "bindings": [{
       "type": “httpTrigger",
       "direction": "in“
    },{
      "name": "inputblobs",
      "type": "blob",
      "dataType": "binary",
      "path": “{folder}/input_*_1",
      "direction": "in“
    },{
      "name": "$return",
      "type": “http",
      "direction": "out“
  }]
}

Implementation

Most of this implementation would actually be in the Azure WebJobs SDK. The Python worker would need to be extended to receive a variable numbers of inputs and expose them as a collection to the function. This could also be implemented for other languages.

goiri avatar Sep 24 '20 23:09 goiri

Note that some of the required infrastructure might already be implemented for EventHub triggers as cardinality: https://docs.microsoft.com/en-us/azure/azure-functions/functions-bindings-event-hubs-trigger?tabs=csharp#configuration

goiri avatar Sep 24 '20 23:09 goiri

cc @pragnagopa @paulbatum

gohar94 avatar Sep 24 '20 23:09 gohar94

@goiri is right. We already have support for cardinality: Many which allows a function to receive an array of inputs. Transferring this issue to webjobs sdk repo.

pragnagopa avatar Sep 24 '20 23:09 pragnagopa