NLU.DevOps Consider adding support for custom parsers of utterances

Today, we expect that the utterances JSON file is always an array of utterances with entities in one of two formats, an NLU.DevOps generic format:

[
   {
      "text": "order pizza",
      "intent": "OrderFood",
      "entities": [
        {
           "matchText": "pizza",
           "entityType": "FoodItem"
        }
      ]
   }
]

Or LUIS batch format:

[
   {
      "text": "order pizza",
      "intent": "OrderFood",
      "entities": [
        {
           "entity": "FoodItem",
           "startPos": 6,
           "endPos": 10
        }
      ]
   }
]

I suspect we can make this a bit simpler and afford an opportunity to leverage other tooling (that is less likely to get out of sync) if we allow dependency injection of the parser for utterances. One potential scenario I'd like to unblock is I'd like to write a simple script that takes a test utterance JSON file and sends the utterances off for prediction against LUIS / Lex / etc., storing the unmodified results directly from LUIS / Lex back in a JSON array.

I.e., could we easily enable something like this:

[
  {
    "query": "order pizza",
    "topScoringIntent": {
      "intent": "OrderFood",
      "score": 0.99999994
    },
    "entities": [
      {
        "entity": "pizza",
        "type": "FoodItem",
        "startIndex": 6,
        "endIndex": 10,
        "score": 0.973820746
      }
    ]
  }
]

We could achieve this with a couple different options.

Option 1, we add some flags to the compare command for how to inject the parser:

dotnet nlu compare \
  --expected tests.json \
  --actual results.json \
  --expectedFormat luis-batch \
  --actualFormat luis-response

Option 2, we add an optional envelope to the utterances JSON file:

{
  "format": "luis-response",
  "utterances": [
    {
      "query": "order pizza",
      "topScoringIntent": {
        "intent": "OrderFood",
        "score": 0.99999994
      },
      "entities": [
        {
          "entity": "pizza",
          "type": "FoodItem",
          "startIndex": 6,
          "endIndex": 10,
          "score": 0.973820746
        }
      ]
    }
  ]
}

Feb 14 '20 01:02 rozele

There are a few benefits to this approach:

You do not need to depend on NLU.DevOps to run your tests. If an NLU provider exposes their own batch API, you could consider using that batch API directly and only use NLU.DevOps for comparing.
Whatever the test results are, we do not lose result data by "lifting" the results to a generic format.

Feb 14 '20 01:02 rozele

Acceptance Criteria

dotnet nlu test command returns verbatim results from NLU provider
Default parsing (when not otherwise specified in the compare command CLI options or in a data envelope) should support generic utterances and LUIS batch formats. If a parser is specified, we should always support falling back on the default parsing behavior.
Should be an optional feature of NLU providers (e.g., luis, luisV3, lex, etc.), option to return generic utterance format from the test command should still exist.
For now, let's only support 1 format per NLU provider, so we can continue to use the NLU provider moniker (e.g., luis, luisV3, lex, etc.) to represent the format in the compare command.

Examples If we use a CLI option:

dotnet nlu test -s luis -u tests.json -o results.json
dotnet nlu compare -s luis -e tests.json -a results.json

The tests.json file may contain generic utterances format, whereas the results.json file will contain raw LUIS responses.

If we use the envelope method:

dotnet nlu test -s luis -u tests.json -o results.json
dotnet nlu compare -e tests.json -a results.json

The tests.json file may contain generic utterances format, whereas the results.json file will contain raw LUIS responses embedded in an envelope, e.g.:

{
  "format": "luis",
  "utterances": [
    {
       /* raw LUIS response */
    }
  ]
}

Mar 06 '20 19:03 rozele