Introducing Luis Version Tools

Microsoft’s LUIS is a super cool NLU (Natural Language Understanding) platform that our teams at Insight have been using on projects for over two years since it was in preview and supported a maximum of 10 intents per application. Since then, LUIS has come a really long way in terms of features, performance, automation and governance. Last May, at Build, Microsoft announced the Bot Builder Tools, a set of mostly Node based scripts that allow for easy scripting of everything from the Azure Bot Service, QnA Maker and LUIS as well as provide facilities to help author LUIS models (ludown) or provide a dispatch LUIS model fronting a set of child LUIS models (dispatch). Ludown quickly became one of my favorite tools and I was fortunate enough to help the team by reporting bugs and requesting feature enhancements. This work inspired much of what I present below.

The Problems with Authoring LUIS

LUIS allows users to create Applications. An Application can have one or more Versions. An Application has two deployment slots: staging and production. A Version is a collection of intents and entities. It can be trained and published into any deployment slot. Version A can be published to the production slot for production apps to utilize, and Version B can be published to the staging slot for development/test application to use. Any Version can be exported into JSON.

There are a number of issues that accumulate:

  • The LUIS JSON is very verbose.
  • There is no audit log; one cannot tell who changed what and when.
  • There is no easy way to tell the difference between two versions.
  • There is no clear direction on when to create a new version or any version naming conventions.
  • An Application having two slots is limiting. For many apps, there are more than two environments. In these cases, the two-slot model fails.

Microsoft provides the Authoring API (also accessible via the luis-apis package). The functionality provided by the API is the beginning of how we solve these problems.

Introducing luis-version

I created two tools to help fill the gap and provide easy automation for model and version management right within source control. The tools are based on ludown and luis-apis. We assume that the entire contents of a LUIS app are managed within a ludown file, which we call model.lu by default. The first tool is luis-version.

The goal of luis-version is to generate a new LUIS app version name if the contents of model.lu have changed since the last version. This is tracked via a file called .luis-app-version that sits within source control. The version name is the current UTC date formatted as YYMMDDHHmm. A new version generated right now, would be named 1810271732. This might seem cryptic, but LUIS limits version names to 10 characters so that’s what we ended up with for now. If a second developer runs luis-version with the same model.lu file and the same .luis-app-version in their working directory, luis-version will understand that the hash is the same, therefore the version name is reused.

After determining the version name, luis-version ensures that the version exists. If it does not, it creates it. It runs model.lu through ludown to generate the LUIS JSON and calls luis import version. You can pass an optional --publish flag to ensure that the new version is immediately published.

luis-version obeys the rules of .luisrc as documented here. In addition, you can pass a different luisrc file by using the --luisrc parameter. For example: luis-version --model model.lu --luisrc .luisrc.prod --publish.

And one can add the --verbose flag to see exactly what the utility is doing underneath.

Sample Walkthrough

Let’s say we have a LUIS model we want to manage in source control. We are familiar with the ludown format and use this sample to get started. We will call this file model.lu. Second, we create three new .luisrc files. The contents of each one are similar except for the appId. These are the three LUIS apps representing the dev, test and production environments. In my case, the authoringKey is the same for all three environments. I left the real appIds in here, though I’ve gotten rid of the applications since.


.luisrc
{
  "authoringKey": "",
  "region": "westus",
  "appId": "c83b0094-8c19-4d73-bd91-689d91ccfd8c",
  "endpointBasePath": "https://westus.api.cognitive.microsoft.com/luis/api/v2.0"
}

.luisrc.test
{
  "authoringKey": "",
  "region": "westus",
  "appId": "8650ff9f-6be5-4b28-87ad-a61053d8dbdc",
  "endpointBasePath": "https://westus.api.cognitive.microsoft.com/luis/api/v2.0"
}

.luisrc.prod
{
  "authoringKey": "",
  "region": "westus",
  "appId": "d920f225-6a80-4857-bb11-baf73416ec96",
  "endpointBasePath": "https://westus.api.cognitive.microsoft.com/luis/api/v2.0"
}

I can now run the following command to deploy the latest model data into the development LUIS app.

luis-version --model model.lu --luisrc .luisrc --publish

Getting app id c83b0094-8c19-4d73-bd91-689d91ccfd8c...
Calculating hash...
Hash 59c0db24 and tag 1810271806 generated
Checking if version 1810271806 exists...
Version 1810271806 doesn't exist. Continuing...
Running ludown.
Importing version 1810271806...
Version 1810271806 imported.
Training 1810271806...
Done training ...
Publishing...
All done.

Note that the script created a file called .luis-app-version that contains the latest hash/version name based on the model.lu content. In this case, the file contents match the output hash/version.


{
  "tag": "1810271806",
  "hash": "59c0db24"
}

If we look at our dev LUIS application, we will note that we have created a new version that is published into the production slot. We can easily deploy the same to test and prod using.

luis-version --model model.lu --luisrc.test .luisrc.test --publish
luis-version --model model.lu --luisrc.prod .luisrc.test --publish

If the version already exists, the version is simply retrained and published if the flag is passed. If the .luis-app-version file exists with the same hash, the old version name is used as shown by this output.

Getting app id c83b0094-8c19-4d73-bd91-689d91ccfd8c...
Calculating hash...
Hash 59c0db24 and tag 1810271809 generated
Found old version with hash 59c0db24. Using version 1810271806
Checking if version 1810271806 exists...
Version 1810271806 exists...
Version exists. Not updating...
Training 1810271806...
Done training ...
Publishing...
All done.

If we were now to modify the model.lu file and run the same command, the script will create a new version and publish.

Getting app id c83b0094-8c19-4d73-bd91-689d91ccfd8c...
Calculating hash...
Hash 5bc3e294 and tag 1810271809 generated
Checking if version 1810271809 exists...
Version 1810271809 doesn't exist. Continuing...
Running ludown.
Importing version 1810271809...
Version 1810271809 imported.
Training 1810271809...
Done training ...
Publishing...
All done.

Running luis list versions should result in three versions.


[
  {
    "version": "1.0",
    "createdDateTime": "2018-10-27T14:22:51.000Z",
    "lastModifiedDateTime": "2018-10-27T14:22:51.000Z",
    "lastTrainedDateTime": null,
    "lastPublishedDateTime": null,
    "endpointUrl": null,
    "assignedEndpointKey": null,
    "externalApiKeys": null,
    "intentsCount": 1,
    "entitiesCount": 0,
    "endpointHitsCount": 0,
    "trainingStatus": "NeedsTraining"
  },
  {
    "version": "1810271806",
    "createdDateTime": "2018-10-27T18:06:18.000Z",
    "lastModifiedDateTime": "2018-10-27T18:06:30.000Z",
    "lastTrainedDateTime": "2018-10-27T18:08:59.000Z",
    "lastPublishedDateTime": "2018-10-27T18:09:09.000Z",
    "endpointUrl": null,
    "assignedEndpointKey": null,
    "externalApiKeys": null,
    "intentsCount": 7,
    "entitiesCount": 3,
    "endpointHitsCount": 0,
    "trainingStatus": "Trained"
  },
  {
    "version": "1810271809",
    "createdDateTime": "2018-10-27T18:09:51.000Z",
    "lastModifiedDateTime": "2018-10-27T18:10:13.000Z",
    "lastTrainedDateTime": "2018-10-27T18:10:08.000Z",
    "lastPublishedDateTime": "2018-10-27T18:10:19.000Z",
    "endpointUrl": null,
    "assignedEndpointKey": null,
    "externalApiKeys": null,
    "intentsCount": 7,
    "entitiesCount": 3,
    "endpointHitsCount": 0,
    "trainingStatus": "Trained"
  }
]

Supporting Manual or Web App Editing

Not all users will be happy supporting editing the model using the ludown file. Some team members might still want to use the web app UI to iterate, test and make sure the model is working correctly. That is fine. The second tool in the luis-version-tools NPM package is luis-lu-export. This script downloads the latest model and writes is to the destination file of choice. For example, we can run the following command to get the latest online version.

luis-lu-export --luisrc .luisrc --model model.lu --version 1810271809

Any edits made online will be applied to model.lu. At this point, before checking into source control, we can run luis-version --luisrc .luisrc --model model.lu --publish to ensure the .luis-app-version file is re-generated based on the latest model content and a new version is created. At this point, we can check all changes into source control.

In my experience, this manual editing of the model using the web app should only be allowed in the development version of the model. Test, QA, Prod, Integration, and all other environments should be generated directly from a ludown file.

What’s Next?

The scripts are in a state where they can integrated into dev ops pipelines. Go ahead and submit feature requests/bug reports on GitHub. I’m very interested in how developers may end up using the tools and feedback on the approach. NPM package details here.