Decision Models and ChatGPT

last updated 2023-11-05 21:11:13 by Simon Vandevelde

Update: we recently investigated the synergy between DMN and GPT-3 and published a paper on it. You can read all about it here.

ChatGPT has been all the rage, lately. Many people have been experimenting with it, from producing code and passing a computer science exam to succesfully writing emails disputing a parking fine.

So, as a researcher in symbolic AI, this begs the question: can it craft decision models for us? This blog posts describes my attempt to have ChatGPT generate a simple Decision Model and Notation (DMN) model. Let's have ChatGPT explain what DMN is, and what a DMN decision table looks like.

So far so good. ChatGPT generated a decision table using three inputs (Condition A-C) and one output (Decision), and explains it well enough for this simple example. One remark is that the table is not complete, e.g. an input combination of True, True, False would not have an applicable row.

Let's ask for a more complex example, in which numerical variables are used.

While the table example is impressive, we can already see some errors pop up in the explanation. ChatGPT incorrectly claims that Age is categorical, as there is only >25 and <25. Similarly, it says that Credit Score is a continuous variable, which does not make any sense.

More interestingly, the table is again incomplete. Maybe ChatGPT can rectify this?

Alright, the AI is on the right track here -- it should be possible to write a complete table with 8 rows. But is it really complete?

Uh oh, we gave ChatGPT inputs that have no applicable rule, and it just decided that my Credit Score was good after all, so that Row 3 could fire. Let's see if ChatGPT can correct this.

Words cannot describe how funny this answer is. ChatGPT here tries to go for the "technically correct is the best kind of correct" answer. But, obviously, the table is still incomplete.

Let's try something else. Could ChatGPT generate a decision model based on a decision description?

Alright, let's try to have it model the well-known Vacation Days example.

Impressively, ChatGPT was able to derive the correct inputs and the output to the problem. However, while at first glance it might seem like a reasonable table, it is riddled with mistakes: there are a lot of conflicting rules possible, and the table is yet again not complete.

Maybe the AI would do better if it split the large table into three more manageable tables instead?

The AI split the table, but for some reason it "forgot" two rows of the second table. At the very least, this split version of the decision model makes more sense than the previously generated one. Let's look at the explanation ChatGPT gives:

... a 15 year old employee with 5 years of service seems a bit illegal, but sure, let's go with it. Interestingly, ChatGPT does not derive a correct decision: our example employee should be eligible for the 5 additional days (Row 1).

In the second example given, again the AI derives an incorrect decision: according to its model, the employee should also be eligible for the extra 2 days. But maybe it just forgot to express that "the 2 additional days cannot be combined with the 5 extra days", but it did remember it from the decision description. Let's ask ChatGPT to make that a bit more explicit.

Again, a very funny answer. ChatGPT mentions that it adds an additional condition, but it does not modify the table besides copying the second row and adding four empty ones. It then goes on to explain that now it should be impossible to combine the two.

Both of these examples are, sadly, incorrect. At the very least the second example is on the right track, if only it had correctly encoded the third table.

At this point, I figure we sufficiently explored ChatGPT's DMN capabilities, so let's stop the experiment. I think this blog post is a nice indication of ChatGPT's (lack of) DMN capabilities.

Some take-aways:

While the tables generated by ChatGPT have the correct form, their contents are truly subpar.
ChatGPT has no proper understanding of incomplete tables, or of conflicting rules.
ChatGPT can not be used as a decision engine (but why would you?).
It is capable of deriving the variables used in small examples.

In the future, I'll definitely play around with ChatGPT and DMN some more. More specifically, I'm interested in seeing if ChatGPT can also derive the variables for more complex examples.