I wrote an AI chatbot that can reason about your ArchiMate model

Started by maksim aniskov, December 28, 2024, 17:08:57 PM

Previous topic - Next topic

maksim aniskov

Hi all,

Recently I was presenting How I cope with model of 400 600 elements, 1200 1400 relations that mutates to be correct to yet another group of engineers. During Q&A, I was ask if I ever considered implementing a chatbot one could ask questions about the model.

The model describes pretty a portion of a complex real-life set of applications and systems. In addition to relationships between the components involved, the model contains properties linking those components to their configuration and source code.

The guarantee about the data is that it is high-quality and routinely checked for validity and correctness by using archi-powertools-inspector and archi-powertools-verifier.

So, what if we want to talk to the data to ask it some questions?
Here it is. 👉 archi-powertools-chatbot 👈

Happy Holidays and New Year!


This a fully-functional implementation of chatbot which can reason and answer question about information in your ArchiMate® model.

What's under the hood
  • The chatbot reads enterprise/system architecture models in the Archi® modelling toolkit's format. (You don't need to install the toolkit to use the chatbot.)
  • The chatbot uses archi-powertools-verifier's machinery to read the model and import it a Neo4j graph database it runs in a container.
  • The chatbot is configured to understand archi-powertools-inspector's markup. The markup enriches the model with information which describes the system's deployment configuration on "real" infrastructure.
  • The chatbot uses OpenAI API and AI models.

For the demonstration, we are going to use the demo model file archi-powertools-walkthrough.archimate which is in the repository. This is a screenshot of what is in the model.


Prerequisites
  • Make sure you have installed Docker Compose or compatible container engine.
  • Register on OpenAI developer platform. Create your OpenAI API key. Top up your credit balance. $5 of credits will be more than enough for running the demonstration scenario.

Setting it up
wget https://raw.githubusercontent.com/MaksimAniskov/archi-powertools-chatbot/refs/heads/1.x/archi-powertools-walkthrough.archimate
wget https://raw.githubusercontent.com/MaksimAniskov/archi-powertools-chatbot/refs/heads/1.x/compose.chatbot.yaml
wget https://raw.githubusercontent.com/MaksimAniskov/archi-powertools-verifier/refs/heads/1.x/compose.yaml
wget https://raw.githubusercontent.com/MaksimAniskov/archi-powertools-verifier/refs/heads/1.x/compose.ui.yaml

Demonstration
Run the chatbot container and its dependencies.
OPENAI_API_KEY=... \
ARCHI_FOLDER=. \
ARCHI_FILE=archi-powertools-walkthrough.archimate \
docker compose -f compose.yaml -f compose.ui.yaml -f compose.chatbot.yaml run --rm chatbot

Expect to see docker containers building and starting up.
[+] Building 5.9s (17/17) FINISHED
  ...
[+] Creating 3/3
 ✔ Volume "demo_intermediate_files"  Created
 ✔ Container archi-powertools-chatboot-archi2csv-1        Created
 ✔ Container archi-powertools-chatboot-neo4j-1            Created
[+] Running 2/2
 ✔ Container archi-powertools-chatboot-archi2csv-1  Exited
 ✔ Container archi-powertools-chatboot-neo4j-1      Started
[+] Building 2.4s (10/10) FINISHED
  ...

Then following prompt for choosing AI model appears.
Choose model from the list.
[0] o1-preview
[1] o1
[2] o1-mini
[3] gpt-4o
For sake of following example conversation, I'm going to use o1-preview model. So, I type in 0 and press Enter. My output is like this.
> 0
Using model o1-preview.
Ask your questions.

Here I begin my conversation with the AI. My question is about microservices it finds in the ArchiMate model.
Your question > Show me all microservices

> Entering new GraphCypherQAChain chain...
Generated Cypher:
MATCH (a:ApplicationService) RETURN a
Full Context:
[{'a': {'archi_id': 'id-8e5aedd18d9a42749428a51b52cf8912', 'documentation': '', 'name': 'Service Alpha', 'specialization': ''}}, {'a': {'archi_id': 'id-6c4b1a8d76c54fc4882f8e0d7e8910f2', 'documentation': '', 'name': 'Service Beta', 'specialization': ''}}, {'a': {'archi_id': 'id-8992258f1f054aa7b91996376b69236b', 'documentation': '', 'name': 'Service Gamma', 'specialization': ''}}]

> Finished chain.
Intermediate steps: [{'query': 'MATCH (a:ApplicationService) RETURN a'}, {'context': [{'a': {'archi_id': 'id-8e5aedd18d9a42749428a51b52cf8912', 'documentation': '', 'name': 'Service Alpha', 'specialization': ''}}, {'a': {'archi_id': 'id-6c4b1a8d76c54fc4882f8e0d7e8910f2', 'documentation': '', 'name': 'Service Beta', 'specialization': ''}}, {'a': {'archi_id': 'id-8992258f1f054aa7b91996376b69236b', 'documentation': '', 'name': 'Service Gamma', 'specialization': ''}}]}]

Final answer: Service Alpha, Service Beta, Service Gamma are all microservices.

NOTE: Neo4j Browser UI is a part of this setup. It is available at http://localhost:7474/
Open the link in your browser, then you will be able to use it to run those queries the chatbot generates and check their correctness.


Let's continue.
Your question > Describe Alpha

> Entering new GraphCypherQAChain chain...
Generated Cypher:
MATCH (n{name:'Alpha'})-[r]-(m)
RETURN n, r, m
Full Context:
[]

> Finished chain.
Intermediate steps: [{'query': "MATCH (n{name:'Alpha'})-[r]-(m)\nRETURN n, r, m"}, {'context': []}]
Final answer: I don't know the answer.

It could not find the element by exact match. Let's try approximate search. (From here on, I omit verbosity from the responses leaving only the Cypher queries and the AI's final answer.)
Your question > Describe Alpha. Use approximate search.

Generated Cypher:
MATCH (e) WHERE apoc.text.levenshteinSimilarity(e.name, 'Alpha') > 0.3
OPTIONAL MATCH (e)-[r]-()
RETURN e, r

Final answer: Service Alpha is a service that involves Access, Flow, and Serving relationships, and is presented in Archi_PresentedIn.

Your question > Show all Kafka producers

Generated Cypher:
MATCH (e)-[:Flow]->(t{specialization:'Kafka topic'})
RETURN DISTINCT e.name

Final answer: Service Alpha is a Kafka producer.

Your question > Show all Kafka consumers

Generated Cypher:
MATCH (app:ApplicationService)<-[:Flow]-({specialization:'Kafka topic'})
RETURN DISTINCT app.name

Final answer: Service Beta, Service Gamma are the Kafka consumers.

Your question > Which Kafka topic Service Alpha does produce to?

Generated Cypher:
MATCH (e{name:'Service Alpha'})-[:Flow]->(topic{specialization:'Kafka topic'})
RETURN DISTINCT topic.name

Final answer: topic-xyz is the Kafka topic Service Alpha produces to.

Your question > Which services do produce to topic topic-xyz?

Generated Cypher:
MATCH (e)-[:Flow]->({name:'topic-xyz', specialization:'Kafka topic'})
RETURN DISTINCT e.name

Final answer: Service Alpha produces to topic topic-xyz.

Your question > Which services do write to or read from AWS S3?

Generated Cypher:
MATCH (s:ApplicationService)-[:Flow]-({specialization:'AWS S3 object'})
RETURN DISTINCT s.name
Full Context:
[]

> Finished chain.
Intermediate steps: [{'query': "MATCH (s:ApplicationService)-[:Flow]-({specialization:'AWS S3 object'})\nRETURN DISTINCT s.name"}, {'context': []}]
Final answer: I don't know the answer.

Your question > Which services do write to or read from AWS S3? Relationship type is Access.

Generated Cypher:
MATCH (s:ApplicationService)-[:Access]->(e{specialization:'AWS S3 object'})
RETURN DISTINCT s.name

Final answer: Service Gamma, Service Beta, Service Alpha write to or read from AWS S3.

Your question > Which external APIs are in use?

Generated Cypher:
MATCH (app:ApplicationService)-[:Flow]->(api{specialization:'HTTP REST'})
RETURN DISTINCT api.name

Final answer: I don't know which external APIs are in use.

Your question > Which external APIs are in use? Look for relations in all directions.

Generated Cypher:
MATCH (api {specialization: 'HTTP REST'})--()
RETURN DISTINCT api.name

Final answer: my-rest-endpoint.acme.com, external-api.another.com are external APIs in use.

The ArchiMate model I use for this demonstration has archi-powertools-inspector's markup on some of the relationships.

In a nutshell, archi-powertools-inspector is a tool that understands properties one provides for the model's relationships or elements. The properties are supposed to build links between the model and "real" infrastructure or source code. For example, property with name pwrt:inspector:value-ref is a pointer which can be a href to a specific line of source code on a git repository, or a pointer to a Kubernetes resource's configuration value, or to an AWS resource, and so on. The tool automates keeping those links current and inspected for correctness. Hence, "inspector".

Let's make the AI use knowledge it can derive from those pointers.

Your question > Show how Service Alpha gets its configuration

Generated Cypher:
MATCH ({name:'Service Alpha'})-[r]-()
WHERE r.`pwrt:inspector:value-ref` IS NOT NULL
RETURN r.`pwrt:inspector:value-ref`

Final answer: Service Alpha gets its configuration from:
- AWS SSM Parameter Store at the path `/alpha/Prod/config/S3bucket` (Parameters/Value)
- The configuration file located at `gitlab.acme.com/systemA/alpha/-/blob/main/src/config.txt` on line 27
- Kubernetes Ingress resource in the `alpha` namespace at `https://kubernetes.acme.com`, combining host and HTTP paths from `spec.rules[0]`

To stop the conversation, type quit.

Your question > quit

Alberto

QuoteThis a fully-functional implementation of chatbot which can reason and answer question about information in your ArchiMate® model.

Mark, I think you have built the first half of Archie the EA ChatBot

I think natural language processing (NLP) abilities of LLM to generate Graph queries that can answer questions is impressive technical feat we didn't have available just few years ago, but this solution is not quite able to Reason just yet, which is the second half of our collective Enterprise AI dream. -The ability for an LLM to infer new facts from a knowledge graph (or in our case a given ArchiMate model, metamodel, and derivation rules), would be the true game-changing use case.   

There is a paper from Juan Sequeda and Dean Allemang from Data.World on the Role of Knowledge Graphs on Large Language Model's Accuracy for Question Answering on Enterprise SQL Databases https://arxiv.org/abs/2311.07509 which is worth a read.  They also have a Podcast where they talk about the paper. As far as I know, even when providing the LLM with a Database, a DB schema and an Ontology, there's still a gap as the LLMs are not not yet able to provide 100% accurate answers to complex questions in production-sized datasets.

I don't know exactly what capabilities we can exploit in the existing LLM models, but just as The 3 Amigos proposed, I think the answer is in using Semantic Web Technologies rather than Property Graphs (probably a combination of OWL/RDF/SHACL) would be a key component in the Reasoning LLM equation. I have seen a lot of academic work around instantiating ArchiMate in RDF but not much interest from The Open Group in tackling this problem, but if they are, I think we need to put our collective heads together and figure out how to incorporate this into Archi immediately.

maksim aniskov

Quote from: Alberto on December 30, 2024, 16:11:59 PMI think you have built the first half of Archie the EA ChatBot

That's amazing how right you were back in 2017!
EA ChatBot
QuoteGo forward in time, say five years, and imagine an EA Chatbot – an AI bot that you can talk to, interact with, and that could, for example, assist the Architect to quickly and easily find information about an in-house application, how it connects to other applications.

And, it is so very much amazing that in 2024 it takes just a few days of effort to implement it!

How it finds the information and explains it in plain English, to me it looks like quite some bit of reasoning.
Your question > Show how Service Alpha gets its configuration

Final answer: Service Alpha gets its configuration from:
- AWS SSM Parameter Store at the path `/alpha/Prod/config/S3bucket` (Parameters/Value)
- The configuration file located at `gitlab.acme.com/systemA/alpha/-/blob/main/src/config.txt` on line 27
- Kubernetes Ingress resource in the `alpha` namespace at `https://kubernetes.acme.com`, combining host and HTTP paths from `spec.rules[0]`

@Alberto Can you try the chatbot on your models? Probably, the prompt it uses will require some tuning.