Archi Forum

Archi => Archi Development => Topic started by: Phil Beauvoir on August 26, 2014, 06:51:51 AM

Title: Neo4j
Post by: Phil Beauvoir on August 26, 2014, 06:51:51 AM
I'm interested in looking at Neo4j (http://www.neo4j.org/) again for ArchiMate persistence.

The idea originated from Olivier Rey, in the old forums (https://groups.google.com/forum/#!msg/archi-users/2cLcaiFLidY/WP9ASswOoskJ):

QuoteThe ultimate storage for Archimate is Neo4j the graph database.

In one shot, you solve many problems:
-Archi becomes scalable with a performance not based upon model size;
-Archi becomes team enabled;
-We can query the models in Cypher and perform advanced analysis.

With this persistence layer, both Archi and Neo4j can make an annoucement and provide the most straightforward and powerful product ever.

Have a look at it guys, it is worth the shot.

Bye,
Olivier

And in this LinkedIn thread (http://lnkd.in/dbJA6Eb):

QuoteTake a look at Neo4j database. The use of this storage to persist Archimate models would enable easily:
-scalability (a view is a one depth request inside the DB) and requests are not dependent on the size of the model;
-team support (due to ACID support, team support is built in);
-easy impact analysis and reporting (using Cypher language for instance);

Using a graph database would enable the hard-to-implement following functionality:
-model annotations and comments,
-workflows on models, where some lead architect can validate the main repository updates,
-easy models comparison and fusion
-management of a huge repository of models,
-sharing of some common objects that were validated by a specific workflow but which "outgoing" structure cannot be modified without the lead architect validation,
-project spaces in which teams can create temporary models using sharing validated objects and relation instances,
-alternate architecture scenario management,
-etc.

Graph databases seem to open new ways of seeing things, especially for Archimate model persistence and management. Any opinions?



Title: Re: Neo4j
Post by: Phil Beauvoir on August 26, 2014, 07:05:40 AM
Here's a video of Ian Robinson talking about Neo4j's History, Data Structure and Use Cases:

http://www.infoq.com/interviews/ian-robinson-neo4j?utm_source=infoq&utm_medium=videos_homepage&utm_campaign=videos_row1 (http://www.infoq.com/interviews/ian-robinson-neo4j?utm_source=infoq&utm_medium=videos_homepage&utm_campaign=videos_row1)

Title: Re: Neo4j
Post by: fajar on August 29, 2014, 20:55:55 PM
How's that compare with CDO?
Is there any IStore implementation in CDO for Neo4j?
Title: Re: Neo4j
Post by: Phil Beauvoir on August 29, 2014, 20:59:30 PM
Quote from: fajar on August 29, 2014, 20:55:55 PM
How's that compare with CDO?
Is there any IStore implementation in CDO for Neo4j?

I don't know enough about it at this stage. You may be interested in Neo4EMF:

http://neo4emf.com/ (http://neo4emf.com/)
Title: Re: Neo4j
Post by: kdwykleingeld on September 04, 2014, 10:28:20 AM
Hi, this would really be great!. I was already preparing a sample data set (based on an archi csv export)  for neo4j import (through its jexp batch importer) so that i can do some cypher querying .. storing the data natively in a graph database would prevent that step...
reg koen
Title: Re: Neo4j
Post by: Bart Ratgers on September 09, 2014, 06:11:30 AM
I think the adoption of Neo4j is step forward for market adoption and gives you also another possibilities. For the long term it gives you the possibility to adopt another parts of the TOGAF repository concept. For example adopt a discussion log (for example by adopting standard eclipse plugins) or add Reference libraries and Standards Information base. I'm not sure if this matches the ideas of Phill, but I think with these enhancements it must be possible to create a tool that support the architecture process of a company.
Title: Re: Neo4j
Post by: Jean-Baptiste Sarrodie on September 09, 2014, 08:37:49 AM
@All,

Can someone explain me the real (not marketing) advantage of using Neo4J ?

I must admit that for the moment I only see it as another DB to persist model, but I see absolutly no advantages over "classic" SQL DB. I also fear that really few people would learn its specific query language.

Is it only a geek thing or do I really miss something ?

JB
Title: Re: Neo4j
Post by: Phil Beauvoir on September 09, 2014, 09:02:52 AM
For me it sounds interesting for its own sake. The main thing is ACID.  :)

And much faster for querying on relationships rather than usual Join type queries.

It's not something I'm necessarily endorsing for Archi, it just looks quite cool.  8)
Title: Re: Neo4j
Post by: Jean-Baptiste Sarrodie on September 09, 2014, 11:24:34 AM
Quote from: Phil Beauvoir on September 09, 2014, 09:02:52 AM
For me it sounds interesting for its own sake. The main thing is ACID.  :)

SQLite is ACID too ;-)

Quote
And much faster for querying on relationships rather than usual Join type queries.

I'm not against speed, but I'm not sure this is a key point in our usage.

Quote
It's not something I'm necessarily endorsing for Archi, it just looks quite cool.  8)

Thus the "geeky" part of my remark ;-)

To be clear, I'm not against Neo4j, it's just I don't understand its real power (yet?).
Title: Re: Neo4j
Post by: Phil Beauvoir on September 09, 2014, 11:27:16 AM
Quote from: Jean-Baptiste Sarrodie on September 09, 2014, 11:24:34 AM
To be clear, I'm not against Neo4j, it's just I don't understand its real power (yet?).

That's how I feel. I have an interest in it from the point of view of the "Archi R&D department".
Title: Re: Neo4j
Post by: Bart Ratgers on September 09, 2014, 15:19:41 PM
I think the biggest advance to use neo4j as storage engine (or another form of document or graph database) is the flexibility and scalability in a larger environment. You can easily create a cluster environment with an increased availability and performance. If you existing data doesn't  require a relational database then a document or a graph oriented database (NoSQL) are good alternatives.


Another advance I see is the re-use of an existing library and solution. I think that beside the storage engine a lot of work needs to be done to manage the integrity of the complete model and signaling when more users are working together on one project.


I have some experiences as architect (and not as software developer) with the use of NoSQL storage engine, and the ease of management and scalability are big pros. 
Title: Re: Neo4j
Post by: techlogix on September 23, 2014, 14:04:35 PM
In enterprise environments it would be possible (easier) to write a more performant Visualizer outside the IDEs. E.g. a single page application that could navigate like a visualizer in a read only made on a web-site with very high performance.
Title: Re: Neo4j
Post by: dknol on September 26, 2014, 16:46:29 PM
I've started to investigate the usage of EMFStore from an Archi plugin. I am able to store the model, including changes.

EMFStore should be one of the easiest ways to accomplish shared working, versioning etc.

What I've done is to create a EMFStore "project" and attach the archi model. This basically works, but there are issues with ID's and at some point (no clue why or how) the archi model is corrupt when I want to save it.

I can share my code if someone is interested.
Title: Re: Neo4j
Post by: Jean-Baptiste Sarrodie on September 26, 2014, 18:28:52 PM
Quote from: dknol on September 26, 2014, 16:46:29 PM
I can share my code if someone is interested.

Hi, That would be great. I can create a git repository under https://github.com/archi-contribs and provide you right on it. If you're OK, I would suggest "archi-emfstore-plugin" as repository name and "org.archicontribs.emfstore" as base for code (not mandatory at all, just a proposed convention if you don't have one already).

JB
Title: Re: Neo4j
Post by: adeze on November 26, 2014, 02:59:43 AM
by leveraging the archimate exchange format (https://www2.opengroup.org/ogsys/catalog/S142), and the code from the plug in, a lot of the work conceptually has been done-- all thats needed is a mechanism to CRUD the nodes in neo4j.
i've been contemplating the analysis of models for quite a while, and whilst the queries could be very clever, you could also consider syncing with a CMS like structr.org as the analysis tool/viewer, rather than a static report.

Title: Re: Neo4j
Post by: Phil Beauvoir on November 26, 2014, 20:36:20 PM
Quote from: adeze on November 26, 2014, 02:59:43 AM
by leveraging the archimate exchange format (https://www2.opengroup.org/ogsys/catalog/S142), and the code from the plug in, a lot of the work conceptually has been done-- all thats needed is a mechanism to CRUD the nodes in neo4j.
i've been contemplating the analysis of models for quite a while, and whilst the queries could be very clever, you could also consider syncing with a CMS like structr.org as the analysis tool/viewer, rather than a static report.

I look forward to the day when the ArchiMate exchange format is validated and widely used.
Title: Re: Neo4j
Post by: Morat on February 12, 2015, 09:18:16 AM
Quote from: Jean-Baptiste Sarrodie on September 09, 2014, 08:37:49 AM

Can someone explain me the real (not marketing) advantage of using Neo4J ?


I registered to do just that. I've been using Archi extensively to document an existing, complex legacy applications architecture involving tens of systems and their associated interfaces, business processes and data objects. I pulled the Archi model into Neo4j and also translated it into GEXF, an open XML-based format for graph storage.

The advantages of having the model available in Neo4j were huge:


It also opens up the model to being stored as a centralised, shareable resource in a way that isn't currently safe with the file-based storage we have.

Native neo4j storage would be incredible.
Title: Re: Neo4j
Post by: Phil Beauvoir on February 12, 2015, 18:09:59 PM
Hi Morat,

thanks for sharing this information, it sounds very interesting. I guess what we really need is an architecture where models are stored internall in EMF (as they are now) and then to have persistence connectors to allow to save to various formats - XML, Neo4J, SQL DB, NOSQL DB, etc...

Phil
Title: Re: Neo4j
Post by: Jean-Baptiste Sarrodie on February 12, 2015, 19:35:59 PM
Hi,

First of all, thank you for helping me to understand neo4j ;-)

Quote
I've been using Archi extensively to document an existing, complex legacy applications architecture involving tens of systems and their associated interfaces, business processes and data objects. I pulled the Archi model into Neo4j and also translated it into GEXF, an open XML-based format for graph storage.

I'm still not sure about neo4j, but I do know Gephi and thus GEXF format. That's a really powerful graph exploration solution. I'm currently working on a new HTML export for Archi, and the java library used for that (StringTemplate (http://www.stringtemplate.org/)) could easily allow to build generic "exporter" and thus a GEXF one. Would you find it usefull or not ?

Quote
The advantages of having the model available in Neo4j were huge:

  • Once you have it in Neo4j you can use cypher (Neo's query language) to analyse it, which helps with impact analysis, consistency checking and examining subsets of the architecture.
  • You can visualise it using Gephi and other similar tools, which can really help illustrate complexity of the model without the overhead of having to manually produce the diagrams. Graphviz is also very handy for this.
  • If you use properties, for example to show which parts of the architecture exist at which points in time, you can examine slices of the model through time. Especially useful for relationships, which aren't first-party citizens in Archimate and can't be contained in plateaus.
  • You can link other external resources to the elements of the model.
  • You can easily access the model programatically through the many adapter libraries available (Java, Ruby, Python, R, etc...).

From that list I'd say that most of the points are related to Gephi, the one that interrest me the most is the use of cypher (Neo's query language). Could you elaborate and provide some example of what you can do with it in EA context ?

Quote
It also opens up the model to being stored as a centralised, shareable resource in a way that isn't currently safe with the file-based storage we have.

I don't see flat files as less safe than central, binary based solution. I'd add that export/import to a central neo4j instance doesn't mean that it become 'automagically" shareable. These are 2 separate things, several Archi users have developed central DB repository, but each time without the multi/concurrent users feature.

Quote
Native neo4j storage would be incredible.

I like such enthusiasm ;-)

Regards

JB
Title: Re: Neo4j
Post by: Morat on February 22, 2015, 09:56:32 AM
Quotethe one that interrest me the most is the use of cypher (Neo's query language). Could you elaborate and provide some example of what you can do with it in EA context

It's very useful for impact analysis and determining relationships due to the ability to follow chains of relationships in a way that's more challenging in a traditional relational database. I did some analysis to determine which business processes were ultimately reliant on which data objects and through which application functions with this. It's also good for finding interesting features in your model, e.g. 'show me the data objects that are written by more than one application function' or 'which application functions do not yet have certain kinds of relationships modelled'.

I got most value out of it when the model is combined with other more detailed information, for example a mapping of interface data usage and system to logical data models. We were able to determine from querying the model that one of the systems has internal movement of data between tables due to gaps in the expected results when we looked for the linkage between an upstream system mastering data and the downstream system that was using it.

This work was done in a solutions architect context, but the ability to query your model in this way and make inferences based on the relationships is relevant whether you're working in the SA or EA space, especially as the architecture model becomes complex. It's like having the visualiser window on steroids.
Title: Re: Neo4j
Post by: Stef Joosten on November 17, 2016, 16:46:10 PM
This topic should be linked with multi-user Archi.

Collaborative editing of Archi models requires a persistent store with transactional facilities.
To develop that on MySQL (or MariaDB) takes more effort than doing that on a graph
database (like Neo4j). Even with the learning included, neo4j will be preferrable. The better performance is a nice perq, but not essential.

That I would applaud building Archi on top of neo4j, just to get a collaborative Archi.
(By the way, a MySQL-based Archi repo could just as well fulfill the needs for collaborative use of Archi. But why use worse technology over better?)