How large a model is too large?

Started by kkosienski, October 30, 2020, 15:59:54 PM

Previous topic - Next topic

kkosienski

Hi everyone,

Looking to start a discussion on approaches for maybe breaking up a model to reduce size and improve performance of initial imports, commits, and publishes.  We have one enterprise model we are currently using across our federated architecture community.  The main driver initially for this was that we have a lot of common shared elements that we import from our EAMS. The EAMS serves as a system of record for the elements.  Our model has now grown to almost 7000 elements, 9000 relationships and about 700 views.  This happened in the course of about 7 months since we originally implemented it.  We are starting to see performance issues with Github interactions. For example, an initial import is taking over 30 minutes.  The time to refresh the model, commit and publish are materially increasing to the point it is impacting efficiency.  Looking for input from anyone in the forum if you may have already experienced this and how you may have approached solving for it.  Thanks

Phil Beauvoir

#1
Hi,

I can't advise on how best to manage or split large models but I'd like to say that we're aware that coArchi will suffer performance-wise with many elements and relationship due to the time taken to create, aggregate and disaggregate thousands of xml files and manage those files using git. We're very much aware of this and have started work on coArchi version 2 where this will no longer be an issue.

Phil
If you value and use Archi, please consider making a donation!
Ask your ArchiMate related questions to the ArchiMate Community's Discussion Board.

kkosienski

Thanks Phil for the heads up on Co-Archi Ver 2.0.  In the interim, we will continue to look at options we may have that work best for our architecture practice.  Any ballpark estimates on when the new version may be available?

Phil Beauvoir

> Any ballpark estimates on when the new version may be available?

2021 Q1 for beta versions.
If you value and use Archi, please consider making a donation!
Ask your ArchiMate related questions to the ArchiMate Community's Discussion Board.

Jean-Baptiste Sarrodie

Hi,

Quote from: kkosienski on October 30, 2020, 15:59:54 PM
We are starting to see performance issues with Github interactions. For example, an initial import is taking over 30 minutes.  The time to refresh the model, commit and publish are materially increasing to the point it is impacting efficiency.

I'll start with this "technical" question first. The way coArchi works means that each element, relationship or view will be associated to a small XML file. Each time you run an action (refreshing, commiting, publishing), you re-create all those files. In your case, this means that you generate 16700 XML files for each action. This can of course take some time, but usually, the real bottleneck is to be found on anti-virus solutions which will check each of those files, leading to poor performances. If you can, try to import and work on the model using a workstation on which anti-virus has been disable (or even better without one). If you see a big improvement, then you'll know where to look at. FWIW, One of my model need 3 min for a commit with anti-virus and 5 sec without. This also impacts initial import as this will download all versions of all XML files.

Quote from: kkosienski on October 30, 2020, 15:59:54 PM
We have one enterprise model we are currently using across our federated architecture community.  The main driver initially for this was that we have a lot of common shared elements that we import from our EAMS. The EAMS serves as a system of record for the elements.  Our model has now grown to almost 7000 elements, 9000 relationships and about 700 views.  This happened in the course of about 7 months since we originally implemented it.

That's a bit too big IMHO (I usually consider a model to be "big" when it reaches 5000 concepts) but the real question is: do you need all those elements comming from your EAMS? Very often, the majority of those elements are not used, but only there because of some automated synchronization... "just in case". In your case, you should try to make sure you don't import unnecessary things.

You should see how to partition your model. There are several ways to do so depending on how you use your model: you can split stable things in one model and projects in dedicated models (for this you can keep "stable things in the master branch and create a new project branch off of master when needed for a project). You can also split your model by business domain.

Of course, there will most certainly be some subset which is common to most of your models, in this case you can manage this as a sub model (keeping it to the bare minimum) and import it in all your models. You'll then be able to update the source model an import it again later.

Regards,

JB
If you value and use Archi, please consider making a donation!
Ask your ArchiMate related questions to the ArchiMate Community's Discussion Board.

kkosienski

Thanks JB.  Great suggestions on how we may want to think about breaking up our model.  We will look into the potential anti-virus issue but no hope of not running it since these are company supplied devices.  Maybe we can skip scanning of XML files if there are no associated risks.

kkosienski

Hi All,

I have been giving some thought to some of the suggestions Jean-Baptiste provided around partitioning models.  I wanted to throw out an idea around around a possible enhancement that could make an approach where large architecture teams could more easily share content between EA and SA functions.  At the risk of over simplifying the solution. What I am suggesting is that functionality could be developed to extend the current "view reference" functionality so that it can be leveraged across models. I am less interested in sharing the elements as we can manage that through building processes to import shared elements in different models were using.  What I feel is more important for us is to be able to share common / standard views of architecture components that  are being leveraged in business systems solutions. Those reference views being built and managed in their own models. So for example, the ability to refer to a view that represents the reference architecture for a business systems platform within a solution architecture being developed within a project. My feeling is that it would be adequate to render the reference in a more passive way like displaying an image. What would be important though is to still be able to add relationships to the view reference in the view it was being used in. I am not quite sure about how important it would be to capture the relationships (Model:View) in the model being referred to where the view being referenced lives.  I guess it would be nice to have to align with the "analysis" functionality that exists in Archi today.  I see the following items as a prerequisite for making this work.  1) It would be required that all models be located locally in a location that was set as a preference.  Models not in the location would not be able to take advantage of leveraging view references across models. 2) when creating a view reference. The context would have to be known to drive behavior. If the reference was being made in the same model the functionality would work as it does today. When the reference spanned models the application would need to verify the model and view existed physically in the local model repository. There may also need to be other user inputs or additional properties or preferences needed to manage cross model view references. 3) Lastly,  I think some thought would have to go into how best to manage the historical aspect of an architecture view if it was important to maintain a point in time reference or version of an architecture view. Maybe a cross model reference actually creates an image file for the view being referenced and that gets stored as an artifact managed in github and that is part of the repository it is being used in. There's probably several different ways to design this each with their own set of pro's and con's.   

I am probably overlooking a lot of inner workings of the application that make this a lot more complex than it sounds.  Interested in hearing your thoughts on this concept around allowing sharing of views across models.

-Kevin