atomicapp
atomicapp copied to clipboard
nulecule component graph rendering is non-Hamiltonian
From an examination of the code in nulecule/base.py, it would appear that what is referred to as a graph is actually rendered as nested lists of lists.
The predicament this raises is the potential for the same "node" to appear multiple times in the data structure if it is a dependency of multiple other "nodes", which invariably results in an error and terminates execution when the "node" is deployed to a provider a second time.
To avoid such a predicament an acyclic hamiltonian graph is needed in place of the nested list of lists. networkx is one such library that could help with topological sorting.
@rtnpro ^^ do you mind taking a look at this?
I should probably explain this a little differently, after starting to work on a PR of my own.
Possibly a better explanation is that due to the nature of how the components are processed in a recursive manner, as opposed to composing a directed graph of components during the unpack method from which the other methods in atomicapp/nulecule/main.py operate, what ought to be a unique set of components that are processed just once is in fact indeterminate with no guarantee of uniqueness. For example I have a couple services depending on mongo, and when attempting to run mongo a second time, BOOM. The intention of that safeguard appears to be avoidance of infinite recursion.
The potential for infinite recursion can be eliminated by composing a directed graph with a determinate set of components and edges to define the topological ordering during the unpack method in main.py. The run method should operate over a reverse topological sort, and the stop method over a forward sort. The ordering itself will not be guaranteed as there exists the possibility for multiple valid traversals. For the other methods that depend on the unpack method, the ordering seems irrelevant.
Composing a graph during the unpack phase adds the bonus that the Nulecule objects are only rendered once from the filesystem. I think it also eliminates the need to nest external directories during unpacking. Plus it lends itself to the possibility of defining an answers.conf in which all recursive dependencies are addressable.
My efforts are a little stalled at the moment, but I should be able to pick them up shortly if @rtnpro doesn't beat me to it.
So far atomicapp, has been focused on container orchestrators like
kubernetes
, openshift
, where ordering of execution does not
matter much, as those systems are designed in a way to eventually
bring up things in a stable way.
I should probably explain this a little differently, after starting to work on a PR of my own.
Possibly a better explanation is that due to the nature of how the components are processed in a recursive manner, as opposed to composing a directed graph of components during the unpack method from which the other methods in atomicapp/nulecule/main.py operate, what ought to be a unique set of components that are processed just once is in fact indeterminate with no guarantee of uniqueness. For example I have a couple services depending on mongo, and when attempting to run mongo a second time, BOOM. The intention of that safeguard appears to be avoidance of infinite recursion.
That said, the orchestrators will not be able to handle the above scenario, and it needs to be taken care in atomicapp. :+1: for pointing it out.
The potential for infinite recursion can be eliminated by composing a directed graph with a determinate set of components and edges to define the topological ordering during the unpack method in main.py. The run method should operate over a reverse topological sort, and the stop method over a forward sort. The ordering itself will not be guaranteed as there exists the possibility for multiple valid traversals. For the other methods that depend on the unpack method, the ordering seems irrelevant.
It makes the implement the above in atomicapp
, so that atomicapp
can work across providers where ordering of events matter, e.g.,
docker-swarm, etc.
Composing a graph during the unpack phase adds the bonus that the Nulecule objects are only rendered once from the filesystem. I think it also eliminates the need to nest external directories during unpacking. Plus it lends itself to the possibility of defining an answers.conf in which all recursive dependencies are addressable.
:+1:
Please let me know if you have any question or something to discuss.
Also, I have been working on another PR: #720, which aims to improve the way we handle a config data in a (nested) atomicapp.
Thanks for the feedback, @rtnpro
I acknowledge that "relying" on birth order runs contra to one of the key tenets of orchestration platforms, but I also believe there's no harm in exploiting that information to make for a smoother deployment, which is less dependent on retry mechanisms.
I have since made more progress on a PR, and intend to have something by end of week.
WIP is here in case anyone has any comments prior to a PR.
On Mon, Jun 13, 2016 at 5:34 AM, Khris Richardson [email protected] wrote:
WIP is here in case anyone has any comments prior to a PR.
Seems radical :+1: I am looking into it and I will get back to you. Thanks!
On Mon, Jun 13, 2016 at 3:32 PM, Ratnadeep Debnath [email protected] wrote:
On Mon, Jun 13, 2016 at 5:34 AM, Khris Richardson [email protected] wrote:
WIP is here in case anyone has any comments prior to a PR.
Seems radical :+1: I am looking into it and I will get back to you. Thanks!
LGTM :+1:. Please create a PR for it :)
@khrisrichardson
I was not able to run wordpress atomicapp example. Seems like it failed to render artifacts for docker provider for the mother Nulecule app with id wordpress-atomicapp
. Here's the logs:
[vagrant@centos7-adb atomicapp]$ atomicapp -v run --provider docker ../nulecule-library/wordpress-centos7-atomicapp/
INFO :: - cli/main.py :: Atomic App: 0.5.2 - Mode: Run
DEBUG :: - cli/main.py :: Final parsed cmdline: run -v --provider docker ../nulecule-library/wordpress-centos7-atomicapp/
DEBUG :: - nulecule/main.py :: NuleculeManager init app_path: ../nulecule-library/wordpress-centos7-atomicapp/
DEBUG :: - nulecule/main.py :: NuleculeManager init image: None
DEBUG :: - nulecule/main.py :: Request to unpack to projectatomic/mariadb-centos7-atomicapp to ../nulecule-library/wordpress-centos7-atomicapp//external/mariadb-centos7-atomicapp
INFO :: - nulecule/base.py :: Unpacking image projectatomic/mariadb-centos7-atomicapp to ../nulecule-library/wordpress-centos7-atomicapp//external/mariadb-centos7-atomicapp
INFO :: - nulecule/base.py :: Found existing destination ../nulecule-library/wordpress-centos7-atomicapp//external/mariadb-centos7-atomicapp
ANSWER >> db_user (Database User): sdf
ANSWER >> db_pass (Database Password): sdf
ANSWER >> db_name (Database Name): sdf
INFO :: - nulecule/base.py :: Provider not specified, using default provider - kubernetes
DEBUG :: - plugin.py :: Configuration option 'provider-config' not provided
DEBUG :: - plugin.py :: Configuration option 'provider-config' not provided
DEBUG :: - providers/docker.py :: Given config: {u'db_pass': 'sdf', u'db_name': 'sdf', u'db_user': 'sdf', u'root_pass': u'MySQLPass', 'provider': u'kubernetes', 'namespace': 'default'}
DEBUG :: - providers/docker.py :: Namespace: default
WARN :: - providers/docker.py :: The artifact name has not been provided within Nulecule, using a UUID instead
DEBUG :: - providers/docker.py :: No image name found for artifact, using UUID 2cb4ecdb3a45 in container name
INFO :: - providers/docker.py :: Deploying to provider: Docker
INFO :: - providers/docker.py :: WARNING: Using --name provided within artifact file.
ERROR :: - cli/main.py :: Data for provider "docker" are not part of this app
But, it's not a blocker to open up a PR. Your work looks great. Once you open the PR, we can collaborate on it better.
Thanks :)
Hi @rtnpro Thanks so much for the review, but I think I may have asked for it a little prematurely.
I see the problem and it stems from a little hubris on my part for thinking that the NuleculeBase abstract base class and NuleculeComponent could "easily" be squashed into the Nulecule class with no side effects.
Whereas the iteration in master cleanly assigns the "params" and "artifacts" attributes, I made a terrible assumption that Nulecule.id was necessarily equivalent to the name of the item in self.graph.
That bad assumption lies both here and here.
That indicates to me that the changes I've made are in violation of the spec, because it assumes a one-to-one relationship between Nulecule manifest and "internal" component and that all other components are "external". That assumption matches my scenario, but is non-compliant.
I will try to address this before submitting a PR, and make sure the examples in the nulecule-library repo work first.
While I have your ear, it's unclear to me from the spec which has precedence in the event of a conflict, global params or graph object params. I would assume the latter. Cheers
On Tue, Jun 14, 2016 at 8:34 PM, Khris Richardson [email protected] wrote:
Hi @rtnpro Thanks so much for the review, but I think I may have asked for it a little prematurely.
I see the problem and it stems from a little hubris on my part for thinking that the NuleculeBase abstract base class and NuleculeComponent could "easily" be squashed into the Nulecule class with no side effects.
I wish I could do away with these abstractions, but this was the best thing that I could think of.
Whereas the iteration in master cleanly assigns the "params" and "artifacts" attributes, I made a terrible assumption that Nulecule.id was necessarily equivalent to the name of the item in self.graph.
That bad assumption lies both here and here.
That indicates to me that the changes I've made are in violation of the spec, because it assumes a one-to-one relationship between Nulecule manifest and "internal" component and that all other components are "external". That assumption matches my scenario, but is non-compliant.
I will try to address this before submitting a PR, and make sure the examples in the nulecule-library repo work first.
Awesome, looking forward to it :)
While I have your ear, it's unclear to me from the spec which has precedence in the event of a conflict, global params or graph object params. I would assume the latter. Cheers graph object parama > global params
We're in the process of documenting how we handle config data and will add it to the SPEC accordingly.
Let me know if you have any questions.