atomspace
atomspace copied to clipboard
Better interoperability with deep learning frameworks
This issue is to document and discuss changes necessary to use deep learning frameworks with opencog.
Our usecases:
-
We would like to be able to pass between ExecutionOutputLinks python objects, particularly pytorch tensors. Pytorch saves in it's tensor instances necessary information for performing backward propagation.
-
More convenient api than GroundedSchemaNode for calling objects methods.
Example:
Our motivating example is implementing transparent by design networks(https://arxiv.org/pdf/1803.05268) with opencog. The idea of this example is answering some question about a picture by applying series of filters, implemented as pytorch neural networks. Each nn accepts original picture + mask from previous filter(initially mask is all zeroes) generating new mask.
First i describe current implementation: ExecutionOutputLinks for the question "What is the large purple object is made of?":
https://github.com/singnet/semantic-vision/blob/9ca40eedd78eb6aec7af469defd436eace2c4be5/experiments/opencog/pattern_matcher_vqa/tbd_cog/tbd_helpers.py#L140-L166
(ExecutionOutputLink
(GroundedSchemaNode "py:filter")
(ListLink
(ConceptNode "material")
(VariableNode "$X")
(ExecutionOutputLink
(GroundedSchemaNode "py:filter")
(ListLink
(ConceptNode "color")
(ConceptNode "purple")
(ExecutionOutputLink
(GroundedSchemaNode "py:filter")
(ListLink
(ConceptNode "size")
(ConceptNode "large")
(ExecutionOutputLink
(GroundedSchemaNode "py:init_scene")
(ListLink
(VariableNode "$Scene")
)
)
)
)
)
)
)
)
Here pattern matcher grounds VariableNode "$X" to different ConceptNodes representing materials since the constraint:
(InheritanceLink
(VariableNode "$X")
(ConceptNode "material")
)
where filter is wrapper to call some pytorch module object: https://github.com/singnet/semantic-vision/blob/9ca40eedd78eb6aec7af469defd436eace2c4be5/experiments/opencog/pattern_matcher_vqa/tbd_cog/tbd_helpers.py#L389-L404
def filter(filter_type, filter_type_instance, data_atom):
module_type = 'filter_' + filter_type.name + '[' + filter_type_instance.name + ']'
module = tbd.function_modules[module_type]
atomspace = data_atom.atomspace
key_attention, key_scene, key_shape_attention, key_shape_scene = generate_keys(atomspace)
feat_input = extract_tensor(data_atom, key_scene, key_shape_scene)
feat_attention = extract_tensor(data_atom, key_attention, key_shape_attention)
out = module(feat_input.float(), feat_attention.float())
set_attention_map(data_atom, key_attention, key_shape_attention, out)
return data_atom
and init_scene accept scene atom and generate new atom which holds dummy attention map and features from scene. This atom is then reused to pass values between filters.
There are issues with current implementation
a. It requires to convert back and forth between pytorch tensor objects and FloatValue for each ExecutionOutputLink application.
See https://github.com/singnet/semantic-vision/blob/9ca40eedd78eb6aec7af469defd436eace2c4be5/experiments/opencog/pattern_matcher_vqa/tbd_cog/tbd_helpers.py#L281
b. This implementation doesn't allow to backpropagate error to neural network weights since information is lost due to conversion. Pytorch Tensor object keeps both: current numeric values and link to computation graph which allows to backpropagate error automatically.
PtrValue Both issues may be addressed with introduction of new value type: PtrValue. Then to implement storing of values for different binding language one would inherit from base PtrValue type. For python there will be c++ class PythonValue(PtrValue) and for haskell HaskelValue(PtrValue) etc..
Then extracting tensor object in "py:filter" will look like:
atom.get_value(PredicateNode("PythonValue"))
and returning value will be done by creating a new atom to hold value:
layer_result = ConceptNode(gen_random_uuid())
layer_result.set_value(PredicateNode("PythonValue"), PythonValue(tensor_mask))
ExecutionValueLink In addition to PtrValue we may introduce new link type: ExecutionValueLink which would return a PtrValue. This would allow to return from "py:filter" PythonValue(tensor_mask).
That's for addressing of usecase 1.
To address usecase 2 - more convenient api than GroundedSchemaNode for calling objects methods:
One way is to use proposed PtrValue alongside with wrapper function.
def callPythonMethod(atom_obj, atom_method_name, *args):
obj = atom_obj.get_value(PredicateNode("py:callPythonMethod"))
return getattr(obj, atom_method_name.name)(*args)
Then calling method will be a bit verbose but quite straightforward:
ExecutionOutputLink(
GroundedSchemaNode("py:callPythonMethod"),
ListLink(ConceptNode("FilterRed"),
ConceptNode("forward"),
ExecutionOutputLink...
)
GroundedObjectNode
Another way to address the same issue is to use LibraryManager to store python objects. GroundedObjectNode atom type would register python object in LibraryManager. Like:
import torch
GroundedObjectNode("dot", torch.dot)
then calling with ExecutionOutputLink or any other executable link:
ExecutionOutputLink
GroundedSchemaNode("obj: dot.__call__")
ListLink
VariableNode("$OtherWeights")
VariableNode("$SomeWeights")
There was a detailed discussion on how to integrate deep learning with the atomspace that happened in the spring, with @Necr0x0Der and @misgeatgit is implementing the prototype for this, with the space-time server. I gave a sketch of this in pull req #1971 -- I'll respond to the above shortly, when I get a chance to read it.
After experimenting with first prototype we have found that representing NN features as FloatValues is not enough. This improvement is suggested by @Necr0x0Der next step for integration NN and OpenCog.
Is this code already written, or is this a design proposal? I don't think you should do with nested execution output links -- that's NOT going to work well at all ... for many different reasons
OK, first -- some background lectures: the atomspace is meant to be a declarative data store, and not a procedural programming language. Your example is treating it as a procedural programming language, and that will not work well; it will have terrible performance, it cannot be used with the pattern matcher, the pattern miner, the rule engine, PLN, etc.
An example of a declarative way of declaring what you wrote would be this:
(SequentialAndLink
(PredicateNode "init scene")
(EquivalentLink
(PredicateNode "size")
(PredicateNode "large"))
(EquivalentLink
(PredicateNode "color")
(PredicateNode "purple"))
(EquivalentLink
(PredicateNode "material")
(VariableNode "$result")))
This is one way to declare it. It's not a good way (there are better ways), but it is a way, and since its simple, let me use it as an example for the next 24 or 48 hours.
Anyway, the above is a declaration. It doesn't actually do anything. To make it "actually do something", you could (for example) write this:
(ExecutionOutputLink
(GroundedSchemaNode "py:process_scene")
(SequentialAndLink .... (the stuff above))
The py:process_scene
code would compile the contents of the sequential-and, to create an executable network of filters. Then, every time the ExecuteOutputLink is called, you would execute that network once.
There are several problems with my proposal, above. Next post will talk about how to solve them. Issues include:
-
There is nowhere for the compiled code to be cached. (you want to compile it only once, and cache the results; you want to execute it many times
-
It probably does not make sense to return the result as an Atom (it should be a value)
-
the SequentialAndLink format is ugly-ish.
Let's solve issues 1) and 2) part of 3) The solution that seems natural, at the moment, is to invent a new link type, call it VisualLink
. In the atomspace, it would be used like so (figure three from the Arxiv article)
(VisualLink
(QueryNode "color")
(SequentialAnd
(AttendNode "large")
(IntersectLink
(SequentialAnd
(RelateNode "left")
(AttendNode "sphere")
(AttendNode "metal")
(AttendNode "large"))
(SequentialAnd
(RelateNode "right")
(AttendNode "metal")
(AttendNode "green")))))
The above is just the digram from figure 3, written upside-down. (because Atomese is upside-down ...)
The VisualLink
is a brand new C++ class. It might look something like this pseudocode: (next post)
The VisualLink
is a brand new C++ class. Actually, you need TWO C++ classes. The other one is VisualValue
. But first, VisualLink
might look something like this pseudocode:
class VisualLink: public Link
{
private:
PyExec * compiled_query;
public:
VisualLink() { // ctor
// Convert the outgoing set of AttendNodes and RelateNodes to
// a PyTorch program, and save that program.
compiled_query = PyMangle("blargle", getOutgoingSet());
ValuePtr torchy = createVisualValue(compiled_query);
this->set_value(PredicateNode("visual answer"), torchy);
}
}
That's it. Not much there, except a big blob of code to convert the list of AttendNode
s and RelateNode
s into a pytorch program. Next, we have this:
class VisualValue : public FloatValue
private:
PyExec * compiled_query;
virtual void update() const { // overload update() from FloatValue
// Execute the previously generated PyTorch program
std::vector<double> result = PyExecute(compiled_query);
this->_value = result // store the result right here.
}
};
That's it. Nothing more to do here. Its almost trivial. How does the user use this? Like so:
(define myprocessor (VisualLink ...etc)) ; stuff from above
;; Now get the value; the Predicate must be the same as in the C++ code!
(define myview (cog-get-value myprocessor (PredicateNode "visual answer")))
;; Now run the processing pipeline exactly once
(display "hello world\n")
(display (cog-value->list myview))
(newline)
; Now run it again...
(display (cog-value->list myview))
Every time you call the scheme function cog-value->list
it will call the C++ FloatValue::value()
method which calls the C++ VisualValue::update()
method (because update
is virtual) and VisualValue::update()
calls the previously compiled pytorch code to get an answer.
That's it. Although this is pseudocode, there is not a lot here. VisualValue is really pretty simple, almsot exactly like the pseudocode; the hard parts are the compiler in VisualLink
Why implement it like this?
A) When late-night talk-show host asks Sophia, "hey is there a large green metal sphere on your left?", the ghost
subsystem can convert this English-language sentence into the VisualLink
given above, then run it, then get the answer.
B) Some suitable form of PLN can make inferences about the contents of the VisualLink
, viz, "if round things are balls and shiny things are metal, therefore (VisualLink round metal)"
C) The pattern miner can look for frequent patterns "there were 23 VisualLinks for round things and 48 Visual Links for shiny things and 5555 Visual links for small shiny colored squares" -> Sophia tells the talk show host "Gee, I think I really like glitter!"
D) Almost no compute cycles happen in the atomspace, or on the CPU. After compiling the visual network; it gets loaded onto the GPU's where it sits and does its thing in real time. The ONLY time you waste CPU cycles is when the user does the cog-value->list
which causes pytorch to ask the GPU for an answer. Otherwise, the system is idle.
The declarative, stateless form of the VisualLink allows all kinds of subsystems to gain access to the scene description. If you hide the state inside of GroundedObjectNodes
, it becomes impossible to find it, reason over it, edit it, construct new visual queries on the fly, etc.
Is this code already written, or is this a design proposal? I don't think you should do with nested execution output links -- that's NOT going to work well at all ... for many different reasons
Code with nested execution output links is already written, it is written that way mainly due to reason that we don't have URE rule for working with tensor(pixelwise) truth values. Once we have more seamless integration we will implement declarative solution.
My thought on VisualLink vs ValuePtr: ValuePtr allows to write more declarative code since it allows us to interleave inference in URE with calls to neural networks, URE determines actual sequence of calls to NN. On the other hand VisualLink requires one to write some imperative code to execute it's content:
The py:process_scene code would compile the contents of the sequential-and, to create an executable network of filters.
Code with nested execution output links is already written,
What's the github repo?
VisualLink requires one to write some imperative code to execute it's content
Yes, absolutely. That's the whole point! That's the intent! I feel that maybe there is still some misunderstanding. Code written in python, C++, etc. is necessarily imperative, because that is what these languages are. Code written "inside of atoms", implementing them, is necessarily imperative: it "does something".
URE
Nothing above makes use of the URE in any way. You haven't yet written a single rule, much less something that would require a URE.
Personally, I don't see anything wrong with nesting ExecutionLinks
or any kind of links. A sequence is underneath a nesting of links too (see https://wiki.opencog.org/w/ConsLink).
Argh @ngeiswei this hurts. First, ConsLink is a terrible idea, for the same reason that SetLink is a terrible idea. Now that we know, from five-plus years of hard-core here's-dirt-in-your-face experience about the evilness of SetLink, and the evilness of nesting in general ... this is not some rugby game where you stand up and say "please smash my face into the dirt again, I like how that feels". Nesting doesn't work because it's not editable, its not pattern matchable, its not rewritable. Cough cough -- "quote link" -- "Doctor doctor, it hurts when I do this" -- "well, don't do that!"
ExecutionOutputLinks are just ugly hacks meant to provide emergency escape routes for incompletely-designed systems. I don't want to encourage bad design when there is a simple, easy, obvious alternative. We need to take the accumulated experience, and move forward.
OK, but sometimes you want things to be immutable.
The difference between
Member
E1
A
...
Member
En
A
Inheritance
A
B
and
Inheritance
Set
E1 ... En
B
is that the latter is set in stone, the former isn't.
Which one is better depend on the usage, sure for iteratively returning pattern matcher results a MemberLink is better than a SetLink
, in other situations, not.
@linas Rules will be something like that(for conjunction over tensors):
variables = VariableNode("X1"), VariableNode("X2")
pattern = AndLink(variables)
rewrite = ExecutionOutputLink(GroundedSchemaNode("py: fuzzy_tensor_conjunction_introduction_formula"),
ListLink(AndLink(variables), SetLink(variables)))
BindLink(pattern, rewrite)
Then it will be possible to rewrite nested execution output links to more declarative form:
(AndLink
(ExecutionOutputLink
(GroundedSchemaNode "py:filter")
(ListLink
(ConceptNode "color")
(ConceptNode "purple")
(VariableNode "$Scene")))
(ExecutionOutputLink
(GroundedSchemaNode "py:filter")
(ListLink
(ConceptNode "size")
(ConceptNode "large")
(VariableNode "$Scene")))
)
By the way how introduction of VisualValue is better than PtrValue? It has pointer to python object just like PtrValue..
@ngeiswil yes, SetLink does have some valid use cases. It got over-used in the current API, and that is my fault.
@noskill, as stated elsewhere: GroundedSchema and ExecutionOutput are ugly hacks meant to solve simple problems, and you are pushing them far beyond what they can carry,
ExecutionOutput/GroundedScheme are like Atomese "GOTO" statements: they have valid uses, they solve certain simple problems, one should not write complicated code with them. The PtrValue
is like a void *
pointer: it points at anything. You keep trying to tell me that you want to write complex programs using GOTO's and void* pointers, and I keep saying "don't do that". You keep saying that you can make it look structured, or declarative, or maybe even object-oriented, if you really try hard, and yes, I'm sure that's true.
I'm saying that structured programming, object-oriented programming was invented for a reason. There's a history for why people do it that way. This history is older than me, and I'm fine with it. It doesn't need to be rediscovered/reinvented.
Create an AttendNode
C++ object. Using ordinary OO programing style. If it needs to hold state, put that state into the C++ object. Just make sure that it's reconstructable state: nothing that would ever need to be transmitted on the network, nothing that would ever need to be saved to disk. Create 23 different OO method calls on class AttentionNode
. Many of these method calls might call py:filter
, under the covers where they user cannot see the call to python. If you need to pass extra hidden values, put them into the class AttentionNode
. If you really really need to, use void*
pointers in the C++ code, but I'm pretty sure that PyObject*
would be better.
Create a RelateNode
. If the RelateNode
needs to call methods on the class AttendNode
to get secret parameters and hidden values, .. well, then, do that. If VisualLink
needs to talk to both RelateNode
and AttendNode
so that it can cache partial results that py:filter
returned, -- great! Do that! Whatever it is that py:filter
returns, whatever additional stuff that it is that it needs as input, cache it inside of VisualLnk
and AttendNode
and RelateNode
as C/C++ pointers, maybe PyObject*
pointers, whatever, I don't care. Just use good, clean, normal OO programming style. Don't get hacky.
I am afraid that saying more will confuse the issue, but I can't stop. I think that comment https://github.com/opencog/atomspace/issues/1970#issuecomment-450517875 and https://github.com/opencog/atomspace/issues/1970#issuecomment-450519933 really really describe the correct implementation for this.
Above, I said "object oriented" several times. I mean it, and that is very important. I want good, clean, college-textbook-standard OO style. The thing I did not say was "structured query" and "relational data", which is also very very important. I want you to follow the rules of good clean text-book-style relational design. Atomese should look exactly like what they teach in school as good relational style.
Here is a very simple nice example: There are five short tutorials here: https://grakn.ai/grakn-core Please please please read them and study them.
-
Panel 1 is labeled ER "Entity-Relationship" and these are like our EvaluationLink/PredicateNode idea.
-
Panel 2 is "Types", which is like our type system (the "deep types" part)
-
Panel 3 is called "Rules" and its almost identical to our
BindLink
-
Panel 4 is called "Inference" and it is almost identical to running our
GetLink
-
Panel 5 is called "Analytics" and we don't do that.
The grakn.ai example is a very nice example of good, clean, solid well-designed relational programming style. Its just like SQL, but 50 years newer and better. That is the same kind of style that we want to have for Atomese.
Notice that grakn.ai does not have GroundedSchema and ExecutionOutputLinks. That is not an accident, a short-coming, or a missing feature. That is because they have a good, clean high-quality design that eschews nasty ugly callback hacks. (Notice that SQL doesn't have these, either. Notice that datalog doesn't have these, either.)
Atomese should look like, be written like, have the same general style as grakn.ai, or SQL, or datalog. It should follow all the rules in a typical textbook on relational data, including concepts like "normalization".
Atomese should NOT look like lambda calculus or CamL or Haskel or Lisp or scheme.
Atomese should NOT look like a random mashup of GOTO's and void* pointers.
Panel 4 is called "Inference" and it is almost identical to running our GetLink
I think panel 4 demonstrates true inference, thus would be closer to what the rule-engine does.
Atomese should look like, be written like, have the same general style as grakn.ai, or SQL, or datalog.
Maybe, or maybe a layer on top of Atomese could look like grakn.ai and such.
It's kinda hard for me to have a feel about all this because I lack the experience of building large and complex knowledge bases with Atomese, only worked on toy problems so far.
@linas Ok, i kind of understand your position. Our goals require more functionality than provided by grakn.ai or datalog. I hope it's possible to make atomspace extendable enough so that we can implement PtrValue and GroundedObjectNode in separate repository.
@noskill, intead of
def callPythonMethod(atom_obj, atom_method_name, *args):
obj = atom_obj.get_value(PredicateNode("py:callPythonMethod"))
return getattr(obj, atom_method_name.name)(*args)
I think you mean
def callPythonMethod(atom_obj, atom_method_name, *args):
obj = atom_obj.get_value(PredicateNode("PythonValue"))
return getattr(obj, atom_method_name.name)(*args)
that is "py:callPythonMethod"
has been replaced by "PythonValue"
, right?
Hmm... I have read this thread moderately carefully, and unfortunately it is not yet quite clear to me how Linas wants to see Anatoly/Vitaly implement their desired functionality....
On Fri, Jan 11, 2019 at 6:49 PM Nil Geisweiller [email protected] wrote:
@noskill, intead of
def callPythonMethod(atom_obj, atom_method_name, *args): obj = atom_obj.get_value(PredicateNode("py:callPythonMethod")) return getattr(obj, atom_method_name.name)(*args)
I think you mean
def callPythonMethod(atom_obj, atom_method_name, *args): obj = atom_obj.get_value(PredicateNode(""PythonValue")) return getattr(obj, atom_method_name.name)(*args)
that is "py:callPythonMethod" has been replaced by "PythonValue", right?
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub, or mute the thread.
-- Ben Goertzel, PhD http://goertzel.org
"The dewdrop world / Is the dewdrop world / And yet, and yet …" -- Kobayashi Issa
@ngeiswei It should be some meaningful string, maybe "PythonValue"
@bgoertzel I wrote the desired psuedocode in these two comments: https://github.com/opencog/atomspace/issues/1970#issuecomment-450517875 and https://github.com/opencog/atomspace/issues/1970#issuecomment-450519933 -- also please note that the pseudocode in those two comments is almost a cut-n-paste of the octomap API -- both of which are the formal result from the long email chain with Alexey Potapov from last spring, where these details got worked out.
@noskill You wrote: "Our goals require more functionality than provided by grakn.ai or datalog."
I am trying to use these two as examples of high-quality relational design. The atomspace is very nearly a textbook-standard relational database. If you don't understand what relations are, or how to write them, you will have a very difficult time abstracting your problem and arriving at a good solution.
@noskill, you wrote "I hope it's possible to make atomspace extendable enough so that we can implement PtrValue and GroundedObjectNode in separate repository."
I gave some very explicit and precise pseudocode that explains exactly how to write a good implementation that follows textbook relational-database style, and also textbook object-oriented style programming, in comments https://github.com/opencog/atomspace/issues/1970#issuecomment-450517875 and https://github.com/opencog/atomspace/issues/1970#issuecomment-450519933
I think that pseudocode does exactly everything that you need to do, and provides for all features and functions for encapsulating your project. Its an API that has already been proven to work; its used both in the octomap implementation, and in example programs that were explicitly created for your neural-net project.
I don't understand why there is a lot of resistance and friction here. I'm pretty sure that you were told from the very beginning of the project, by many people, that you should NOT use GroundedSchemaNodes and GroundedPredicateNodes -- I don't understand why you decided to go this way, anyway, despite being told not to ... and why you are rejecting what seems to be a very simple, clear and obvious design.
Linas, I don't understand your metaphor
"GroundedSchema and ExecutionOutput are ugly hacks meant to solve simple problems, and you are pushing them far beyond what they can carry,
ExecutionOutput/GroundedScheme are like Atomese "GOTO" statements: "
There may be ugly hacks in the current implementation of these Atoms (I haven't looked lately) but as the person who first introduced these concepts into Atomese, I can tell you they were not intended to be anything like Atomese GOTO statements. They were really intended to be more functional-programming-ish than GOTO-ish. ExOutLink lets a GroundedSchemaNode applied to some argument produce some entity which then gets passed along to a different GroundedSchemaNode. This is not like a GOTO statement that passes a pointer to a certain position, it's just passing the output of one function as input to another function.... I don't consider this an ugly hack but rather a generally useful functionality...
Regarding "The atomspace is very nearly a textbook-standard relational database" ==> honestly I think this statement is only true at a high level of abstraction that is not really useful for this discussion...
Regarding "Atomese should look like, be written like, have the same general style as grakn.ai, or SQL, or datalog. It should follow all the rules in a typical textbook on relational data, including concepts like "normalization".
Atomese should NOT look like lambda calculus or CamL or Haskel or Lisp or scheme." ==> I understand what you're saying here but I just don't agree with you...
What you describe is NOT the vision that I had for the Atomspace when I came up with the idea, and not the vision that I had when thinking about how different cognitive algorithms would interact with Atomspace, etc. It is also not how I have been thinking about Atomspace when discussing it with Nil, Alexey and others...
This view of Atomspace doesn't accommodate for how PLN currently works in the Atomspace, with explicit quantifiers and quantifier bindings, etc. And it also doesn't accommodate for how I think makes most sense to integrate deep NNs with Atomspace (which does involve a more functional style, including ExOutLinks passing their outputs to other GSNs...)
So as far as I can understand, where we are now is: You (Linas) have one view of how Atomspace/Atomese should be used, both in general and in this particular case; and we (Alexey and his team and myself) have a different view. It's obvious you think your approach is much better, but I prefer to just look at it as two different approaches.
The question then is how do we manage this on a codebase level so that everything is maximally copacetic. While I am immensely thankful to and respectful of you for writing and designing the bulk of the current code in the Atomspace repo, nevertheless I don't agree with your current perspective about how this deep-NN/OpenCog related code should be implemented (nor with your general current philosophy about eschewing lambda-calculus-ish and functional-programming-ish idioms in Atomese in favor of SQL-ish/datalog-ish idioms...).... I don't want to screw with your development work or style, but I also don't want to have to accommodate everyone's work to your particular taste... (which I note changes over time like all of our tastes do...)
It would really be better NOT to have to fork Atomspace into Linas-space versus Singularity-space ...
Clearly MOST of the stuff that Vitaly/Anatoly/Alexey want to do here can be modularized from the core Atomspace code so you don't have to deal with in your own work if you don't want to. However it seems there are a few things that they need to do that would need to go into the core Atomspace code. My own view would be: These are not huge code changes, and they are features that you don't need to use in your own work if you don't want to. So I don't see why they should be blocked.
I do understand your VisualLink suggestion. I just think that design is worse, it's awkward by comparison and will lead to a big proliferation of link types instead of a simpler abstract mechanism. I understand your taste is different than that of myself/Anatoly/Vitaly/Alexey in this matter but these are highly competent AI developers and theorists and should not have to do every little thing according to your particular taste IMO...
Apologies if the above remarks are a bit rambling and not 100% coherent, I don't have a lot of time for this but have read the various comments on this issue moderately carefully and am very eager to see this move forward. There is so much cool stuff happening in the "deep NN for vision and language world" these days and so much value to be created by interfacing this stuff appropriately with OpenCog, it's a shame to be held back from exploiting this potential via codebase-management issues. I understand these are not trivial issues, they are about how cognitive architecture intersects w/ programming paradigm etc. etc., but nevertheless given the task we have collectively taken on we gotta be able to sort through this stuff more efficiently than is currently occurring...
They say that working with engineers is like pushing rope, or herding cats. It feels like I'm doing a whole lot of rope-pushing and cat-herding. I think that this is once-again a very clear and simple problem that is being turned into something exquisitely complexticated, for reasons that I don't understand. Please, let us keep simple things simple, and save our energy for the complicated things that really need thought and effort!
On Fri, Jan 11, 2019 at 10:18 PM bgoertzel [email protected] wrote:
"GroundedSchema and ExecutionOutput are ugly hacks meant to solve simple problems, and you are pushing them far beyond what they can carry,
ExecutionOutput/GroundedScheme are like Atomese "GOTO" statements: "
There may be ugly hacks in the current implementation of these Atoms (I haven't looked lately) but as the person who first introduced these concepts into Atomese, I can tell you they were not intended to be anything like Atomese GOTO statements.
Defacto, as currently implemented, they are nothing more than escapes. They are completely lacking in any architecture to ever be anything other than escapes. I don't particularly see how they could ever be anything more than an escape.
Regarding "The atomspace is very nearly a textbook-standard relational database" ==> honestly I think this statement is only true at a high level of abstraction that is not really useful for this discussion...
I honestly think that this statement is a fundamental cause of misunderstanding about what the atomspace is, and how it works. When new-comers look at the atomspace for the first time, they are completely befuddled. We really need to make it clear to newcomers that the atomspace is just like a text-book relational database.
Again -- take a look at the difficulty we had with Corto, and compare it to what is happening at grakn.ai -- the grakn.ai people understand what the problem is, they created a clean, elegant solution, and they understand how to present the solution in such a way that others understand what it is, so that others start using it.
Now, I think that the atomspace is more sophisticated, more advanced than grakn.ai But as long as no one seems to understand what the atomspace is, or how it works, we will continue to have these pushing-rope issues.
What you describe is NOT the vision that I had for the Atomspace when I came up with the idea, and not the vision that I had when thinking about how different cognitive algorithms would interact with Atomspace, etc. It is also not how I have been thinking about Atomspace when discussing it with Nil, Alexey and others...
I don't know what to say. Perhaps you need to read a book on relational algebra. I cannot fix the way you think. I can tell you about reality, and about how things actually are. I would be a lot happier if you actually listened to what I said.
This view of Atomspace doesn't accommodate for how PLN currently works in the Atomspace, with explicit quantifiers and quantifier bindings, etc.
There is a huge amount of confusion in that statement. I'm guessing that you have no clue what a relational algebra is. Have you ever actually used SQL for anything? An outer join? and inner join?
And it also doesn't accommodate for how I think makes most sense to integrate deep NNs with Atomspace (which does involve a more functional style, including ExOutLinks passing their outputs to other GSNs...)
Holy cow. There was a long email chain on NN's with Alexy this spring and summer. You were a part of that conversation. We worked out all the details of exactly how this could be implemented inside the atomspace! I wrote up a design for it! I wrote example code and mailed it around! Its now part of the unit test suite! Misgana has more or less finished an actual working implementation, for the octomap server! The question is: why aren't noskill and vsbogd following that design? Did they not understand it? Did they not like it? Is there something flawed or incomplete about it?
The question then is how do we manage this on a codebase level so that everything is maximally copacetic.
Look I have a straight-forward task in front of me, and that is to create a system where all of the various sub-components can work with one-another in a coherent, uniform way, with a common programming interface that everyone understands and everyone can use. I am pushing back on the proposal here, because it is just weird and different and not very thought-out; it is going to integrate poorly with PLN and with ghost, and with octomap.
While I am immensely thankful to and respectful of you for writing and designing the bulk of the current code in the Atomspace repo,
Thank you!
nevertheless I don't agree with your current perspective about how this deep-NN/OpenCog related code should be implemented
Why is this coming out now, instead of half a year ago? Have you read the example demos, and found them lacking, incomplete, incorrect? Sure, it is possible that I made a mistake, overlooked something, failed to think it through in enough detail. But we discussed this in detail. Why did you agree, for the last half year, and change your mind today?
(nor with your general current philosophy about eschewing lambda-calculus-ish
Over the last two years, Nil has very carefully and in great detail exposed exactly what goes wrong when one takes a naive approach of merging lambda calculus with relational algebra. I call it "the QuoteLink issue". I think he calls it the "alpha conversion issue". We talked and argued about it a lot. The vast amount of code that Nil wrote to handle QuoteLink, and the complexity of it, and all of the unusual corner cases and exceptions have made it profoundly clear that the current merger of relational algebra and lambda calc is fundamentally flawed.
Now, you can either burn some brain cells and try to figure out how to fix this problem, or you can live in denial and claim that there is no problem.
In the 50+ years that relational algebra and lisp have been around, I am sure many people have tried to merge them. There is no shame in saying that we too have tried, and the results did not work out as nicely or wonderfully as we had hoped. But lets not pretend everything is rainbows and unicorns; its not.
Clearly MOST of the stuff that Vitaly/Anatoly/Alexey want to do here can be modularized from the core Atomspace code so you don't have to deal with in your own work if you don't want to.
Look, the GroundedWhateverLink is a de facto escape (or goto) -- it is architected like an escape (or GOTO) Maybe you once upon a time had a different vision of it, but that is not what it actually is, today. And the PtrValue is like a void* pointer.
I personally think its crazy to even try to write software with void* and goto's. Structured programming was invented in the 1960's to solve this problem. Relational algebra was invented in the 1970's to solve this problem. Object oriented programming was invented in the 1980's to solve this problem. Why the heck would we time-travel ourselves back to the 1950's and start programming with escapes and void*'s ?
I do understand your VisualLink suggestion. I just think that design is worse, it's awkward by comparison and will lead to a big proliferation of link types instead of a simpler abstract mechanism. I understand your taste is different than that of myself/Anatoly/Vitaly/Alexey in this matter but these are highly competent AI developers and theorists and should not have to do every little thing according to your particular taste IMO...
Jesus, it is not a matter of "taste" -- For whatever reason, you are unable to see an escape when you see one. I see one big blaring escape. It's not taste, its a factual statement.
Look at how the authors of the arxiv article explained their own work: they provided a "figure 3" for the benefit of the reader. Figure 3 seems to me like a perfectly adequate, entirely reasonable, maybe even a very good way of describing what their system does, and how it works.
We should implement that diagram. It seems eminently sensible to do it the way they describe it.
-- Linas
-- cassette tapes - analog TV - film cameras - you
why you are rejecting what seems to be a very simple, clear and obvious design.
@linas Honestly you just replaced PyObject* with PyExec* and ExectutionOutputLink with VisualLink.
The only argument why it is better that i noticed is "ExecutionOutputLinks are just ugly hacks" and poiters to void and goto.
I don't buy pointers to void argument, c++ compiles to binary code that has jmp instructions in it, and a lot of languages compile to c with pointers to void, e.g. cython and it doesn't cause much problems.
Besides i don't understand "atomspace is relational database" argument. Even if so, atomese language' interpretation is determined by interpreter. Now we have pattern matcher acting as interpreter and also URE which allows to provide different interpretations.. It's nice: we have knowledge base and different algorithms that use that common knowledge base. We are trying to add new class of algorithms that one can use with atomspace. If you want limit interactions with external programs like in sql or gakn.ai then it will make atomespace no more useful for ai project than gakn.ai
Besides pl/sql allows to call external code, so there is place for it even in relational databases
What's the github repo?
The link is provided in the beginning of the issue, i copy it here:
https://github.com/singnet/semantic-vision/blob/9ca40eedd78eb6aec7af469defd436eace2c4be5/experiments/opencog/pattern_matcher_vqa/tbd_cog/tbd_helpers.py#L140-L166
VisualLink requires one to write some imperative code to execute it's content
Yes, absolutely. That's the whole point! That's the intent! I feel that maybe there is still some misunderstanding.
Our intent is different, it is to replace imperative python code from original model with declarative atomese code. Current implementation is just a little bit more declarative, but i hope we will move further.
Alexey summarized the plan here https://blog.singularitynet.io/towards-declarative-visual-reasoning-or-not-a8f84ca09b39
This PtrValue thing might not be the final design, we need to experiment with this and other ideas without need to maintain fork.