gum icon indicating copy to clipboard operation
gum copied to clipboard

Coordinated auxes

Open nschneid opened this issue 2 years ago • 16 comments

In investigating UniversalDependencies/UD_English-EWT#298 I found one sentence lacking proper nested structure:

  • rules that establish what can and cannot be done

nschneid avatar Jan 22 '22 19:01 nschneid

Yes, I agree this is not right and should have conj, thanks - but what would you suggest doing with "not" in this case?

amir-zeldes avatar Jan 24 '22 21:01 amir-zeldes

advmod(can, not)

nschneid avatar Jan 24 '22 22:01 nschneid

Yes, I had the same thought BUT:

  • Attaching "not" to "can" treats "can" as the promoted equivalent of the verb (otherwise an AUX would not be allowed to dominate anything)
  • If it is the promoted equivalent of the verb, then it should be dominated by conj from the verb itself
  • But that would lead to a right-to-left conj!!!
  • So we should attach conj from AUX to AUX
  • But then we have to attach "not" to the lexical verb "done", but that makes it look like all versions of "done" are negated, yet there is a positive instance here!
  • So we should attach "not" to the second "can", as if it's promoted. BUT...

And at this point my brain went into a maximum recursion depth error :|

Any thoughts?

amir-zeldes avatar Jan 24 '22 22:01 amir-zeldes

AFAICT the rules about promotion are not really designed to deal with function words having dependents (even coordination). I think the most natural solution is to coordinate the AUXes, and then treat the 2nd AUX as head of the negation. The other option (which some of the EWT sentences had before I changed it) treating the main verb as elided in the first instance was really awkward.

nschneid avatar Jan 24 '22 22:01 nschneid

From a commonsense perspective I agree completely that that's the most sensible solution. However I would like to see some UD guideline formalize this, for example:

Coordinated auxiliaries should be attached from the first to the subsequent ones via conj. If more than one function word participates in a non-initial auxiliary (for example a negation of just the second auxiliary, or a cluster of coordinated auxiliaries), then the syntactically highest auxiliary in the non-initial cluster is taken to represent that entire cluster, is governed by conj and governs the remaining function words in its cluster using their usual deprels

Note that this explanation may not make sense for languages with postposed coordination AND postposed auxiliaries, in which the second AUX can very naturally behave this way just by assuming normal promotion.

Pinging @dan-zeman for your opinion on a UD policy for such cases.

amir-zeldes avatar Jan 25 '22 15:01 amir-zeldes

I think that the more complex cases should be solved as coordinate clauses with ellipsis, i.e., an auxiliary is promoted. And the more I am thinking about it, the more I am wondering whether we should do the same even with the simple coordination of auxiliaries, such as she could and should come early.

dan-zeman avatar Jan 25 '22 20:01 dan-zeman

should be solved as coordinate clauses with ellipsis

Do you mean with enhanced dependencies and an empty node? But what would you do with the basic graph? Do you mean orphan(can2, not)?

amir-zeldes avatar Jan 25 '22 21:01 amir-zeldes

No, we do not use orphan when we promote an auxiliary to the head of clause. We also do not use empty nodes in such cases.

what can and cannot be done

nsubj(can-2, what) conj(can-2, done) cc(done, and) aux(done, can-4) advmod(done, not) aux(done, be)

dan-zeman avatar Jan 25 '22 21:01 dan-zeman

nsubj(can-2, what)

OK, it looks like you're analyzing this like a question, right? Isn't this a free relative? If it's a question I would expect nsubj:pass (because we are asking what can be done), but if it's a free relative I guess it would be acl:relcl(what, can-2)?

In the free relative reading I think you are suggesting this:

image

Is that right? Thoughts on this @nschneid ? I understand the motivation to do this, but I find it dissatisfying that for the first "can" there is no trace of what the predicate actually is, not even in edeps, since no empty node is introduced. On the other hand, if we attach "not" to the second "can", we need to do something similar in saying "the predicate is the same as the conj parent", but that seems a little more intuitive to me (as @nschneid suggested). I would like to throw the orphan option into the mix too though, since we could say it's like this:

1	rules	rule	NOUN	_	_	2	nsubj	2:nsubj	_
2	establish	establish	VERB	_	_	0	root	_	_
3	what	what	PRON	_	_	2	obj	2:obj|4.2:nsubj:pass|9:nsubj:pass	_
4	can	can	AUX	_	_	9	aux	4.2:aux	_
4.1	be	be	AUX	_	_	_	_	4.2:aux:pass	_
4.2	done	do	VERB	_	_	_	_	3:acl:relcl	_
5	and	and	CCONJ	_	_	6	cc	9:cc	_
6	can	can	AUX	_	_	4	conj	9:aux	_
7	not	not	PART	_	_	6	orphan	9:advmod	_
8	be	be	AUX	_	_	9	aux:pass	9:aux:pass	_
9	done	do	VERB	_	_	3	acl:relcl	4.2:conj:and	_

In this analysis, the edeps express the entire expanded argument structure with auxiliaries and negation, but in basic dependencies we would say "not" can't modify an auxiliary, so it's really an orphan caused by the ellipsis of the second "done", which is reconstructed in an empty node.

amir-zeldes avatar Jan 25 '22 21:01 amir-zeldes

you're analyzing this like a question, right?

Yes. I deliberately cited only part of the example to get rid of the free relative reading :-) And yes, it should have been nsubj:pass rather than just nsubj.

I am not apriori against using empty nodes for content predicates when auxiliaries are promoted, especially if we end up adding other possibilities for empty nodes. I am just saying it is not what UD does now. (And if that enhancement is introduced, then I think it should include all places where we promote auxiliaries now.)

dan-zeman avatar Jan 25 '22 21:01 dan-zeman

We are having the free relative vs. interrogative content clause discussion elsewhere. For this thread let's sidestep that and use a simpler sentence:

  • I can not3 and should not6 eat this whole pizza.

Option A is to treat coordinated material (the non-initial conjuncts) as self-contained so that the analysis is just

  • I can not eat this whole pizza.

with a few edges/words added: conj(can, should) and advmod(should, not-6). This seems most natural to me. It basically promotes the second aux to head with respect to the second not. Since auxiliaries are also verbs it feels like not too much of a stretch.

Option B is to say that, really, not-6 is modifying the main verb rather than should, so instead of advmod it should be orphan(should, not-6), but otherwise the same as Option A.

Option C (which I think is what @dan-zeman meant) is to promote the first aux as head of the whole clause, so that both nots are proper advmods. But this has the effect that removing "and should not" changes the whole structure, because auxes are not normally heads.

Option D is to assume the verb is elided in the first instance: I can not <eat3.1> and should not eat7 this whole pizza. Which is fine insofar as the Enhanced layer goes but results in Option C for the Basic dependencies, which is clunky.

Are those the 4 options or am I missing something?

nschneid avatar Jan 25 '22 22:01 nschneid

I think that's right thank you both! What you say was exactly my problem: I agree @nschneid 's suggestion is sensible, but I also felt it is not what UD currently officially does, as @dan-zeman pointed out.

This is complicated enough that I don't see a clear winner - 'the blanket is too short to cover everything', as we say where I'm from... So we need to choose which part to cover and which one to mistreat. Shall we tackle this in some upcoming meeting?

amir-zeldes avatar Jan 25 '22 22:01 amir-zeldes

I would love to see examples from other languages, too. But they may be difficult to search for because we don't know how people currently analyze them (and they are probably quite rare).

dan-zeman avatar Jan 25 '22 22:01 dan-zeman

Sure, same here. I could produce translations of this into a couple of languages where things work sufficiently similarly... But either way I would ideally like to have a live discussion, GH issues are a bit too cumbersome for this level of deliberation IMO.

amir-zeldes avatar Jan 25 '22 22:01 amir-zeldes

Yeah we should have a live discussion. One more point is that other kinds of function words can be coordinated: "Are you flying right into or right out of Chicago?" "Is this or that book the correct one?" So maybe we need a general policy.

nschneid avatar Jan 25 '22 23:01 nschneid

Kulturelle Ansichten müssen und sollen diskutiert werden. "Cultural views must and should be discussed." Schránka-1

de as evoluções de «downsizing» e «rightsizing» a que muitos sistemas de informação estiveram ou virão a estar sujeitos. "from the evolutions of «downsizing» and «rightsizing» to which many information systems have been or will be subject." Schránka-2

maar het is en blijft een race tegen het horloge. "but it is and remains a race against the watch." Schránka-3

K něčemu takovému bychom nuceni byli a nebyli. "We would be and would not be forced to do such a thing." Schránka-4

členy Evropské unie nejsme a patrně ani dlouho ještě nebudeme. "we are not members of the European Union and probably will not be for a long time." Schránka-5

 a byla by to bývala (nebo snad i bude) návštěva prvního významného státníka v oblasti "and it would have been (or perhaps will be) the visit of the first important statesman in the area" Schránka-6

dan-zeman avatar Jan 26 '22 10:01 dan-zeman