rdf4j icon indicating copy to clipboard operation
rdf4j copied to clipboard

Unbound variable logic in BindingSets is very confusing.

Open JervenBolleman opened this issue 2 years ago • 3 comments

Current Behavior

The handling of unbound in the bindingsets is rather difficult. And causes differences between iterator and getValue and getBinding access. With lots of space for subtle bugs.

@Test
	public void testUnbound() {
		QueryBindingSet bs1 = new QueryBindingSet();
		bs1.addBinding(String.valueOf(1), null);
		assertEquals(1, bs1.size());
		assertNull(bs1.getBinding("1"));
		assertTrue(bs1.hasBinding("1"));
		//This part fails
		Iterator<Binding> iterator = bs1.iterator();
		assertTrue(iterator.hasNext());
		Binding next = iterator.next();
		assertNull (next);
	}

I think it would be better to use a special marker value for Unbound and not use null. Current solution for #2741 requires a 3rd array to keep track of band settings.

Expected Behavior

I propose we introduce a singleton variable/type to mark unbound values. To allow for consistent and easier to track logic.

The following should then pass.

@Test
	public void testUnbound() {
		QueryBindingSet bs1 = new QueryBindingSet();
		bs1.addBinding(String.valueOf(1), UNBOUND);
		assertEquals(1, bs1.size());
		assertTrue(bs1.getBinding("1") instanceof Unbound);
		assertTrue(bs1.getBinding("1") == UNBOUND);
		assertTrue(bs1.hasBinding("1"));
		//This part passes
		Iterator<Binding> iterator = bs1.iterator();
		assertTrue(iterator.hasNext());
		Binding next = iterator.next();
		assertTrue(next.getValue() instanceof Unbound);
		assertTrue(next.getValue() == UNBOUND);
	}

Steps To Reproduce

No response

Version

4.0.0-develop

Are you interested in contributing a solution yourself?

Yes

Anything else?

No response

JervenBolleman avatar Dec 19 '21 08:12 JervenBolleman

"Unbound" literally means there is no binding value for a particular variable, therefore doing something like bs1.addBinding(String.valueOf(1), null) shouldn't even be possible IMO. This is also reflected in the javadoc of Binding itself, e.g. on the getValue method:

	/**
	 * Gets the value of the binding. The returned value is never equal to <var>null</var>, such a "binding" is
	 * considered to be unbound.
	 *
	 * @return The value of the binding, never <var>null</var>.
	 */
	Value getValue();

I agree there is a discrepancy here but if anything I'd argue that, rather than fixing the behavior of the iterator, we should be fixing the size method instead: it shouldn't count variables where the value is null, as those are not bindings. And/or possibly we should strength the input validation on BindingSet#addBinding to protest on getting a null value.

Fwiw I vaguely remember that we might be (ab)using the fact that BindingSet doesn't enforce a requireNonNull on the value argument in some places in the SPARQL engine. If that is indeed the case, let's go over those situations and see if we can solve them in a better way.

abrokenjester avatar Dec 28 '21 00:12 abrokenjester

Yes, the sparql engine does push in null values in a binding in a number of places. Which is very confusing and fragile. So I would like to fix it in the engine.

On Tue, Dec 28, 2021, 01:17 Jeen Broekstra @.***> wrote:

"Unbound" literally means there is no binding value for a particular variable, therefore doing something like bs1.addBinding(String.valueOf(1), null) shouldn't even be possible IMO. This is also reflected in the javadoc of Binding itself, e.g. on the getValue method:

/** * Gets the value of the binding. The returned value is never equal to null, such a "binding" is * considered to be unbound. * * @return The value of the binding, never null. */ Value getValue();

I agree there is a discrepancy here but if anything I'd argue that, rather than fixing the behavior of the iterator, we should be fixing the size method instead: it shouldn't count variables where the value is null, as those are not bindings. And/or possibly we should strength the input validation on BindingSet#addBinding to protest on getting a null value.

Fwiw I vaguely remember that we might be (ab)using the fact that BindingSet doesn't enforce a requireNonNull on the value argument in some places in the SPARQL engine. If that is indeed the case, let's go over those situations and see if we can solve them in a better way.

— Reply to this email directly, view it on GitHub https://github.com/eclipse/rdf4j/issues/3530#issuecomment-1001808129, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAHQYFMW5LTRGAIVHAWX2VLUTD62PANCNFSM5KLUHOCA . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

You are receiving this because you authored the thread.Message ID: @.***>

JervenBolleman avatar Dec 28 '21 07:12 JervenBolleman

Yes, the sparql engine does push in null values in a binding in a number of places. Which is very confusing and fragile. So I would like to fix it in the engine.

I think the reason we did had to do with it sometimes being necessary to explicitly register a specific variable as unbound in a particular branch of the evaluation. I'll try and refresh my memory on this by digging through some code. Let's try and hash out a better approach. The unbound marker you suggested might not be a bad solution but I also want to look other options.

abrokenjester avatar Dec 28 '21 21:12 abrokenjester