tree-sitter-lua icon indicating copy to clipboard operation
tree-sitter-lua copied to clipboard

bug: Highlighting fails after ERROR

Open gregnis opened this issue 1 year ago • 1 comments

Did you check existing issues?

  • [x] I have read all the tree-sitter docs if it relates to using the parser
  • [x] I have searched the existing issues

Tree-Sitter CLI Version, if relevant (output of tree-sitter --version)

tree-sitter 0.22.6 (b40f342067a89cd6331bf4c27407588320f3c263)

Describe the bug

Here's the file I parsed:

function expand_dgttsp(token, result_sentence)

	if trigger ~= "WEB" then
		function nocase (s)	
		  s = string.gsub(s, "%a", function (c)
				return string.format("[%s%s]", string.lower(c),
											   string.upper(c))
			  end)
		  return s
		end

		token, count = string.gsub( token, "\/", " !BREAK! " )	

		
		--token, count = string.gsub( token, "Xi%s([A-Z])", "/+'si/+ %1" )

		token = " "..token.." "

		token = lang_spec_abbr(token, result_sentence)

		for word, sign in string.gfind(token, "(%a+)([%p ])") do
			word2 = nil
			if sign ~= " " then
				word2 = word..sign
			end

			
			if abbr_cs[word] ~= nil then
				token, count = string.gsub( token, " "..word.." ", " "..abbr_cs[word].." " )
				if word2 ~= nil then
					token, count = string.gsub( token, " "..word2.." ", " "..abbr_cs[word].." " )
				end
			end

			expr1 = nocase(word)
			if word2 ~= nil then
				expr2 = expr1.."[\\"..sign.."]"
			end
			word = string.lower(word)

			
			if abbr_begin[word] ~= nil then
				token, count = string.gsub( token, "^ "..expr1.." ", abbr_begin[word].." " )
				if word2 ~= nil then
					token, count = string.gsub( token, "^ "..expr2.." ", abbr_begin[word].." " )
				end
			end

			
			if abbr_end[word] ~= nil then
				token, count = string.gsub( token, " "..expr1.." $", " "..abbr_end[word] )
				if word2 ~= nil then
					token, count = string.gsub( token, " "..expr2.." $", " "..abbr_end[word] )
				end
			end

			
			if abbr_misc[word] ~= nil then
				token, count = string.gsub( token, " "..expr1.." ", " "..abbr_misc[word].." " )
				if word2 ~= nil then
					token, count = string.gsub( token, " "..expr2.." ", " "..abbr_misc[word].." " )
				end
			end
		end
	end
	return token
end

function get_maneuver_group()

	guidance_mode = "publicTransport"

	result_id = {"p","t","0","0","0","0","0","0"}	

	group_types = {
		["drive"] = "a",
		["pedestrian"] = "b",
		["pedestrian_walk_to"] = "c",
		["pedestrian_get_off_and_arrive"] = "d",
		["pedestrian_get_off_and_walk_to"] = "e",
		["public_transport_take_and_get_out"] = "f",
		["public_transport_take_and_change"] = "g",
	}

	group_instruction_type = {
		["summary"] = "a",
		["arrival"] = "b",
		["wait"] = "c",
	}

	set_result_id( 5, group_types[maneuver_group.maneuver_group_type] )
	set_result_id( 6, group_instruction_type[maneuver_group.maneuver_group_instruction_type] )

	if ( maneuver_group.maneuver_group_type == "public_transport_take_and_get_out" or maneuver_group.maneuver_group_type == "public_transport_take_and_change" )
	  and ( maneuver_group.line_destination == nil or maneuver_group.line_destination == "" ) and ( maneuver_group.maneuver_group_instruction_type == "summary" ) then
		set_result_id( 6, "d" )
	end

	if ( maneuver_group.maneuver_group_type == "drive" or maneuver_group.maneuver_group_type == "pedestrian" or maneuver_group.maneuver_group_type == "pedestrian_get_off_and_arrive" )
	  and ( maneuver_group.street == nil or maneuver_group.street == "" ) and ( maneuver_group.maneuver_group_instruction_type == "arrival" ) then
		set_result_id( 6, "d" )
	end

	if maneuver_group.street ~= nil then
		tts_street_1 = maneuver_group.street
	else
		tts_street_1 = ""
	end

	if maneuver_group.station_name ~= nil then
		station_name = maneuver_group.station_name
	else
		station_name = ""
	end

	if maneuver_group.transit_type ~= nil then
		if transit_type_list[maneuver_group.transit_type][1] ~= nil and transit_types ~= nil then
			transit_type = transit_types[transit_type_list[maneuver_group.transit_type][1]]
		else
			transit_type = ""
		end
	else
		transit_type = ""
	end

	if maneuver_group.line_name ~= nil then
		line_name = maneuver_group.line_name
	else
		line_name = ""
	end

	if maneuver_group.line_destination ~= nil then
		line_destination = maneuver_group.line_destination
	else
		line_destination = ""
	end

	if maneuver_group.next_station_name ~= nil then
		next_station_name = maneuver_group.next_station_name
	else
		next_station_name = ""
	end

	if maneuver_group.company_short_name ~= nil then
		station_company_name = maneuver_group.company_short_name
	else
		station_company_name = ""
	end

	if maneuver_group.time_to_wait ~= nil then
		time_to_wait = tostring(maneuver_group.time_to_wait)
	else
		time_to_wait = "0"
	end

	filter_double_street_on_street_signpost_combination( )
	command_id_1 = result_id[1]..result_id[2]..result_id[3]..result_id[4]..result_id[5]..result_id[6]..result_id[7]..result_id[8]
	sentence_1 = set_result( command_id_1 )

end

Highlighting stops in the middle if line 12

token, count = string.gsub( token, "\/", " !BREAK! " )

and comes back in 120. Commenting out line 12 restores highlighting.

Steps To Reproduce/Bad Parse Tree

Run the playground with the file above and use highlights.scm from the repo.

Expected Behavior/Parse Tree

See above.

Repro

No response

gregnis avatar Oct 19 '24 18:10 gregnis

I attached the failing file here (change .lua.txt to .lua) common2.lua.txt. And here's the tree: tree.txt

gregnis avatar Oct 19 '24 19:10 gregnis

token, count = string.gsub( token, "/", " !BREAK! " )

LuaLS complains about the escape sequence \/ being invalid (gsub takes Lua patterns, not regular expressions!) Are you sure this is valid syntax?

clason avatar May 17 '25 09:05 clason

I don't know, not a Lua programmer. I can see though that VSCode highlights the file in question without any issues.

Image

gregnis avatar May 17 '25 18:05 gregnis

VS Code uses a different, less correct, syntax highlighting strategy.

clason avatar May 17 '25 18:05 clason

So you are OK with the parser not recovering until over 100 lines after a "bad" line?

gregnis avatar May 17 '25 19:05 gregnis

If you are OK with incorrect syntax, yes.

clason avatar May 17 '25 19:05 clason

My point is that one of the main features of tree-sitter is its ability to recover from errors. That even bad syntax will not break the parser. It's hard to argue that a single bad line should be able to break syntax highlighting for such a large part of a file.

gregnis avatar May 17 '25 20:05 gregnis

Yes, so please bring this up with tree-sitter. Or open a PR if you think you can do it better.

clason avatar May 17 '25 20:05 clason