markdown.sile icon indicating copy to clipboard operation
markdown.sile copied to clipboard

(Markdown) link attributes on links, indirect images etc.

Open Omikhleia opened this issue 2 years ago • 6 comments

Lunamark only implements a subset of extension link_attributes (and friends).

Leveraging Lunamark for full support would allow more things than currently supported (e.g. links to images, etc.)

Omikhleia avatar Nov 09 '22 10:11 Omikhleia

See https://github.com/jgm/lunamark/issues/43

Omikhleia avatar Nov 11 '22 10:11 Omikhleia

Needed for proper handling of #36 (see #8)

Omikhleia avatar Dec 03 '22 19:12 Omikhleia

A "minimal" ad-hoc implementation of link attributes on direct links was added in 8a15a6f0593b988edfdd747f55736dc9711f0927 in order to proceed with #36 (so that the latter can make it with both the markdown and the pandocast packages) -> There is (slightly) less urge for a proper implementation. Rescheduling to a latter milestone.

Omikhleia avatar Jan 01 '23 13:01 Omikhleia

See jgm/lunamark#43

Having read the above, I've managed my dirty solution in the meantime. Hope something more integrated will be introduced for seamless workflows.

#!/usr/bin/env lua

--[[
This function creates a new subdirectory within the working one. It then parses the source file for any image URLs,
and downloads the images to the newly created subdirectory. To use this function, you must have 'aria2' installed on
your system, a download utility that enables fast and efficient image downloads.
-- ]]
local function download_images(source, dir)
	local count = 0
	local f = assert(io.open(source, "r"), "Error opening source file.")
	for line in f:lines() do
		local url = line:match("%[!%b[]%b()%]%((.-)%)")
		if url then
			-- aria2 will handle the downloading erorrs, if any.
			os.execute("aria2c " .. url .. " -d " .. dir)
			count = count + 1
		end
	end
	f:close()

	print(string.format("\n%d links to the images used in '%s' have been downloaded to '%s'", count, source, dir))
end

-- Once all the images have been downloaded to a subdirectory, this function will help you reformat all the image links
-- in your markdown and divert them to the local storage instead of their remote servers.
local function reformat_img_links(source)
	local count, subs = 0, 0
	local lines = {}
	local file
	local success, err = pcall(function()
		file = io.open(source, "r")
		for line in file:lines() do
			if line:match("%[!%[.-%]%(.-%)%]%(.-%)") then
				count = count + 1
				-- Capture what inside [![(1)]]((2)) and transfer them into ![%1](%2)
				line, sub = line:gsub("%[!%[(.-)%]%(.-%)%]%((.-)%)", "![%1](%2){width=10cm}")
				-- Use a regular expression to replace the URL with "images/"
				line = line:gsub("https?://.+/", "images/")
				if sub then
					subs = subs + 1
				end
				lines[#lines + 1] = line
			else
				lines[#lines + 1] = line
			end
		end
	end)
	if not success then
		io.stderr:write("Failed to read file: " .. err)
		return
	end
	file:close()

	--- Use a temporary file to avoid overwriting the original file in case of an error.
	local temp = os.tmpname()
	local file
	local success, err = pcall(function()
		file = io.open(temp, "w")
		for n, line in ipairs(lines) do
			file:write(line, "\n")
		end
	end)

	if not success then
		io.stderr:write("Failed to write to temporary file: " .. err)
		return
	end

	file:close()
	-- For more info: https://zzzcode.ai/answer-question?id=7aee8473-b29d-436f-9326-b81666bdf8e0
	local function getFullPath(filename)
		local info = debug.getinfo(1, "S")
		local path = info.source:sub(2):gsub("[^/\\]+", "")
		return path .. filename
	end

	-- Get the full path to the source file.
	-- local source_path = getFullPath(source)
	-- Get the full path to the temporary file.
	local temp_path = getFullPath(temp)

	-- Move the temp file from /tmp to the working directory.
	-- local success = os.execute("mv " .. temp_path .. " " .. source_path)
	local success = os.execute("mv " .. temp_path .. " $(pwd)")

	if success then
		print("\nFile moved successfully!\n")
	else
		print("\nFile moving failed.\n")
	end

	-- Print the number of patterns found and replacements.
	print("Found " .. count .. " patterns and " .. subs .. " replacements\n")

	-- We can rename the tempfile to the source's one, but to be sure, manual renaming plays safe here.
	print(
		string.format(
			'All the links in "%s" file haven replaced, check the result in the file "%s", \nwhich resides in your working directory, before renaming it manually in case of something might have gone wrong in \nthe replacement process.',
			source,
			temp:gsub("/tmp/", "")
		)
	)
end

-- Real action.
-- download_images(arg[1], "images_test")
reformat_img_links(arg[1])

no-vici avatar May 04 '23 03:05 no-vici

Ah, @no-vici - reading your code snippet, I think it is something else.

This issue was about supporting extended (Pandoc) attributes on images and links. For instance a direct image in Markdown is just ![caption](url), but Pandoc alllows, for instance:

![caption](url){ #id .class width=xxx }

(with an attribute containing an identifier to allow internal PDF links, an explicit width, etc.).

As noted above, there's some support already for this in our version of the lunamark parsing library. But Markdown also has an indirect image syntax (and the same concept exists for links too), such as:

![caption][ref]
...
[ref]: url

This syntax is perhaps less known, but has the advantages that it allows using references in documents, and likely group them somewhere so that all actual URLs appear together, rather than being scattered.

The support for attributes in that case is however still missing (and so it's kind of useless for now). That's the point of this very issue: it needs to be added to the lunamark parsing code, and then markdown.sile has to take advantage of it too.

In other terms, it's just a syntax thing and it is wholly unrelated to the question of remote URLs being downloaded.

Omikhleia avatar May 04 '23 08:05 Omikhleia

Oh, thank you. That was a very clear explanation. I wanted to ask about downloading images automatically like what Pandoc already does but SILE not yet does. Going through the discussion briefly and I thought the discussion was about the feature that I was after ;)

no-vici avatar May 04 '23 09:05 no-vici