markdown.sile
markdown.sile copied to clipboard
(Markdown) link attributes on links, indirect images etc.
Lunamark only implements a subset of extension link_attributes (and friends).
Leveraging Lunamark for full support would allow more things than currently supported (e.g. links to images, etc.)
See https://github.com/jgm/lunamark/issues/43
Needed for proper handling of #36 (see #8)
A "minimal" ad-hoc implementation of link attributes on direct links was added in 8a15a6f0593b988edfdd747f55736dc9711f0927 in order to proceed with #36 (so that the latter can make it with both the markdown and the pandocast packages) -> There is (slightly) less urge for a proper implementation. Rescheduling to a latter milestone.
See jgm/lunamark#43
Having read the above, I've managed my dirty solution in the meantime. Hope something more integrated will be introduced for seamless workflows.
#!/usr/bin/env lua
--[[
This function creates a new subdirectory within the working one. It then parses the source file for any image URLs,
and downloads the images to the newly created subdirectory. To use this function, you must have 'aria2' installed on
your system, a download utility that enables fast and efficient image downloads.
-- ]]
local function download_images(source, dir)
local count = 0
local f = assert(io.open(source, "r"), "Error opening source file.")
for line in f:lines() do
local url = line:match("%[!%b[]%b()%]%((.-)%)")
if url then
-- aria2 will handle the downloading erorrs, if any.
os.execute("aria2c " .. url .. " -d " .. dir)
count = count + 1
end
end
f:close()
print(string.format("\n%d links to the images used in '%s' have been downloaded to '%s'", count, source, dir))
end
-- Once all the images have been downloaded to a subdirectory, this function will help you reformat all the image links
-- in your markdown and divert them to the local storage instead of their remote servers.
local function reformat_img_links(source)
local count, subs = 0, 0
local lines = {}
local file
local success, err = pcall(function()
file = io.open(source, "r")
for line in file:lines() do
if line:match("%[!%[.-%]%(.-%)%]%(.-%)") then
count = count + 1
-- Capture what inside [![(1)]]((2)) and transfer them into 
line, sub = line:gsub("%[!%[(.-)%]%(.-%)%]%((.-)%)", "{width=10cm}")
-- Use a regular expression to replace the URL with "images/"
line = line:gsub("https?://.+/", "images/")
if sub then
subs = subs + 1
end
lines[#lines + 1] = line
else
lines[#lines + 1] = line
end
end
end)
if not success then
io.stderr:write("Failed to read file: " .. err)
return
end
file:close()
--- Use a temporary file to avoid overwriting the original file in case of an error.
local temp = os.tmpname()
local file
local success, err = pcall(function()
file = io.open(temp, "w")
for n, line in ipairs(lines) do
file:write(line, "\n")
end
end)
if not success then
io.stderr:write("Failed to write to temporary file: " .. err)
return
end
file:close()
-- For more info: https://zzzcode.ai/answer-question?id=7aee8473-b29d-436f-9326-b81666bdf8e0
local function getFullPath(filename)
local info = debug.getinfo(1, "S")
local path = info.source:sub(2):gsub("[^/\\]+", "")
return path .. filename
end
-- Get the full path to the source file.
-- local source_path = getFullPath(source)
-- Get the full path to the temporary file.
local temp_path = getFullPath(temp)
-- Move the temp file from /tmp to the working directory.
-- local success = os.execute("mv " .. temp_path .. " " .. source_path)
local success = os.execute("mv " .. temp_path .. " $(pwd)")
if success then
print("\nFile moved successfully!\n")
else
print("\nFile moving failed.\n")
end
-- Print the number of patterns found and replacements.
print("Found " .. count .. " patterns and " .. subs .. " replacements\n")
-- We can rename the tempfile to the source's one, but to be sure, manual renaming plays safe here.
print(
string.format(
'All the links in "%s" file haven replaced, check the result in the file "%s", \nwhich resides in your working directory, before renaming it manually in case of something might have gone wrong in \nthe replacement process.',
source,
temp:gsub("/tmp/", "")
)
)
end
-- Real action.
-- download_images(arg[1], "images_test")
reformat_img_links(arg[1])
Ah, @no-vici - reading your code snippet, I think it is something else.
This issue was about supporting extended (Pandoc) attributes on images and links.
For instance a direct image in Markdown is just , but Pandoc alllows, for instance:
{ #id .class width=xxx }
(with an attribute containing an identifier to allow internal PDF links, an explicit width, etc.).
As noted above, there's some support already for this in our version of the lunamark parsing library. But Markdown also has an indirect image syntax (and the same concept exists for links too), such as:
![caption][ref]
...
[ref]: url
This syntax is perhaps less known, but has the advantages that it allows using references in documents, and likely group them somewhere so that all actual URLs appear together, rather than being scattered.
The support for attributes in that case is however still missing (and so it's kind of useless for now). That's the point of this very issue: it needs to be added to the lunamark parsing code, and then markdown.sile has to take advantage of it too.
In other terms, it's just a syntax thing and it is wholly unrelated to the question of remote URLs being downloaded.
Oh, thank you. That was a very clear explanation. I wanted to ask about downloading images automatically like what Pandoc already does but SILE not yet does. Going through the discussion briefly and I thought the discussion was about the feature that I was after ;)