Paste Config Improvement: add support for a filter function in pasteConfig to allow matching elements by attributes
Motivation
-
Need a central place to validate/normalize app-generated paste payloads.
-
Current gaps:
- Multiline paste:
patternsaren’t applied because payload is treated as HTML. - Can’t apply cross-line rules (e.g., “entire bold line → header”) before blocks are created.
- Special formatting cases are hard to solve tool-by-tool (see #2909).
- Multiline paste:
Proposal
- Add a top-level
pasteInterceptoroption that receivesPasteData(s) and can transform/validate them before insertion.
Prior art / PoC
- Special formatting context: https://github.com/codex-team/editor.js/issues/2909
- Minimal implementation sketch: https://github.com/tylox-team/editor.js/commit/d3d88da262541257c3659ea827a35692445360e4
Next steps
- If this sounds reasonable, I can open a PR with the solution.
Special formatting cases are hard to solve tool-by-tool Pasting from Google Docs Strips Basic Text Formatting (Bold, Italic, Underline)
Firstly, we need to find out the reason of that problem.
Firstly, we need to find out the reason of that problem.
@neSpecc The reason is simple: Google Docs only applies formatting through the style attribute (my simple solution):
<meta charset="utf-8" />
<span
style="
font-size: 11pt;
font-family: Arial, sans-serif;
color: #000000;
background-color: transparent;
font-weight: 700;
font-style: normal;
font-variant: normal;
text-decoration: none;
vertical-align: baseline;
white-space: pre;
white-space: pre-wrap;
"
id="docs-internal-guid-a778f21e-7fff-8120-1a71-a31507244384"
>bold</span
><span
style="
font-size: 11pt;
font-family: Arial, sans-serif;
color: #000000;
background-color: transparent;
font-weight: 400;
font-style: normal;
font-variant: normal;
text-decoration: none;
vertical-align: baseline;
white-space: pre;
white-space: pre-wrap;
"
>, </span
><span
style="
font-size: 11pt;
font-family: Arial, sans-serif;
color: #000000;
background-color: transparent;
font-weight: 400;
font-style: italic;
font-variant: normal;
text-decoration: none;
vertical-align: baseline;
white-space: pre;
white-space: pre-wrap;
"
>italic</span
><span
style="
font-size: 11pt;
font-family: Arial, sans-serif;
color: #000000;
background-color: transparent;
font-weight: 400;
font-style: normal;
font-variant: normal;
text-decoration: none;
vertical-align: baseline;
white-space: pre;
white-space: pre-wrap;
"
>, plain</span
>
An equally significant problem is with processing patterns when inserting multiple formatted blocks
Plus the problem with the inability to set specific logic for processing a bold paragraph as a header
p.s. all of this is tested with v2.30.6
I think we should extend our API to support matching elements by attributes as well.
Our pasteConfig allows to specify a filter function.
// bold inline tool
static get pasteConfig(): PasteConfig {
return {
tags: [{
bold: true,
b: true,
span: (el) => {
return parseInt(el.style.fontWeight, 10) > 400
}
}],
};
}
But currently we don't check this filter function (it is used only for sanitizing) in paste processing. We need to improve getTagsConfig to support it.
In case of TS problems, maybe related #2957
@neSpecc sounds good
and about handling patterns: currently when you paste multiple formatted blocks editorjs calls processText(data, isHTML = true)
in the case when there are more then one block, processPattern is not called
maybe it make sense to change this part in such way:
dataToInsert.map(
async (content, i) => {
const patternContent = (await this.processPattern(content.data.textContent.trim())) as PasteData;
return this.insertBlock(patternContent || content, i === 0 && needToReplaceCurrentBlock);
}
);
what do you think?
p.s. Let me describe the situation for understanding: when transferring articles from Google Doc as source, it already contains links to related resources, and it is quite difficult to go through each of them to activate the processPattern after copy-pasting.
when transferring articles from Google Doc as source, it already contains links to related resources, and it is quite difficult to go through each of them to activate the processPattern after copy-pasting
Sorry, I'm not sure I've got this.
@neSpecc sorry, I just wanted to explain a scenario in which patterns aren't processed:
- You have a Google Doc containing formatted text, plain text, and links (without formatting), as shown in the screenshot.
- You copy all these paragraphs of text and paste them into editor.js.
- The link should have been converted into an embed block, but in fact, it remains plain text.
This happens because processPattern isn't called when you paste multiple paragraphs at once. To turn them into embed blocks, you have to go through each such link after insertion and convert them manually
Got it. It is a separate feature, we can support it as well.
@neSpecc should I create separate issues for discussed features (one for processPattern and one for getTagsConfig) or we can keep it here?
+1, we’d really benefit from this feature as well.
Our use case is very similar: we need a single, centralized place to normalize and validate all pasted content (HTML from Google Docs/Word, multi-line text, etc.) before blocks are created and tools-specific onPaste logic runs. Right now this kind of logic is hard to implement reliably from the outside without hacking around Editor.js’ internal paste handling and risking issues with selection/undo/history.
A global pasteInterceptor hook at the Editor level (with access to the full PasteData[] and the API) would solve this cleanly and make advanced paste pipelines much easier to build.
Is this something that’s realistically on the roadmap? And if yes, is there any rough timeline or guidance on the preferred API shape, so that the community could help with a PR that’s likely to be accepted?
@piotrp321 if you need some solution for now you can use my patch but it is for v2.30.6
@neSpecc should I create separate issues for discussed features (one for processPattern and one for getTagsConfig) or we can keep it here?
@SunnyCapt for the second feature, it would be better to create a separate issue