gallery-dl icon indicating copy to clipboard operation
gallery-dl copied to clipboard

How to replace "/"? (How to express a forward slash as a literal charactrer?)

Open KonoVitoDa opened this issue 2 years ago • 3 comments

I need to replace <br /> in Pixiv's 'caption'. I tried {caption:R<br />//} with no success.

EDIT: Also, is there any way to replace a "True" value? On Reddit, I would like "is_original_content" in a given field to be converted to "OC" when its value is True. With is_original_content:?OC:// I got "OC:True", but that's not ideal.

KonoVitoDa avatar Jul 07 '22 05:07 KonoVitoDa

Using / in format specifiers like R<br />// is not possible, because the "parser", if you can even call it that, is very primitive. It simply uses all / characters as argument separators and does not allow for any escapes like \/.

If your format string is not particularly complex, or only contains {caption}, you can use a direct Python expression to build the value with, which allows for all sorts of transformations and it also uses a proper parser: \fE caption.replace('<br />', '')

The potential problem here is that this can only applied to entire format strings and not just a single replacement field.


Also, is there any way to replace a "True" value? On Reddit, I would like "is_original_content" in a given field to be converted to "OC" when its value is True. With is_original_content:?OC:// I got "OC:True", but that's not ideal.

Well, you could use the \fE thing again: \fE 'OC' if is_original_content else ''),

or you convert the boolean True/False to a string first and replace its values accordingly: {is_original_content!s:RTrue/OC/RFalse//}

mikf avatar Jul 08 '22 11:07 mikf

{is_original_content!s:RTrue/OC/RFalse//}

This worked great, thanks!

And I don't even know how to use the \fE thing (where and how to place it), sorry. 😅 I need to study programming properly someday.

KonoVitoDa avatar Jul 09 '22 01:07 KonoVitoDa

\fE (or \fT or \fM etc, see https://github.com/mikf/gallery-dl/blob/master/docs/formatting.md#special-type-format-strings) goes at the beginning of a format string and tells gallery-dl that this format string should not be interpreted like a regular one ([id}_{title}.{extension}), but as something else.

\fE in this case means Python Expression, where you can directly use functions like str.replace() (or delete all your files, if you really wanted to ...)

In other news, I've added a global option that lets you change the character used as argument separators (https://github.com/mikf/gallery-dl/commit/74865adae56230b611162208f5b496923cd5e361). Set format-separator to "#", for example, and you can do {caption:R<br />##}, but be aware that this affects all format strings in your config.

mikf avatar Jul 10 '22 17:07 mikf

but be aware that this affects all format strings in your config.

Still no way to limit it to just some extractors/postprocessors/strings?

alternatively: how could I write a regex expression for Notepad++ to find all the / used as format-separator in my config?

KonoVitoDa avatar Jun 25 '23 20:06 KonoVitoDa

No, still no way to do that and there most likely never will be.

Just use f-strings. They work very similar to regular format strings in basic cases and allow for a lot more control when you want to transform values.

\fF {'OC' if is_original_content else ''}

Post your format strings and I'll convert them for you if you can't do it yourself.

mikf avatar Jun 26 '23 11:06 mikf

Post your format strings and I'll convert them for you if you can't do it yourself.

Nvm, I was able to do it myself by searching them with (\{.*?)(\:)(\?|L|J|R|D|O)(.*?)(\/)(.*?)(\}), and the replacing.

KonoVitoDa avatar Jun 28 '23 19:06 KonoVitoDa