FiltaQuilla
FiltaQuilla copied to clipboard
Feature Request: Additional Regex Search options
From (RegEx) To/Cc/Bcc (RegEx) To (Regex) Cc (Regex) Bcc (Regex) Body (RegEx)
Body is important, & all others, but i know a few in the list already exist.
I think generally there is nothing wrong with any of these requests - It may just come across a little spammy if we have a separate option for each of these. I wonder if there was a way to put them into "one item" and add the header name as additional parameter into the textbox?
Not really spammy, as all i need is to locate what i want in a list.. even if the lists already rich (excluding regex). "I wonder if there was a way to put them into "one item" and add the header name as additional parameter into the textbox?". This complicates things, forcing people to add more parameters after having built the reg. expression.. i'd rather see each clear function to select.. regex being already a hard task in specific situations.
Also, the "match" term is needless, cause "is", "isn't", "contains", "doesn't contain" must be options for regex too.
Also, the "match" term is needless, cause "is", "isn't", "contains", "doesn't contain" must be options for regex too.
sounds like you want to make it even harder for me. To me there should be "match" (and maybe "doesn't match") and the rest should be controlled via the regex. If just 1 match is found within the field I would assume this to be true.
This is the original function from Kent James:
my first job will be to make this compatible with Thunderbird 78. Then I will look into whether it is possible (and easy to understand for the end user + the regular expression engine) to expand its behavior.
I think adding support for more different search fields first (as you requested in the original post above) makes more sense and shouldn't be too hard - working with body & attachments is always a problem as filtering seems to be synchronous and often will get you a nice locking mechanism especially when dealing with Imap mail - it requires all mails to be downloaded in full before running the filters, because you cannot run any of these commands "Server-side".
Client-side filtering is a bit of a minefield, debugging it requires running Thunderbird with a command line parameter and looking at an external log file, that's because it runs mostly on the (unscriptable) C++ layer. Also a reason I don't intend to commercialize FiltaQuilla because I really don't want to debug core code and be forced to have a fully functions Visual Studio build environment depending on multiple GByte of source code.
You may as well throw in an option linking to RegEx101 while you're at it. BIG WINK
match/doesn't match sounds fair enough Axel, as containing or not can be controlled through the regex itself, anyways. with the actual filters in thunderbird + your new full regex filtering system, spam will finally DIE. it's been a decades wish lol
& a last wish:
Which means that the addon will keep in an updated text file & not vm, my tb's email accounts auto updated by the addon itself & if none of them appear, or even nothing appears in To field, i'll take action: delete..
that would be better put into a separate issue, as it has nothing to do with regex things... It's hard enough to keep track of everything, please don't piggyback unrelated issues on the same thread.
OK, so the workaround for most of these is using Header Regex Match, (as you know already but other uses may not, hence I am documenting it here), with the following parameters: From (RegEx) - "from:regular Expression" To (Regex) - "to:regular Expression" Cc (Regex) - "cc:regular Expression" Bcc (Regex) - "bcc:regular Expression"
... which leaves To/Cc/Bcc (RegEx) Body (RegEx)
they should be implemented first because there is no workaround.
Fully support xlovinglyx's suggestion for a "Body (RegEx) match" clause ! Plz note the "URi data (base64 encoded strings)" issue mentioned in https://github.com/wangvisual/expression-search/issues/102 to be considered as well. If there is anything I can do aside from donating to FiltaQuilla, e.g. beta testing a "Body (RegEx) match" enhancement, plz let me know. Cheers - Rob
Raven, you started with:
"I think generally there is nothing wrong with any of these requests" Then you decided to ignore the way i'd like them to be disposed, all regex sections instead of the stupid header regex filed where i should type the regex type myself, instead of including the regex type, & as i had told you, the body was refused in the past & you show me it can't be added now.
i'm giving up on filtaquilla.
Raven, you started with:
"I think generally there is nothing wrong with any of these requests" Then you decided to ignore the way i'd like them to be disposed,
I wasn't ignoring it, but I have a very busy Christmas period - also where in this thread was I saying it is impossible with body? It may be tricky because it requires the whole mail to be downloaded first, but so do any filter conditions that work in body.
I want to implement "body" and "to/bcc/cc" first because they cannot be implemented by the user with the workaround (regex with header). Once they are done we can move on to breaking out separate items for From (RegEx) / To (RegEx) / Cc (RegEx) and Bcc (RegEx) - that last one is a bit of a head scratcher as it won't work for incoming mails (the bcc is not transmitted).
By the way I just cross-linked from: https://thunderbird.topicbox.com/groups/addons/T1a9d1732af0b0db2/contribution-request-2 to show that I am interested in implementing that functionality. It's just not fair to expect me to implement these functions over the holiday period without it even being a commercial Add-on. I have spent plenty of hours testing and trying to understand my user's needs, but new features have to be introduced carefully. Yesterday I spent 2 hours debugging Thunderbird because of #79 to find out that there is likely a low level problem in the C++ layer that loads the filters from disk. You probably don't know how time-demanding and complex this work really is.
It's not a matter of knowing how complex this is, it's just that i've never been a fan of downgrades & usually hardly follow them. it doesn't matter to me if it takes time, i just need to make sure it's in your todo list, to simplify user's experience.
"they cannot be implemented by the user with the workaround (regex with header). Once they are done we can move on to breaking out separate items for From (RegEx) / To (RegEx) / Cc (RegEx) and Bcc (RegEx)"
if you're referring to having all these strings: From (RegEx) To/Cc/Bcc (RegEx) To (Regex) Cc (Regex) Bcc (Regex) Body (RegEx) Inside the main list, were people usually find: To From Subject Body Age in days
Then i'm ok with it, cause i can't amuse myself all the time in the future to add strings like: x-spam-status:SUBJ_ALL_CAPS etc etc
I thought you were going to add the body regex inside header too.
Even the headers regex filed is complex, cause it requires a manual (a page to go pick the right section), while there should be a way of adding a popup list of all headers regex proposals in the filters window. Directly during selection of that headers regex option, a popup should appear & then, people can select the wanted value, this would help a lot. to have all tools in hands from the same interface.
For now, i can't use the addon system at all, cause many updates are awaiting & i can't change my regex filters all the time. so i'll be waiting.
Please consider the popup filter, it kills me to have another page to check for values all the time & adding them before the regexpression.
xlovinglyx: I don't get it robstroess.. @robstroess, You said you were working on the "Body (Regex) Match" for filtaquilla, on wang's comments. Are you still on it or is it done ?
Sorry to have caused confusion. To clarify: I tried to migrate Wang's "expression search" to current TB78 via intermediate step towards TB68, as recommended by devs. I failed on this already on the attempt to port to TB68. Next I tried to use avegaweiss's port to TB68.12 as a starting point - I could not get it to run - the "body (regex)" section just missed popping up a box to input a regex clause. Also, I'm a noob in programming TB extensions. Hence I completely withdraw from porting Wang's work. Fiddling around with Raven2000's Filtaquilla is far beyond my expertise - so I can only hope for Raven2000 to somewhen implement a "body (regex)" feature. cheers - rob
It's not a matter of knowing how complex this is, it's just that i've never been a fan of downgrades & usually hardly follow them. it doesn't matter to me if it takes time, i just need to make sure it's in your todo list, to simplify user's experience.
Yes, as you may not have noticed: This github is my todo list and I have assigned myself to this issue. And I am replying and clarifying here, always. That basically means it's on my todo list. If I want to prioritize it before others, such as issue #79 then I pin it to the top (there can only be 3 issues with priority, which probably makes sense).
"Not sure about the "popup" you keep mentioning, I don't think I want to implement a fourth dropdown within the condition rows. Might add a little button to help building the advanced expression, but we can discuss UI there."
If you have time after implementing all other regex's individually, that would help a lot & prevent people from adding strings all the time before the regular expression, and i'm not discussing about a 4th dropdown box. Someone selects "Header Regex Match" a popup appears, & the regarded section is finally clicked & appears in that same initial dropdown box. if it's too much work, forget it, we'll play with adding those sections/strings before the regex then.
Or if possible, if that eases the process, all options appear on a pop up line by line & then if i click on one option: the exact string before the regular expression is auto-pasted in the field where i'm supposed to add my regex .. . Like that, your "headers regex match" doesn't change on the first dropdown list.
Also, as a third option, if it's complicated to add a popup from the first dropdown box, selecting options for "headers regex match" that would add the pre-string in the regex field, from your proposed "button" would be great too.
Or if possible, if that eases the process, all options appear on a pop up line by line & then if i click on one option: the exact string before the regular expression is auto-pasted in the field where i'm supposed to add my regex .. . Like that, your "headers regex match" doesn't change on the first dropdown list.
What makes it easier is just a button to invoke the extra dialog box, instead of relying on the select event, something like this:
Clicking on (+) should then open up a new dialog with a selection, free textbox for typing a non-standard header and a second, larger exitbox for pasting the Regular Expression. Upon pressing OK it would automatically build the syntax "header:regular Expression" and either append or insert (with replacing) this string into the condition value field.
I'm lost..
so for the upcoming individual regex options i've mentioned on top of the list of this suggestion, i'll have to type:
to:andHEREtheREGexpression ? even after breaking out separate items for From (RegEx) / To (RegEx) / Cc (RegEx) and Bcc (RegEx) Body(RegEx) ? or is it just for now with the "header regex match"
i never had this with wangvisual (( if i had a clear To (Regex) in the first dropdown menu, there would be no need to retype to:
Also, it's hard to follow this & what will be selectable from that button.. https://quickfilters.quickfolders.org/filtaquilla.html#header_regex
This for example: is a language i don't expect nor understand nothing about:
"x-spam-status:SUBJ_ALL_CAPS
and then I will search for all x-spam-status headers that contain “SUBJ_ALL_CAPS”. This example is a little contrived, because you could do the same thing with the older custom header search in TB, but the difference is that you could use something more complex than simply a string if you wanted as your regex."
I understand this: x-spam-status: < added option in about:config, for example.. but what comes after is not a regular expression ... . but this is a regex: /SUBJ_ALL_CAPS/gmi & this is a regex that should normally work too: /([>]|\s|^|"|<<)(-?)(i)('ll|'?d|'?ve|'?m|\b)([<]|\s)/gmi after breaking out separate items for From (RegEx) / To (RegEx) / Cc (RegEx) and Bcc (RegEx) Body(RegEx), as well.
so i'd rather type: x-spam-status:/SUBJ_ALL_CAPS/gmi
by the way, the button system for "header regex match" is perfect.
I'm lost..
so for the upcoming individual regex options i've mentioned on top of the list of this suggestion, i'll have to type:
to:andHEREtheREGexpression ? even after breaking out separate items for From (RegEx) / To (RegEx) / Cc (RegEx) and Bcc (RegEx) Body(RegEx) ?
No, that makes no sense at all. just the regular expression obviously, no need for the "header:" Syntax. The additional parameter is only for the generic " header(regex)" function. Initially I will only add 2 things, body and the combined From/to/cc. Once they work we can think of adding the other ones that are already covered by the header function. I don't like that one very much because the conditions menu is already way to long and cluttered.
i never had this with wangvisual (( if i had a clear To (Regex) in the first dropdown menu, there would be no need to retype to:
Also, it's hard to follow this & what will be selectable from that button..
I didn't make a GUI yet but it will be a drop down menu listing the most common headers. To, From, Cc etc. you select that and the top field will then be populated accordingly:
header: from
regular expression:
you simply paste your regex into the separate field "regular expression"
once you click ok on that dialog it will concatenate it simply using the formula
header.value + ":" +regexp.value;
It might even be able to do some sanity checking at this stage. then it will build the expression for you and add (or insert) it into the condition field.
https://quickfilters.quickfolders.org/filtaquilla.html#header_regex
This for example: is a language i don't expect nor understand nothing about:
"x-spam-status:SUBJ_ALL_CAPS
...
so i'd rather type: x-spam-status:/SUBJ_ALL_CAPS/gmi
Yes, Kent should have used a placeholder for a regular expression. He does some extra work by adding / whatnot if you do not add it - so the string will be interpreted as regular expression even if you type it "Normally".
Looks like you want to propose specific string names (codes) for conditions, like: SUBJ_ALL_CAPS
I hope my regular expressions will never interact with those codes.. cause a regex can even contain this: _ So maybe you should tell the addon that if my expression ends with; /1TO5 letters only, then it'll never be one of your codes, but a pure javascript regular expression.
Looks like you want to propose specific string names (codes) for conditions, like: SUBJ_ALL_CAPS
Nope - this is not a code. it's a string literal, and has no specific meaning. it could also be "house" or "red.*dog" or "[quickFilters] - <type subject here>". I think Kent just used a bad example that looks like it means something; it's literally just a string containing a regular expression. SUBJ_CALL_CAPS
may however be a word that has a meaning in the context of being an x-spam-status - obviously filtaquilla doesn't verify that and doesn't care.
I hope my regular expressions will never interact with those codes.. cause a regex can even contain this: _
Yes. I double checked the filtaquilla code, there is no "SUBJ_ALL_CAPS" keyword. The only thing that is determined would be the header name (the part before the colon, e.g. "subject:" or "from:", "x-spam-status:" etc. As these are Email headers, they can be any string according to email header standards.
If your regular expression has no flags, it can be entered without slashes, for example “findme”. But if you want to add a flag (for example case insensitivity), then surround the term with slashes and append the appropriate flags (for example “/findME/i”).
So again, if you want to add flags, definitely start your reg expression with / after the colon. Example:
subject:/Customer\s-+\s(\w+[\s\w]+)\s/i
My match text variables on my other Add-on SmartTemplates support a group argument, that's why my own examples above may be a little too complex (containing match groups) but I just wanted to give some real live examples that I use there for my customer service templates.
If I make an argument dialog (invoked by the (+) button) for entering regular expressions, we could also add a test field, would that be helpful? You could have the separate entry fields for
- header name
- regexp
- test text - so you could paste your subject / from / whatever other field from a real-life Email and click a [match] button to test if the regexp was found
great :)
It's taking so long ((
Would love to see this. I can't get header regex to work at all. Neither to:john
nor to:/john/ig
match messages with [email protected]
in the to: field, so it would be very nice to have a clearer syntax or something that worked.
EDIT: Wait, belay that. recipients:john
works, which is weird, because there is no 'recipients' header. Yeah, a dropdown menu with the allowable headers would be much clearer.
I don't know is it the right solution, but I've managed to get body (at least for stored messages, haven't completely tested yet), but there is a problem with result propagating to main match function.
self.bodyRegex =
{
id: "[email protected]#bodyRegex",
name: self.strings.GetStringFromName("filtaquilla.bodyregex.name"),
getEnabled: function bodyRegEx_getEnabled(scope, op) {
return _isLocalSearch(scope);
},
needsBody: true,
getAvailable: function bodyRegEx_getAvailable(scope, op) {
return _isLocalSearch(scope) && BodyRegexEnabled;
},
getAvailableOperators: function bodyRegEx_getAvailableOperators(scope) {
if (!_isLocalSearch(scope))
{
return [];
}
return [Matches, DoesntMatch];
},
match: function bodyRegEx_match(aMsgHdr, aSearchValue, aSearchOp) {
//(aMsgHdrs, aActionValue, aListener, aType, aMsgWindow) {
//console.log("aMsgHdrs:", aMsgHdrs);
//console.log("messageId: ", aMsgHdrs.messageId);
//console.log("aActionValue: ", aActionValue);
//console.log("aListener: ", aListener);
//console.log("aType: ", aType);
//console.log("aMsgWindow: ", aMsgWindow);
var mimeConvert = Cc["@mozilla.org/messenger/mimeconverter;1"].getService(Ci.nsIMimeConverter),
decodedMessageId = mimeConvert.decodeMimeHeader(aMsgHdr.messageId, null, false, true);
console.log("decoded: ", decodedMessageId);
// msgAdded: function(aMsgHdr) {
// if( !aMsgHdr.isRead ){
//Get folder in case it's not a plaintext
//@see https://stackoverflow.com/questions/27265271/how-to-intercept-incoming-email-and-retrieve-message-body-in-thunderbird
let folder = aMsgHdr.folder;
let msgBody;
let result = MsgHdrToMimeMessage(aMsgHdr, null, function (_aMsgHdr, aMimeMessage) {
// do something with aMimeMessage:
//alert("the message body : " + aMimeMessage.coerceBodyToPlaintext(folder));
msgBody = aMimeMessage.coerceBodyToPlaintext(folder);
//alert(aMimeMessage.allUserAttachments.length);
//alert(aMimeMessage.size);
let searchValue, searchFlags;
[searchValue, searchFlags] = _getRegEx(aSearchValue);
let r = RegExp(searchValue, searchFlags).test(msgBody);
console.log("body matches: ", r);
return r;
}, true);
// }
//}
console.log("result: ", result);
console.log("body: ", msgBody);
let searchValue, searchFlags;
[searchValue, searchFlags] = _getRegEx(aSearchValue);
console.log("body matches: ", RegExp(searchValue, searchFlags).test(msgBody));
switch (aSearchOp)
{
case Matches:
//return RegExp(searchValue, searchFlags).test(subject);
case DoesntMatch:
//return !RegExp(searchValue, searchFlags).test(subject);
}
return false;//not implemented yet
}
};
I don't enough knowledge to propagate "r" result into match parent function to return actual result. Body is not available there too. I suppose there should be used synchronous function to wait for a result, and then it will be available there. Is that right or i'm missing something?
I don't know is it the right solution, but I've managed to get body (at least for stored messages, haven't completely tested yet), but there is a problem with result propagating to main match function.
Did you add the search term to the filterService?
filterService.addCustomTerm(self.bodyRegex);
If you need hep with options integration I can do it but you it would help if you could create a PR, or if you can't do that, attach a version of your full source code here (not just the one function)
@RealRaven2000 it's not ready, however it doesn't fail, too, so i have created a PR #108.