Wildcards in .cin files? (Question, not a bug)
I see in the source that wildcards use a simple regex with "?" and "*". And "cj-wildcard.cin" which notes wildcard support, is that within the .cin file itself or some other mechanism?
Reviewing the paper "OpenVanilla – A Non-Intrusive Plug-In Framework of Text Services" suggests much more than just an input method and gives "ehq-symbols.cin" as an example. That is a great example by the way. And that is the use I am looking at. However is there a way for non-exact matches to be displayed? For example if I have
... foobar foobar ...
that would only display it as a candidate with an exact match of "foobar". Is there a way to display candidates that partially match? For example "foobar" would potentially be in the candidate list for anything from "f" through "foobar" instead of just "foobar", as opposed to having to use
... f foobar fo foobar foo foobar foob foobar fooba foobar foobar foobar ...
Is that the current solution for this? If so, how well does OpenVanilla perform with very large .cin files?
Thanks much!
Interesting. TestOVWildcard.cpp shows wildcard usage and it's in the usage not in the .cin files. So while typing I can use f*r and it will match foobar. If I can get into XCode sometime I'll see if I can hack in an implicit * option so f would match foobar without having to use f*. That would match my understanding, like on mobile devices that show full candidates regardless of the input size. Thanks much for this though. Either way this gets me closer.
謝謝謝謝
Hi,
The paper was old although the idea was there (and I'm glad that it's read!).
The place you are looking for is Modules/TableBased/OVIMTableBasedContext.cpp, in particular OVIMTableBasedContext::compose(). It's definitely doable to change the query from f to f* if there's no other wildcard. Better yet, this could be made into an option.
As for performance, there are users whose customized tables easily contain 100k-200k entries. The biggest built-in table in the OpenVanilla repository has 97k entries. Since the table is entirely read into memory, search is fast. Wildcard search will try to start from the first entry that matches the longest prefix, so it's not that a search like f* would regress into a linear search from the beginning
This being said, wildcard search performance depends on selectivity. Supposed your keys are evenly distributed over 20 one-character prefixes (e.g. a-t), then for a table 200k entries we are talking about copying 10k entries to the candidate list each time, which is still blazingly fast in modern PC. The copying is an unfortunate consequence of the simple early design – the candidate panel does not use an iterator-based data source – and so for a really large table (say 1m entries) with very low selectivity, the copying cost will be high. Still, I doubt the usability of an input method that presents 100k choices at any key stroke :). Perhaps such input method should just cut off at some first n candidates. Happy hacking!
An option would be good. I see how it comes together in compose but it'll take me a bit to get the environment set up and get syntax details right. C++ is a different beast then what I normally work with.
Limiting the candidate list would be a good option as well as the idea is to try some experiments with regular English to bring IDE like completion (fixed lists) to general typing. These would just be experiments but and the lists would be large. Nice! I just realized with caps lock on I can type normally. That is really helpful even towards this goal.
Thanks much for the information! I'll likely do hard coded changes once I get some time to get this going.
Also just to confirm, OpenVanilla doesn't do prediction or learning, correct? All changes need to go into the .cin file and be reloaded? Though I have found I have to remove and re-add as reloading attempts to overwrite but I don't see the changes take effect.
No, the table-based input method doesn't do predictive input, but it is possible to write a new module for OV.
There is a bug when you try to add the same table again via preferences – but feel free to file one so that we have a record.
Here's a trick: If you edit the table in ~/Library/Application Support/OpenVanilla/UserData/TableBased (say foo.cin) and save it, the change will take effect immediately – underneath OV uses file timestamp to track if a .cin needs reloading. So a possible workflow is just edit your .cin file there and copy to your work directory afterwards.
I did see those and hadn't dug in enough to see how it would handle it. Thanks much for the explanation. And I saw in the code why caps lock and shift together work the way it does. Very nice. It is configured to build for 10.7 but I am still on 10.6.8 so I'll have to wait to be able to make changes. Making it able to compile on 10.6 looks like it's a bit of work which I am not familiar with. I'll just try to work with the .cin files in the app support location as you noted.
Thanks!