GT7a: Design a description language for ad-hoc search engine filters
I wanted to open an issue for this. I wanted to it to be possible for Marginalia to index only some specific documents. For instance I want an instance with things only about Linux.
Various ideas come to mind. First, you shouldn't invent a new language IMO. You should use an already existing one, and one which people can extend. It should be also something fast enough and which you can easily embed.
I have two ideas. First is Lua. The langauge itself is simple. And it has been designed to be embedded. Also a variety of other languages compile to Lua. And if you want some specific language, you can just compile your language to Lua. Also it's possible for Lua to load external modules. Needless to mention that Lua is also very fast while being bytecode interpreted.
The other option is that you use some already existing Lisp. Then it'll be hard for people to learn it. But for people who are convenient with it, Lisps have got advantage of very powerful metaprogramming.
Note that there are some Lisps which compile to Lua such as Fennel.
The language basically needs to be declarative, anything interpreted is simply too slow and also a security issue. So we're realistically looking at either something like yaml or XML.
The language basically needs to be declarative, anything interpreted is simply too slow and also a security issue. So we're realistically looking at either something like yaml or XML.
Those are not expressive enough. Lua is byte code interpreted and as far as I know, it's very fast. The VM does the things. Regarding Lisps, some compile to bytecode, some to native code and some are interpreted. Regarding Common Lisp, implementations for all 3 exist. And Common Lisp code can be almost as fast as C/C++ with a compiler which compiles to native like SBCL, at least on x86_64. Tho CL hasn't been decided to be embedded.
An option would be going for Lua and let people load their own modules through it whenever they need something.
The problem with lua is that it's as fast as the logic you implement with it. Due to the halting problem, we can't even tell if it will ever terminate. That basically rules out non-declarative languages entirely for this sort of application.
Yeah declarative languages are less capable, but they're also more safe.
The problem with lua is that it's as fast as the logic you implement with it. Due to the halting problem, we can't even tell if it will ever terminate. That basically rules out non-declarative languages entirely for this sort of application.
Yeah declarative languages are less capable, but they're also more safe.
Isn't checking how fast it is or it should be up to the instance admin?
Well possibly, though I had in mind just letting people define their own filters completely ad-hoc.
If someone is operating a search engine, they can just patch the ranking logic. I'm not sure I'm a big fan of having a DSL for that. Maybe a plugin system, but I'm not sure how big the demand is for such a thing.
oh yes of course. I'm surprised I never mentioned this. In general, I like to make things as modular as possible. That is, you have a small core and you can add whatever plugins you want to it to make it work best for your use case. For the same reason, I am a fan of Emacs and (Neo)vim rather than modern text editors or IDEs for programming :)
At any rate, you don't really need lua with Java, that's mostly a workaround for languages that are AOT-compiled.
In Java you can just load code in any JVM-language programmatically at runtime with zero overhead, this includes e.g. jython.
Might see about implementing the declarative ranking configuration system as one of potentially multiple plugins ;-)
At any rate, you don't really need lua with Java, that's mostly a workaround for languages that are AOT-compiled.
In Java you can just load code in any JVM-language programmatically at runtime with zero overhead, this includes e.g. jython.
Might see about implementing the declarative ranking configuration system as one of potentially multiple plugins ;-)
oh I really didn't know that. It might be possible to compile Lua itself to JVM bytecode.