chemcore icon indicating copy to clipboard operation
chemcore copied to clipboard

Any TODOs?

Open bddap opened this issue 4 years ago • 2 comments

Hey there :wave:

I think this project is neat and I have some free time. Anything you would like added? Substructure search? Fingerprints?

bddap avatar Jun 11 '21 20:06 bddap

Thanks! As you can see it's early days. Fingerprints and substructure search would both be useful.

I was also thinking that it might be useful to get some basic tooling around molfile and SDfile processing in place. I was considering a project along the lines of Purr, but for the ctfile formats. Then bring that in as a dependency for ChemCore with a small package for translations to/from ChemCore Molecules.

Those are bigger projects, but there are also some smaller, useful ones I can think of. Molecular formula, average molecular mass and exact molecular mass come to mind.

I'm also pretty interested at the moment in trying to put a Postgres cartridge together. It seems like that should be possible given projects such as Rust-Postgres. It could start small with, say, just getting a formula from a SMILES. Should fingerprints and substructure search become available later, then it could be integrated.

I don't use Python much, but it might be useful to have some good Python bindings. There's some early stage work here:

Zooming out, I'd like to leverage Cargo as much as possible. I think ChemCore should focus on molecular representation and broadly-applicable, fundamental algorithms (e.g., substructure search, maximum common substructure, fragmentation, etc). I/O seems like something that aught to be integrated given it's always used and requires tight coupling to representation. Other things like Fingerprints are very useful, but maybe not universally so. I'm thinking that fingerprints and similar functionality might be better as their own independent crates that users can pull into their own projects as needed.

You can see this approach at work with both Purr and Gamma, both of which focus pretty narrowly on a specific area and which ChemCore uses as dependencies. I think the same approach could be used with ChemCore.

So I'm thinking less monolith (e.g., CDK and RDKit) and more core cheminformatics functionality for ChemCore. But I'm open to other ideas.

rapodaca avatar Jun 11 '21 23:06 rapodaca