PeriodicTable.jl icon indicating copy to clipboard operation
PeriodicTable.jl copied to clipboard

Discussion: Borrow structure from mendeleev python package

Open Gregstrq opened this issue 2 years ago • 6 comments

There are a couple of open issues regarding adding of additional information, which have no progress in part due to laziness of the authors of these issues (myself included).

On the other hand, it seems to me that the internal organization of the information in the package might play a part in it (of course it is totally subjective). Let's say you want to add additional information to existing element type. Then, you have to create a new type with additional fields. Then, you have to go over all the lines where elements are defined and add the data in new fields. Sounds a bit complicated. When you start thinking about adding isotopic information, it gets even more complicated.

The counterpart of PeriodicTable.jl in python community is mendeleev. It is interesting that they store all the data as an sqlite database with several interconnected tables. May be we can take inspiration from that?

In Julia case, I think we might not need the whole database, but we can store the data in a dataframe (or several linked dataframes instead). Overall, it should make working with the information rather convenient. For example, one can add additional fields to the elements by appending a column with the relevant data. One can also leverage dataframe utilities and do all kinds of querying on the information.

This approach also lends itself nice for the purpose of introduction of isotopic information: we can create an additional dataframe which will store isotope-dependent information, and the proton number can be used to establish foreign key relationship with the main dataframe storing the elements.

We can also simply copy the database from mendeleev, provided that its maintainers are ok with that.

Gregstrq avatar Dec 10 '21 19:12 Gregstrq

A first comment: a potential disadvantage of this would be that the package wouldn't be as light weight any more. Generally speaking, I would recommend to only add dependencies to packages if they are necessary and not just convenient. In that spirit, if we would decide to switch to a database backend we should look for something light weight and fast (our information is static).

carstenbauer avatar Dec 11 '21 09:12 carstenbauer

Also note that python has https://github.com/pkienzle/periodictable which seems to store the information in regular dictionaries (no databases involved).

carstenbauer avatar Dec 11 '21 09:12 carstenbauer

In that spirit, if we would decide to switch to a database backend we should look for something light weight and fast (our information is static).

I do not insist on doing everything with a database. We can use dataframes and simply serialize them to a file for the storage.

Also note that python has https://github.com/pkienzle/periodictable which seems to store the information in regular dictionaries (no databases involved).

Of course, we can have dataframe (database) backend as a separate package. We can even provide the possibility to return Element type for the unity of interface. But is it convenient to have two separate packages?

Gregstrq avatar Dec 11 '21 15:12 Gregstrq

Mendeleev provides much much more data than PeriodicTable, which, on the other side, is lightweight and fast. Thus there may be place for two separate packages,

I have started a package named Mendeleev.jl . In the current state, it exports an array of Element_M strucs, similar to PeriodicTable. Element_M has 70 fields, corresponding to data in the main (only) table of the source database (isotopes are a separate table).

The plans are to convert data to unitful / measurements where applicable, add data from PeriodicTable, add aliases for fields of the same meaning but different names (e.g. number in PeriodicTable vs atomic_number in python-Mendeleev), and add indexing from PeriodicTable so that Mendeleev.jl can be later on used as a direct replacement for PeriodicTable.

P.S. Py-Mendeleev is under MIT license, thus no additional permission to re-use data is necessary.

Eben60 avatar Jun 26 '22 21:06 Eben60

Everyone involved in these efforts, please consider joining efforts. These are great initiatives and it would be nice to see a single package doing it all instead of dividing the community of users in the long-term 🙏🏽

juliohm avatar Nov 05 '22 12:11 juliohm

See discussion at the Discourse.

My preferred solution would be to get Mendeleev.jl to v1.0.0, then transfer the ownership to this or some other Julia organizations (JuliaMolSim ??), staying myself a maintainer. And then maybe officially retire PeriodicTable.

Eben60 avatar Nov 05 '22 20:11 Eben60