Chado icon indicating copy to clipboard operation
Chado copied to clipboard

Changes to Chado Installation

Open spficklin opened this issue 7 years ago • 3 comments

This issue imported from the Chado v1.4 requested changes google doc:

Changes to Chado Installation that Karl O. Pinc suggested

http://gmod.827538.n3.nabble.com/RFC-Chado-relase-v1-3-tp4048655p4048701.html

  • Do not delete tables before installing. (If the table already exists the transaction should roll back.)
  • Do not be so chatty, display only warnings and errors, not informational messages.
  • I would like to have it be possible to install Chado into it's own schema. The first step for this is to get rid of the multiple schemas that Chado currently uses.
  • I would like to see Chado be able to be installed modularly. At present this is rather-to-very difficult. (At least it's difficult if you want only Chado. I don't know the process if Tripal is involved.) See email link for ideas on how to do this.
  • In a follow up email, Siddhartha Basu suggested releasing just the DDL without the software.

spficklin avatar Jan 10 '18 18:01 spficklin

I believe it is already possible to install chado in it's own schema, though it is probably clunky. I worry that moving the stuff in their own schemas into the "main" one will break software. @kpinc , what use case do you want to address with putting Chado in it's own schema? If you just want to move it out of public, I don't see what you need to move the "other" schemas into the same chado schema. Alternatively, if you want to have multiple Chados (ugh) having the other schemas does present a problem. I imagine Tripal (@spficklin ?) would be the most impacted by a change, as well as the GBrowse adaptor (yikes, which I know people still use!).

scottcain avatar Aug 03 '18 18:08 scottcain

Hi Scott,

The short answer is that using schemas is good practice.

Apologies for the rambling answer below. My memory improved as I wrote.

On Fri, 03 Aug 2018 11:28:50 -0700 Scott Cain [email protected] wrote:

I believe it is already possible to install chado in it's own schema, though it is probably clunky.

We installed chado in its own schema. It was clunky, involving modification to the source in several places. I recall "public" being hard-coded.

I worry that moving the stuff in their own schemas into the "main" one will break software.

I'm not clear what this statement means. What is "the stuff"?

In the for what it's worth category, I don't see why the schema into which chado installs cannot be chosen by the user. If desired chado could be installed in the public schema. This should depend on the user's search_path setting -- whatever schema is first is where chado gets installed. The installation process which existed when we installed chado did this, but only "mostly".

@kpinc , what use case do you want to address with putting Chado in it's own schema?

In our case it's often a matter of namespace visibility. Chado has a lot of tables and many people with database access don't use chado. Others use chado in conjunction with tables in other schemas. (Demographic data, etc.) Most people use a generic Postgres interface (think pgAdmin). Lumping all the tables/views into a single schema presents a confusing number of relations in the user interface. Better that each user be able to choose the schema (s)he is working with and see those tables.

The search_path is set such that joins between schemas don't require schema qualification in the user's SQL. Unless the user wants to change things and schema-qualify tables located in little-used schemas.

In the general case some advantages of schemas are laid out in the "Database Schema Recommendations for an Application" section of:

https://wiki.postgresql.org/wiki/Database_Schema_Recommendations_for_an_Application

The reason why, by the way, the above recommends deleting the public schema is because of CVE-2018-1058. See:

https://wiki.postgresql.org/wiki/A_Guide_to_CVE-2018-1058:_Protect_Your_Search_Path

CVE-2018-1058 is another reason why the schema into which Chado installs should not be hardcoded.

FWIW, we have never had a public schema. Every schema has a dedicated purpose. This lets us clean up unused stuff, provides security, reduces clutter, etc. When chado wanted to install into a schema that didn't exist this presented a problem.

Security is a big point here. We have a fair number of users, most of which have no access to chado, much less write access to chado tables. Easy to control with permissions on schemas. But if chado were in the public schema, along with who knows what else, security would be a nightmare.

If you just want to move it out of public, I don't see what you need to move the "other" schemas into the same chado schema.

Again, I don't understand. Are you talking about the several other schemas that come with Chado? There's no reason why these shouldn't exist as independent schemas. It would be nice if the installation process was structured such that the names of these schemas were not hardcoded -- but that's an additional enhancement.

Alternatively, if you want to have multiple Chados (ugh) having the other schemas does present a problem.

Well, not if the other schemas can also be installed with names of the user's choosing. As long as you're going there it'd be nice to not have to install these other schemas. If they're not used it'd be nice to not have them. (I have no recollection now of whether we use them, but we probably don't.)

Multiple chados should probably be put into multiple databases, although just as with us and our multitude of schemas there's probably a case to be made somewhere for cross-Chado-instance joins.

(Our typical user has, if I'm remembering right, 5 schemas in their search path. Not counting the system schemas.)

I imagine Tripal (@spficklin ?) would be the most impacted by a change, as well as the GBrowse adaptor (yikes, which I know people still use!).

In theory, they should rely on the search_path and shouldn't have to be modified. Of course if they already hardcode schema qualification into their SQL this would be a problem.

Regards,

Karl [email protected] Free Software: "You don't pay back, you pay forward." -- Robert A. Heinlein

kpinc avatar Aug 03 '18 19:08 kpinc

Unfortunately this is I big change, and while I think it's a good idea to do, I don't have the bandwidth to do it now. Perhaps after 1.4.

scottcain avatar Dec 04 '18 01:12 scottcain