grist-core icon indicating copy to clipboard operation
grist-core copied to clipboard

SQLite UNION operator

Open Vladimir-Va opened this issue 3 years ago • 8 comments

Please consider adding the ability to combine data from multiple tables into a complete result set using the UNION and UNION ALL operators.

Vladimir-Va avatar Apr 26 '22 19:04 Vladimir-Va

Shouldn't this be an ideal candidate for some widget rather than built-in functionality? IDK

dumblob avatar Apr 26 '22 19:04 dumblob

Thank you @dumblob This is the creation of table views by SQL query. SQL Example:

SELECT City, Country FROM Customers
WHERE Country='Germany'
UNION or (UNION ALL)
SELECT City, Country FROM Suppliers
WHERE Country='Germany'
ORDER BY City;

The UNION operator is used to combine the result-set of two or more SELECT statements.

  • Every SELECT statement within UNION must have the same number of columns
  • The columns must also have similar data types
  • The columns in every SELECT statement must also be in the same order

Vladimir-Va avatar Apr 27 '22 20:04 Vladimir-Va

The purpose of the UNION operator is to workaround the inability of SQL to produce the correct output right away from the same data (thus one has to write multiple queries over the same data and then "merge" the results). Other proper use cases are unknown to me (because if there were any, one would anyway put all the data into one table before doing any querying).

Therefore if Grist wants people to keep data without duplicates across tables (and yes, the whole Grist ecosystem assumes there are no duplicates), it shouldn't encourage people to easily use UNION by providing this operator as first class citizen (and for advanced users there is always the possibility to write SQL directly or make a teeny tiny widget as I suggested above).

Thoughts?

dumblob avatar Apr 28 '22 12:04 dumblob

Thank you @dumblob I 100% support the concept of one data source and UNION does not violate this concept. For example tasks CPQ

If i have 3 different tables (with columns): ProductType1 (UUID, ProductName, PartNo, ColumnsN) ProductType2 (UUID, ProductName, PartNo, ColumnsM) AllProducts (UUID, ProductName, PartNo) In each ProductType table I configure PartNo and ProductName for this product type. Then I need to union all the data from the table ProductType1 and ProductType2, while making changes to any ProductType tables need same result in AllProducts table.

This can be done on the basis of one table ProductType (UUID, ProductName, PartNo, ColumnN, ColumnsM) but formula for calculate ProductName and PartNo based on ColumnsN for ProductType1 and ColumnsM for ProductType2. That is, if I have N ProductType i need N conditions for formula calculated ProductName and PartNo.

Vladimir-Va avatar Apr 28 '22 22:04 Vladimir-Va

@Vladimir-Va thanks for the example. It's actually a typical example of "bad design" IMHO.

IIUC the problem you described is that one would have to write lengthy queries due to the need to list larger number of product types if everything was in one table (as I suggested).

Yes, I agree that it's definitely a problem. I at the same time disagree though that splitting the one large table into several smaller ones is a (good) solution. I claim that in practice you anyway need to write the same (large) amount of queries (as you would with one large table) just to maintain all invariants about the data (especially when adding & removing product types).

I've actually never seen anything like this example in production (simply because adding/removing tables is much more complicated than adding/removing rows due to user facing software APIs not being built for that purpose). I've only seen such examples in academic circles, DB expert circles, and novice exercises.

But your mileage may vary.

The question here is whether such IMHO purely synthetic needs are worth the complications in Grist (especially complications this will bring to customers trying to iron over the consequences of their badly designed data tables).

What do you think?

dumblob avatar Apr 29 '22 07:04 dumblob

Thanks for the suggestion @Vladimir-Va! Support for SQL is on our roadmap (https://github.com/gristlabs/grist-core/issues/45). I'd expect for this feature that the results of arbitrary SQL queries (including those using UNION) will be viewable as a grid; we're still thinking about the degree to which editing would be possible, or automatic updates of cells as the source data changes.

paulfitz avatar Apr 29 '22 13:04 paulfitz

Thank you @dumblob for your detailed comment. Just the same, in real production and trading companies with complex technical products, it is normal practice to create product configurators to generate Part No, product name and descriptions. The main task of CPQ is actually to integrate these configurators into a single interface for uniformity of its use by managers of data use/reuse in ERP accounting and product configuration by engineers. In fact, this is an attempt to create Many-to-Any relationships. https://docs.getgrist.com/gKbvkPcC9AAJ/CPQ

  1. I tried adding a formula to the text field to calculate ProuctTypes/PartNoFormula to then call this text as a formula in Type1/PartNoByLinkTo (as far as I understand, this is impossible for security purposes).
  2. It is possible to combine all the option tables into a single table, but when adding options, we do not have the creation of dependent or filtered reference column type.
  3. We can add each Type Widget to ProuctTypes Page but how to use if Type=N widgets. Сall linked widgets dynamicaly?
  4. Combine in ProductsAll to be used for invoicing.

Vladimir-Va avatar May 01 '22 00:05 Vladimir-Va

Thank you @paulfitz Think that the columns on which the union takes place should be read-only, this will allow you not to update the widget form every time. If it is implemented as a simple table view (related to this request) then it is more than enough to make it read-only.

Vladimir-Va avatar May 01 '22 00:05 Vladimir-Va