sqlite_modern_cpp icon indicating copy to clipboard operation
sqlite_modern_cpp copied to clipboard

Adds an api to collect results into direct object and/or its stl-container

Open pPanda-beta opened this issue 7 years ago • 27 comments

I've created this api for a better ORM like solution which works with sqlite_modern_cpp.

This api can collect multiple results

  1. to a container like (vector, list , ...etc) of object . (collect::to)
  2. to a container like (vector, list , ...etc) of smart_pointers of object. (collect::to)
  3. as fresh newly created stl-container of object. (collect::as)
  4. as fresh newly created stl-container of smart_pointers of object. (collect::as)

Also this api can collect single results

  1. as smart pointer (std::shared_ptr) of an object. (collect::as)

SUPPORTS BOTH c++11 and c++14.

pPanda-beta avatar May 17 '17 17:05 pPanda-beta

@pPanda-beta Thanks for your PR :-)

I like your approach, it can save a lot of repetitive patterns. But ...

It can be really hard to write a decent true ORM in C++, that's why we have kept the library un-opinionated. Everyone is encouraged to write there own ORM if they need one.

I think it's better to set up a separate git repository for this ORM and mention it in the project Readme. My main reason is that an ORM is highly opinionated :-) Another reason is that there could be future implementations from other users and we can't ship all of them.

aminroosta avatar May 17 '17 17:05 aminroosta

So forget about orm. this api is not actually what orm is. Its more created for fetching list of objects directly instead of using functors.

but the already available library has ability to collect data in tupples. Does'nt that give any hint that its able to fetch whole record​ at once? So the problem is : If its able to fetch 1 record why can't several records?

pPanda-beta avatar May 17 '17 17:05 pPanda-beta

Aren't they look similar?

tuple<int, string> record;
db<<"SQL">>record;

And this one

vector<tuple<int,string>> records;
db<<"SQL"
    >>collect<int,string>::to(records);

pPanda-beta avatar May 17 '17 18:05 pPanda-beta

Hi,

i like it too, its a generic functor creator for standard stuff not an ORM at all. Maybe its a misunderstanding. I too dont like the idea of a "true" ORM in here but this isnt even close to one. Anyway there is one thing i don't like, its in the global namespace not sqlite. If that is fixed i could see a place for stuff like that in this wrapper? Thougths?

Killili avatar May 17 '17 18:05 Killili

Humm, yea i get your point @Killili @pPanda-beta Alright, some misunderstanding from my part :-)

Lets just make sure if we need to change anything or not! @zauguin can you please comment on this, with this PR we can get something similar to #103.

// From 103
auto stmt = db << "select age,name,weight from user where age > ? ;" << 21;
for(auto &&[age, name, weight] : stmt.as<int, string, double>()) {
}
// Interface in this PR
auto stmt = (db << "select age,name,weight from user where age > ? ;" << 21);
for(auto &&[a,n,w] : (stmt >> collect<int,string,double>::as(vector, tuple<int,string,double>)) {
}

aminroosta avatar May 17 '17 18:05 aminroosta

By the way i really liked if we could do

list<int> ids;
db << "select id from tbl;" >> ids;
// instead of the current interface
db << "select id from tbl;" >> collect<int>::to(ids);

But we can't because >> vector<T> is already taken by blobs :/

aminroosta avatar May 17 '17 18:05 aminroosta

@Killili Yes I was also thinking to put those in the same namespace "sqlite", but were stuck by a confusion that this is an extension to the library which is unaware about most of the internal properties of the original library. May be sqlite::ext or something like sqlite::collector will be another choice. I'm not sure and still confused. :(

pPanda-beta avatar May 17 '17 19:05 pPanda-beta

@aminroosta About #103 , yes iterators are always a more generic choice and aligned with [open-for-extension-close-for-modification] principle. May be there are a couple of differences with this PR also.

These are :

  1. Its backward-compatible with c++11, I'm not sure but may be #103 is compatible upto c++14
  2. as is a struct not a function, so we can save this as a tool to create containers. auto list_builder = collect<int>::as<list,int>(); or simply collect<int>::as<list,int> list_builder; and use them like auto list1=db<<"SQL">>list_builder;
  3. The syntax is yet not readable as we have to repeat the basic data types in both collect and as . also for function to we still have to provide the data types. Basically they solve different purposes collect<What we are collecting>::as<How we are storing> This is mainly for two reasons. 3.a) Some classes are Non-Copyable and "not default-constructible", for that we need to give support to smart-pointers. e.g. collect<int,string>::as<vector, shared_ptr<user>> 3.b) Some times we are not just collecting normal objects, there can be overloaded constructors with several basic data types. and the class database_binder has no clue which path to follow, for example : result -> int -> difficult_class_constructor_taking_int or result -> string -> difficult_class_constructor_taking_string . 3.c) Even sometimes we can solve great problems without extra coding. e.g. auto images = db <<"SELECT img_path FROM item ;" >>collect<string>::as<vector, cv::imagefile>();

pPanda-beta avatar May 17 '17 19:05 pPanda-beta

I just checked because i thought we had something like that already. While investigating #32 i had tried to make something like the interface @aminroosta mentioned. But i did not get it to work apparently and we went with the simple solution.

Killili avatar May 17 '17 19:05 Killili

This combines four features:

  1. std::shared_ptr support.
  2. Extract into containers
  3. The auto sth = db << "SQL;" >> collect<...>::as<...>(); interface
  4. Directly construct objects

I like all of them, but I would like them to usable independent of each other. For example with this we would have

std::shared_ptr<int> shared = db << "SQL;" >> collect<int>::as<std::shared_ptr,int>(); // OK
db << "SQL;" >> shared; // fails
std::unique_ptr<int> unique = db << "SQL;" >> collect<int>::as<std::unique_ptr,int>(); // fails
db << "SQL;" >> unique; // OK

So maybe we can split them up?

@aminroosta @pPanda-beta Regarding this and #103: For storing things in containers this has a much better interface than #103, but for iterating #103 has the adventage that for large result not all of them have to be stored but are discarded after each step. So they solve different problems.

zauguin avatar May 17 '17 22:05 zauguin

@pPanda-beta In your point 3.a the problem is with non-movable classes, there is no reason not to support, e.g. collect<int,string>::as<vector, user> for movable, non-copyable user.

zauguin avatar May 17 '17 22:05 zauguin

@pPanda-beta These are pure suggestions, feel free to disagree :-)

1- Drop the shared_ptr support, because nothing is actually shared and we already have unique_ptr.

~2- make >> vector<int> operator smarter! If the result column is of type blob and it's a single row extract blob data, but if it is multiple rows and the column is not blob data extract numeric data.~

3- support >> vector<tuple<int, string ...>> operator.

With 2 & 3 i think we almost don't need the collect<...>::to() interface anymore.

4- Drop support for user defined types! No >> collect<int,string>::as<vector, user>!

5- Change the collect<...>::as<...> interface, instead support >> (vector<int>())! Overload the >> operator for Rvalue references to std::vector!

So an example combining the above would be:

vector<int> ids;
db << "select id from tbl;" >> ids;
vector<int> numbers; // this is blob data
db << "select numbers from tbl where id = ?" << 1 >> numbers;

vector<tuple<int,string>> people;
db << "select age, name from tbl;" >> people;

// The ">> vector<T>&&" operator returns the given vector
auto ids = db << "select id from tbl;" >> (vector<int>());
auto numbers = db << "select numbers from tbl where id = ?" << 1 >> (vector<int>());
auto poeple = db << "select age,name from tbl;" >> (vector<tupe<int,string>>());

aminroosta avatar May 19 '17 08:05 aminroosta

@Killili I think i merged #32 too soon 😄

aminroosta avatar May 19 '17 11:05 aminroosta

@aminroosta I like 1, 3, 4 and 5. Your second point would lead to very surprising behaviour because the code behaves differently if e.g. your query for all rows behave differently iff the table has just one row. Additionally it would be quite hard to implement, SQLite only determines if a second row exists when the second row is queried. So we would have to extract the first result before we even know which type we need.

zauguin avatar May 19 '17 12:05 zauguin

@zauguin You are right about checking the number of rows. However can we only use sqlite3_value_type()(?) which returns SQLITE_INTEGER, SQLITE_TEXT, SQLITE_FLOAT and SQLITE_BLOB.

// single row blob
vector<int> vblob;
db << "select numbers from tbl where id = 1" >> vblob; // SQLITE_BLOB
// multiple row int
vector<int> vrows;
db << "select age from tbl;" >> vrows; // SQLITE_INTEGER
// multiple rows of blob data
vector<vector<int>> vblobrows; // !
db << "select numbers from tbl;" >> vblobrows;

Do you think that works for 2?

aminroosta avatar May 19 '17 12:05 aminroosta

@aminroosta

If we stop considering user-defined types then there is no requirement of shared_ptr or unique_ptr.

So support for both can be omitted together.

Now the I've two confusions.

  1. Why vector ? I mean why so much restriction? if its an api then it should be open to all variety. Like the idea of iterator in #103. An api should always be open for extension and close for modification, right? Currently using to and as we can collect rows in multiple data structures(linear). Like vector, array, list, forward_list, deque, ... etc Suppose the developer just needs to merge the results from two different queries from two different databases, they can simply pass the container to this api and collect results. Also only std::vector may not be very much comfortable with other apis. Like QList from Qt. With our current api it really doesn't care about the type, but with only vector people will be forced to use that or simply reject this api.

  2. Isn't just >> operator for both blob and rows, a little bit cryptic? What about the readability?

pPanda-beta avatar May 19 '17 15:05 pPanda-beta

@pPanda-beta unique_ptr can supported anyway, we could remove it, but I don't hink this warrents a breaking change, especially since it can be useful if you have big amounts of data.

Why vector? I think vector is mostly an example here. The same interface should be supported for all container types, but vector is the interesting case because the problem with BLOBs appears only with vector.

I'm completely with you that using >> vector for blobs and rows is very hard to read. If all other proposals of @aminroosta would be applied we would end up with sth. like this

// single row blob
vector<int> vblob;
db << "select numbers from tbl where id = 1" >> vblob;
// multiple row int
vector<int> vrows;
db << "select age from tbl;" >> collect(vrows);

Actually I would prefer it the other way around, like

// single row blob
vector<int> vblob;
db << "select numbers from tbl where id = 1" >> blob(vblob);
// multiple row int
vector<int> vrows;
db << "select age from tbl;" >> vrows;

because extracting all rows is most likely much more common than extracting a single blob. Obviously this would break programs which use the old interface.

zauguin avatar May 19 '17 16:05 zauguin

When i was writing the proposed interface i only thought about std::vector but now that i read @zauguin comment, i see no problem supporting other containers and that's really cool 👍

Isn't just >> operator for both blob and rows, a little bit cryptic?

Yes it is, i actually like the blob(vblob) syntax from @zauguin much more. The next version is v4.0, So i think its alright to break backward compatibility! (if it delivers a simpler interface). And we already have one such instance with statement-execution in #120

@pPanda-beta The main functional difference between the two interfaces is dropping support for user defined types. I know that's a debatable topic :-)

aminroosta avatar May 19 '17 17:05 aminroosta

@pPanda-beta If we want to keep the user defined type support, we could try to use a type-trait to document the columns required to serialize it. So you could write something like this

class user {
public:
  user(int id, std::string name) {...}
  ...
};
namespace sqlite {
  template<> struct serialize<user>: serialize<user, int, std::string> {};
}
...
int main() {
  ...
  auto users = db << "SELECT * FROM users;" >> (std::vector<user>());
  ...
}

This would allow us to combine this functionality with the short syntax. What do you think?

zauguin avatar May 19 '17 18:05 zauguin

@aminroosta @zauguin Thanks for the suggestion. function names blob or collect sounds more readable.

Now For collect<...>::as<...> part I was thinking that we can also overload type casting operator of database_binder, Like :

list<tuple<int,int>> rows = db<<"SELECT id,age FROM tbl ; ";

or

vector<tuple<int,int>> rows = db<<"SELECT id,age FROM tbl ; ";

For me both overloaded >> operator with rvalue [ e.g. vector() ] and this one seems readable. But as for blob currently the operator >> is already overloaded, its better not to make conflicts with it until it gets moved to blob().

The only part with which I'm still confused is that why not "user-define types"? I mean that effectively saves time right? From the api's perspective whats the difference in between vector<tuple<int,string>> and vector<User> and <int,string> ? The 2nd one constructs objects where the first one forces user to convert it from tuple to user-defined class;

pPanda-beta avatar May 19 '17 19:05 pPanda-beta

@pPanda-beta

I was thinking that we can also overload type casting operator of database_binder

It's a clever idea, but that's very similar to prepared statements syntax so i'm for >> operator with Rvalue containers :-)

But as for blob currently the operator >> is already overloaded, its better not to make conflicts with it until it gets moved to blob().

Exactly, this will also be part of the suggested interface implementation.

The only part with which I'm still confused is that why not "user-define types"? From the api's perspective whats the difference in between vector<tuple<int,string>> and vector and <int,string> ?

You are right, i am just a little bit reluctant in case of user defined types, because it looks like an ORM! Not that there is something inherently wrong with an ORM, it's just my personal opinion to keep the library as minimal as possible.

The proposed interface from @zauguin looks good to me, if you guys think we should support user defined types, i'm OK too :-)

aminroosta avatar May 19 '17 19:05 aminroosta

On a second thought, the cast overload isn't that bad either.

list<tuple<int,int>> rows = db<<"SELECT id,age FROM tbl ; ";

I do think it's a little confusing but the >> (vector<int>()) is also somewhat weird!

aminroosta avatar May 19 '17 19:05 aminroosta

@zauguin

Yaay, type-traits are also great. It leaves the implementation of serialization to the user! That's awesome. But being a graduation student I've very little knowledge about modern c++ syntax. And I really wanted​ to know if it supports overloaded constructors! Like

struct user
{
...
    user(int id, string name){...}
    user(string name){...}
};

//I don't know in this case how to specialize serialize may be

template<> struct deserialize​<user>::from<int,string> {}

template<> struct deserialize<user>::from<string> {}

but the real question is how the overloaded >> operator will detect which one to use just from a rvalue reference of vector</just/user>

pPanda-beta avatar May 19 '17 22:05 pPanda-beta

The idea was that the author of the class decides which constructor should be used and provides one of the two specializations. This is detected by operator >>, so operator >> "knows" which arguments it has to pass.

zauguin avatar May 19 '17 22:05 zauguin

@aminroosta

Talking about ORM I was looking around hiberlite and ibatis. Both of them has some conditions for the user-defined types. And surely these solutions never let any user-defined types independent of those apis. hiberlite uses some macros which makes the model (the user-defined type) quite dependent on the api.

Now in this collector api

  1. we don't even touch the user-defined class
  2. we don't touch any member of database and database_binder
  3. we don't even think about serializing from object

Now about size of code, I believe the amount of code will be almost same for both the cases. i.e. the std::tuple and user-defined type **Moreover the current api supports both user-defined type as well as std::tuple [Suddenly it came to my notice that std::tuple<int,string> is effectively​ an user-defined type having constructor with parameters (int,string) ] ** I can't reduce the code just by using tuple because the compiler must know about the parameters of the consttuctor. _ collect<int,string>::as<vector,user>() vs vector<tuple<int, string>()_ So we can't escape "What we are collecting" aka <int, string>

Btw, I also discovered that just by writing

as<...., typename T>

to

as<..., typename T=std::tuple<Args...> >

will allow the developer to write

collect<int,string>::as<vector>()

which will return

vector<tuple<int,string>>

so with a very little effort we can reduce a lot of pains of developers and the api will be more flexible, like the way they want to use it, without having any restriction!

pPanda-beta avatar May 19 '17 22:05 pPanda-beta

@zauguin

but there can be nessecity for both the constructors...

  1. I want all users who has finished otp verification and then I'll​ decide their ids
db<<"SELECT name FROM temp_user WHERE otp_verified=true ;"
  1. I want all users with their ids.
db<<"SELECT id,name FROM user ;"

so the "way of serialization" can vary from query to query, but if we bound it with our api we are restricting it to only one constructor... and also I believe (sorry this is extremely personal opinion) database is database, object is object, database is not a serialized set of object, rather than it persists data of a set of object. So a database api should always be unaware about object or the mapping, and it should indiscriminately allow all objects to be formed by the data stored in it.
Like same database, same api, and now someone will create tupple from it, someone will create user, someone will create person using least amount of code, and using the api in same fashion...

pPanda-beta avatar May 19 '17 23:05 pPanda-beta

@pPanda-beta Let's drop the traits then.

I would prefer to use an interface which doesn't depend on template template arguments, because they are overly restrictive. Custom containers may not be templates to begin with, may need additional arguments etc. Default arguments could still be used for collect, so this would bring us to

vector<tuple<int, string>> vec;
db << "SQL;" >> collect<>::to(vec);
// or
auto vec2 = db << "SQL;" >> collect<>::as<vector<tuple<int, string>>>();

Without user-defined types it could be

vector<tuple<int, string>> vec;
db << "SQL;" >> vec;
// or maybe
auto vec2 = db << "SQL;" >> vector<tuple<int, string>>();
// or even
vector<tuple<int, string>> vec3 = db << "SQL;";

I think the second one is much nicer. While I think that user-defined types would be nice to have if they would fit in, for me they aren't important enough to justify this interface change.

zauguin avatar May 20 '17 22:05 zauguin