risingwave
                                
                                
                                
                                    risingwave copied to clipboard
                            
                            
                            
                        Support `random()` in batch queries
Is your feature request related to a problem? Please describe.
Handy for generating testing data.
Describe the solution you'd like
No response
Describe alternatives you've considered
No response
Additional context
No response
The corresponding setseed function seems non-trivial. Note the following query is actually deterministic:
test=# select case n when 1 then setseed(0.4) else null end, random() from generate_series(1, 10) as t(n);
 case |       random        
------+---------------------
      |  0.6722690410337004
      |  0.9936586191615264
      | 0.19947461195735405
      |    0.90340354600701
      |   0.516998156895498
      |  0.7913052710048887
      |   0.531550972556847
      | 0.48302401286537666
      | 0.14525565255427209
      | 0.19221106009513123
(10 rows)
So what random() really does in PostgreSQL is mutating a session-level variable:
https://www.postgresql.org/docs/16/sql-set.html#:~:text=TO%20value.-,SEED,-Sets%20the%20internal
https://github.com/postgres/postgres/blob/REL_16_1/src/backend/utils/misc/guc_tables.c#L3695-L3704
https://github.com/postgres/postgres/blob/REL_16_1/src/common/pg_prng.c#L232-L234
Of course we can provide a weaker random() function first without the ability to generate deterministic sequence given a seed.
cc @wangrunji0408 may be more familiar with RNGs given experience working on madsim.
So what random() really does in PostgreSQL is mutating a session-level variable:
Of course we can provide a weaker random() function first without the ability to generate deterministic sequence given a seed.
Either LGTM. If the session-level variable approach involves lots of work, I will prefer the 2nd