starrocks icon indicating copy to clipboard operation
starrocks copied to clipboard

[Good First Issue]StarRocks Hands-on Tasks 2024

Open wangsimo0 opened this issue 1 year ago • 7 comments

Hi Rockstars,

This is a list of proposed Hands-on tasks. If you're new to StarRocks and eager to engage with the community, here are some issues that are well-suited for you to dive into :) These issues are suitable for gaining hands-on experience and becoming familiar with StarRocks development. Also this is an open list, you are welcome to propose more tasks.

Please @kateshaowanjou or @wangsimo0 to book the issue, and add a comment in the issue you picked, so the issue won't be assigned to others. And always discuss with the community about the design before actually developing, some of the issues are really big, don't hesitate to seek help from the community.

External Catalog related issues

Information Schema in External Catalog

In version 3.2 and later, StarRocks enhances compatibility with more BI tools by supporting the information_schema database in External Catalog. This feature serves as a valuable tool for obtaining structured information. While several views within information_schema currently return empty, efforts are underway to optimize support for these views to ensure comprehensive coverage. StarRocks aligns with MySQL's pattern in supporting information_schema, as it follows the MySQL protocol. We better maintain the compatibility with MySQL, provide as much information as we can, and optimize for efficiency to minimize time consumption. consumed.

  • [ ] Columns view
  • [ ] Views view

Trino's Compatibility Issues

In version 3.0 and later, StarRocks supports Trino's SQL_dialect mode; however, ongoing enhancements are necessary to further optimize this functionality.

New Functions

  • [ ] inverse_normal_cdf and normal_cdf @241600489 #38989
  • [ ] typeof @MicePilot #36245
  • [ ] regexp_split #37089
  • [x] boolor_agg,boolxor_agg,booland_agg #22949
  • [ ] from_iso8601_date(string),from_iso8601_timestamp(string) #40877
  • [ ] array_agg in window function #40881 @mygrsun
  • [ ] cardinality in HLL data type #40879
  • [ ] count(distinct) window function #46105 @yangzho12138

Function Mapping

  Trino's function/expression StarRocks' function/expression comment assginee
  • [ ]
map_agg(key, value) → map<K,V> map()  
  • [ ]
show schemas from <catalog_name> Show databases from <catalog_name>  
  • [ ]
array_sort(array(T), function(T, T, int)) -> array(T) array_sortby(, array0 [, array1...]) This one needs to pay attention to the input order.
  • [ ]
sequence(start, stop)sequence(start, stop, step)In integers data type array_generate([start,] end [, step])  
  • [ ]
last_day_of_month(x) → date last_day(x,'month');  
  • [ ]
map_from_entries(array(row(K, V))) -> map(K, V) map_from_arrays. This one needs to pay attention to the transformation. SELECT map_from_entries(ARRAY[(1, 'x'), (2, 'y')]); equals to SELECT map_from_arrays([1,2],['x','y']);
  • [x]
current_catalog catalog()   thanks to @macroguo-ghy
  • [x]
current_schema database()   thanks to @macroguo-ghy
  • [x]
slice(x, start, length) → array array_slice(input, offset, length)  
  • [ ]
approx_set(x) → HyperLogLog HLL_HASH(column_name)  
  • [ ]
empty_approx_set() → HyperLogLog HLL_EMPTY()  
  • [ ]
merge(HyperLogLog) → HyperLogLog HLL_RAW_AGG(hll)  

Other Enhancements

  • [ ] Apache Ranger's policy translator

StarRocks support using Hive service in Ranger to control access towards hive tables. However we discover there are still some community users want to manage all the privs in StarRocks ranger service. So we need a translator(maybe a script)

  • [x] Add catalog information in FE's query_detail @happut

After enabling collect query details using admin set frontend config("enable_collect_query_detail_info"="true") user can get query detail using curl -uroot: http://172.26.81.138:8030/api/query_detail?event_time=<unixtimestamp_value> , the information is like ...."database":"simo","sql":"insert into abc values (1,2),(2,3)","user":"root".... There is no catalog information. Like "catalog":"defaut_catalog"

Apache Hudi & Delta Lake Compatibilities

  • [ ] Add Hudi sink (✨ HIGH priority)
  • [ ] Add Delta Lake sink (✨ HIGH priority)

More Connectors

  • [ ] Oracle catalog
  • [ ] Kudu catalog @predator4ann
  • [ ] StarRocks catalog
  • [ ] Greenplum catalog
  • [ ] SQLSever catalog
  • [x] Clickhouse catalog
  • [ ] Trino catalog
  • [ ] DB2 catalog
  • [ ] Druid catalog
  • [ ] Oceanbase catalog
  • [ ] SAP Hana catalog

More Compatibilities

  • [ ] Hive UDF compatible
  • [ ] Spark SQL compatible structure
  • [ ] Hive SQL compatible structure
  • [ ] Impala SQL compatible structure

wangsimo0 avatar Feb 06 '24 08:02 wangsimo0

I'd add iceberg tagging and branch query

alberttwong avatar Feb 08 '24 03:02 alberttwong

https://github.com/StarRocks/starrocks/issues/37959

alberttwong avatar Feb 08 '24 03:02 alberttwong

I want to pick #38989 @wangsimo0

241600489 avatar Mar 01 '24 02:03 241600489

I want to pick #40881 @wangsimo0

mygrsun avatar Apr 26 '24 06:04 mygrsun

I want to pick #37089 @wangsimo0

yangzho12138 avatar May 21 '24 03:05 yangzho12138

I want to pick #37089 @wangsimo0 You need to also comment under the issue #37089 so I can assign it to you. If you have any issues during the development process, I can introduce you to the relevant discussion group. https://853921.ma3you.cn/articles/b12e90J/

kateshaowanjou avatar May 21 '24 03:05 kateshaowanjou

I want to pick #46105 @wangsimo0

yangzho12138 avatar Jun 27 '24 06:06 yangzho12138

@wangsimo0 Hi, I want to add Delta Lake Compatibilities. Has this requirement been resolved?

FLAYhhh avatar Aug 06 '24 08:08 FLAYhhh

@wangsimo0 Hi, I want to add Delta Lake Compatibilities. Has this requirement been resolved? Are you referring to the "Add Delta Lake sink" function? There's no one working on it at the moment and it'd be awesome if you are willing to give it a try!😎

kateshaowanjou avatar Aug 08 '24 02:08 kateshaowanjou

Sure thing! I'd be happy to take this on.

FLAYhhh avatar Aug 08 '24 08:08 FLAYhhh

@kateshaowanjou @wangsimo0 Can I pick this issue: https://github.com/StarRocks/starrocks/issues/38989 if its not being worked upon by anyone ?

amoghmargoor avatar Aug 08 '24 10:08 amoghmargoor

Sure thing! I'd be happy to take this on.

This issue is not the easiest one so feel free to add my WeChat:wanjoushao if you need help!

kateshaowanjou avatar Aug 08 '24 11:08 kateshaowanjou

@kateshaowanjou @wangsimo0 We are migrating from Trino to Starrocks and working on the functions. Can I pick the map_agg issue?

Jcnessss avatar Aug 22 '24 02:08 Jcnessss

I want to pick this issue #46060 @wangsimo0 @kateshaowanjou

SoraNimi avatar Aug 25 '24 08:08 SoraNimi