Multiple stream example needs clarify
When use 1 stream following this example https://github.com/googleapis/python-bigquery-storage/blob/6254bf2a588e69e2175df1c67edb514655d93e9d/samples/to_dataframe/main_test.py#L130-L139, the example perhaps make confusion.
The example states that we can read from multiple streams in order to get data faster. It does not say that multiple streams may be generated automatically if there is a lot of data and so you need to get data from all streams.
Thanks!
Thanks for the report. Yes, we can improve these comments.
Note: I did recently change this sample to request a maximum of 1 stream so that the sample does not miss rows that have been assigned to other streams. https://github.com/googleapis/python-bigquery-storage/pull/114
Converting this to a feature request for a code sample handling multiple streams, as the original issue has been fixed as per Tim. Thanks!
@tswast So basically, we can loop over every stream which will contain a piece of the dataframe? Are these pieces disjoint? In order to exploit multiple streams, do I need to use libraries like multiprocessing? thanks
This issue was transferred from python-bigquery-storage to google-cloud-python as part of the work for https://github.com/googleapis/google-cloud-python/issues/10991.