Dominik Weckmüller
Dominik Weckmüller
## Problem Same here. I'm trying to load a 6GB parquet file with 3 cols two string cols and one with embeddings (arrays just like @aschmu described) in pandas with...
+1 See also https://github.com/qdrant/qdrant/discussions/1418
Very nice, thanks for writing the tutorial!
Very nice idea, I had something very similar in mind: #137
Yes absolutely! Just a disclaimer: my personal view might be somewhat biased towards the usage in SemanticFinder where I only append items to a variable, filter them and then perform...
Let's maybe head over to https://github.com/do-me/SemanticFinder/discussions/32 in order to not become too off-topic for this issue!
Are there any updates on this issue? @BCorbeek did you investigate any further? For anyone parsing pdf to text, losing text (particularly without knowing it) is probably the worst that...
That's something I definitely wouldn't expect! Just created a test pdf with LibreOffice but everything seems ok: [truncation_test_tika.pdf](https://github.com/chrismattmann/tika-python/files/11375870/truncation_test_tika.pdf) Content: ``` 5.8 abcd some words here, the sentence ends now 6.1...
On a second thought, this test might not be representative as there is a myriad of export options available. Also, it seems as if LibreOffice always adds line breaks (`240...
I also just tripped over this issue and can confirm what @cyril23 said: it's happening very rarely in let's say