Olivier Binette
Olivier Binette
Thanks! I mean specifically using the tree visualization component, not the whole dashboard. The logic in the npm package should help.
@ngmarchant thanks for the link! I'll look into it for further optimization, and then see if I can make a pull request.
For reference, here's the optimization I could get by reducing the memory usage a bit more: ```python def levenshtein(s, t): m = len(s) n = len(t) buffer = np.arange(max(n, m)...
Importance weighting approaches may be the easiest to implement in a general framework. Weights can be defined through: 1. Stratification based on user-specific categorical features 2. A user-specified model for...
Thanks for the links @RobinL ! I missed the part of the documentation with the custom comparison as SQL. With AWS Athena ([Trino SQL](https://trino.io/docs/current/functions/array.html)), I think `zip_with` and `reduce` would...
@RobinL That is good to know. I'm sure there in interest to provide more functionality for machine learning data preprocessing and NLP as well. Trino has some good stuff for...
@NickCrews I think that would work well for generic SQL backends. For SQL engines that support array data types (like Trino/AWS Athena or lists with duckdb), it would be more...
@RobinL The embeddings can be quite long. PatentsView uses the sent2vec package to embed and compare patent titles. These are of length 100 by default if I remember correctly. These...
@RobinL, @samnlindsay, duckdb has added [list_cosine_similarity](https://blog.lancedb.com/vector-similarity-search-with-duckdb-44dec043532a) as a function and Trino has added [cosine_similarity](https://trino.io/docs/current/functions/math.html). I'd be interested working with your team to showcase this functionality, using deep embedding models (via...
I only get this error on my current laptop, which was recently upgraded from Pop!_OS 21.04 to Pop!_OS 21.10. I don't have any issue on my other linux machine. Pop!_OS...