superduper [DOCS0-2] Fill out missing details in docs and check consistency

Read through structure, give live feedback, suggestions, and push improvements.

[ ] Getting started
[ ] Core API
[ ] Apply API
[ ] Execute API
[ ] Models
[ ] Data integrations
- [ ] MongoDB
- [ ] SQL
[ ] AI Integrations
- [ ] Anthropic
- [ ] Cohere
- [ ] OpenAI
- [ ] Jina
- [ ] Scikit-learn
- [ ] Torch
- [ ] Transformers
- [ ] vLLM
- [ ] LlamaCpp
[ ] Fundamentals
[ ] Production features

For each AI integration:

How to instantiate
(How to train)

Apr 05 '24 09:04 blythed

The Oracle tab is wrong. It refers toMSSQL

https://docs.superduperdb.com/docs/docs/reusable_snippets/connect_to_superduperdb

May 14 '24 11:05 fnikolai

The Get useful sample data can be renamed to Fetch Dataset

May 14 '24 11:05 fnikolai

Tabs on Compute features have some spaces at the beginning

May 14 '24 11:05 fnikolai

data integrations seems redundant and misleading.

If we want to keep it, then at least it should be similar to reusable snippets

May 14 '24 11:05 fnikolai

Create Vector-Index seems to have an issue with the tabs

May 14 '24 11:05 fnikolai

The SQL statement on Perform a vector search seems a bit odd.

select = query_table_or_collection.like(item, vector_index=vector_index_name, n=10).limit(10)

Why do we need limit(10) when we have n=10 ?

May 14 '24 11:05 fnikolai

Connecting Listeners is empty

May 14 '24 11:05 fnikolai

On Postgresql of Connect to SuperDuperDB the user and password should be superduper in order to be in compliance with the credentials of the Docker databases.

May 14 '24 12:05 fnikolai

Change Mongo Connection to:

from superduperdb import superduper

user = 'superduper'
password = 'superduper'
port = 27017
host = 'localhost'
database = 'test_db'

db = superduper(f"mongodb://{user}:{password}@{host}:{port}/{database}")

May 14 '24 13:05 fnikolai

Also explain how do to pre-filtering and post-filtering of the data.

In general, whatever questions have raised on Slack/Issues, should be answered somewhere on the docs.

May 16 '24 14:05 fnikolai

from superduperdb import dtype is shown everywhere on Get Useful Samples but it returns

ImportError: cannot import name 'dtype' from 'superduperdb' (/home/superduper/superduperdb/superduperdb/__init__.py)

replace it with:

from superduperdb.backends.ibis.field_types import dtype

May 17 '24 01:05 fnikolai

curl -O s3://superduperdb-public-demo/images.zip returns curl: (1) Unsupported protocol or curl: (1) Protocol "s3" not supported or disabled in libcurl.

You need to replace it with https://superduperdb-public-demo.s3.amazonaws.com/images.zip

The same for video and audio.

May 17 '24 01:05 fnikolai

On Multimodal Vector Search in Define the embedding model datatype the SQL and Mongo are swapped compared to others.

May 17 '24 01:05 fnikolai

On Multimodal Vector Search the Image chunker is missing and therefore the Listener cannot be used.

May 17 '24 02:05 fnikolai

On Multimodal Vector Search when choose Text chunker the listener blocks forever

May 17 '24 02:05 fnikolai

Add some comments on the Rest Config

This will make it easier for someone to understand what this is about.

https://github.com/SuperDuperDB/superduperdb/blob/main/deploy/rest/config.yaml

May 30 '24 20:05 fnikolai

Following this discussion: https://github.com/SuperDuperDB/superduperdb/issues/2122

we need:

Add in the FAQ the if users experience a dependency issue, they should uninstall the previous superduperdb installation.
Add instructions of how to install the framework directly from repo

May 30 '24 22:05 fnikolai

On Connect to SuperDuperDB, for Snowflake, add the instructions to use the role and the database or warehouse within the connection URI

Jun 05 '24 11:06 fnikolai

Broken links

We need a smart way to detect broken links.

Others

python -m superduperdb config A commit removed the CLI for config, should make it back.
The right-hand sidebar of this page has been disrupted by outputs.
Apply Supplement the usage of db.load(uuid=xxxx)
Model add a link to custom models.
Listener Supplement the real case introduction of key parameter.
VectorIndex
- Explain the differences and usage of A and B, and what problems they solve.
- Add a detailed explanation of the entire process behind the creation of the VectorIndex class, so that users can understand what exactly is happening.
DataType Supplement lazy_file in INFO
Schema
- Add an example for encode_data = Document(data).encode(schema=schema) and decode_data = Document.decode(encode_data, schema=schema), and show what the data looks like.
Table update the new message about Table for MongoDB, we can use Table in MongoDB now
Dataset update the new message about the parameter pin
Template
- Show the data generated from template.export
- Introduce how to share and reuse the template
Setting up tables and encodings
- typo db['my-table'].insert[_many](data).execute()
Working with and inserting large pieces of data
- Deprecated method from superduperdb.backends.mongodb import Collection
- The error cause by the default of info parameter
- .unpack(db) : the unpack funciton do not have db parameter now
Working with external data sources
Vector-search Introduce how to use pre-like and post-like, or link to vector_search_algorithm
Datalayer
- remove the dask things and add ray things
- remove the db.add and use db.apply
- Add db.load key method
- remove the db.predict, db.validate key methods
Command line interface
- Did not create a runnable command line for superduperdb; currently it is python3 -m superduperdb.
YAML/Json
Fix new changes
- @objectmodel -> @model
- predict_one -> predict

Jun 14 '24 03:06 jieguangzhou

superduper superduper copied to clipboard

[DOCS0-2] Fill out missing details in docs and check consistency

Broken links

Others

superduper
superduper copied to clipboard