superduper
superduper copied to clipboard
[DOCS0-2] Fill out missing details in docs and check consistency
Read through structure, give live feedback, suggestions, and push improvements.
- [ ] Getting started
- [ ] Core API
- [ ] Apply API
- [ ] Execute API
- [ ] Models
- [ ] Data integrations
- [ ] MongoDB
- [ ] SQL
- [ ] AI Integrations
- [ ] Anthropic
- [ ] Cohere
- [ ] OpenAI
- [ ] Jina
- [ ] Scikit-learn
- [ ] Torch
- [ ] Transformers
- [ ] vLLM
- [ ] LlamaCpp
- [ ] Fundamentals
- [ ] Production features
For each AI integration:
- How to instantiate
- (How to train)
The Oracle
tab is wrong. It refers toMSSQL
https://docs.superduperdb.com/docs/docs/reusable_snippets/connect_to_superduperdb
The Get useful sample data
can be renamed to Fetch Dataset
Tabs on Compute features
have some spaces at the beginning
data integrations
seems redundant and misleading.
If we want to keep it, then at least it should be similar to reusable snippets
Create Vector-Index
seems to have an issue with the tabs
The SQL statement on Perform a vector search
seems a bit odd.
select = query_table_or_collection.like(item, vector_index=vector_index_name, n=10).limit(10)
Why do we need limit(10) when we have n=10 ?
Connecting Listeners
is empty
On Postgresql
of Connect to SuperDuperDB
the user
and password
should be superduper
in order to be in compliance with the credentials of the Docker databases.
Change Mongo Connection to:
from superduperdb import superduper
user = 'superduper'
password = 'superduper'
port = 27017
host = 'localhost'
database = 'test_db'
db = superduper(f"mongodb://{user}:{password}@{host}:{port}/{database}")
Also explain how do to pre-filtering
and post-filtering
of the data.
In general, whatever questions have raised on Slack/Issues, should be answered somewhere on the docs.
from superduperdb import dtype
is shown everywhere on Get Useful Samples
but it returns
ImportError: cannot import name 'dtype' from 'superduperdb' (/home/superduper/superduperdb/superduperdb/__init__.py)
replace it with:
from superduperdb.backends.ibis.field_types import dtype
curl -O s3://superduperdb-public-demo/images.zip
returns curl: (1) Unsupported protocol
or curl: (1) Protocol "s3" not supported or disabled in libcurl
.
You need to replace it with https://superduperdb-public-demo.s3.amazonaws.com/images.zip
The same for video and audio.
On Multimodal Vector Search
in Define the embedding model datatype
the SQL
and Mongo
are swapped compared to others.
On Multimodal Vector Search
the Image chunker
is missing and therefore the Listener
cannot be used.
On Multimodal Vector Search
when choose Text chunker
the listener blocks forever
Add some comments on the Rest Config
This will make it easier for someone to understand what this is about.
https://github.com/SuperDuperDB/superduperdb/blob/main/deploy/rest/config.yaml
Following this discussion: https://github.com/SuperDuperDB/superduperdb/issues/2122
we need:
-
Add in the FAQ the if users experience a dependency issue, they should uninstall the previous superduperdb installation.
-
Add instructions of how to install the framework directly from repo
On Connect to SuperDuperDB
, for Snowflake
, add the instructions to use the role
and the database
or warehouse
within the connection URI
Broken links
We need a smart way to detect broken links.
Others
-
python -m superduperdb config
A commit removed the CLI for config, should make it back. -
The right-hand sidebar of this page has been disrupted by outputs.
-
Apply Supplement the usage of
db.load(uuid=xxxx)
-
Model add a link to custom models.
-
Listener Supplement the real case introduction of key parameter.
-
- Explain the differences and usage of A and B, and what problems they solve.
- Add a detailed explanation of the entire process behind the creation of the VectorIndex class, so that users can understand what exactly is happening.
-
DataType Supplement
lazy_file
in INFO -
- Add an example for
encode_data = Document(data).encode(schema=schema)
anddecode_data = Document.decode(encode_data, schema=schema)
, and show what the data looks like.
- Add an example for
-
Table update the new message about
Table
for MongoDB, we can useTable
in MongoDB now -
Dataset update the new message about the parameter
pin
-
- Show the data generated from
template.export
- Introduce how to share and reuse the template
- Show the data generated from
-
Setting up tables and encodings
- typo
db['my-table'].insert[_many](data).execute()
- typo
-
Working with and inserting large pieces of data
- Deprecated method
from superduperdb.backends.mongodb import Collection
- The error cause by the default of
info
parameter -
.unpack(db)
: theunpack
funciton do not havedb
parameter now
- Deprecated method
-
Vector-search Introduce how to use pre-like and post-like, or link to vector_search_algorithm
-
- remove the
dask
things and addray
things - remove the
db.add
and usedb.apply
- Add
db.load
key method - remove the
db.predict
,db.validate
key methods
- remove the
-
- Did not create a runnable command line for
superduperdb
; currently it ispython3 -m superduperdb
.
- Did not create a runnable command line for
-
Fix new changes
- @objectmodel -> @model
- predict_one -> predict