superduper icon indicating copy to clipboard operation
superduper copied to clipboard

[DOCS0-2] Fill out missing details in docs and check consistency

Open blythed opened this issue 10 months ago • 9 comments

Read through structure, give live feedback, suggestions, and push improvements.

  • [ ] Getting started
  • [ ] Core API
  • [ ] Apply API
  • [ ] Execute API
  • [ ] Models
  • [ ] Data integrations
    • [ ] MongoDB
    • [ ] SQL
  • [ ] AI Integrations
    • [ ] Anthropic
    • [ ] Cohere
    • [ ] OpenAI
    • [ ] Jina
    • [ ] Scikit-learn
    • [ ] Torch
    • [ ] Transformers
    • [ ] vLLM
    • [ ] LlamaCpp
  • [ ] Fundamentals
  • [ ] Production features

For each AI integration:

  • How to instantiate
  • (How to train)

blythed avatar Apr 05 '24 09:04 blythed

The Oracle tab is wrong. It refers toMSSQL

https://docs.superduperdb.com/docs/docs/reusable_snippets/connect_to_superduperdb

fnikolai avatar May 14 '24 11:05 fnikolai

The Get useful sample data can be renamed to Fetch Dataset

fnikolai avatar May 14 '24 11:05 fnikolai

Tabs on Compute features have some spaces at the beginning

fnikolai avatar May 14 '24 11:05 fnikolai

data integrations seems redundant and misleading.

If we want to keep it, then at least it should be similar to reusable snippets

fnikolai avatar May 14 '24 11:05 fnikolai

Create Vector-Index seems to have an issue with the tabs

fnikolai avatar May 14 '24 11:05 fnikolai

The SQL statement on Perform a vector search seems a bit odd.

select = query_table_or_collection.like(item, vector_index=vector_index_name, n=10).limit(10)        

Why do we need limit(10) when we have n=10 ?

fnikolai avatar May 14 '24 11:05 fnikolai

Connecting Listeners is empty

fnikolai avatar May 14 '24 11:05 fnikolai

On Postgresql of Connect to SuperDuperDB the user and password should be superduper in order to be in compliance with the credentials of the Docker databases.

fnikolai avatar May 14 '24 12:05 fnikolai

Change Mongo Connection to:

from superduperdb import superduper

user = 'superduper'
password = 'superduper'
port = 27017
host = 'localhost'
database = 'test_db'

db = superduper(f"mongodb://{user}:{password}@{host}:{port}/{database}")

fnikolai avatar May 14 '24 13:05 fnikolai

Also explain how do to pre-filtering and post-filtering of the data.

In general, whatever questions have raised on Slack/Issues, should be answered somewhere on the docs.

fnikolai avatar May 16 '24 14:05 fnikolai

from superduperdb import dtype is shown everywhere on Get Useful Samples but it returns

ImportError: cannot import name 'dtype' from 'superduperdb' (/home/superduper/superduperdb/superduperdb/__init__.py)

replace it with:

from superduperdb.backends.ibis.field_types import dtype

fnikolai avatar May 17 '24 01:05 fnikolai

curl -O s3://superduperdb-public-demo/images.zip returns curl: (1) Unsupported protocol or curl: (1) Protocol "s3" not supported or disabled in libcurl.

You need to replace it with https://superduperdb-public-demo.s3.amazonaws.com/images.zip

The same for video and audio.

fnikolai avatar May 17 '24 01:05 fnikolai

On Multimodal Vector Search in Define the embedding model datatype the SQL and Mongo are swapped compared to others.

fnikolai avatar May 17 '24 01:05 fnikolai

On Multimodal Vector Search the Image chunker is missing and therefore the Listener cannot be used.

fnikolai avatar May 17 '24 02:05 fnikolai

On Multimodal Vector Search when choose Text chunker the listener blocks forever

fnikolai avatar May 17 '24 02:05 fnikolai

Add some comments on the Rest Config

This will make it easier for someone to understand what this is about.

https://github.com/SuperDuperDB/superduperdb/blob/main/deploy/rest/config.yaml

fnikolai avatar May 30 '24 20:05 fnikolai

Following this discussion: https://github.com/SuperDuperDB/superduperdb/issues/2122

we need:

  1. Add in the FAQ the if users experience a dependency issue, they should uninstall the previous superduperdb installation.

  2. Add instructions of how to install the framework directly from repo

fnikolai avatar May 30 '24 22:05 fnikolai

On Connect to SuperDuperDB, for Snowflake, add the instructions to use the role and the database or warehouse within the connection URI

fnikolai avatar Jun 05 '24 11:06 fnikolai

Broken links

We need a smart way to detect broken links.

Others

  • python -m superduperdb config A commit removed the CLI for config, should make it back.

  • The right-hand sidebar of this page has been disrupted by outputs.

  • Apply Supplement the usage of db.load(uuid=xxxx)

  • Model add a link to custom models.

  • Listener Supplement the real case introduction of key parameter.

  • VectorIndex

    • Explain the differences and usage of A and B, and what problems they solve.
    • Add a detailed explanation of the entire process behind the creation of the VectorIndex class, so that users can understand what exactly is happening.
  • DataType Supplement lazy_file in INFO

  • Schema

    • Add an example for encode_data = Document(data).encode(schema=schema) and decode_data = Document.decode(encode_data, schema=schema), and show what the data looks like.
  • Table update the new message about Table for MongoDB, we can use Table in MongoDB now

  • Dataset update the new message about the parameter pin

  • Template

    • Show the data generated from template.export
    • Introduce how to share and reuse the template
  • Setting up tables and encodings

    • typo db['my-table'].insert[_many](data).execute()
  • Working with and inserting large pieces of data

    • Deprecated method from superduperdb.backends.mongodb import Collection
    • The error cause by the default of info parameter
    • .unpack(db) : the unpack funciton do not have db parameter now
  • Working with external data sources

  • Vector-search Introduce how to use pre-like and post-like, or link to vector_search_algorithm

  • Datalayer

    • remove the dask things and add ray things
    • remove the db.add and use db.apply
    • Add db.load key method
    • remove the db.predict, db.validate key methods
  • Command line interface

    • Did not create a runnable command line for superduperdb; currently it is python3 -m superduperdb.
  • YAML/Json

  • Fix new changes

    • @objectmodel -> @model
    • predict_one -> predict

jieguangzhou avatar Jun 14 '24 03:06 jieguangzhou