cocoindex icon indicating copy to clipboard operation
cocoindex copied to clipboard

[FEATURE] Support more connectors by leveraging available open source

Open stevereiner opened this issue 7 months ago • 0 comments

What is the use case? Needing more sources and targets sooner. Save development time Describe the solution you'd like Use connectors from From query in place (have open source python connector code) Swirl content focused sources MindsDB data focused sources (would think less important) Many sources and targets have python api /sdks code separately Sources Box, SharePoint, Dropbox, Nuxeo, Alfresco (me todo), CMIS, Adobe AEM, AWS boto3 including S3, Microsoft azure-storage-blob, Google Cloud Client Libraries MongoDB pymongo (as doc content source, not database) PostgresSQL psycopg2 , SQLAlchemy (would think databases sources less needed) Targets OpenSearch, ElasticSearch (these are important) Neo4j (neo4j, graphdatascience, neo4j-graphrag) Weaviate, Qdrant, Pinecone pinecone package

Additional context Was thinking unstructured.io has a ton of source and target connectors, would have some for the open source tier But these are a only for the enterprise UI and API Enterprise https://unstructured.io/enterprise has a picture of all under World Class Transformation and Orchestration ETL https://docs.unstructured.io/ingestion/source-connectors/overview https://docs.unstructured.io/ingestion/destination-connectors/overview Open Source Apache 2.0 For open source tier many supported file formats https://docs.unstructured.io/open-source/introduction/supported-file-types For open source tier data workflows for LLM https://docs.unstructured.io/open-source/introduction/overview Python apis for partitioning, cleaning, extracting, staging, chunking, staging, chunking embedding Unstructured Github


❤️ Contributors, please refer to 📙Contributing Guide. Unless the PR can be sent immediately (e.g. just a few lines of code), we recommend you to leave a comment on the issue like I'm working on it or Can I work on this issue? to avoid duplicating work. Our Discord server is always open and friendly.

stevereiner avatar Jun 10 '25 03:06 stevereiner