Feature: Search capability
Pertaining WasmEdge #3504
This PR aims to add search capabilities to the RAG API Server. This functionality will be enabled through the search feature. It will also leverage the llamaedge-query-server to deliver nuanced search queries and provide the RAG API Server something to fall back on in case the RAG fails due to no RAG query matches.
Hello, I am a code review agent on flows.network. Here are my reviews of changed source code files in this PR.
Cargo.toml
Potential issues
-
Issue: Version locking for all dependencies is not recommended as it can lead to incompatibilities and missed out on security updates. Consider using a version range or the latest compatible version for most dependencies, except for those you have explicitly patched (e.g., "socket2", "reqwest", "hyper", "tokio").
-
Issue: The "llama-core" dependency is pinned to a specific version ("=0.17.0") and has the "logging" feature enabled, which might not be necessary or desirable in all deployment scenarios. Consider using a newer compatible version of "llama-core" if available, and make the usage of the "logging" feature optional or configurable based on the deployment environment.
Summary of changes
-
Addition of the "search" feature: The main change is the addition of a new optional feature named "search". This may potentially allow for improved search functionality, but without more context or information about its purpose and implementation, it's hard to provide a detailed explanation.
-
Updated Git branch reference for 'tokio': The git URL and branch used for the tokio dependency has been updated from an unspecified version to the "v1.3" branch of the "second-state/wasi_tokio" repository.
-
No other major changes were detected in the patch, such as removal or modification of existing features.
src/backend/ggml.rs
Potential issues
Based on the given Rust code, I have identified three major coding issues that need to be addressed:
-
Error Handling in
models_handlerfunction: The error handling for serializing the model list result is not wrapped within a match statement, which could lead to unhandled panics if the serialization fails. -
Error Handling in
embeddings_handlerandrag_query_handlerfunctions: These two functions have similar issues where error handling for creating HTTP response objects is not properly handled inside a match statement. This can cause unhandled panics if the response creation fails. -
Error Handling in
files_handlerfunction: The error handling for reading buffer from request body usingto_bytesfunction and converting it intoMultipartobject is not wrapped within a match statement, which could lead to unhandled panics if any of these operations fails.
Summary of changes
Key Changes:
-
Added Conditional Compilation (#cfg) for "search" feature to selectively include/exclude sections of code based on the feature flag during compilation. This is likely a new functionality that can perform web searches when enabled.
-
Introduced
web_search_allowedvariable under the "search" feature which will be set to true when no points are retrieved, thus enabling web search. -
Added logic to send a request to an external LlamaEdge query server for web searches when
web_search_allowedis true and the "search" feature flag is enabled. This includes parsing query, building requests, making HTTP requests, handling responses, error checking, and injecting search summary into conversation context if the decision from query server is successful.
src/main.rs
Potential issues
Based on a review of the provided source code for an LlamaEdge-RAG API Server, I've identified three major coding issues:
-
Error Handling and Logging Issue (Line 286): The error message being logged when parsing the socket address may not be clear to developers or users. It would be beneficial to provide a more descriptive error message that suggests potential solutions, such as "Invalid socket address format. Please use the IP:PORT syntax."
-
Validation Missing for URL (Line 154): There is no validation check in place for the Qdrant REST API URL provided by the user through the command line arguments. Adding a URL validity check could prevent unnecessary errors and improve the overall reliability of the server.
-
Server Configuration Issue (Line 208-214): The application does not validate whether the prompt template for chat models supports system messages while the RAG policy is set to "system_message." This can lead to runtime errors, especially if developers or users use chat models that do not support system messages in their templates. Adding a validation check here would ensure consistency and prevent unexpected behavior at runtime.
Summary of changes
Key Changes:
-
Added a new module
utilsand imported it. This suggests that the project might include additional utility functions or types. -
Introduced a new feature flag "search". Under this flag, added imports for
SearchArgumentsfrom theutilsmodule and a static variableSEARCH_ARGUMENTS. These changes likely indicate the addition of search functionality to the project. -
Modified the CLI arguments by adding two new options:
api-key, used to supply an API key for the endpoint, andquery-server-url, which is a required URL for the LlamaEdge query server. Also added a new optionsearch-backendwith a default value "tavily" that requires thequery-server-url. These changes indicate the extension of functionality to include internet searches.
src/utils.rs
Potential issues
-
In the
is_valid_urlfunction, there's no error handling for when parsing a URL fails. It would be beneficial to have some way of indicating that an invalid URL was provided, such as by returning a Result instead of a boolean. -
The conversion from
LogLevel::Criticaltolog::LevelFilter::Errorin the implementation ofFrom<LogLevel> for log::LevelFiltermay not be intentional. It's worth reviewing this mapping to ensure it meets the project's requirements correctly. -
The
SearchArgumentsstruct uses a string (search_backend) to represent an enumeration of possible search backends. Using an enumeration would make the code safer and more maintainable, as it would prevent invalid values from being used at compile-time.
Summary of changes
- Addition of a new struct, "SearchArguments", which is used to handle additional parameters for internet search in the context of a feature called "search".
- Introduction of three new fields within "SearchArguments": 'api_key', 'query_server_url', and 'search_backend'. This enables support for different search backends and query servers.
- The 'gen_chat_id' function is unaffected by this change as it remains unchanged, aside from potential usage of the new "SearchArguments" struct if it's enabled with the feature flag.
@suryyyansh Please fix the CI failure. Thank you!
Hi @suryyyansh
Could you please write documentation on how to use the search API server? Thanks. https://github.com/LlamaEdge/docs
Is there a GitHub repo called search-api-server? If so, could you please send us your repo link and add README.md about the project? Thanks.
@suryyyansh Could you please fix the conflicts? Thanks a lot.
@suryyyansh Could you please rebase this PR onto the latest code. Thanks a lot!