langchainjs
langchainjs copied to clipboard
Unsafe usage of string interpolation in SQL queries
APIs based on Postgres use the node-postgres library. This library's parameterized queries work for values but not for identifiers (source). In many places table and column identifiers can be set by developers. In these cases simple string interpolation is used to build the dynamic query. In PGVectorStore for example:
await this.pool.query(`
CREATE TABLE IF NOT EXISTS ${this.tableName} (
"${this.idColumnName}" uuid NOT NULL DEFAULT uuid_generate_v4() PRIMARY KEY,
"${this.contentColumnName}" text,
"${this.metadataColumnName}" jsonb,
"${this.vectorColumnName}" vector
);
`);
node-postgres also allows for multiple statements to be executed in one query. See here, also confirmed by myself.
This means that a table name such as
yougotpwned (
id SERIAL PRIMARY KEY
);
<malicious queries here>
CREATE TABLE IF NOT EXISTS legittable
could run arbitrary queries on the database through SQL injection.
Developers creating multi-tenant applications could face this problem if they use any kind of user provided content to generate table or column names. Developers may trust that this library (langchainjs) securely sanitizes these inputs.
In 8.11.0, pg-node introduced an escapeIdentifier
utility which may be used to sanitize table and column names, although someone with more expertise in security may be able to better determine its viability. Alternatively, the pg-format library can be used, however this library has been archived since last year with no official successor fork.
🤖
Thank you for bringing this to our attention, MJDeligan. You're absolutely right about the potential SQL injection vulnerability due to the unsafe usage of string interpolation in SQL queries.
As you've mentioned, the escapeIdentifier
utility introduced in pg-node version 8.11.0 could be a potential solution. However, to provide a more accurate solution, I would need a bit more information.
Could you please provide the following details:
- The version of LangChainJS you are currently using.
- The version of node-postgres library you are using.
- Any specific areas in the codebase where you've noticed this issue, apart from the PGVectorStore example you've already provided.
This information will help me understand the context better and provide a more accurate solution.
About Dosu
This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.
Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.
If you want to continue the conversation, start your reply with @dosu-bot.