langchain community: test file triggers antivirus scan

Checked other resources

[X] I added a very descriptive title to this issue.
[X] I searched the LangChain documentation with the integrated search.
[X] I used the GitHub search to find a similar question and didn't find it.
[X] I am sure that this is a bug in LangChain rather than my code.
[X] The bug is not resolved by updating to the latest stable version of LangChain (or the specific integration package).

Example Code

Not applicable

Error Message and Stack Trace (if applicable)

No response

Description

The following files trigger antivirus scans which report the presence of the Emf.Exploit.CVE_2017-3122-6335825-0 CVE:

docs/docs/integrations/document_loaders/example_data/fake.vsdx
libs/community/tests/examples/fake.vsdx

These files were added in the https://github.com/langchain-ai/langchain/pull/16171 PR.

Details on the scan results: https://www.virustotal.com/gui/file/3b02db67f312bfb1a0ac430673c372ec92eabfaf2888030161d7841ae2120f5f/detection

Recommendation: remove the visio/media/image2.emf entry from the fake.vsdx archive. This is the file which triggers the CVE and it is not required for tests which use the archive.

System Info

System Information

OS: Linux OS Version: #26~22.04.1-Ubuntu SMP PREEMPT_DYNAMIC Tue Mar 12 10:22:43 UTC 2 Python Version: 3.11.4 (main, Jul 10 2023, 09:48:51) [GCC 11.3.0]

Package Information

langchain_core: 0.1.26 langchain: 0.1.9 langchain_community: 0.0.22 langsmith: 0.1.5 langchain_experimental: 0.0.52 langchain_openai: 0.0.7

Packages not installed (Not Necessarily a Problem)

The following packages were not found:

langgraph langserve

Apr 15 '24 08:04 snopoke

I assume there's no easy way to test whether the file actually has a malicious payload? As far as I understand, if there were a malicious payload, this would affect users trying to open the file with specific versions of Adobe? I'll remove the example file from the docs -- this feels like a false positive from the antivirus software (but likely no easy way to confirm whether the contents are malicious or not)

Apr 18 '24 19:04 eyurtsev

I assume there's no easy way to test whether the file actually has a malicious payload? As far as I understand, if there were a malicious payload, this would affect users trying to open the file with specific versions of Adobe? I'll remove the example file from the docs -- this feels like a false positive from the antivirus software (but likely no easy way to confirm whether the contents are malicious or not)

Correct. I assume this is a false positive and not actually harmful however it would be better to remove the file regardless since false positives are frustrating and could lead to people adding antivirus rules that ignore harmful files as well.

Apr 20 '24 08:04 snopoke

I came here to report this as well. According to virustotal, only ClamAV identified this as a threat.

I see that the file from the docs directory was removed, but there appears to still be another, with the same hash (b84af575643ac927e21073f336370945), located in the latest master branch here. 2024-04-22_14-59

Apr 22 '24 21:04 tdfacer

langchain langchain copied to clipboard

community: test file triggers antivirus scan

Checked other resources

Example Code

Error Message and Stack Trace (if applicable)

Description

System Info

System Information

Package Information

Packages not installed (Not Necessarily a Problem)

langchain
langchain copied to clipboard