langflow icon indicating copy to clipboard operation
langflow copied to clipboard

Directory component doesn't take file types into consideration

Open EDLLT opened this issue 1 year ago • 11 comments

Bug Description

When using the Directory component, we're able to manually insert the type of files to include as it looks for files in the directory.

The types component has an issue as it doesn't pass the "types" value that the user had set. This makes the retrieve_file_paths function rely on utils.py TEXT_FILE_TYPES instead.

Since the issue seems to be straightforward, I'll attempt to fix it myself and submit a PR

Reproduction

Create a directory component Set a path to a directory containing file extensions that aren't in utils.py(eg: .srt files) Set the types value to srt

Build the directory component

Expected behavior

It should include the .srt files. However, it ignores it

Who can help?

No response

Operating System

24.04

Langflow Version

v1.0.18

Python Version

3.12

Screenshot

No response

Flow File

No response

EDLLT avatar Oct 03 '24 18:10 EDLLT

To address the issue where the Directory component doesn't consider user-specified file types, you can modify the component to ensure it correctly passes the types parameter to the retrieve_file_paths function. Here's a brief overview of how you can implement this:

  1. Modify the DirectoryComponent Class: Ensure that the types parameter is correctly passed from the DirectoryComponent to the retrieve_file_paths function. This involves updating the load_directory method to include the types parameter when calling retrieve_file_paths.

  2. Update the retrieve_file_paths Function: Ensure that the function uses the types parameter passed from the DirectoryComponent instead of defaulting to TEXT_FILE_TYPES. This can be done by checking if the types parameter is provided and using it to filter the file paths.

Here's a simplified example of how you might adjust the DirectoryComponent:

def load_directory(self) -> list[Data]:
    path = self.path
    types = self.types or []  # Ensure types is a list
    # Other parameters...

    resolved_path = self.resolve_path(path)
    file_paths = retrieve_file_paths(resolved_path, load_hidden, recursive, depth, types)  # Pass types here

    # Rest of the code...

By ensuring the types parameter is passed and used correctly, the component should include files with extensions specified by the user, such as .srt files. This should resolve the issue you're experiencing with the Directory component ignoring user-specified file types.

To continue talking to Dosu, mention @dosu.

dosubot[bot] avatar Oct 03 '24 18:10 dosubot[bot]

@dosu That's almost correct, however passing in types as is would cause the default behavior of the component to include ALL types of files because an empty list overrides the default TEXT_FILE_TYPES list found in utils.

EDLLT avatar Oct 03 '24 19:10 EDLLT

To ensure the DirectoryComponent uses the default TEXT_FILE_TYPES when the types parameter is not specified, you can modify the load_directory method to handle this correctly. Instead of passing an empty list, you should pass None or use a conditional to default to TEXT_FILE_TYPES. Here's how you can adjust it:

def load_directory(self) -> list[Data]:
    path = self.path
    # Use TEXT_FILE_TYPES as the default if types is not specified
    types = self.types if self.types else TEXT_FILE_TYPES  # Use TEXT_FILE_TYPES if types is empty
    # Other parameters...

    resolved_path = self.resolve_path(path)
    file_paths = retrieve_file_paths(resolved_path, load_hidden, recursive, depth, types)  # Pass types here

    # Rest of the code...

This way, if self.types is not provided or is an empty list, it will default to using TEXT_FILE_TYPES, ensuring that only the specified file types are included by default [1].

To continue talking to Dosu, mention @dosu.

dosubot[bot] avatar Oct 03 '24 19:10 dosubot[bot]

@dosu

Good bot, now, I'd like to contribute but when attempting to make the backend from langflow's source code, I'm getting this error

make backend
Installing backend dependencies
make: uv: No such file or directory
make: *** [Makefile:69: install_backend] Error 127

What is uv?

EDLLT avatar Oct 03 '24 19:10 EDLLT

The error you're encountering is due to the uv command not being installed. In the Langflow project, uv is a tool used for managing dependencies and running various tasks, such as installing dependencies, running tests, and building the project.

To resolve this error, you need to install uv using pipx. You can do this by running the following command:

pipx install uv

Alternatively, you can run the setup_uv target in the Makefile, which will also install uv for you:

make setup_uv

This will ensure that uv is installed and available for use in the Makefile commands [1].

To continue talking to Dosu, mention @dosu.

dosubot[bot] avatar Oct 03 '24 19:10 dosubot[bot]

@dosu What is this and why is it failing?

CI Logs 0s Run echo $JOBS_JSON echo $JOBS_JSON echo $RESULTS_JSON echo "Exiting with $EXIT_CODE" exit $EXIT_CODE shell: /usr/bin/bash -e {0} env: JOBS_JSON: { "test-backend": { "result": "skipped", "outputs": {} }, "test-frontend": { "result": "skipped", "outputs": {} }, "lint-backend": { "result": "skipped", "outputs": {} }, "test-docs-build": { "result": "skipped", "outputs": {} }, "set-ci-condition": { "result": "success", "outputs": { "should-run-ci": "false" } } } RESULTS_JSON: [ "skipped", "skipped", "skipped", "skipped", "success" ] EXIT_CODE: [1](https://github.com/langflow-ai/langflow/actions/runs/11169601064/job/31050687782?pr=4017#step:2:1) { "test-backend": { "result": "skipped", "outputs": {} }, "test-frontend": { "result": "skipped", "outputs": {} }, "lint-backend": { "result": "skipped", "outputs": {} }, "test-docs-build": { "result": "skipped", "outputs": {} }, "set-ci-condition": { "result": "success", "outputs": { "should-run-ci": "false" } } } [ "skipped", "skipped", "skipped", "skipped", "success" ] Exiting with 1 Error: Process completed with exit code 1.

EDLLT avatar Oct 03 '24 21:10 EDLLT

The CI jobs are being skipped because the should-run-ci condition is set to false. This condition is determined by the set-ci-condition job, which checks if the CI should run based on specific criteria. The criteria are:

  • The pull request must have the 'lgtm' label and not be a draft, or
  • The event triggering the workflow must be workflow_dispatch or workflow_call.

Since none of these conditions are met, should-run-ci is false, causing all jobs to be skipped. The process exits with code 1 because the CI workflow is configured to do so when jobs are skipped and should-run-ci is false [1].

To continue talking to Dosu, mention @dosu.

dosubot[bot] avatar Oct 03 '24 21:10 dosubot[bot]

@dosu what does lgtm stand for? What do you mean by workflow_dispatch or workflow_call

EDLLT avatar Oct 03 '24 21:10 EDLLT

"lgtm" stands for "Looks Good To Me." It's commonly used in pull requests to indicate approval from reviewers.

In the context of CI workflows:

  • workflow_dispatch is a GitHub Actions event that allows a workflow to be manually triggered. It can include inputs specified by users when they trigger the workflow.

  • workflow_call is an event that allows a workflow to be called by another workflow, also with the ability to pass inputs from the calling workflow.

These events provide flexibility in triggering and managing workflows based on specific conditions or inputs [1][2][3].

To continue talking to Dosu, mention @dosu.

dosubot[bot] avatar Oct 03 '24 21:10 dosubot[bot]

@dosu So in my case, I should simply wait for a maintainer to review my PR, right?

EDLLT avatar Oct 03 '24 21:10 EDLLT

It seems like this issue requires a deeper look, @ogabrielluiz could you please lend your expertise?

To continue talking to Dosu, mention @dosu.

dosubot[bot] avatar Oct 03 '24 21:10 dosubot[bot]