databricks-vscode
databricks-vscode copied to clipboard
[Feature] Support ignoring syncing certain paths
I'd like to use .gitignore-style syntax to specify which files should not be included in the directory sync (I am using workspace sync in my extension settings).
Context: I have a project with a large dataset of pdf files that I do not ever want to sync to a remote repository. Or if I'm working with a project that generates files I do not want them to be automatically synced to the workspace.
This triggers other errors, including delays executing workloads.
Both local and system-wide .gitignore files should be respected when doing a directory sync.
If you have a .gitignore file but its contents is not respected when doing a sync, could you provide more details about your setup (e.g. which directories you want to ignore, (subset of) `.gitignore contents, their paths, etc).
Thanks.
I see. I am developing on a large(r) monolith whose file size exceeds the current maximum. In order for me to sync this project using the vscode extension I need to add certain files that I do not want to sync to my .gitignore file. This allows the directory sync to succeed. However, this introduces another issue: I occasionally need to develop in areas of the monolith that I would like to exclude in the directory sync. So I need to remove them from the .gitignore file or add them directly.
If I understand correctly you would need a separate mechanism from .gitignore to control which subtrees are synchronized (e.g. through inclusion/exclusion) so you stay within file size / file count limits.
What limit do you run into specifically?
I have a similar issue with a monorepo. I have a repo .gitignore as well as one in my project's local folder. The local one just has a single line for the folder I am trying to exclude, i.e. data/. This is a rather large file that is currently syncing even though I would like to exclude it.
If you require any additional information, please let me know.
Thank you!
I have similar issue. I can not sunc my repo with error message Client.Timeout exceeded while awaiting headers
Are you looking for includes/excludes beyond what is specified in your .gitignore files? We already support ignoring syncs for any files matching patterns on any .gitignore files.
I have a monorepo with multiple gitignore files. My project gitignore file just has one item data/. This folder was being synchronized despite being gitignored.
In the v2 of the extension we now use Databricks Asset Bundles, and they provide a way to ignore files