terraform-ls icon indicating copy to clipboard operation
terraform-ls copied to clipboard

Terraform-LS 0.29.x not working properly on Linux

Open jleechpe opened this issue 2 years ago • 7 comments

Server Version

0.29.0 0.29.1 0.29.2


Terraform Version

1.2.8

Client Version

ArchLinux
Emacs 28.1
- LSP-Mode (latest git)
- Eglot (Latest Git)

Terraform Configuration Files

<New File>

[|] is cursor

resource "azu[|]" "" {

}

Log Output

Shows successful completion right after initializing then error on subsequent (after file is modified): https://gist.github.com/jleechpe/3c449eab5885fc9149c5fa9a28d95261

Expected Behavior

Completions to populate

Actual Behavior

Emacs LSP-mode responds with "cannot find file: main.tf" Emacs Eglot responds with json-rpc error "main.tf: file not found"

Steps to Reproduce

LSP-Mode

  1. Open a terraform file in Emacs
  2. Connect with lsp-mode (default settings)
  3. Try to complete -> First attempt succeeds
  4. Edit the file in any way
  5. Try to complete -> LSP errors with cannot find file

Eglot

  1. Open a terraform file in Emacs
  2. Connect with eglot (terraform-ls serve -log-file ~/tmp/tflog/log)
  3. Trigger Completion -> Succeeds
  4. Edit the file
  5. Trigger Completion -> JSON RPC error "main.tf: file not found"
  6. Undo edits to file (reverts back to initial state)
  7. Trigger Completion -> Succeeds as it did in 3.

Manually downloaded terraform-ls versions 0.29.2, 0.29.1, 0.29.0, 0.28.1 and tested with eglot: Connect using: ~/downloads/terraform-ls/<version>/terraform-ls serve -> 0.29.x all error. 0.28.1 behaves as expected.

Tested on Windows with Emacs 29.0.50, Terraform 1.2.9 and terraform-ls 0.29.2 -> No issues Tested on Windows with VSCode -> No issues Tested on Archlinux with VSCode -> Same issue as in Emacs (worked on initial test but multiple tests since have the same issue as Emacs)

I was able once or twice to get terraform-ls 0.29.2 to work in Emacs after stopping/restarting the server multiple (10+) times, but this was inconsistent and I could not determine what made it start working.

Also see: https://github.com/emacs-lsp/lsp-mode/issues/3713 (issue on MacOS as well)

Edit: VSCode does not work on subsequent tests

jleechpe avatar Sep 08 '22 15:09 jleechpe

Hi @jleechpe Thank you for the report.

I don't see how the OS could be the factor here. I am willing to guess it's more of a race condition somewhere.

The initial log you shared would suggest that it happens even on a smaller config, but I have a reason to believe that there are some other configs (which you omitted from the comment), which the LS does still have to parse. I'm not implying they are the trigger of the bug, but it makes it unfortunately harder to reproduce the bug by just following the steps you outlined.

Completion of resource types will not work without either provider block or required_providers block entry. The log implies that you had none of these in the main.tf file yet you did get completion candidates for AzureRM, hence I'm assuming the provider block or required_providers block was in a separate file somewhere in the same folder.

Can you share more about the folder, or confirm that you were able to reproduce this with a single file in any other way?

btw. if you wish to share the logs more privately, you can do so via email at radek <at> hashicorp.com and/or use my public key to encrypt it (https://keybase.io/radeksimko).

radeksimko avatar Sep 21 '22 12:09 radeksimko

cc @psibi you mentioned here

I can confirm that I can reproduce in both lsp-mode as well as VS Code.

Is there a snippet of configuration which you can share that would reliably (or even just >50% of times) lead to reproduction?

radeksimko avatar Sep 21 '22 12:09 radeksimko

Alternatively @arcsteveio would you be able to share the whole /Users/sopalenski/dev/terraform_scratch folder (any *.tf or *.tfvars or *.tf.json and *.tfvars.json), or confirm that there aren't any aside from the main.tf in your original report at https://github.com/emacs-lsp/lsp-mode/issues/3713#issue-1364904527?

radeksimko avatar Sep 21 '22 13:09 radeksimko

The only reason I'm suspecting it's OS related (although Mac + Linux) is that I use the exact same Emacs config on a Windows machine and have 0 issues with terraform-ls 0.29.x.

I did have additional provider config I forgot to include

terraform {
  required_providers {
    azurerm = {
      source = "hashicorp/azurerm"
      version = "~> 3.0"
    }
  }
}

provider "azurerm" {
  features {}
}

resource "azurerm_resource_group" "test" {
  
}

GIF of reproducible steps to trigger the error on Archlinux (same steps but removed all my emacs config and terraform config for caching providers as well as anything else that might affect) using a freshly created user profile with the bare minimum to start emacs and load terraform-ls for a file.

  1. terraform init in folder with only a main.tf with the contents above
  2. Launch emacs
  3. Install terraform-mode and lsp-mode
  4. Visit main.tf and move point to inside the resource group definition
  5. Active lsp-mode so that terraform-ls launches
  6. Trigger completion-at-point and see that completion behaves as expected
  7. Start completing and error regarding the file shows up (but can still complete the found completion)
  8. Remove the modifications (but file is still showing as modified) and completion no longer works when calling completion-at-point

https://ibb.co/zRkHrx3

jleechpe avatar Sep 21 '22 13:09 jleechpe

Alternatively @arcsteveio would you be able to share the whole /Users/sopalenski/dev/terraform_scratch folder (any *.tf or *.tfvars or *.tf.json and *.tfvars.json), or confirm that there aren't any aside from the main.tf in your original report at emacs-lsp/lsp-mode#3713 (comment)?

Sure, I can share the main.tf file. I have no other variable files. It's 30 lines so I'll paste it in the window.

provider "aws" {
  region = "us-east-1"
}

data "aws_subnet" "mgmt" {
  id = "subnet-0d7cf1345bb5add54"
}

data "aws_security_group" "mgmt_sg" {
  id = "sg-0f548d25b5df4479b"
}

resource "aws_instance" "scratch" {
  ami                  = "ami-05874fa9befd5c838"
  subnet_id            = data.aws_subnet.mgmt.id
  iam_instance_profile = "AmazonSSMRoleForInstancesQuickSetup"
  instance_type        = "t3.micro"
  key_name             = "devops_demo_key"
  security_groups      = [data.aws_security_group.mgmt_sg.id]
  tags = {
    Name             = "some-name"
    ab_deploy_method = "terraform"
    ab_deploy_owner  = "sopalenski"
    ab_deploy_env    = "dev"
  }
}

output "instance_id" {
  value = aws_instance.scratch.id
}

Happy to share more if it will help.

arcsteveio avatar Sep 21 '22 13:09 arcsteveio

Is there a snippet of configuration which you can share that would reliably (or even just >50% of times) lead to reproduction?

I see that others have already given terraform files, so I would not share one more example (But do let me know, I can easily create one). Note that I have been able to trivially reproduce this in my Linux system (NixOS) in any of the terraform based projects.

psibi avatar Sep 21 '22 15:09 psibi

I'm able to reproduce the issue on Linux and macOS.

The backup file created by emacs seems to trigger the error. When you open a main.tf in emacs and modify the buffer, it will create a .#main.tf file in the same directory, which seems to be a special symlink. The language server is not able to parse the module configuration when encountering such files and errors (copied from the logs you provided):

 "OpTypeParseModuleConfiguration" for {"file:///home/<user>/tmp/tftest"} (err = open /home/<user>/tmp/tftest/.#main.tf: no such file or directory, deferredJobs: [])

I cannot reproduce the issue starting with a clean directory containing only a main.tf and only using VS Code on Linux. But if there is still a .#main.tf from emacs, the language server would show the same behavior.


Taking things one step further, one can force/reproduce the error by creating an invalid symlink: ln -sf sample invalid.tf

~/tmp/issue1067 via 💠 default
❯ ls -l
total 8
lrwxr-xr-x  1 dbanck  staff    6 Sep 23 12:53 invalid.tf@ -> sample
-rw-r--r--  1 dbanck  staff  209 Sep 23 11:20 main.tf

That will break the parsing for the module and all language server requests like completion or hover will fail with the same error:

[Error - 1:06:34 PM] Request textDocument/hover failed.
  Message: main.tf: file not found
  Code: -32098

I think we should update the module parsing to ignore inaccessible files and continue parsing.

dbanck avatar Sep 23 '22 11:09 dbanck

I'm guessing the reason the issue doesn't show up on Windows is that it handles those symlinks differently under the hood?

jleechpe avatar Sep 23 '22 12:09 jleechpe

Thank you for debugging this one @dbanck ! I got caught up in this

I can confirm that I can reproduce in both lsp-mode as well as VS Code.

It appears that emacs is effectively the trigger which creates the symlink, which of course other IDEs which run the LS can pick up as well once it is created and surface the same problem, but the other IDEs do not create these symlinks, which explains why this isn't reproducible without emacs.

For context, this is likely related to the following issues/PRs around handling of hidden files:

  • https://github.com/hashicorp/terraform-ls/pull/971
  • https://github.com/hashicorp/terraform-ls/pull/968
  • https://github.com/hashicorp/terraform-ls/issues/972

I'd like to take a moment to think (again) about how we handle hidden files and the consequences of those decisions. AFAIK people do use symlinks which lead to genuinely parsable files, so we cannot just ignore symlinks entirely. Ignoring hidden files effectively means reverting https://github.com/hashicorp/terraform-ls/issues/972

I also want to check how this affects hidden *.tfvars since these are additionally not ignored by Terraform CLI.

One possible solution would be treating such hidden files as more isolated individual files, such that they don't get parsed automatically with the rest of all *.tf files in the same folder but only when they get open and then their AST is also maintained separately from the other files - so that e.g. duplicates aren't reported between hidden and visible files.

cc @chriswacker

radeksimko avatar Sep 26 '22 08:09 radeksimko

Would disabling backup files (or writing them somewhere else) be a workaround?

john2x avatar Oct 06 '22 02:10 john2x

Would disabling backup files (or writing them somewhere else) be a workaround?

I'm not an emacs user, but if it's possible to move or disable the backup files, this should mitigate the problem.

dbanck avatar Oct 06 '22 08:10 dbanck

My default configuration does have a different location for backup files but that doesn't fix the issue:

(setq backup-directory-alist '(("" . "~/.emacs.d/emacs_backup")))

I guess I can disable the creation of backup files, but that's something I would like to avoid.

psibi avatar Oct 06 '22 08:10 psibi

My default configuration does have a different location for backup files but that doesn't fix the issue:

Can you ensure that there are no hidden files left over from emacs inside your project directory? Furthermore, I can take a look at your LS output log to see if there are any references to unreadable files.

dbanck avatar Oct 06 '22 10:10 dbanck

I've been following along with this as I have the exact same issue, but have just made a potentially useful discovery and a bit of a work around.

So, emacs (depending on your config) is potentially creating several hidden files for slightly different purposes. Let's say you have a "main.tf" file, you might get a) .main.tf.~undo-tree~ - which stores undo history and b) .#main.tf - a lock file that exists only while unsaved changes are present in the file

I successfully moved the undo tree files to a different directory, using a config snippet like @psibi mentioned above, and made sure there were no old undo-tree files lying around, but still got the same issue occurring.

I then added another bit of config like:

(setq create-lockfiles nil)

which disables the second hidden file ever being created, and now terraform ls seems happy again.

These lockfiles cannot be stored in a different place so this seems to be the only valid workaround. I would prefer not to disable them as i find them quite useful (I use syncthing to sync certain files between machines, and these lockfiles let me know if I have a certain file already open on another machine with unsaved changes) But perhaps useful to know.

anthonyfinch avatar Oct 06 '22 10:10 anthonyfinch

Actually, updating further on this, there is now a variable in emacs 28, lock-file-name-transforms that does indeed allow you to change what this file gets called, and therefore where it lives, so you could use this to avoid these files in the working directory (see: https://debbugs.gnu.org/cgi/bugreport.cgi?bug=31908)

anthonyfinch avatar Oct 06 '22 11:10 anthonyfinch

As a temporary workaround I have switched to 0.28.1 which I can confirm works fine. I also volunteer to test out any PR's which will help in resolving this issue.

psibi avatar Oct 13 '22 11:10 psibi

I can confirm that on emacs, (setq create-lockfiles nil) was necessary for me to avoid the error. It clearly chokes on the unexpected symbolic link matching the *.tf filename pattern.

carl-reverb avatar Oct 13 '22 13:10 carl-reverb

Did something change in version 0.29 as to how it processes files in the directory?

Emacs lsp-mode has lockfiles listed as ignored "[/\\]\.#[^/\\]+\'" (parsing out the extra backslashes for legibility) for which events are not sent to the server.
image

This works with 0.28 but those files are obviously being seen in 0.29.

jleechpe avatar Oct 13 '22 14:10 jleechpe

I also get this problem on emacs 30.0.50 built from source. Not sure why the lsp is trying to read these files, but it sure is annoying!

ryanobjc avatar Jan 31 '23 05:01 ryanobjc

emacs+eglot with terraform-lsp works fine, whereas terraform-ls 0.30.2 still crashes on startup. disabling lockfiles didnt help.

cmdcelp avatar Feb 20 '23 23:02 cmdcelp

@cmdcelp Probably you want to open a different issue as this issue is about mis handling of hidden files in emacs ?

Also, for me terraform-ls 0.30.2 seems to be working well with emacs + lsp-mode.

psibi avatar Feb 21 '23 06:02 psibi

Nice to hear that the fix works for you @psibi 🙂

@cmdcelp Can you share more information about the crash you're encountering? Please create a new issue for this and attach the log files. Thanks!

dbanck avatar Feb 21 '23 10:02 dbanck

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues. If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

github-actions[bot] avatar Mar 24 '23 03:03 github-actions[bot]