terraform-provider-vsphere icon indicating copy to clipboard operation
terraform-provider-vsphere copied to clipboard

fatal error: runtime: out of memory - when uploading multiple ISO images in parallel `vsphere_content_library_item`

Open lapawa opened this issue 2 years ago โ€ข 15 comments

Community Guidelines

  • [X] I have read and agree to the HashiCorp Community Guidelines .
  • [X] Vote on this issue by adding a ๐Ÿ‘ reaction to the original issue initial description to help the maintainers prioritize.
  • [X] Do not leave "+1" or other comments that do not add relevant information or questions.
  • [x] If you are interested in working on this issue or have submitted a pull request, please leave a comment.

Terraform

v1.1.7 on linux_amd64

Terraform Provider

v2.1.1

VMware vSphere

v7.0.3

Description

Uploading multiple ISO images in parallel with size of e.g. 5.6G to a content library fails with an out of memory exception.

Affected Resources

resource/vsphere_content_library_item

Terraform Configuration

https://gist.github.com/lapawa/2353174f8f5488f3cdf09322cfb331b9

Debug Output

No response

Panic Output

https://gist.github.com/lapawa/b9ffa0fe2fd872a9ea8c2d05008b585a

Expected Behavior

Completing apply with successfully uploaded items into the vSphere Content Library.

Actual Behavior

Content library and items were created but the content has 0 bytes and Terraform panics.

Steps to Reproduce

Create a vsphere_content_library and vsphere_content_library_item resource with an ISO image and a http url as content source.

Environment Details

No response

Screenshots

No response

References

No response

lapawa avatar Mar 29 '22 11:03 lapawa

Hello, ย  lapawa ! ๐Ÿ–

Thank you for submitting an issue for this provider. The issue will now enter into the issue lifecycle.

If you want to contribute to this project, please review the contributing guidelines and information on submitting pull requests.

github-actions[bot] avatar Mar 29 '22 11:03 github-actions[bot]

Hi @lapawa, ๐Ÿ‘‹

Please update the Terraform Configuration in the issue with the redacted configuration(s) for reproduction.

Ryan Johnson Staff II Solutions Architect | VMware, Inc.

tenthirtyam avatar Mar 29 '22 18:03 tenthirtyam

I've uploaded the configuration and added the gist link to the issues description.

lapawa avatar Mar 31 '22 06:03 lapawa

Hi @lapawa - does the issue also present itself if you modify the API timeout in the provider block to increase the default? Is the issue consistent and repeatable?

Ryan

tenthirtyam avatar Apr 19 '22 04:04 tenthirtyam

It's the same with a api_timeout on the provider of 60. It still crashes with: fatal error: runtime: out of memory and Error: The terraform-provider-vsphere plugin crashed!

And yes it is repeatable. I's only working when I upload a tiny ISO image like Debian netinstall.

lapawa avatar Apr 29 '22 11:04 lapawa

Thanks, @lapawa. I'll take a look and see if I can reproduce the error when time and prioritization permits.

Ryan Johnson Staff II Solutions Architect | VMware, Inc.

tenthirtyam avatar Apr 29 '22 15:04 tenthirtyam

I've taken quite some time to look at this issue, but have not been able to reproduce the exact same issue.

Rather, in my testing, I'm seeing that large or multiple remote (HTTP / HTTPS) .iso files are failing with a 500 Internal Server Error HTTP status code.

I've even modified the provider's CreateLibraryItem function (based on GH-1665) with the following snippets to call the desired function for remote .iso files:

func CreateLibraryItem(c *rest.Client, l *library.Library, name string, desc string, t string, file string, moid string) (*string, error) {
   #...
	switch {
	case isLocal && isOva:
		return &id, uploadSession.deployLocalOva(file, ovfDescriptor)
	case isLocal && !isOva && !isIso:
		return &id, uploadSession.deployLocalOvf(file, ovfDescriptor)
	case isLocal && isIso:
		return &id, uploadSession.deployLocalIso(file)
	case !isLocal && isOva:
		return &id, uploadSession.deployRemoteOva(file, ovfDescriptor)
	case !isLocal && !isOva && !isIso:                           # <----- Updated
		return &id, uploadSession.deployRemoteOvf(file)
	case !isLocal && isIso:                                      # <----- Added
		return &id, uploadSession.deployRemoteIso(file)      # <----- Added
	}
   #...
}

and

func (uploadSession *libraryUploadSession) deployRemoteIso(file string) error {
	ctx := context.TODO()
	_, err := uploadSession.ContentLibraryManager.AddLibraryItemFileFromURI(ctx, uploadSession.UploadSession, filepath.Base(file), file)
	if err != nil {
		return err
	}
	return uploadSession.ContentLibraryManager.WaitOnLibraryItemUpdateSession(ctx, uploadSession.UploadSession, time.Second*10, func() { log.Printf("Waiting...") })
}

This would be similar to:

govc library.import iso-test https://releases.ubuntu.com/jammy/ubuntu-22.04-live-server-amd64.iso

However, even though the AddLibraryItemFileFromURI from vmware/govmomi is being used the .iso files seem to fail with the following in the vSphere client:

Reason: Unable to update files in the library item. The source or destination may be slow or not responding.

It seems to be related to either/both the size of the .iso file and the number of files being uploaded to the content library. It does not seem to be related to HTTPS and SSL acceptance.

Labeling this issue with the community/contribution label in the event anyone has additional thoughts on how to enhance vsphere/internal/helper/contentlibrary/content_library_helper.go to help resolve this issue.

Ryan Johnson Staff II Solutions Architect | VMware, Inc.

tenthirtyam avatar Jun 06 '22 23:06 tenthirtyam

Hi @juanvalino,

Based on your work in GH-1664, I was curious if you'd like to research this issue alongside my prior comments and propose a contribution.

My review of the issue started to build off your contributions in PR GH-1665.

Let us know what you think.

Thanks! Ryan

Ryan Johnson Staff II Solutions Architect | VMware, Inc.

tenthirtyam avatar Jun 14 '22 22:06 tenthirtyam

@tenthirtyam Iยดm going to try to reproduce this in our plataform. As far as I known when you use remote files, vsphere is the responsible of the remote file download into the content library. Perhaps, there is some connectivity issue on the vshpere side that makes the download fail.

I'll keep you informed of my local testings.

juanvalino avatar Jun 16 '22 06:06 juanvalino

Much appreciated, @juanvalino!

I'm leaning in the same line of reasoning since the download is not proxied.

Ryan

tenthirtyam avatar Jun 16 '22 10:06 tenthirtyam

@tenthirtyam in my tests, a single remote ISO of 10GB (https://download.rockylinux.org/pub/rocky/8/isos/x86_64/Rocky-8.6-x86_64-dvd1.iso) ends with the same error: 500 Internal Server Error

Uploading the same iso remote url using vcenter web user interface works perfectly.

Uploading the same iso from local file (not remote url) works perfectly.

Could be some timeout during vcenter API calls?

This problem is beyond my capabilities with terraform/vcenter ๐Ÿ™

juanvalino avatar Jun 23 '22 11:06 juanvalino

No worries @juanvalino - many thanks for your testing efforts. This confirms my suspicions as well and will review it more with I'm back from vacation.

Ryan Johnson Staff II Solutions Architect | VMware, Inc.

tenthirtyam avatar Jun 23 '22 14:06 tenthirtyam

I'll give this a try, but I'm guessing the TF provider is attempting to read the entire OVF/OVA into memory:

...
io/ioutil.ReadAll(...)
io/ioutil/ioutil.go:27
github.com/hashicorp/terraform-provider-vsphere/vsphere/internal/helper/ovfdeploy.GetOvfDescriptor(0xc0000451c0, 0x39, 0x10000, 0x0, 0x0, 0x0, 0x0)
...

sneal avatar Jul 26 '22 22:07 sneal

Tracked it down about a week ago.

It's doing a push to the target from the host Terraform is running over versus a pull from the vCenter Server instance itself.

The root cause appears to be that the call to the AddLibraryItemFileFromURI function is not resulting in the desired pull.

Ryan Johnson Senior Staff Solutions Architect | Product Engineering @ VMware, Inc.

tenthirtyam avatar Aug 08 '22 01:08 tenthirtyam