images icon indicating copy to clipboard operation
images copied to clipboard

[Golang dev-images] The temporary file that causes shell scripts to fail

Open Aisuko opened this issue 2 years ago • 2 comments

Hi, guys. Thanks for working on this project. I am a big fan of the dev-containers project. I have many of experience with dev-containers. I hit an issue when I use the sed command in the mcr.microsoft.com/devcontainers/go:0-1.20-bullseye container. I always get a temporary file and it causes shell scripts to fail.

However, it works well on my local laptop(M1 Pro). So, please help me figure out the reason, thanks.

Shell scripts

## GPT4ALL
gpt4all:
	git clone --recurse-submodules $(GPT4ALL_REPO) gpt4all
	cd gpt4all && git checkout -b build $(GPT4ALL_VERSION) && git submodule update --init --recursive --depth 1
	# This is hackish, but needed as both go-llama and go-gpt4allj have their own version of ggml..
	@find ./gpt4all -type f -name "*.c" -exec sed -i'' -e 's/ggml_/ggml_gptj_/g' {} +
	@find ./gpt4all -type f -name "*.cpp" -exec sed -i'' -e 's/ggml_/ggml_gptj_/g' {} +
	@find ./gpt4all -type f -name "*.h" -exec sed -i'' -e 's/ggml_/ggml_gptj_/g' {} +
	@find ./gpt4all -type f -name "*.cpp" -exec sed -i'' -e 's/gpt_/gptj_/g' {} +
	@find ./gpt4all -type f -name "*.h" -exec sed -i'' -e 's/gpt_/gptj_/g' {} +
	@find ./gpt4all -type f -name "*.h" -exec sed -i'' -e 's/set_console_color/set_gptj_console_color/g' {} +
	@find ./gpt4all -type f -name "*.cpp" -exec sed -i'' -e 's/set_console_color/set_gptj_console_color/g' {} +
	@find ./gpt4all -type f -name "*.cpp" -exec sed -i'' -e 's/llama_/gptjllama_/g' {} +
	@find ./gpt4all -type f -name "*.go" -exec sed -i'' -e 's/llama_/gptjllama_/g' {} +
	@find ./gpt4all -type f -name "*.h" -exec sed -i'' -e 's/llama_/gptjllama_/g' {} +
	@find ./gpt4all -type f -name "*.txt" -exec sed -i'' -e 's/llama_/gptjllama_/g' {} +
	@find ./gpt4all -type f -name "*.cpp" -exec sed -i'' -e 's/json_/json_gptj_/g' {} +
	@find ./gpt4all -type f -name "*.cpp" -exec sed -i'' -e 's/void replace/void json_gptj_replace/g' {} +
	@find ./gpt4all -type f -name "*.cpp" -exec sed -i'' -e 's/::replace/::json_gptj_replace/g' {} +
	mv ./gpt4all/gpt4all-backend/llama.cpp/llama_util.h ./gpt4all/gpt4all-backend/llama.cpp/gptjllama_util.h

Configuration see below.

// For format details, see https://aka.ms/devcontainer.json. For config options, see the
// README at: https://github.com/devcontainers/templates/tree/main/src/docker-existing-docker-compose
{
	"name": "Existing Docker Compose (Extend)",

	// Update the 'dockerComposeFile' list if you have more compose files or use different names.
	// The .devcontainer/docker-compose.yml file contains any overrides you need/want to make.
	"dockerComposeFile": [
		"../docker-compose.yaml",
		"docker-compose.yml"
	],

	// The 'service' property is the name of the service for the container that VS Code should
	// use. Update this value and .devcontainer/docker-compose.yml to the real service name.
	"service": "api",

	// The optional 'workspaceFolder' property is the path VS Code should open by default when
	// connected. This is typically a file mount in .devcontainer/docker-compose.yml
	"workspaceFolder": "/workspace",

	"features": {
		"ghcr.io/devcontainers/features/go:1": {},
		"ghcr.io/azutake/devcontainer-features/go-packages-install:0": {}
	},

	// Features to add to the dev container. More info: https://containers.dev/features.
	// "features": {},

	// Use 'forwardPorts' to make a list of ports inside the container available locally.
	// "forwardPorts": [],

	// Uncomment the next line if you want start specific services in your Docker Compose config.
	// "runServices": [],

	// Uncomment the next line if you want to keep your containers running after VS Code shuts down.
	// "shutdownAction": "none",

	// Uncomment the next line to run commands after the container is created.
	"postCreateCommand": "make prepare"

	// Configure tool-specific properties.
	// "customizations": {},

	// Uncomment to connect as an existing user other than the container default. More info: https://aka.ms/dev-containers-non-root.
	// "remoteUser": "devcontainer"
}

Docker-compose file

version: '3.6'
services:
  # Update this to the name of the service you want to work with in your docker-compose.yml file
  api:
    # Uncomment if you want to override the service's Dockerfile to one in the .devcontainer 
    # folder. Note that the path of the Dockerfile and context is relative to the *primary* 
    # docker-compose.yml file (the first in the devcontainer.json "dockerComposeFile"
    # array). The sample below assumes your primary file is in the root of your project.
    #
    build:
      context: .
      dockerfile: .devcontainer/Dockerfile

    volumes:
      # Update this to wherever you want VS Code to mount the folder of your project
      - .:/workspace:cached

    # Uncomment the next four lines if you will use a ptrace-based debugger like C++, Go, and Rust.
    # cap_add:
    #   - SYS_PTRACE
    # security_opt:
    #   - seccomp:unconfined

    # Overrides default command so things don't shut down after the process ends.
    command: /bin/sh -c "while sleep 1000; do :; done"

Dockerfile

ARG GO_VERSION=1.20
FROM mcr.microsoft.com/devcontainers/go:0-$GO_VERSION-bullseye
RUN apt-get update && apt-get install -y cmake
vscode ➜ /workspace (master) $ make gpt4all
git clone --recurse-submodules https://github.com/go-skynet/gpt4all gpt4all
Cloning into 'gpt4all'...
remote: Enumerating objects: 3126, done.
remote: Counting objects: 100% (468/468), done.
remote: Compressing objects: 100% (88/88), done.
remote: Total 3126 (delta 418), reused 395 (delta 379), pack-reused 2658
Receiving objects: 100% (3126/3126), 9.13 MiB | 10.80 MiB/s, done.
Resolving deltas: 100% (2017/2017), done.
Submodule 'llama.cpp' (https://github.com/manyoso/llama.cpp.git) registered for path 'gpt4all-backend/llama.cpp'
Cloning into '/workspace/gpt4all/gpt4all-backend/llama.cpp'...
remote: Enumerating objects: 1977, done.        
remote: Counting objects: 100% (777/777), done.        
remote: Compressing objects: 100% (57/57), done.        
remote: Total 1977 (delta 732), reused 720 (delta 720), pack-reused 1200        
Receiving objects: 100% (1977/1977), 2.02 MiB | 8.33 MiB/s, done.
Resolving deltas: 100% (1281/1281), done.
Submodule path 'gpt4all-backend/llama.cpp': checked out '03ceb39c1e729bed4ad1dfa16638a72f1843bf0c'
cd gpt4all && git checkout -b build a330bfe26e9e35ca402e16df18973a3b162fb4db && git submodule update --init --recursive --depth 1
Switched to a new branch 'build'
# This is hackish, but needed as both go-llama and go-gpt4allj have their own version of ggml..
sed: couldn't open temporary file ./gpt4all/gpt4all-backend/llama.cpp/tests/sedCQKLDZ: Permission denied
make: *** [Makefile:46: gpt4all] Error 1

vscode ➜ /workspace (master) $ ls -la  ./gpt4all/gpt4all-backend/llama.cpp/tests/
ls: cannot access './gpt4all/gpt4all-backend/llama.cpp/tests/sedCQKLDZ': No such file or directory
total 32
drwxr-xr-x  8 vscode vscode   256 May 16 09:06 .
drwxr-xr-x 40 vscode vscode  1280 May 16 09:06 ..
-rw-r--r--  1 vscode vscode   498 May 16 09:06 CMakeLists.txt
-?????????  ? ?      ?          ?            ? sedCQKLDZ
-rw-r--r--  1 vscode vscode  1733 May 16 09:06 test-double-float.c
-rw-r--r--  1 vscode vscode  5156 May 16 09:06 test-quantize-fns.cpp
-rw-r--r--  1 vscode vscode 11553 May 16 09:06 test-quantize-perf.cpp
-rw-r--r--  1 vscode vscode  2680 May 16 09:06 test-tokenizer-0.cpp

Aisuko avatar May 16 '23 09:05 Aisuko

It looks like -exec sed -i'' -e 's/ggml_/ggml_gptj_/g' {} + will create sedCQKLDZ somehow.

Aisuko avatar May 16 '23 12:05 Aisuko

In a container, the sed command with -i'' subcommand will create a temporary file in place but it will save into the same folder. And it will cause this issue. So, it works well after replacing -i: -exec sh -c "sed 's/ggml_/ggml_bert_/g' {} > {}.tmp && mv {}.tmp {}" \;.

However, I still do not know the reason it was failed in the container environment.

Aisuko avatar May 16 '23 14:05 Aisuko