Type: Bug

Behaviour

Bug Description

The .env file parser in the Python extension does not support multiline environment variables, breaking compatibility with the standard python-dotenv library and causing corrupted/truncated values to be passed to Jupyter kernels.

Steps to reproduce:

1. Create a `.env` file with a multiline variable

Create a file named .env in your workspace root:

TEST_ENV_VAR_MULTILINE='{
  "key1": "value1",
  "key2": "value2"
}'
TEST_ENV_VAR_SIMPLE='simple_value'

2. Create a Jupyter notebook and check the environment

Create a new Jupyter notebook (.ipynb) and run this code in the first cell (before any load_dotenv() or imports):

import os
print("Simple var:", os.environ.get('TEST_ENV_VAR_SIMPLE'))
print("Multiline var:", repr(os.environ.get('TEST_ENV_VAR_MULTILINE')))

3. Observe the bug

Expected output:

Simple var: 'simple_value'
Multiline var: '{\n  "key1": "value1",\n  "key2": "value2"\n}'

Actual output:

Simple var: 'simple_value'
Multiline var: "'{"

The multiline variable is truncated to just the first line!

4. Compare with regular Python script

Run the same code in a regular Python script (not Jupyter):

# test.py
from dotenv import load_dotenv
import os

load_dotenv()
print("Multiline var:", repr(os.environ.get('TEST_ENV_VAR_MULTILINE')))

$ python test.py
Multiline var: '{\n  "key1": "value1",\n  "key2": "value2"\n}'  ✅ Works correctly!

Extension version: 2025.14.0 VS Code version: Code 1.104.1 (0f0d87fa9e96c856c5212fc86db137ac0d783365, 2025-09-17T23:36:24.973Z) OS version: Linux x64 6.8.0-79-generic Modes:

Python version (& distribution if applicable, e.g. Anaconda): 3.11.13
Type of virtual environment used (e.g. conda, venv, virtualenv, etc.): Venv
Value of the python.languageServer setting: Default

User Settings


venvPath: "<placeholder>"

languageServer: "Pylance"

Installed Extensions

Extension Name	Extension Id	Version
claude-code	Ant	2.0.0
copilot	Git	1.372.0
copilot-chat	Git	0.31.3
js-debug	ms-	1.104.0
js-debug-companion	ms-	1.1.3
jupyter	ms-	2025.8.0
jupyter-keymap	ms-	1.1.2
jupyter-renderers	ms-	1.3.0
material-theme	zhu	3.19.0
pylint	ms-	2025.2.0
python	ms-	2025.14.0
rainbow-csv	mec	3.22.0
remote-containers	ms-	0.427.0
ruff	cha	2025.26.0
vscode-js-profile-table	ms-	1.0.10
vscode-jupyter-cell-tags	ms-	0.1.9
vscode-jupyter-slideshow	ms-	0.1.6
vscode-pylance	ms-	2025.8.3
vscode-python-envs	ms-	1.8.0

System Info

Item	Value
CPUs	Intel(R) Core(TM) i7-7600U CPU @ 2.80GHz (4 x 2900)
GPU Status	2d_canvas: enabled direct_rendering_display_compositor: disabled_off_ok gpu_compositing: enabled multiple_raster_threads: enabled_on opengl: enabled_on rasterization: enabled raw_draw: disabled_off_ok skia_graphite: disabled_off trees_in_viz: disabled_off video_decode: enabled video_encode: disabled_software vulkan: disabled_off webgl: enabled webgl2: enabled webgpu: disabled_off webnn: disabled_off
Load (avg)	5, 4, 4
Memory (System)	31.23GB (15.46GB free)
Process Argv
Screen Reader	no
VM	0%
DESKTOP_SESSION	ubuntu-xorg
XDG_CURRENT_DESKTOP	Unity
XDG_SESSION_DESKTOP	ubuntu-xorg
XDG_SESSION_TYPE	x11

Sep 29 '25 23:09 piskunow

Root Cause

I traced this to the .env parser implementation in:

File: src/client/common/variables/environment.ts (lines 125-166)

The parser consists of two functions that work together:

1. `parseEnvFile()` - The Main Parser (Lines 125-139)

export function parseEnvFile(lines: string | Buffer, baseVars?: EnvironmentVariables): EnvironmentVariables {
    const globalVars = baseVars ? baseVars : {};
    const vars: EnvironmentVariables = {};
    lines
        .toString()
        .split('\n')  // ← THE BUG: Splits by newline FIRST
        .forEach((line, _idx) => {
            const [name, value] = parseEnvLine(line);
            if (name === '') {
                return;
            }
            vars[name] = substituteEnvVars(value, vars, globalVars);
        });
    return vars;
}

The problem: .split('\n') splits the entire file into lines before parsing, which destroys multiline quoted values. The file is split into separate lines before any quote-awareness happens.

2. `parseEnvLine()` - The Line Parser (Lines 141-166)

function parseEnvLine(line: string): [string, string] {
    // Most of the following is an adaptation of the dotenv code:
    //   https://github.com/motdotla/dotenv/blob/master/lib/main.js#L32
    // We don't use dotenv here because it loses ordering, which is
    // significant for substitution.
    const match = line.match(/^\s*([a-zA-Z]\w*)\s*=\s*(.*?)?\s*$/);
    if (!match) {
        return ['', ''];
    }

    const name = match[1];
    let value = match[2];
    if (value && value !== '') {
        if (value[0] === "'" && value[value.length - 1] === "'") {
            value = value.substring(1, value.length - 1);
            value = value.replace(/\\n/gm, '\n');
        } else if (value[0] === '"' && value[value.length - 1] === '"') {
            value = value.substring(1, value.length - 1);
            value = value.replace(/\\n/gm, '\n');
        }
    } else {
        value = '';
    }

    return [name, value];
}

Additional problems in parseEnvLine():

Single-line regex: The pattern /^\s*([a-zA-Z]\w*)\s*=\s*(.*?)?\s*$/ only matches complete KEY=VALUE pairs within a single line
No state tracking: Doesn't track whether we're inside a quoted string
Only handles escaped newlines: replace(/\\n/gm, '\n') only converts \n strings to actual newlines, doesn't preserve actual newline characters in the source

Why the Comment About dotenv is Misleading

The code has this comment:

"We don't use dotenv here because it loses ordering, which is significant for substitution."

This reasoning is outdated for several reasons:

Modern dotenv preserves order: Since ES2015, JavaScript objects maintain insertion order. The dotenv library (v16+) returns plain objects that preserve the order variables are defined in the file.

Substitution can still work: The variable substitution logic (substituteEnvVars()) can be applied to the output of dotenv.parse() just as easily:

const parsed = dotenv.parse(lines);
for (const [name, value] of Object.entries(parsed)) {
    vars[name] = substituteEnvVars(value, vars, globalVars);
}

The custom parser broke a key feature: By reimplementing parsing from scratch, this code lost the multiline support that the original dotenv library provides correctly.

Current Data Flow

The bug manifests through this chain:

File read: customEnvironmentVariablesProvider.node.ts reads .env
Parse: Calls parseEnvFile() which splits by \n first
Export to kernel: Parsed (corrupted) variables passed to kernelEnvVarsService.node.ts
Kernel spawn: kernelProcess.node.ts spawns kernel with corrupted environment
Result: Jupyter kernel inherits TEST_VAR="'{"instead of full JSON

What happens to the .env file:

Original content:

TEST_ENV_VAR_MULTILINE='{
  "key1": "value1",
  "key2": "value2"
}'

After .split('\n'):

[
  "TEST_ENV_VAR_MULTILINE='{",  // ← Only this line matches the regex
  '  "key1": "value1",',         // ← No '=' sign, skipped
  '  "key2": "value2"',          // ← No '=' sign, skipped
  "}'"                           // ← No '=' sign, skipped
]

Result: TEST_ENV_VAR_MULTILINE = "'{" (corrupted!)

Impact

This bug affects:

✅ Affected

Jupyter notebooks in VSCode - Kernel inherits corrupted environment variables
Pydantic Settings - Reads from os.environ which has corrupted values
Any library that reads from os.environ before loading .env files
Real-world use cases:
- SSH/SSL private keys (multiline by nature)
- JSON Web Tokens (JWT)
- Certificates
- Pretty-printed JSON configurations
- SQL scripts

❌ Not Affected

Regular Python scripts - They use python-dotenv directly (works correctly)
Terminal/shell - Doesn't use VSCode's parser
Single-line environment variables - Work fine

Expected Behavior

The parser should handle multiline values the same way python-dotenv does:

Valid .env syntax per the dotenv standard:

# Multiline with actual newlines (should work)
MULTILINE='{
  "key": "value"
}'

# Multiline with escaped newlines (already works)
ESCAPED='{"key": "value",\n  "key2": "value2"}'

Both formats should be supported.

Proposed Solution

Replace the line-by-line parser with a state machine parser that:

Reads characters sequentially
Tracks quote state (inside single quote, double quote, or unquoted)
Only treats newline as a line separator when outside quotes
Preserves newlines within quoted values

Implementation Approach 1: State Machine Parser

export function parseEnvFile(lines: string | Buffer, baseVars?: EnvironmentVariables): EnvironmentVariables {
    const globalVars = baseVars ? baseVars : {};
    const vars: EnvironmentVariables = {};
    const content = lines.toString();

    let i = 0;

    while (i < content.length) {
        // Skip whitespace (but not newlines)
        while (i < content.length && content[i] !== '\n' && /\s/.test(content[i])) {
            i++;
        }

        // Skip empty lines and comments
        if (i >= content.length || content[i] === '\n') {
            i++;
            continue;
        }
        if (content[i] === '#') {
            while (i < content.length && content[i] !== '\n') i++;
            i++;
            continue;
        }

        // Parse variable name
        const nameStart = i;
        while (i < content.length && /[a-zA-Z0-9_]/.test(content[i])) {
            i++;
        }
        const name = content.substring(nameStart, i);

        if (!name) {
            while (i < content.length && content[i] !== '\n') i++;
            i++;
            continue;
        }

        // Skip whitespace and =
        while (i < content.length && /[\s=]/.test(content[i]) && content[i] !== '\n') {
            i++;
        }

        // Parse value
        let value = '';
        if (i < content.length && content[i] !== '\n') {
            const quote = content[i];

            if (quote === '"' || quote === "'") {
                // Quoted value - can span multiple lines
                i++; // Skip opening quote
                let escaped = false;

                while (i < content.length) {
                    const char = content[i];

                    if (escaped) {
                        // Handle escape sequences
                        switch (char) {
                            case 'n': value += '\n'; break;
                            case 'r': value += '\r'; break;
                            case 't': value += '\t'; break;
                            case '\\': value += '\\'; break;
                            case quote: value += quote; break;
                            default: value += '\\' + char; break;
                        }
                        escaped = false;
                    } else if (char === '\\') {
                        escaped = true;
                    } else if (char === quote) {
                        break; // Closing quote found
                    } else {
                        value += char; // Include newlines!
                    }
                    i++;
                }

                if (i < content.length && content[i] === quote) {
                    i++; // Skip closing quote
                }
            } else {
                // Unquoted value - single line only
                const valueStart = i;
                while (i < content.length && content[i] !== '\n' && content[i] !== '#') {
                    i++;
                }
                value = content.substring(valueStart, i).trim();
            }
        }

        // Skip to next line
        while (i < content.length && content[i] !== '\n') i++;
        if (i < content.length) i++;

        // Store the variable
        if (name) {
            vars[name] = substituteEnvVars(value, vars, globalVars);
        }
    }

    return vars;
}

Implementation Approach 2: Use Standard `dotenv` Library (Recommended)

I strongly recommend this approach - delegate to the well-tested dotenv npm package instead of maintaining custom parsing logic:

import * as dotenv from 'dotenv';

export function parseEnvFile(lines: string | Buffer, baseVars?: EnvironmentVariables): EnvironmentVariables {
    const globalVars = baseVars ? baseVars : {};
    const parsed = dotenv.parse(lines);
    const vars: EnvironmentVariables = {};

    // Apply variable substitution to the parsed values
    // This maintains the ordering needed for substitution
    for (const [name, value] of Object.entries(parsed)) {
        vars[name] = substituteEnvVars(value, vars, globalVars);
    }

    return vars;
}

Why this is the better solution:

Battle-tested library: dotenv has 18+ million weekly downloads and has been refined over years
- Handles multiline values correctly
- Handles all quote types (single, double, backticks)
- Handles escape sequences properly
- Handles edge cases you haven't thought of

Preserves ordering: Modern JavaScript (ES2015+) guarantees object insertion order, so variable substitution works perfectly:

// Given .env:
// BASE_URL=https://example.com
// API_URL=${BASE_URL}/api

const parsed = dotenv.parse(envContent);
// parsed = { BASE_URL: 'https://example.com', API_URL: '${BASE_URL}/api' }

// Substitution happens in order:
for (const [name, value] of Object.entries(parsed)) {
    vars[name] = substituteEnvVars(value, vars, globalVars);
}
// Result: { BASE_URL: 'https://example.com', API_URL: 'https://example.com/api' }

Matches python-dotenv behavior: Since Python developers expect .env files to work the same in Python scripts and Jupyter, using the canonical dotenv library ensures consistency
Less maintenance burden:
- ✅ No custom parser to maintain
- ✅ Bug fixes handled by the community
- ✅ New features (like inline comments) come for free
- ✅ Security patches handled upstream
Already a dependency: The dotenv package is likely already in the dependency tree for other features
Minimal code change: Only ~10 lines changed, with exactly the same API

Package details:

NPM: https://www.npmjs.com/package/dotenv
GitHub: https://github.com/motdotla/dotenv
Weekly downloads: 18+ million
Size: ~20KB minified

Suggested Tests

Tests should be added to verify multiline support:

test('Parse multiline value with single quotes', () => {
    const content = `TEST_VAR='{
  "key1": "value1",
  "key2": "value2"
}'`;

    const vars = parseEnvFile(content);

    assert.strictEqual(vars.TEST_VAR, '{\n  "key1": "value1",\n  "key2": "value2"\n}');
    assert.doesNotThrow(() => JSON.parse(vars.TEST_VAR));
});

test('Parse SSH private key (real-world use case)', () => {
    const content = `SSH_KEY="-----BEGIN RSA PRIVATE KEY-----
MIIEpAIBAAKCAQEA04up8hoqzS1+
-----END RSA PRIVATE KEY-----"`;

    const vars = parseEnvFile(content);

    assert.include(vars.SSH_KEY, '-----BEGIN RSA PRIVATE KEY-----');
    assert.include(vars.SSH_KEY, '\n');
});

test('Regression: Single-line values still work', () => {
    const content = `VAR1=value1
VAR2='value2'
VAR3="value3"`;

    const vars = parseEnvFile(content);

    assert.strictEqual(vars.VAR1, 'value1');
    assert.strictEqual(vars.VAR2, 'value2');
    assert.strictEqual(vars.VAR3, 'value3');
});

Additional Context

The same bug exists in vscode-jupyter: src/platform/common/variables/environment.node.ts
Both extensions share nearly identical parsing code
This is a silent failure - no error message, just corrupted data
Affects developers using Pydantic Settings, FastAPI, and other modern Python frameworks

Workarounds (Temporary)

Until this is fixed, users can:

Use single-line JSON:

JSON_VAR='{"key": "value", "key2": "value2"}'

Use escaped newlines:

JSON_VAR='{"key": "value",\n  "key2": "value2"}'

Clear os.environ in code before using Pydantic:

import os
os.environ.pop('CORRUPTED_VAR', None)
from dotenv import load_dotenv
load_dotenv()

Use a separate config file instead of embedding JSON in .env

References

dotenv specification: https://github.com/motdotla/dotenv
python-dotenv (correctly handles multiline): https://github.com/theskumar/python-dotenv
Related issue in vscode-jupyter: [link if you create one there]

Labels

bug, area-environments, area-jupyter, needs-investigation

Willingness to Contribute

I'm willing to submit a PR with the fix if the maintainers agree with the approach. Happy to use either the state machine parser or delegate to the dotenv npm package - whichever the team prefers!

Sep 29 '25 23:09 piskunow

Thank you for the extensive work both discovering this bug and tracking down a fix! I will review your approach and let you know so you can move forward with the correct PR. thanks!

cc @DonJayamanne as it is notebooks related as well

Oct 03 '25 17:10 eleanorjboyd

seeing now that this doesn't occur for regular python files, this might be more in Jupyter's domain. @amunger any ideas here as Don is out for a bit?

Oct 03 '25 20:10 eleanorjboyd

Indeed, the .env file is parsed when launching a Jupyter kernel, not for Python scripts by default

Oct 03 '25 20:10 piskunow

Do you suggest to open another issue in vscode-jupyter? The same function is defined: https://github.com/microsoft/vscode-jupyter/blob/952a5f7212890e079373736cfb37719b1fad9b80/src/platform/common/variables/environment.node.ts#L132

Oct 03 '25 21:10 piskunow

I'm not familiar with the env file parsing, so Don should definitely be the one to look help with this.

Oct 06 '25 15:10 amunger

@DonJayamanne any ideas here?

Oct 07 '25 20:10 eleanorjboyd

Making feature request as Python extension also has its own .env parser and effort was put into this to ensure it lines up with user expectations (features) when running Python code. Trying to add support for dotenv nmp package could break those and introduce other issues. But agreed, its worth looking into.

Oct 07 '25 22:10 DonJayamanne

Python extension also has its own .env parser

I came to say the same thing: Could the Jupyter extension leverage the settings + environment variable handling from the other extensions?

Nov 20 '25 05:11 afeld

`.env` Parser Does Not Support Multiline Environment Variables

Behaviour

Bug Description

Steps to reproduce:

1. Create a .env file with a multiline variable

2. Create a Jupyter notebook and check the environment

3. Observe the bug

4. Compare with regular Python script

Root Cause

1. parseEnvFile() - The Main Parser (Lines 125-139)

2. parseEnvLine() - The Line Parser (Lines 141-166)

Why the Comment About dotenv is Misleading

Current Data Flow

What happens to the .env file:

Impact

✅ Affected

❌ Not Affected

Expected Behavior

Proposed Solution

Implementation Approach 1: State Machine Parser

Implementation Approach 2: Use Standard dotenv Library (Recommended)

Suggested Tests

Additional Context

Workarounds (Temporary)

References

Labels

Willingness to Contribute

1. Create a `.env` file with a multiline variable

1. `parseEnvFile()` - The Main Parser (Lines 125-139)

2. `parseEnvLine()` - The Line Parser (Lines 141-166)

Implementation Approach 2: Use Standard `dotenv` Library (Recommended)