agent icon indicating copy to clipboard operation
agent copied to clipboard

Downloaded output file has extra line breaks

Open essen opened this issue 6 years ago • 7 comments

I can see extra CR invisible characters in the downloaded output files, the line endings become CRCRLF instead of just CRLF, and my browser and editors display an extra line break between each line as a result. For example:

https://builds.ninenines.eu/logs/cowlib/141/centos/output.txt

I'm not sure it's the agent's fault but considering the extra CRs occur only on lines of outputs from the configured commands, and not on the lines from the agent itself, I think it's likely to be something the agent is doing.

If it's inserted by the agent on purpose then an option to disable this behavior would be nice to have.

essen avatar Apr 18 '18 09:04 essen

Hi @essen, sorry about this and thanks for the report! Would you be able to drop us an email at [email protected] with a build where this is happening?

Is the agent in question a windows agent (gonna guess no based on the centos bit in the url)? Are you viewing the logs on windows?

lox avatar Apr 18 '18 12:04 lox

Nothing private so no need for sending an email. An example build is https://buildkite.com/ninenines/cowlib/builds/141#_ and if you can access this too the agents are https://buildkite.com/organizations/ninenines/agents

Basically the agents where this CRCRLF happens are all the v3.0.1 while the v2 ones are all fine (only CRLF there).

No Windows involved at any point, all Linux (Arch Linux on my machine) and happens when downloaded via browser and curl.

essen avatar Apr 18 '18 14:04 essen

Hi @essen I’ve just been looking at this now but I’m afraid those links have gone 404 since your report a couple of years ago. We haven’t had any reports of this since and I’m unable to reproduce it. I’m going to tentatively close this issue but if you (or anyone else watching) is still experiencing this issue or an issue like it please don’t hesitate to re-open this issue or leave a comment :bow:

keithduncan avatar Aug 30 '21 02:08 keithduncan

An example of a recent build, although the agents are a few versions behind: https://buildkite.com/organizations/ninenines/pipelines/ranch/builds/289/jobs/0de65494-6463-43f8-9209-7299ef376d96/download.txt

essen avatar Sep 01 '21 13:09 essen

Thanks @essen, I will take a look!

keithduncan avatar Sep 02 '21 02:09 keithduncan

I think this is behaving as expected. The Download and Raw Logs link both give you the unprocessed log stream as captured by the Buildkite Agent, including terminal escape codes and TTY overwrites.

For example the raw log includes multiple lines for remote: Counting objects: 5% (1/17)[K, remote: Counting objects: 11% (2/17)[K etc as part of the git operations but the rendered output compresses that to the last line remote: Compressing objects: 100% (3/3), done.

It is possible to get the rendered output from our API but you need an access token (even though the build and its log is public 😅):

curl --header 'Authorization: Bearer [redacted]' https://api.buildkite.com/v2/organizations/ninenines/pipelines/ranch/builds/289/jobs/0de65494-6463-43f8-9209-7299ef376d96/log.html >~/Downloads/logs.html.txt

This downloads the processed logs as shown on the Buildkite Builds page under the Logs tab.

Please let me know if that’s the behaviour you were observing, I’m happy to take another look if I’ve missed something! 😄

keithduncan avatar Sep 09 '21 06:09 keithduncan

I am talking about extra CR characters. This is a screenshot from an editor that can print "invisible" characters (SciTE here):

Screenshot from 2021-09-09 11-32-30

Instead of CRLF we end up with CRCRLF in the output.

In vim this shows up as extra ^M characters instead of extra empty lines, with ^M corresponding to the extra CR characters:

Screenshot from 2021-09-09 11-37-07

Note that the extra CR isn't present for the lines that correspond to the commands the agent runs (make -k ci... and make dialyze here):

Screenshot from 2021-09-09 11-40-03

It is only in the output from these commands that extra CR characters are present.

essen avatar Sep 09 '21 09:09 essen

Any update on this?

Using the downloaded logs from BK is very annoying because all the output appears interspersed with an extra newline.

travisdowns avatar Jan 31 '23 15:01 travisdowns

I've been doing some digging, and these extra CR characters come from the https://github.com/creack/pty library that we use to expose a Pseudoterminal to the commands running in the build job.

If you Write []byte("one\ntwo\r\nthree") to that package's PTY, you get back []byte("one\r\ntwo\r\n\r\nthree").

Running the Buildkite Agent with --no-pty avoids the issue, but the implications of that are probably worse than getting extra newlines in the raw log.

I'm keen to dig deeper and figure out if it's a bug in the library and whether that's fixable, or if it's something more inherent to running in a pseudoterminal.

pda avatar Oct 20 '23 10:10 pda

I think this is fundamental to running in PTY mode.

The PTY driver in the kernel is mapping LFCR LF, and terminal emulators (Terminal.app / xterm / iTerm2 etc) rely on that; CR meaning “go to the start of the lineandLFmeaning go down one line. Sending justLFto a terminal emulator results in stepped text, which we can see by turning off this mapping withstty -onlcr` (“Do not map NL to CR-NL on output):

$ printf "one\ntwo\nthree"
one
two
three

stty -onlcr

$ printf "one\ntwo\nthree"
one
   two
      three

This is often seen when a program puts the PTY into raw mode and then crashes; subsequent text comes out stepped like that.

So, if our raw log didn't use CR LF, then the terminal/HTML rendering of it will be garbled.

But! It looks like our github.com/buildkite/terminal-to-html terminal emulator doesn't have this issue:

	case '\n':
		p.screen.newLine()

parser.go#L210-L211

func (s *screen) newLine() {
	s.x = 0
	s.y++
}

screen.go#L258-L261

So, perhaps we could configure the PTY to disable that mapping, like stty -onlcr does. https://github.com/creack/pty has limited support for the ioctl interface to configure the PTY, but perhaps we could extend it.

pda avatar Oct 20 '23 11:10 pda

I've tested putting stty -onlcr in a Buildkite command step…

Here's the HTML-rendered log, looks fine:

image

Here's the raw log as rendered by Firefox 118:

image

And here's the terminal output from the agent itself via the not-commonly-used --write-job-logs-to-stdout flag: (macOS 14, iTerm2.app, tmux 3.3a, TERM=tmux-256color)

image

We can see that stty -onlcr fixes the raw log rendering, and doesn't break the terminal log rendering, so perhaps we can set that -onlcr flag on the PTY from the agent itself.

In man 4 termios it's called ONLCR c_oflag.

https://pkg.go.dev/golang.org/x/term#MakeRaw would disable ONLCR but it would probably disable too many other things too. Worth checking.

Or, https://pkg.go.dev/golang.org/x/sys/unix#IoctlSetTermios etc give us low-level interfaces to get/set the termios settings for the PTY… this patch seems to work, but needs some love and testing:

diff --git a/process/process.go b/process/process.go
index ed500f40..eca5e6d1 100644
--- a/process/process.go
+++ b/process/process.go
@@ -17,6 +17,7 @@ import (
 	"time"
 
 	"github.com/buildkite/agent/v3/logger"
+	"golang.org/x/sys/unix"
 )
 
 const (
@@ -171,6 +172,18 @@ func (p *Process) Run(ctx context.Context) error {
 			return err
 		}
 
+		// Ask the PTY driver not to turn \n into \r\n
+		// This might not work for Windows?
+		termios, err := unix.IoctlGetTermios(int(pty.Fd()), unix.TIOCGETA)
+		if err != nil {
+			return fmt.Errorf("unix.IoctlGetTermios: %w", err)
+		}
+		termios.Oflag &^= unix.ONLCR // Unset the output flag that maps NL (\n) to CRNL (\r\n)
+		err = unix.IoctlSetTermios(int(pty.Fd()), unix.TIOCSETA, termios)
+		if err != nil {
+			return fmt.Errorf("unix.IoctlSetTermios: %w", err)
+		}
+
 		// Make sure to close the pty at the end.
 		defer func() { _ = pty.Close() }()
 

In some testing it seems to turn \r\n into \n\n which might be related to termios input flags, but I couldn't find a combination of those that made sense. And I'm a bit concerned that some terminal programs might be depending on CR/\r for returning the cursor to the start of the current line in raw mode, etc. Ideally they'd use ESC [ G for that.

So, this patch might break things (e.g. progress indicators that update over time). More thought and testing needed.

pda avatar Oct 20 '23 13:10 pda