docker-hs
docker-hs copied to clipboard
Added log processing
closes #74
Hey @ilyakooo0 Thanks for the PR. :cake: I've only taken a cursory look as I'm short on time but I'll try and review it properly tomorrow. It seems the CI is erroring for unrelated reasons.
Actually, some of the builds errored because I used the new ConduitT
, which is not available in the older versions.
UPD: I now fixed it and builds are failing only for unrelated reasons
Hi everyone, I ran into the same problem yesterday, but found that the format is documented in https://docs.docker.com/engine/api/v1.40/#operation/ContainerAttach
Here's some (simplified) code to show how I deal with it at the moment (which seems to work fine):
import Data.ByteString (ByteString)
import qualified Data.ByteString as BS
testSink :: ConduitT ByteString Void IO ()
testSink = do
-- wait for data to become available
mbs <- await
-- the data might be `Nothing` if the conduit has been closed,
-- so lets decide what to do...
case mbs of
Nothing -> pure ()
-- otherwise, if there is data, send it to the output channel with
-- appropriate metadata added to it
Just bs -> do
-- the docker daemon attaches 8 byte headers to each line of
-- output, see:
-- https://docs.docker.com/engine/api/v1.40/#operation/ContainerAttach
-- the first byte identifies the stream (1 - stdout, 2 - stderr)
let streamType = BS.unpack (BS.take 1 bs)
let message = BS.drop 8 bs
-- do something with `message` according to `streamType`
-- resume waiting for input
testSink
-- put this somewhere else in your program to use it
getContainerLogsStream logOpts containerID $ CB.lines .| testSink
It is worth noting that it may not be sensible to deal with this directly in the implementation of getContainerLogsStream
since the 8 byte headers are only present if tty: false
for the container. If a user sets tty: true
in the CreateOpts
then the 8 byte header will not be present.
Hey @mbg thanks for the clarification and the example. It was always the intention of the library to leave the stream processing outside of the library ie. let the user define the sink that gets passed to the getContainerLogsSream function. So I'm not sure if it's wise to include this entire machinery in the library itself.
That said, there is too much abstration leaking outside of the library and the user has to know docker api internals to construct a proper sink....(ie. the message structure and what the first bytes mean and so on). So I think we need to address this somehow.
I'd love to get this one merged but I haven't had time to look into it in more detail. I still think we shouldn't handle the stream necessarily in the library but there's also too much abstraction leaking if we let the users implement it themselves.
Would an example stream that just prints stuff to stdout and a pointer to it in the docs be the best course of action? @ilyakooo0 @mbg what do you think?
IMO the main problem is that the raw stream produces invalid characters (as some of the data is not text at all). So, the stream is mostly useless unless you do the exact processing that is done in this PR even if you want to get the binary output of the executable.
I can't imagine a reason why someone would use this library and want to get the raw unprocessed stream.
I agree with @ilyakooo0. I think that there should be an implementation of this in the library, at the very least as an opt-in (i.e. even if it was just something along the lines of the testSink
from my comment that got exported). The two questions for me are:
- If the processing gets incorporated into
getContainerLogsStream
, how do we disable it if"tty": true
(as I noted in my comment). This is not an issue if the processing is provided by a separate function. - Users would probably want more than just the
ByteString
representing the message, but also the source of the message (stdout/stderr). So we might need to introduce a new data type to represent a line of output for that.
Neither of these are particularly big issues, they are just design decisions that need to be made.
@denibertovic have you had any more thoughts on this / have you made a decision?
@mbg I agree with your previous comment. This should be implemented as a separate function and not directly in getContainerLogsStream
so that it can handle the tty: true
case. We export that function and provide docs/example on how to use it with getContainerLogStream.
I haven't thought about your second point but introducing a new type for that seems fine to me.
I don't have time to work on this but I'd be happy to merge if the above changes were made.