Standard outputs streaming¶
BuildBox supports using the LogStream API for streaming the contents that a command writes to stdout
and stderr
.
The following diagram shows the flow of information when buildbox is connected to BuildGrid, which has a built-in LogStream service implementation 5.
LogStream API¶
The LogStream API 1 defines a mechanism for transmitting logs that relies on the ByteStream API 2 for the actual transfer of the data.
An execution service provides a worker with a write_resource_name
, to which the worker has exclusive access, and broadcasts an associated read_resource_name
to clients that are interested in receiving those outputs.
Since the logs are uploaded to CAS after a job has finished running, the streams are only alive while the command is executing. Once the last reader closes its connection, the stream ends, and the outputs should be accessed via the stdout/err blobs stored in CAS.
BuildBox Runner¶
The Runner
class provided by buildbox runner offers the --std{out, err}--file
CLI options to allow the process that invokes the runner to redirect the stdout/err pipes of the command to those particular paths.
If no files are given, the runner will create and use temporary files that are deleted once their contents are uploaded into CAS. If paths are specified, those files will not be deleted after that upload operation, persisting after the runner process has exited.
buildbox-common¶
buildbox-common
provides facilities to write to a LogStream and to monitor files and incrementally stream their contents as they are being updated.
In particular, it provides the StandardOutputStreamer
class, which offers a high-level interface that takes a path, the address of a ByteStream endpoint and a resource name and streams the contents of an output file as it gets written.
BuildBox worker¶
buildbox-worker
scans for trailing metadata entries 4 that are attached to the UpdateBotSession()
response coming from the execution server.
In particular, it looks for entries with name "executeoperationmetadata-bin"
and whose value is a serialized ExecuteOperationMetadata
3 message m
such that:
m.action_digest
is equal to the digest of the pertinent action in theLease
, and
m.stdout_stream_name
andm.stderr_stream_name
point to ByteStream resource names.
If such a message is found, BuildBox worker will stream the outputs of that Action’s command to the given resource name.
When streaming is requested, the worker will create a pair of files, set their paths in the --std{out, err}-file
options it passes to the runner, and spawn two StandardOutputStreamer
instances that will monitor those files and stream to the given resource names. Once the command has finished running they will commit the streamed data and delete the files.
On-demand streaming¶
Because an execution service might not be able to predict whether some client will request streamed logs while a command is running, workers must always stream the outputs. That would imply wasted network traffic if those logs end up unread.
In order to avoid that, BuildGrid implements a mechanism for signaling that at least one reader is interested in the logs; in effect creating the ByteStream write resource on-demand. For that it offers the ByteStream.QueryWriteStatus()
call, which will block on the server side and return either OK
, meaning that the worker can start writing to the stream and a reader will receive the contents, or a NOT_FOUND
error, which indicates that streaming can be skipped due to the logs not being requested.
buildbox-worker will leverage that behavior and start streaming only when QueryWriteStatus()
returns successfully for the ByteStream resource.
LogStream tools¶
buildbox-tools contains some utilities that allow to write to a LogStream service and receive data. Those can be useful for testing purposes or for leveraging an existing LogStream service to stream other contents. (Note that the underlying data transfers, done via the ByteStream API, can contain any arbitrary binary data.)