fix: replace stdio chown with prctl(PR_SET_DUMPABLE) and pipe-wrap

After setuid, the kernel clears the dumpable flag, making /proc/self/
entries owned by root. This broke open("/dev/stderr") for non-root
users inside subcontainers. The previous fix (chowning /proc/self/fd/*)
was dangerous because it chowned whatever file the FD pointed to (could
be the journal socket).

The proper fix is prctl(PR_SET_DUMPABLE, 1) after setuid, which restores
/proc/self/ ownership to the current uid.

Additionally, adds a `pipe-wrap` subcommand that wraps a child process
with piped stdout/stderr, relaying to the original FDs. This ensures all
descendants inherit pipes (which support re-opening via /proc/self/fd/N)
even when the outermost FDs are journal sockets. container-runtime.service
now uses this wrapper.

With pipe-wrap guaranteeing pipe-based FDs, the exec and launch non-TTY
paths no longer need their own pipe+relay threads, eliminating the bug
where exec would hang when a child daemonized (e.g. pg_ctl start).
This commit is contained in:
Aiden McClelland
2026-03-23 01:14:49 -06:00
parent 8d1e11e158
commit 2aa910a3e8
3 changed files with 98 additions and 15 deletions

View File

@@ -5,7 +5,7 @@ OnFailure=container-runtime-failure.service
[Service]
Type=simple
Environment=RUST_LOG=startos=debug
ExecStart=/usr/bin/node --experimental-detect-module --trace-warnings /usr/lib/startos/init/index.js
ExecStart=/usr/bin/start-container pipe-wrap /usr/bin/node --experimental-detect-module --trace-warnings /usr/lib/startos/init/index.js
Restart=no
[Install]