Unix.open_process* and file descriptors
Thursday 15 March 2012, by
Hi fellow OCamlers, long time no see! :-)
I would like to mention here an issue with
Unix.open_process* that may concern some of you here.
Processes opened using
Unix.open_process in POSIX systems are spawned using
fork(). Therefore, they inherit all file descriptors opened by the calling process.
This leads to many unfortunate minor issues. For instance, if the calling process has opened a port and crashes after calling the external process, then that port remains open until the external process terminates..
More importantly, this is a source of potentially important security issues: by default, processes spawned with
Unix.open_process* will have access to any opened file or socket, allowing them to those read files’ content or sniff network traffic..
The problem is not easy to fix using the standard POSIX API. If you have control over all opened file descriptor in your code, you can use
close-on-exec on each of them, which makes sure that they are closed when forking. However, for large programs this quickly turns out to be impractical, in particular if your program is using libraries that open file descriptors.
Another common solution is to emulate BSD’s
closefrom() function, which closes all file descriptors higher than a given number. You can find a portable but possibly very slow implementation of this function in openssh’s code:
OCaml developers are aware of the situation but haven’t found a suitable fix for them. Here is a comment from Xavier Leroy:
Feel free to publicize this PR so that Unix experts out there can chime in.
This is an old issue that popped up during the development of Cash (the Caml shell), in particular. The "ideal" solution is to have all file descriptors in
close-on-execmode, but it requires discipline from the programmers. (File descriptors created by
close-on-exec, but not those created by
Unix.socket, for compatibility with the Unix specs.)
Also, a long time ago we experimented with a
closeall()/closefrom()function that did not use
/proc/<pid>/fd, and it was really slow to call
close()on thousands of potential file descriptors. The
/proc/<pid>/fdtrick is nice but not terribly portable.
This issue can be a problem for some of you so I wanted with this post to properly document it. It is particularly crucial if you implement any kind of delegation to a less powerful system user using forked processes, for instance when implementing a CGI interface..
Take care y’all!