fork() and file descriptors with un-flushed output

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

fork() and file descriptors with un-flushed output

thoni56
Maybe this is an FAQ but I could not find it in it ;-) or in the lists I searched:

In cygwin, when you fork() process shares file descriptors. If there happens to be unflushed output in such a shared file descriptor buffer, would that be output by both processes?

I have some empirical evidence to support this theory. I support cgreen, a C unit test and mock framework, which runs every test case in its own processes using fork().

For many years I have seen the effect that when running in a command window every thing works as expected, But running in Emacs created multiple outputs. That has not bothered me that much but know I implemented some further output routines in the reporting code, and everything just blew up!

The test case is run in a separate processes using fork() which then messages back and then dies. The output from the runner (parent process) written to the file before the fork() is then output twice.

This behaviour changed to the expected (only printed once) if a fflush() was added after the printf() in the parent process before the fork. I'm suspecting this happens because of unflushed output in the file buffer which is shared by the two processes, first flushed when the child dies, then flushed by the parent at some point, not only duplicating output, but also garbling it.

Is this a known behaviour? Unavoidable in cygwin? (Obviously not, if I'm on the right track with my guesswork...) If it is a bug, will it be fixed?¨
Reply | Threaded
Open this post in threaded view
|

Re: fork() and file descriptors with un-flushed output

Daniel Colascione-5
On 8/27/2012 11:26 PM, thoni56 wrote:
> Is this a known behaviour? Unavoidable in cygwin? (Obviously not, if I'm on
> the right track with my guesswork...) If it is a bug, will it be fixed?

This behavior isn't Cygwin-specific. In fact, it's longstanding Unix behavior.
(The buffering problem is one reason you should, generally speaking, use
_exit(2), not exit(3), in a forked child.) Calling fflush(NULL) before the fork
will flush all stdio buffers for you and eliminate the problem, assuming you're
single-threaded.


signature.asc (267 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: fork() and file descriptors with un-flushed output

Wolf Geldmacher
In reply to this post by thoni56
This is not a bug - it's a feature ;-)

The "issue" you are describing is in fact the standard behaviour
expected of fork() in any unix/posix compliant implementation.

 From the fork man page on Linux:

> ...
> fork()  creates  a new process by duplicating the calling process.  The
>  new process, referred to as the child, is an  exact  duplicate of  the
>  calling  process,  referred  to as the parent,
> ...

and yes, "exact duplicate" includes all data in buffers not yet flushed.

The difference in behaviour when you  run your program from the terminal
vs. from Emacs stems from the "intelligence" built into the stdio
library that looks at type of the file a stream is connected to and
automagically turns on full buffering if it is not connected to a
terminal in order to optimize performance.

To avoid the duplication of data you can either explicitly turn off
buffering with setbuf() (and pay the associated performance penalty),
fflush() your open files before you fork (usually the easiest to
implement), or revert to the use of the basic OS functions open(),
read(), write(), close() (useful for special cases when not much of
stdio is needed - make sure you don't mix the two).

Cheers,
Wolf

On 28.08.2012 08:26, thoni56 wrote:

> Maybe this is an FAQ but I could not find it in it ;-) or in the lists I
> searched:
>
> In cygwin, when you fork() process shares file descriptors. If there happens
> to be unflushed output in such a shared file descriptor buffer, would that
> be output by both processes?
>
> I have some empirical evidence to support this theory. I support cgreen, a C
> unit test and mock framework, which runs every test case in its own
> processes using fork().
>
> For many years I have seen the effect that when running in a command window
> every thing works as expected, But running in Emacs created multiple
> outputs. That has not bothered me that much but know I implemented some
> further output routines in the reporting code, and everything just blew up!
>
> The test case is run in a separate processes using fork() which then
> messages back and then dies. The output from the runner (parent process)
> written to the file before the fork() is then output twice.
>
> This behaviour changed to the expected (only printed once) if a fflush() was
> added after the printf() in the parent process before the fork. I'm
> suspecting this happens because of unflushed output in the file buffer which
> is shared by the two processes, first flushed when the child dies, then
> flushed by the parent at some point, not only duplicating output, but also
> garbling it.
>
> Is this a known behaviour? Unavoidable in cygwin? (Obviously not, if I'm on
> the right track with my guesswork...) If it is a bug, will it be fixed?¨
>
>
>
> --
> View this message in context: http://cygwin.1069669.n5.nabble.com/fork-and-file-descriptors-with-un-flushed-output-tp92349.html
> Sent from the Cygwin list mailing list archive at Nabble.com.
>
> --
> Problem reports:       http://cygwin.com/problems.html
> FAQ:                   http://cygwin.com/faq/
> Documentation:         http://cygwin.com/docs.html
> Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
>


--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

Reply | Threaded
Open this post in threaded view
|

Re: fork() and file descriptors with un-flushed output

thoni56
Thank you both! You learn something everyday. Thank you very much!

The suggestions from Daniel worked beautifully!

/Thomas




28 aug 2012 kl. 09:07 skrev "Wolf Geldmacher [via Cygwin]" <[hidden email]>:

This is not a bug - it's a feature ;-)

The "issue" you are describing is in fact the standard behaviour
expected of fork() in any unix/posix compliant implementation.

 From the fork man page on Linux:

> ...
> fork()  creates  a new process by duplicating the calling process.  The
>  new process, referred to as the child, is an  exact  duplicate of  the
>  calling  process,  referred  to as the parent,
> ...

and yes, "exact duplicate" includes all data in buffers not yet flushed.

The difference in behaviour when you  run your program from the terminal
vs. from Emacs stems from the "intelligence" built into the stdio
library that looks at type of the file a stream is connected to and
automagically turns on full buffering if it is not connected to a
terminal in order to optimize performance.

To avoid the duplication of data you can either explicitly turn off
buffering with setbuf() (and pay the associated performance penalty),
fflush() your open files before you fork (usually the easiest to
implement), or revert to the use of the basic OS functions open(),
read(), write(), close() (useful for special cases when not much of
stdio is needed - make sure you don't mix the two).

Cheers,
Wolf

On 28.08.2012 08:26, thoni56 wrote:

> Maybe this is an FAQ but I could not find it in it ;-) or in the lists I
> searched:
>
> In cygwin, when you fork() process shares file descriptors. If there happens
> to be unflushed output in such a shared file descriptor buffer, would that
> be output by both processes?
>
> I have some empirical evidence to support this theory. I support cgreen, a C
> unit test and mock framework, which runs every test case in its own
> processes using fork().
>
> For many years I have seen the effect that when running in a command window
> every thing works as expected, But running in Emacs created multiple
> outputs. That has not bothered me that much but know I implemented some
> further output routines in the reporting code, and everything just blew up!
>
> The test case is run in a separate processes using fork() which then
> messages back and then dies. The output from the runner (parent process)
> written to the file before the fork() is then output twice.
>
> This behaviour changed to the expected (only printed once) if a fflush() was
> added after the printf() in the parent process before the fork. I'm
> suspecting this happens because of unflushed output in the file buffer which
> is shared by the two processes, first flushed when the child dies, then
> flushed by the parent at some point, not only duplicating output, but also
> garbling it.
>
> Is this a known behaviour? Unavoidable in cygwin? (Obviously not, if I'm on
> the right track with my guesswork...) If it is a bug, will it be fixed?¨
>
>
>
> --
> View this message in context: http://cygwin.1069669.n5.nabble.com/fork-and-file-descriptors-with-un-flushed-output-tp92349.html
> Sent from the Cygwin list mailing list archive at Nabble.com.
>
> --
> Problem reports:       http://cygwin.com/problems.html
> FAQ:                   http://cygwin.com/faq/
> Documentation:         http://cygwin.com/docs.html
> Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
>

--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple




If you reply to this email, your message will be added to the discussion below:
http://cygwin.1069669.n5.nabble.com/fork-and-file-descriptors-with-un-flushed-output-tp92349p92353.html
To unsubscribe from fork() and file descriptors with un-flushed output, click here.
NAML