ECONNABORTED and ECONNRESET on TCP socket using recv()

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

ECONNABORTED and ECONNRESET on TCP socket using recv()

Cygwin list mailing list
Hi all

Have anyone experienced getting ECONNABORTED and ECONNRESET on local TCP
socket when using recv() ?


We have a fairly complex application where it, amongst others, spawns child
processes (using posix_spawnp)

This is a simplified scenario

- parent performs socket() + bind() + listen() to localhost
- parent spawns a client-child process
  - client-child is doing socket() + connect() to localhost
  - client-child is doing send()
  - client-child is doing recv() and getting ECONNRESET

- parent performs accept()
- parent spawns a server-child process
  - server-child is doing recv() and getting ECONNABORTED


According to strace, both of these errors originates from
fhandler_socket_inet::recv_internal() (in my version it says line 1221)



Maybe there's some defect in our application (there's a lot of other fuzz
going on as well), but it works in several Linux-implementations but this
error is deterministically occurring using CYGWIN


I've searched mail archives but I cannot really find any explanation or
cause

Does anyone have any knowledge about this ?



Best regards
Kristian

p.s.
   strace -f works in the opposite way as in most Linux-implementation btw
(as far as I understand)
d.s.

--
Problem reports:      https://cygwin.com/problems.html
FAQ:                  https://cygwin.com/faq/
Documentation:        https://cygwin.com/docs.html
Unsubscribe info:     https://cygwin.com/ml/#unsubscribe-simple
Reply | Threaded
Open this post in threaded view
|

Re: ECONNABORTED and ECONNRESET on TCP socket using recv()

Corinna Vinschen-2
On May  8 10:32, Kristian Ivarsson via Cygwin wrote:

> Hi all
>
> Have anyone experienced getting ECONNABORTED and ECONNRESET on local TCP
> socket when using recv() ?
>
>
> We have a fairly complex application where it, amongst others, spawns child
> processes (using posix_spawnp)
>
> This is a simplified scenario
>
> - parent performs socket() + bind() + listen() to localhost
> - parent spawns a client-child process
>   - client-child is doing socket() + connect() to localhost
>   - client-child is doing send()
>   - client-child is doing recv() and getting ECONNRESET
>
> - parent performs accept()
> - parent spawns a server-child process
>   - server-child is doing recv() and getting ECONNABORTED
>
>
> According to strace, both of these errors originates from
> fhandler_socket_inet::recv_internal() (in my version it says line 1221)

The errors are generated by the called Windows function WSARecvFrom.
We'd need a reproducible testcase for this to allow debugging.

> Maybe there's some defect in our application (there's a lot of other fuzz
> going on as well), but it works in several Linux-implementations but this
> error is deterministically occurring using CYGWIN
>
>
> I've searched mail archives but I cannot really find any explanation or
> cause
>
> Does anyone have any knowledge about this ?
>
>
>
> Best regards
> Kristian
>
> p.s.
>    strace -f works in the opposite way as in most Linux-implementation btw
> (as far as I understand)

It's a toggle, multiple -f on the cli will switch it multiple times.
The fact that the default is to follow forks is pretty old, commit
f69af9b3d2352 from 2002.


Corinna

--
Corinna Vinschen
Cygwin Maintainer
--
Problem reports:      https://cygwin.com/problems.html
FAQ:                  https://cygwin.com/faq/
Documentation:        https://cygwin.com/docs.html
Unsubscribe info:     https://cygwin.com/ml/#unsubscribe-simple
Reply | Threaded
Open this post in threaded view
|

Sv: ECONNABORTED and ECONNRESET on TCP socket using recv()

Cygwin list mailing list
> > Hi all
> >
> > Have anyone experienced getting ECONNABORTED and ECONNRESET on local
> > TCP socket when using recv() ?
> >
> >
> > We have a fairly complex application where it, amongst others, spawns
> > child processes (using posix_spawnp)
> >
> > This is a simplified scenario
> >
> > - parent performs socket() + bind() + listen() to localhost
> > - parent spawns a client-child process
> >   - client-child is doing socket() + connect() to localhost
> >   - client-child is doing send()
> >   - client-child is doing recv() and getting ECONNRESET
> >
> > - parent performs accept()
> > - parent spawns a server-child process
> >   - server-child is doing recv() and getting ECONNABORTED
> >
> >
> > According to strace, both of these errors originates from
> > fhandler_socket_inet::recv_internal() (in my version it says line
> > 1221)
>
> The errors are generated by the called Windows function WSARecvFrom.
> We'd need a reproducible testcase for this to allow debugging.

The application is quite complex but I guess it won't count as a test-case
and we still fail to reproduce this in a simple manner


Looking at strace along with winsock-trace revealed a few mysterious though.
According to the strace there's a fork for every posix_spawnp, i.e. it seems
like two processes are created (the forked later exits) but they are somehow
tied to the same cygwin-pid. The weird thing is that one of the forked
"ghost-processes" gets a winsock-abort-event, so my take on this is that the
dup(lications) of socket-descriptors kind of transforms the ownership to the
wrong process or perhaps there's some premature release or such. The
"ghost-process" getting the winsock-abort-event are of a type that should
"inherit" the accept-socket and is called "client-child" in the description
above


The problems doesn't occur in the simplified test-case but if someone is
interested I can give guidance to help building/testing the more complex
test-cases


Does anyone have a clue of where we could find some more clues about this
and/or if something obvious come to someone's mind of how to proceed ?


There are some comments that might be related to this in the implementation
of fhandler_socket_inet::recv_internal() though the comments does not really
describe our scenario


Best regards,
Kristian




> > Maybe there's some defect in our application (there's a lot of other
> > fuzz going on as well), but it works in several Linux-implementations
> > but this error is deterministically occurring using CYGWIN
> >
> >
> > I've searched mail archives but I cannot really find any explanation
> > or cause
> >
> > Does anyone have any knowledge about this ?

[snip]

--
Problem reports:      https://cygwin.com/problems.html
FAQ:                  https://cygwin.com/faq/
Documentation:        https://cygwin.com/docs.html
Unsubscribe info:     https://cygwin.com/ml/#unsubscribe-simple
Reply | Threaded
Open this post in threaded view
|

Sv: ECONNABORTED and ECONNRESET on TCP socket using recv()

Cygwin list mailing list
> > > Hi all
> > >
> > > Have anyone experienced getting ECONNABORTED and ECONNRESET on local
> > > TCP socket when using recv() ?
> > >
> > >
> > > We have a fairly complex application where it, amongst others,
> > > spawns child processes (using posix_spawnp)
> > >
> > > This is a simplified scenario
> > >
> > > - parent performs socket() + bind() + listen() to localhost
> > > - parent spawns a client-child process
> > >   - client-child is doing socket() + connect() to localhost
> > >   - client-child is doing send()
> > >   - client-child is doing recv() and getting ECONNRESET
> > >
> > > - parent performs accept()
> > > - parent spawns a server-child process
> > >   - server-child is doing recv() and getting ECONNABORTED
> > >
> > >
> > > According to strace, both of these errors originates from
> > > fhandler_socket_inet::recv_internal() (in my version it says line
> > > 1221)
> >
> > The errors are generated by the called Windows function WSARecvFrom.
> > We'd need a reproducible testcase for this to allow debugging.
>
> The application is quite complex but I guess it won't count as a test-case
> and we still fail to reproduce this in a simple manner
>
>
> Looking at strace along with winsock-trace revealed a few mysterious
> though.
> According to the strace there's a fork for every posix_spawnp, i.e. it
> seems like two processes are created (the forked later exits) but they are
> somehow tied to the same cygwin-pid. The weird thing is that one of the
> forked "ghost-processes" gets a winsock-abort-event, so my take on this is
> that the
> dup(lications) of socket-descriptors kind of transforms the ownership to
> the wrong process or perhaps there's some premature release or such. The
> "ghost-process" getting the winsock-abort-event are of a type that should
> "inherit" the accept-socket and is called "client-child" in the
> description above

[snip]

We discovered this to be a defect in our own code due to the fact that some
parts assumed that struct linger always had two int's (but in CygWin it is
two short's

This was discovered due to a strace-debug-output

  if (optlen == (socklen_t) sizeof (int))
    debug_printf ("setsockopt optval=%x", *(int *) optval);

in fhandler_socket_inet::setsockopt that in it self is kind of weird, i.e.
it seems like it assumes an int is passed just because optlen is of the same
size as an int (and struct linger happen to be just that, so ... it kind of
helped us :-)

Keep up the good work
Kristian

[snip]

--
Problem reports:      https://cygwin.com/problems.html
FAQ:                  https://cygwin.com/faq/
Documentation:        https://cygwin.com/docs.html
Unsubscribe info:     https://cygwin.com/ml/#unsubscribe-simple