network
network copied to clipboard
Socket type is incorrect on Windows
Socket
is defined as holding a CInt
, but on Windows SOCKET
is an unsigned 64 bit value, usually UINT_PTR
. See [1] and [2] this means we may truncate a valid SOCKET
to something invalid.
[1] https://sourceforge.net/p/mingw-w64/mingw-w64/ci/master/tree/mingw-w64-headers/include/psdk_inc/_socket_types.h [2] https://docs.microsoft.com/en-us/windows/win32/winsock/socket-data-type-2
@Mistuke Could you fix it by yourself?
Yes but I'm not sure how yet. The problem is it's an exported type and we have functions where the CInt
escapes into an external API.
I fear this may break a lot of packages.
- How large
CInt
is on Windows? - Do you mind if we release version 3.1.1.0 without fixing this?
- Should we change the error checking style for
socket()
andaccept()
on WIndows?
If we export the following CSocket
instead of CInt
, what will happen?
#if defined(mingw32_HOST_OS)
type CSocket = SOCKET
#else
type CSocket = CInt
#endif
How large CInt is on Windows?
Same as Linux, just 32-bits.
Do you mind if we release version 3.1.1.0 without fixing this?
Yeah I don't consider this a blocker, it's been like this for years so.
Should we change the error checking style for socket() and accept() on WIndows?
Yes as it returns SOCKET_ERROR
. so an unsigned -1
not a signed one.
But the problem is withFdSocket
, socketToFd
and mkSocket
all expose the CInt
, so what I meant is that changing this to the wider type on Windows would likely cause anything that uses these calls to fail..
Sorry but I did not know that the size of CInt
is 4 even on Unix.
Should we use CIntMax
instead?
Hmm I though that an fd
on Unix
is a signed 32-bit value? so CInt
is correct on Unix
no? For WIndows you'd need an unsigned value so CUIntMax
would be appropriate.
Do you have a source which explains that fd
on Unix
is limited to 32-bit?
Not directly, but all the docs for e.g. epoll
uses int
for fd
https://linux.die.net/man/2/epoll_ctl
and while not authoritative it is backed up by wikipedia https://en.wikipedia.org/wiki/File_descriptor
Wow. I misunderstood that sizeof int
on 64bit Unix is 8.
OK. It's 4. And epoll
's manual is a good evidence.
The size if the "int" type in C is implementation dependent. Historically (very long ago) it was even 16 bits. Now most Unix system have 32-bit ints, but some Unix variants (perhaps Cray systems) IIRC have had 64 bit ints.
This is why haskell has a CInt
data type rather than just using Int32
or similar. On any given Haskell runtime CInt
may be 32 or 64 bits.
In GHC, despite the documentation in base
https://hackage.haskell.org/package/base-4.12.0.0/docs/Foreign-C-Types.html, the actual type seems to be:
https://github.com/ghc/ghc/blob/21f0f56164f50844c2150c62f950983b2376f8b6/libraries/base/Foreign/C/Types.hs#L119-L122
and CInt
is defined indirectly via HTYPE_INT
, which in turn comes from configure.ac
:
https://github.com/ghc/ghc/blob/21f0f56164f50844c2150c62f950983b2376f8b6/libraries/base/configure.ac#L118-L175
So, the bottom line is that these types are platform-dependent. They do have Storable
instances, so that on any given platform the FFI can serialize and deserialize them portably regardless of size variations.
Now as for Sockets, indeed on Windows an OS Socket is NOT an int, and portable code needs to abstract the type (e.g. in Heimdal Kerberos: Windows: https://github.com/heimdal/heimdal/blob/afaaf3d89d86bb33f42a63767b41f57c24238aed/lib/roken/roken.h.in#L108 Unix: https://github.com/heimdal/heimdal/blob/afaaf3d89d86bb33f42a63767b41f57c24238aed/lib/roken/roken.h.in#L126 ).
This also means that sockets can't directly be converted back and forth to "file descriptors". The Haskell API can convert between Network.Socket Socket
s and OS integral SOCKET
s with the latter type being 32 bit signed on Unix and 64-bit unsigned on Windows.
@vdukhovni Thank you for your explanation. Very clear.
@kazu-yamamoto I opened an issue on GHC's gitlab that is motivated by this discussion: https://gitlab.haskell.org/ghc/ghc/issues/17335 Do you think I'm making a valid point, or is this unlikely to cause any confusion?
I just updated that issue, it seems that the top-matter (introductory text for the module) already discusses the portability caveat, but it is arguably too easy to miss when one is in a hurry?
Yes base has a few of those disclaimers around. In particular the Win32 stuff in base are mostly all wrong.
For the Win32
module I override the docs with ones generated in Windows.
But in a general sense they're wrong for all platforms where types differ not only by OS but by architecture as well.