text
text copied to clipboard
Non-deterministic "Bad file descriptor" from hGetContents
Are there any troubleshooting steps you might suggest if I'm getting similar errors to #10 with new versions of text
? I'm trying to do a shrink, but right now can only manifest the problem non-deterministically (yay) as part of the Cryptol test suite on OS X and Windows (our linux hosts have not exhibited this failure).
Cryptol source files are specified as UTF-8, so we have a check that trying to load an ISO-8859-1 file fails, but without bringing down the whole interpreter. The script is just:
:l check31.iso8859.cry
:l check31.utf8.cry
Example output is here. You can see that on most runs of this script, we get the expected error message from the 8859 file followed by a successful load of Foo
, the UTF-8 version. However on some runs, the UTF-8 version fails when trying to load the Cryptol Prelude, which is a UTF-8 file, and which loads at the beginning of each of these attempts successfully.
Could there be some bit of internal state in text
that is left in a bad state by trying to load an unexpected encoding? FWIW, the following attempt at a shrink couldn't trigger the problem:
module Main where
import qualified Control.Exception as X
import Control.Monad
import qualified Data.Text as T
import qualified Data.Text.IO as T
import System.IO
f = do
let eLoadAndPrint :: FilePath -> IO (Either X.IOException T.Text)
eLoadAndPrint path = X.try $ do
h <- openFile path ReadMode
T.hGetContents h
pre1 <- eLoadAndPrint "lib/Cryptol.cry"
print pre1
iso <- eLoadAndPrint "tests/regression/check31.iso8859.cry"
print iso
pre2 <- eLoadAndPrint "lib/Cryptol.cry"
print pre2
main = replicateM_ 10000 f
As requested, the GHCs where I've seen this include 7.8.3, 7.8.4, 7.10.2, and 7.10.3 for 64-bit Windows, 64-bit Mac, and both 32 and 64-bit Linux.
I don't know of any such internal state. My sorry attempt at a best guess is some kind of memory corruption problem, but that's fairly obvious.
I'm a little unclear from your description on the sequencing here. Is it this?
- You try to load the 8859 file. This always fails as expected.
- Sometimes an attempt to load the UTF-8 file immediately afterwards fails.
Given that UTF decoding has been completely rewritten since 2016, I suggest to close. @Lysxia what do you think?