fs2 icon indicating copy to clipboard operation
fs2 copied to clipboard

Test http4s servers fail release bound port on shutdown when using fs2 3.12.0

Open mtomko opened this issue 4 months ago • 5 comments

This issue has been minorly discussed in discord in the cats-effect channel, but I have recently concluded that the most fruitful avenue for exploration may be in fs2 rather than cats-effect, although my suspicion is that the bug might be somewhere straddling the two libraries.

The general idea is that we have a number of integration tests that spin up http4s servers so the tests can run requests against them. The servers are run within Resource so we can guarantee that they shut down cleanly when the tests are finished. We are careful to not run tests in parallel and we've been setting the fork property so tests run in a forked JVM.

When cats-effect 3.6.0 came out, we upgraded cats-effect and fs2 and suddenly a number of our tests began failing with errors like:

==> X com.example.ce361.ReproSpec.shutdown 2  2.168s java.net.BindException: Address already in use
    at sun.nio.ch.Net.bind0(Native Method)
    at sun.nio.ch.Net.bind(Net.java:565)
    at sun.nio.ch.ServerSocketChannelImpl.netBind(ServerSocketChannelImpl.java:344)
    at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:301)
    at java.nio.channels.ServerSocketChannel.bind(ServerSocketChannel.java:224)
    at fs2.io.net.SelectingSocketGroup.$anonfun$serverResource$7(SelectingSocketGroup.scala:112)
    at delay @ fs2.io.net.SelectingSocketGroup.$anonfun$serverResource$6(SelectingSocketGroup.scala:107)
    at traverse @ fs2.io.net.SelectingSocketGroup.$anonfun$serverResource$4(SelectingSocketGroup.scala:106)
    at flatMap @ fs2.io.net.SelectingSocketGroup.$anonfun$serverResource$4(SelectingSocketGroup.scala:106)
    at *>$extension @ org.typelevel.keypool.internal.RequestSemaphore$.apply(RequestSemaphore.scala:79)

Our interpretation of the failure is that at the end of the resource scope, when the http4s servers should have shut down, they have not unbound their ports in the OS, so when a new test starts up and tries to start a new server (on the same port, because we just use a fixed port), it cannot bind to the test port and fails. As a result, we have been stuck back on cats-effect 3.5.7 and fs2 3.11.0 since early this year, but it's becoming increasingly untenable to be in this situation.

I have made a repro case in this repository, mirroring a common basic setup for our tests:

https://github.com/mtomko/ce361

Note that if you use fs2 3.11.0, it seems to work fine, but when you bump to 3.12.0, it fails.

I'm happy to do some legwork here but I think without a bit of guidance I might be hard pressed to make any progress. I was planning to begin by looking at the polling change introduced in 3.12.0 but I'm not familiar with the internals of either fs2 or cats-effect so I have a pretty steep learning curve.

mtomko avatar Aug 21 '25 21:08 mtomko

I am not an expert in these APIs but I noticed this code in SelectingSocketGroup.scala, in the definition of serverResource:

    Resource
      .make(F.delay(selector.provider.openServerSocketChannel())) { ch =>
        F.delay(ch.close())
      }
      .evalMap { serverCh =>
        val configure = address.traverse(_.resolve).flatMap { ip =>
          F.delay {
            serverCh.configureBlocking(false)
            serverCh.bind(
              new InetSocketAddress(
                ip.map(_.toInetAddress).orNull,
                port.map(_.value).getOrElse(0)
              )
            )
          }
        }

I'm curious because serverCh.bind is non-void - it returns something that can be closed, but it doesn't seem to be closed anywhere explicitly. It's possible that closing the channel created in the resource above will unbind serverCh too but what if it doesn't?

mtomko avatar Aug 21 '25 22:08 mtomko

I tested this and it doesn't seem to pan out. Looking at the Javadocs, I finally found that bind returns the socket it was called on to allow for chaining, so if the channel is closed at all (which it should be since it's a resource) the socket should be unbound.

mtomko avatar Aug 21 '25 22:08 mtomko

Also, including the output of sbt evicted for the fs2 3.11.0 case:

[info] Here are other dependency conflicts that were resolved:
[info]  * org.scala-lang:scala-library:2.13.16 is selected over {2.13.14, 2.13.15, 2.13.8, 2.13.5, 2.13.12, 2.13.11}
[info]      +- com.example:ce361_2.13:0.0.1-SNAPSHOT              (depends on 2.13.16)
[info]      +- io.circe:circe-numbers_2.13:0.14.13                (depends on 2.13.16)
[info]      +- org.typelevel:cats-kernel_2.13:2.13.0              (depends on 2.13.16)
[info]      +- org.typelevel:cats-core_2.13:2.13.0                (depends on 2.13.16)
[info]      +- io.circe:circe-generic_2.13:0.14.13                (depends on 2.13.16)
[info]      +- io.circe:circe-core_2.13:0.14.13                   (depends on 2.13.16)
[info]      +- com.chuusai:shapeless_2.13:2.3.13                  (depends on 2.13.16)
[info]      +- org.scala-lang:scala-reflect:2.13.16               (depends on 2.13.16)
[info]      +- org.typelevel:vault_2.13:3.6.0                     (depends on 2.13.14)
[info]      +- org.typelevel:log4cats-slf4j_2.13:2.7.0            (depends on 2.13.14)
[info]      +- org.typelevel:log4cats-core_2.13:2.7.0             (depends on 2.13.14)
[info]      +- org.typelevel:literally_2.13:1.1.0                 (depends on 2.13.8)
[info]      +- org.typelevel:keypool_2.13:0.4.10                  (depends on 2.13.14)
[info]      +- org.typelevel:jawn-parser_2.13:1.5.1               (depends on 2.13.11)
[info]      +- org.typelevel:jawn-fs2_2.13:2.4.0                  (depends on 2.13.8)
[info]      +- org.typelevel:cats-parse_2.13:1.0.0                (depends on 2.13.12)
[info]      +- org.typelevel:cats-mtl_2.13:1.3.1                  (depends on 2.13.8)
[info]      +- org.typelevel:cats-effect_2.13:3.6.3               (depends on 2.13.15)
[info]      +- org.typelevel:cats-effect-std_2.13:3.6.3           (depends on 2.13.15)
[info]      +- org.typelevel:cats-effect-kernel_2.13:3.6.3        (depends on 2.13.15)
[info]      +- org.typelevel:case-insensitive_2.13:1.4.2          (depends on 2.13.14)
[info]      +- org.scodec:scodec-bits_2.13:1.1.38                 (depends on 2.13.12)
[info]      +- org.log4s:log4s_2.13:1.10.0                        (depends on 2.13.5)
[info]      +- org.http4s:http4s-server_2.13:0.23.30              (depends on 2.13.15)
[info]      +- org.http4s:http4s-jawn_2.13:0.23.30                (depends on 2.13.15)
[info]      +- org.http4s:http4s-ember-server_2.13:0.23.30        (depends on 2.13.15)
[info]      +- org.http4s:http4s-ember-core_2.13:0.23.30          (depends on 2.13.15)
[info]      +- org.http4s:http4s-ember-client_2.13:0.23.30        (depends on 2.13.15)
[info]      +- org.http4s:http4s-dsl_2.13:0.23.30                 (depends on 2.13.15)
[info]      +- org.http4s:http4s-crypto_2.13:0.2.4                (depends on 2.13.8)
[info]      +- org.http4s:http4s-core_2.13:0.23.30                (depends on 2.13.15)
[info]      +- org.http4s:http4s-client_2.13:0.23.30              (depends on 2.13.15)
[info]      +- org.http4s:http4s-circe_2.13:0.23.30               (depends on 2.13.15)
[info]      +- io.circe:circe-jawn_2.13:0.14.8                    (depends on 2.13.14)
[info]      +- com.comcast:ip4s-core_2.13:3.6.0                   (depends on 2.13.14)
[info]      +- co.fs2:fs2-io_2.13:3.11.0                          (depends on 2.13.14)
[info]      +- co.fs2:fs2-core_2.13:3.11.0                        (depends on 2.13.14)
[info]  * co.fs2:fs2-core_2.13:3.11.0 is selected over 3.3.0
[info]      +- com.example:ce361_2.13:0.0.1-SNAPSHOT              (depends on 3.11.0)
[info]      +- org.http4s:http4s-core_2.13:0.23.30                (depends on 3.11.0)
[info]      +- co.fs2:fs2-io_2.13:3.11.0                          (depends on 3.11.0)
[info]      +- org.typelevel:jawn-fs2_2.13:2.4.0                  (depends on 3.3.0)
[info]  * io.circe:circe-core_2.13:0.14.13 is selected over {0.14.8}
[info]      +- io.circe:circe-generic_2.13:0.14.13                (depends on 0.14.13)
[info]      +- org.http4s:http4s-circe_2.13:0.23.30               (depends on 0.14.8)
[info]      +- io.circe:circe-jawn_2.13:0.14.8                    (depends on 0.14.8)
[info]  * org.typelevel:cats-effect_2.13:3.6.3 is selected over {3.5.4}
[info]      +- com.example:ce361_2.13:0.0.1-SNAPSHOT              (depends on 3.6.3)
[info]      +- org.typelevel:vault_2.13:3.6.0                     (depends on 3.5.4)
[info]      +- org.typelevel:log4cats-slf4j_2.13:2.7.0            (depends on 3.5.4)
[info]      +- com.comcast:ip4s-core_2.13:3.6.0                   (depends on 3.5.4)
[info]      +- co.fs2:fs2-core_2.13:3.11.0                        (depends on 3.5.4)
[info]  * org.typelevel:cats-effect-std_2.13:3.6.3 is selected over {3.5.7, 3.5.4}
[info]      +- org.typelevel:cats-effect_2.13:3.6.3               (depends on 3.6.3)
[info]      +- org.typelevel:log4cats-core_2.13:2.7.0             (depends on 3.5.4)
[info]      +- org.typelevel:keypool_2.13:0.4.10                  (depends on 3.5.4)
[info]      +- org.http4s:http4s-core_2.13:0.23.30                (depends on 3.5.7)
[info]  * org.scala-lang:scala-reflect:2.13.16 is selected over 2.13.12
[info]      +- org.typelevel:cats-parse_2.13:1.0.0                (depends on 2.13.12)
[info]  * org.scodec:scodec-bits_2.13:1.1.38 is selected over 1.1.34
[info]      +- org.http4s:http4s-core_2.13:0.23.30                (depends on 1.1.38)
[info]      +- co.fs2:fs2-core_2.13:3.11.0                        (depends on 1.1.38)
[info]      +- org.http4s:http4s-crypto_2.13:0.2.4                (depends on 1.1.34)
[info]  * org.typelevel:jawn-parser_2.13:1.5.1 is selected over 1.4.0
[info]      +- io.circe:circe-jawn_2.13:0.14.8                    (depends on 1.5.1)
[info]      +- org.http4s:http4s-jawn_2.13:0.23.30                (depends on 1.5.1)
[info]      +- org.typelevel:jawn-fs2_2.13:2.4.0                  (depends on 1.4.0)
[info]  * org.typelevel:cats-core_2.13:2.13.0 is selected over {2.11.0, 2.10.0, 2.8.0}
[info]      +- io.circe:circe-core_2.13:0.14.13                   (depends on 2.13.0)
[info]      +- org.typelevel:vault_2.13:3.6.0                     (depends on 2.11.0)
[info]      +- org.typelevel:log4cats-core_2.13:2.7.0             (depends on 2.10.0)
[info]      +- org.typelevel:keypool_2.13:0.4.10                  (depends on 2.11.0)
[info]      +- org.typelevel:cats-mtl_2.13:1.3.1                  (depends on 2.8.0)
[info]      +- org.typelevel:cats-effect-kernel_2.13:3.6.3        (depends on 2.11.0)
[info]      +- org.typelevel:case-insensitive_2.13:1.4.2          (depends on 2.11.0)
[info]      +- org.http4s:http4s-crypto_2.13:0.2.4                (depends on 2.8.0)
[info]      +- org.http4s:http4s-core_2.13:0.23.30                (depends on 2.11.0)
[info]      +- com.comcast:ip4s-core_2.13:3.6.0                   (depends on 2.10.0)
[info]      +- co.fs2:fs2-core_2.13:3.11.0                        (depends on 2.11.0)
[info]  * org.typelevel:cats-effect-kernel_2.13:3.6.3 is selected over 3.3.14
[info]      +- org.typelevel:cats-effect-std_2.13:3.6.3           (depends on 3.6.3)
[info]      +- org.typelevel:cats-effect_2.13:3.6.3               (depends on 3.6.3)
[info]      +- org.http4s:http4s-crypto_2.13:0.2.4                (depends on 3.3.14)
[info]  * org.slf4j:slf4j-api:1.7.36 is selected over 1.7.30
[info]      +- org.http4s:http4s-core_2.13:0.23.30                (depends on 1.7.36)
[info]      +- org.typelevel:log4cats-slf4j_2.13:2.7.0            (depends on 1.7.36)
[info]      +- org.log4s:log4s_2.13:1.10.0                        (depends on 1.7.30)

and here it is for 3.12.0:

[info] Here are other dependency conflicts that were resolved:
[info]  * org.scala-lang:scala-library:2.13.16 is selected over {2.13.14, 2.13.15, 2.13.8, 2.13.5, 2.13.12, 2.13.11}
[info]      +- com.example:ce361_2.13:0.0.1-SNAPSHOT              (depends on 2.13.16)
[info]      +- io.circe:circe-numbers_2.13:0.14.13                (depends on 2.13.16)
[info]      +- org.typelevel:cats-kernel_2.13:2.13.0              (depends on 2.13.16)
[info]      +- co.fs2:fs2-io_2.13:3.12.0                          (depends on 2.13.16)
[info]      +- org.typelevel:cats-core_2.13:2.13.0                (depends on 2.13.16)
[info]      +- io.circe:circe-generic_2.13:0.14.13                (depends on 2.13.16)
[info]      +- co.fs2:fs2-core_2.13:3.12.0                        (depends on 2.13.16)
[info]      +- io.circe:circe-core_2.13:0.14.13                   (depends on 2.13.16)
[info]      +- com.chuusai:shapeless_2.13:2.3.13                  (depends on 2.13.16)
[info]      +- org.scala-lang:scala-reflect:2.13.16               (depends on 2.13.16)
[info]      +- org.typelevel:vault_2.13:3.6.0                     (depends on 2.13.14)
[info]      +- org.typelevel:log4cats-slf4j_2.13:2.7.0            (depends on 2.13.14)
[info]      +- org.typelevel:log4cats-core_2.13:2.7.0             (depends on 2.13.14)
[info]      +- org.typelevel:literally_2.13:1.1.0                 (depends on 2.13.8)
[info]      +- org.typelevel:keypool_2.13:0.4.10                  (depends on 2.13.14)
[info]      +- org.typelevel:jawn-parser_2.13:1.5.1               (depends on 2.13.11)
[info]      +- org.typelevel:jawn-fs2_2.13:2.4.0                  (depends on 2.13.8)
[info]      +- org.typelevel:cats-parse_2.13:1.0.0                (depends on 2.13.12)
[info]      +- org.typelevel:cats-mtl_2.13:1.3.1                  (depends on 2.13.8)
[info]      +- org.typelevel:cats-effect_2.13:3.6.3               (depends on 2.13.15)
[info]      +- org.typelevel:cats-effect-std_2.13:3.6.3           (depends on 2.13.15)
[info]      +- org.typelevel:cats-effect-kernel_2.13:3.6.3        (depends on 2.13.15)
[info]      +- org.typelevel:case-insensitive_2.13:1.4.2          (depends on 2.13.14)
[info]      +- org.scodec:scodec-bits_2.13:1.1.38                 (depends on 2.13.12)
[info]      +- org.log4s:log4s_2.13:1.10.0                        (depends on 2.13.5)
[info]      +- org.http4s:http4s-server_2.13:0.23.30              (depends on 2.13.15)
[info]      +- org.http4s:http4s-jawn_2.13:0.23.30                (depends on 2.13.15)
[info]      +- org.http4s:http4s-ember-server_2.13:0.23.30        (depends on 2.13.15)
[info]      +- org.http4s:http4s-ember-core_2.13:0.23.30          (depends on 2.13.15)
[info]      +- org.http4s:http4s-ember-client_2.13:0.23.30        (depends on 2.13.15)
[info]      +- org.http4s:http4s-dsl_2.13:0.23.30                 (depends on 2.13.15)
[info]      +- org.http4s:http4s-crypto_2.13:0.2.4                (depends on 2.13.8)
[info]      +- org.http4s:http4s-core_2.13:0.23.30                (depends on 2.13.15)
[info]      +- org.http4s:http4s-client_2.13:0.23.30              (depends on 2.13.15)
[info]      +- org.http4s:http4s-circe_2.13:0.23.30               (depends on 2.13.15)
[info]      +- io.circe:circe-jawn_2.13:0.14.8                    (depends on 2.13.14)
[info]      +- com.comcast:ip4s-core_2.13:3.6.0                   (depends on 2.13.14)
[info]  * co.fs2:fs2-core_2.13:3.12.0 is selected over {3.11.0, 3.3.0}
[info]      +- com.example:ce361_2.13:0.0.1-SNAPSHOT              (depends on 3.12.0)
[info]      +- co.fs2:fs2-io_2.13:3.12.0                          (depends on 3.12.0)
[info]      +- org.typelevel:jawn-fs2_2.13:2.4.0                  (depends on 3.3.0)
[info]      +- org.http4s:http4s-core_2.13:0.23.30                (depends on 3.11.0)
[info]  * io.circe:circe-core_2.13:0.14.13 is selected over {0.14.8}
[info]      +- io.circe:circe-generic_2.13:0.14.13                (depends on 0.14.13)
[info]      +- org.http4s:http4s-circe_2.13:0.23.30               (depends on 0.14.8)
[info]      +- io.circe:circe-jawn_2.13:0.14.8                    (depends on 0.14.8)
[info]  * org.typelevel:cats-effect_2.13:3.6.3 is selected over {3.6.0, 3.5.4}
[info]      +- com.example:ce361_2.13:0.0.1-SNAPSHOT              (depends on 3.6.3)
[info]      +- org.typelevel:vault_2.13:3.6.0                     (depends on 3.5.4)
[info]      +- org.typelevel:log4cats-slf4j_2.13:2.7.0            (depends on 3.5.4)
[info]      +- com.comcast:ip4s-core_2.13:3.6.0                   (depends on 3.5.4)
[info]      +- co.fs2:fs2-core_2.13:3.12.0                        (depends on 3.6.0)
[info]  * org.typelevel:cats-effect-std_2.13:3.6.3 is selected over {3.5.7, 3.5.4}
[info]      +- org.typelevel:cats-effect_2.13:3.6.3               (depends on 3.6.3)
[info]      +- org.typelevel:log4cats-core_2.13:2.7.0             (depends on 3.5.4)
[info]      +- org.typelevel:keypool_2.13:0.4.10                  (depends on 3.5.4)
[info]      +- org.http4s:http4s-core_2.13:0.23.30                (depends on 3.5.7)
[info]  * org.scala-lang:scala-reflect:2.13.16 is selected over 2.13.12
[info]      +- org.typelevel:cats-parse_2.13:1.0.0                (depends on 2.13.12)
[info]  * org.scodec:scodec-bits_2.13:1.1.38 is selected over 1.1.34
[info]      +- org.http4s:http4s-core_2.13:0.23.30                (depends on 1.1.38)
[info]      +- co.fs2:fs2-core_2.13:3.12.0                        (depends on 1.1.38)
[info]      +- org.http4s:http4s-crypto_2.13:0.2.4                (depends on 1.1.34)
[info]  * org.typelevel:jawn-parser_2.13:1.5.1 is selected over 1.4.0
[info]      +- io.circe:circe-jawn_2.13:0.14.8                    (depends on 1.5.1)
[info]      +- org.http4s:http4s-jawn_2.13:0.23.30                (depends on 1.5.1)
[info]      +- org.typelevel:jawn-fs2_2.13:2.4.0                  (depends on 1.4.0)
[info]  * org.typelevel:cats-core_2.13:2.13.0 is selected over {2.11.0, 2.10.0, 2.8.0}
[info]      +- io.circe:circe-core_2.13:0.14.13                   (depends on 2.13.0)
[info]      +- org.typelevel:vault_2.13:3.6.0                     (depends on 2.11.0)
[info]      +- org.typelevel:log4cats-core_2.13:2.7.0             (depends on 2.10.0)
[info]      +- org.typelevel:keypool_2.13:0.4.10                  (depends on 2.11.0)
[info]      +- org.typelevel:cats-mtl_2.13:1.3.1                  (depends on 2.8.0)
[info]      +- org.typelevel:cats-effect-kernel_2.13:3.6.3        (depends on 2.11.0)
[info]      +- org.typelevel:case-insensitive_2.13:1.4.2          (depends on 2.11.0)
[info]      +- org.http4s:http4s-crypto_2.13:0.2.4                (depends on 2.8.0)
[info]      +- org.http4s:http4s-core_2.13:0.23.30                (depends on 2.11.0)
[info]      +- com.comcast:ip4s-core_2.13:3.6.0                   (depends on 2.10.0)
[info]      +- co.fs2:fs2-core_2.13:3.12.0                        (depends on 2.11.0)
[info]  * co.fs2:fs2-io_2.13:3.12.0 is selected over 3.11.0
[info]      +- com.example:ce361_2.13:0.0.1-SNAPSHOT              (depends on 3.12.0)
[info]      +- org.http4s:http4s-core_2.13:0.23.30                (depends on 3.11.0)
[info]  * org.typelevel:cats-effect-kernel_2.13:3.6.3 is selected over 3.3.14
[info]      +- org.typelevel:cats-effect-std_2.13:3.6.3           (depends on 3.6.3)
[info]      +- org.typelevel:cats-effect_2.13:3.6.3               (depends on 3.6.3)
[info]      +- org.http4s:http4s-crypto_2.13:0.2.4                (depends on 3.3.14)
[info]  * org.slf4j:slf4j-api:1.7.36 is selected over 1.7.30
[info]      +- org.http4s:http4s-core_2.13:0.23.30                (depends on 1.7.36)
[info]      +- org.typelevel:log4cats-slf4j_2.13:2.7.0            (depends on 1.7.36)
[info]      +- org.log4s:log4s_2.13:1.10.0                        (depends on 1.7.30)

mtomko avatar Aug 21 '25 22:08 mtomko

Still digging on this, and ran into:

https://bugs.openjdk.org/browse/JDK-5073504

There are no JavaDocs for ServerSocketChannel.close() -- the reader must default to the Channel.close() docs, which are not specific about the function of the close() method on a non-blocking ServerSocketChannel and the releasing of OS resources. Specifically, after closing a non-blocking ServerSocketChannel that has been registered with a Selector, the isOpen() method will return false. There is no reason for the developer to believe that the close operation is not complete at this point. However, the port has not been freed to the OS...and a subsequent open()/bind() to the same port number will fail. Even if you have de-registered the channel with the selector via SelectionKey.cancel(), the resource is still not freed. You must make another call to Selector.select() in order for the resources to be freed. In our case, our application knows that there are no other resources registered to the Selector and so there was no reason (or so we thought) to make another call to the Selector.

mtomko avatar Aug 22 '25 17:08 mtomko

It seems like this might be it. I checked out fs2 and modified the close logic slightly to wait until the channel no longer thinks it's registered before completing. It now passes my repro test case (with a slightly longer timeout, but I think that's acceptable for a test that opens and closes 1000 servers in a row).

Here's the new logic:

      .make(F.delay(selector.provider.openServerSocketChannel())) { ch =>
        def wait: F[Unit] = F.delay(ch.isRegistered()).ifM(F.sleep(2.millis) >> wait, F.unit)
        F.delay(ch.close()) >> wait
      }

mtomko avatar Aug 22 '25 17:08 mtomko