router icon indicating copy to clipboard operation
router copied to clipboard

hot-reload not working with k3d filesystem on macOS (M1): "Bad file descriptor"

Open yanchogeorgiev opened this issue 3 years ago • 7 comments

Describe the bug Hot-reload is not working on MacOS (M1). The router is unable to start. Tested the same setup on Ubuntu and seems to work without any issue.

To Reproduce Steps to reproduce the behavior:

  1. Install k3d and tilt
  2. Load configuration from this example: https://www.apollographql.com/docs/router/containerization/kubernetes/
  3. The router is unable to start

Expected behavior The router should works as expected.

Output

{"timestamp":"2022-08-08T10:53:02.456187Z","level":"INFO","fields":{"message":"Apollo Router v0.14.0 // (c) Apollo Graph, Inc. // Licensed as ELv2 ([https://go.apollo.dev/elv2)"},"target":"apollo_router::executable"}](https://go.apollo.dev/elv2)%22%7D,%22target%22:%22apollo_router::executable%22%7D)
thread 'main' panicked at 'Failed to initialise file watching.: Notify(Io(Os { code: 9, kind: Uncategorized, message: "Bad file descriptor" }))', apollo-router/src/files.rs:22:14
stack backtrace:
   0:       0x40023b8a8d - std::backtrace_rs::backtrace::libunwind::trace::h22893a5306c091b4
                               at /rustc/fe5b13d681f25ee6474be29d748c65adcd91f69e/library/std/src/../../backtrace/src/backtrace/libunwind.rs:93:5
   1:       0x40023b8a8d - std::backtrace_rs::backtrace::trace_unsynchronized::h29c3bc6f9e91819d
                               at /rustc/fe5b13d681f25ee6474be29d748c65adcd91f69e/library/std/src/../../backtrace/src/backtrace/mod.rs:66:5
   2:       0x40023b8a8d - std::sys_common::backtrace::_print_fmt::he497d8a0ec903793
                               at /rustc/fe5b13d681f25ee6474be29d748c65adcd91f69e/library/std/src/sys_common/backtrace.rs:66:5
   3:       0x40023b8a8d - <std::sys_common::backtrace::_print::DisplayBacktrace as core::fmt::Display>::fmt::h9c2a9d2774d81873
                               at /rustc/fe5b13d681f25ee6474be29d748c65adcd91f69e/library/std/src/sys_common/backtrace.rs:45:22
   4:       0x40023ddaac - core::fmt::write::hba4337c43d992f49
                               at /rustc/fe5b13d681f25ee6474be29d748c65adcd91f69e/library/core/src/fmt/mod.rs:1194:17
   5:       0x40023b1b21 - std::io::Write::write_fmt::heb73de6e02cfabed
                               at /rustc/fe5b13d681f25ee6474be29d748c65adcd91f69e/library/std/src/io/mod.rs:1655:15
   6:       0x40023ba8f5 - std::sys_common::backtrace::_print::h63c8b24acdd8e8ce
                               at /rustc/fe5b13d681f25ee6474be29d748c65adcd91f69e/library/std/src/sys_common/backtrace.rs:48:5
   7:       0x40023ba8f5 - std::sys_common::backtrace::print::h426700d6240cdcc2
                               at /rustc/fe5b13d681f25ee6474be29d748c65adcd91f69e/library/std/src/sys_common/backtrace.rs:35:9
   8:       0x40023ba8f5 - std::panicking::default_hook::{{closure}}::hc9a76eed0b18f82b
                               at /rustc/fe5b13d681f25ee6474be29d748c65adcd91f69e/library/std/src/panicking.rs:295:22
   9:       0x40023ba5a9 - std::panicking::default_hook::h2e88d02087fae196
                               at /rustc/fe5b13d681f25ee6474be29d748c65adcd91f69e/library/std/src/panicking.rs:314:9
  10:       0x40023bae42 - std::panicking::rust_panic_with_hook::habfdcc2e90f9fd4c
                               at /rustc/fe5b13d681f25ee6474be29d748c65adcd91f69e/library/std/src/panicking.rs:698:17
  11:       0x40023bad27 - std::panicking::begin_panic_handler::{{closure}}::he054b2a83a51d2cd
                               at /rustc/fe5b13d681f25ee6474be29d748c65adcd91f69e/library/std/src/panicking.rs:588:13
  12:       0x40023b8f44 - std::sys_common::backtrace::__rust_end_short_backtrace::ha48b94ab49b30915
                               at /rustc/fe5b13d681f25ee6474be29d748c65adcd91f69e/library/std/src/sys_common/backtrace.rs:138:18
  13:       0x40023baa59 - rust_begin_unwind
                               at /rustc/fe5b13d681f25ee6474be29d748c65adcd91f69e/library/std/src/panicking.rs:584:5
  14:       0x40002bf163 - core::panicking::panic_fmt::h366d3a309ae17c94
                               at /rustc/fe5b13d681f25ee6474be29d748c65adcd91f69e/library/core/src/panicking.rs:143:14
  15:       0x40002bf253 - core::result::unwrap_failed::hddd78f4658ac7d0f
                               at /rustc/fe5b13d681f25ee6474be29d748c65adcd91f69e/library/core/src/result.rs:1785:5
  16:       0x40008f94e0 - apollo_router::files::watch::h97b97d82fdad84f1
  17:       0x400051eac6 - apollo_router::router::ApolloRouter::serve::h3c5f8c36263d6ab0
  18:       0x40004432c9 - <core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll::hefc342c9ce0bf595
  19:       0x400082e1dd - std::thread::local::LocalKey<T>::with::h250e38da0c18df15
  20:       0x4000999fca - tokio::park::thread::CachedParkThread::block_on::h86bf9862cf0a7622
  21:       0x40006f6f57 - tokio::runtime::thread_pool::ThreadPool::block_on::hc56a09f1564bc964
  22:       0x40005eeadd - tokio::runtime::Runtime::block_on::hf00e84efa2076f11
  23:       0x4000344836 - apollo_router::executable::main::h53f2296129dbd07f
  24:       0x40002bfb0b - router::main::h479ae5fda6e8c43a
  25:       0x40002bfae3 - std::sys_common::backtrace::__rust_begin_short_backtrace::h28cdcd0d5b402e52
  26:       0x40002bfab9 - std::rt::lang_start::{{closure}}::h13104d833b4cbf97
  27:       0x40023ab3de - core::ops::function::impls::<impl core::ops::function::FnOnce<A> for &F>::call_once::had4f69b3aefb47a8
                               at /rustc/fe5b13d681f25ee6474be29d748c65adcd91f69e/library/core/src/ops/function.rs:259:13
  28:       0x40023ab3de - std::panicking::try::do_call::hf2ad5355fcafe775
                               at /rustc/fe5b13d681f25ee6474be29d748c65adcd91f69e/library/std/src/panicking.rs:492:40
  29:       0x40023ab3de - std::panicking::try::h0a63ac363423e61e
                               at /rustc/fe5b13d681f25ee6474be29d748c65adcd91f69e/library/std/src/panicking.rs:456:19
  30:       0x40023ab3de - std::panic::catch_unwind::h18088edcecb8693a
                               at /rustc/fe5b13d681f25ee6474be29d748c65adcd91f69e/library/std/src/panic.rs:137:14
  31:       0x40023ab3de - std::rt::lang_start_internal::{{closure}}::ha7dad166dc711761
                               at /rustc/fe5b13d681f25ee6474be29d748c65adcd91f69e/library/std/src/rt.rs:128:48
  32:       0x40023ab3de - std::panicking::try::do_call::hda0c61bf3a57d6e6
                               at /rustc/fe5b13d681f25ee6474be29d748c65adcd91f69e/library/std/src/panicking.rs:492:40
  33:       0x40023ab3de - std::panicking::try::hbc940e68560040a9
                               at /rustc/fe5b13d681f25ee6474be29d748c65adcd91f69e/library/std/src/panicking.rs:456:19
  34:       0x40023ab3de - std::panic::catch_unwind::haed0df2aeb3fa368
                               at /rustc/fe5b13d681f25ee6474be29d748c65adcd91f69e/library/std/src/panic.rs:137:14
  35:       0x40023ab3de - std::rt::lang_start_internal::h9c06694362b5b80c
                               at /rustc/fe5b13d681f25ee6474be29d748c65adcd91f69e/library/std/src/rt.rs:128:20
  36:       0x40002bfbc2 - main
  37:       0x4005ac0d0a - __libc_start_main
  38:       0x40002bf9ee - _start
  39:                0x0 - <unknown>

Desktop (please complete the following information):

  • OS: macOS Monterey
  • Version 12.5

Additional context Maybe this is related: https://github.com/notify-rs/notify/issues/282

Additional question Is there a way to manually reload the supergraph schema without restart?

yanchogeorgiev avatar Aug 08 '22 10:08 yanchogeorgiev

I'm going to rename the issue, because the problem is caused by the filesystem your are running in k3d. Reload works fine on OS X on bare metal or docker.

garypen avatar Aug 08 '22 11:08 garypen

I've verified this. Not sure what the fix is right now.

garypen avatar Aug 08 '22 11:08 garypen

workaround: don't specify --hot-reload, use APOLLO_GRAPH_REF and APOLLO_KEY to download your supergraph from studio.

garypen avatar Aug 08 '22 13:08 garypen

I have confirmed this issue is also impacting me on an M1 Macbook using Docker Desktop.

tcarrio avatar Aug 26 '22 17:08 tcarrio

Using the command:

docker run --rm -it \
    -p 8085:8085 \
    -v $(pwd)/schema.graphql:/dist/config/schema.graphql \
    -v $(pwd)/router.yaml:/dist/config/router.yaml \
    api-router

If I then update my Docker image to include the --hot-reload flag in the CMD I get the following error:

2022-08-26T17:40:31.857293Z ERROR apollo_router::executable: panicked at 'Failed to initialise file watching.: Notify(Io(Os { code: 9, kind: Uncategorized, message: "Bad file descriptor" }))', apollo-router/src/files.rs:22:14

tcarrio avatar Aug 26 '22 17:08 tcarrio

Additionally running the macOS binary directly is successful:

2022-08-26T17:44:40.595993Z  INFO apollo_router::axum_http_server_factory: GraphQL endpoint exposed at http://0.0.0.0:8085/api/graphql 🚀

tcarrio avatar Aug 26 '22 17:08 tcarrio

This looks to be a known problem with the hotwatch library being used, a wrapper for the notify library. All recommendations say to switch to PollWatcher. Doing this switch might also fix issue #1695 for watching files it does not own.

https://docs.rs/notify/latest/notify/#known-problems

kmcrawford avatar Sep 04 '22 14:09 kmcrawford

I think this is related but happy to open a new issue. On version 1.3.0 I'm unable to run the following command. APOLLO_ROUTER_HOT_RELOAD=true causes it to panic and APOLLO_ROUTER_HOT_RELOAD=false will allow it to run.

docker run -p 4000:4000 \
  --env APOLLO_GRAPH_REF="<your graph>" \
  --env APOLLO_KEY="<your key>" \
  --env APOLLO_ROUTER_HOT_RELOAD=true \
  --mount "type=bind,source=</PATH/TO>/router.yaml,target=/dist/config/router.yaml" \
  --rm \
  ghcr.io/apollographql/router:v1.3.0

boggsey avatar Nov 11 '22 19:11 boggsey