nfs-ganesha
nfs-ganesha copied to clipboard
Syntax errors should cause nfsd to exit immediately
This is the second time I've found myself trying to diagnose a problem with nfs-ganesha only to find that it was actually a syntax error. The problem is that ganesha logs the syntax error very early on:
22/07/2021 00:23:56 : epoch 60f8ba9c : nfs1 : nfs-ganesha-17818[main] main :MAIN :EVENT :nfs-ganesha Starting: Ganesha Version 3.5
22/07/2021 00:23:56 : epoch 60f8ba9c : nfs1 : nfs-ganesha-17818[main] ganesha_yyerror :CONFIG :CRIT :Config file (/etc/ganesha/ganesha.conf:40) error: syntax error
But then it reports that it has successfully parsed the configuration file:
22/07/2021 00:23:56 : epoch 60f8ba9c : nfs1 : nfs-ganesha-17818[main] nfs_set_param_from_conf :NFS STARTUP :EVENT :Configuration file successfully parsed
And continues to attempt to run, finally logging an error that is both inaccurate and unhelpful (before dying with a segmentation fault):
22/07/2021 00:23:56 : epoch 60f8ba9c : nfs1 : nfs-ganesha-17818[main] init_server_pkgs :NFS STARTUP :EVENT :Initializing ID Mapper.
22/07/2021 00:23:56 : epoch 60f8ba9c : nfs1 : nfs-ganesha-17818[main] init_server_pkgs :NFS STARTUP :EVENT :ID Mapper successfully initialized.
22/07/2021 00:23:56 : epoch 60f8ba9c : nfs1 : nfs-ganesha-17818[main] rados_kv_connect :CLIENT ID :EVENT :Failed to create pool: -1
22/07/2021 00:23:56 : epoch 60f8ba9c : nfs1 : nfs-ganesha-17818[main] rados_cluster_init :CLIENT ID :EVENT :Failed to connect to cluster: -1
Segmentation fault (core dumped)
The service should have stopped as soon as it hit the "error: syntax error".
This has come up in the past, with arguments on both sides. The argument that won was that adding a new export with a syntax error to a running Ganesha shouldn't cause it to suddenly exit, but it should keep running with the old (valid) config.
From the point of view of only configuring Ganesha at startup, I completely agree with you. But it turns out most products using Ganesha (and therefore most deployments of it) start Ganesha with largely empty config, and add config at run-time.
Having ganesha exit only on startup in the event of a syntax error would seem to be the best of both worlds. What if it were an optional behavior? nfsd.ganesha --syntax-errors-are-fatal ... or something.
So what was the actual error?
I would consider an option to force exit on any config failure during startup, but as Daniel says, many users of Ganesha depend on reloading config and we definitely don't want Ganesha aborting because of a bad dynamic config.
Another problem that was observed with problem exports in clustered environments is the cluster software restarting Ganesha and then Ganesha bouncing up and down as it is continuously restarted with a bad export config (perhaps because an exported file system was offline so not an actual syntax error).
So what was the actual error?
The actual error is in the description of this issue:
22/07/2021 00:23:56 : epoch 60f8ba9c : nfs1 : nfs-ganesha-17818[main] ganesha_yyerror :CONFIG :CRIT :Config file (/etc/ganesha/ganesha.conf:40) error: syntax error
Another problem that was observed with problem exports in clustered environments is the cluster software restarting Ganesha and then Ganesha bouncing up and down as it is continuously restarted with a bad export config
...which is why I was suggesting making an "exit on errors" behavior optional.
What I meant was what is actually wrong with the config file? That will help me determine how best to address where the error should be handled.
What I meant was what is actually wrong with the config file?
I believe it was a missing semicolon. E.g., given something like (note missing semicolon after the Grace_Period setting):
NFS_CORE_PARAM
{
Enable_NLM = false;
Enable_RQUOTA = false;
Protocols = 4;
}
NFSv4
{
RecoveryBackend = rados_cluster;
Minor_Versions = 1,2;
Grace_Period = 60
}
EXPORT
{
...
}
Starting ganesha results in:
04/08/2021 21:29:51 : epoch 610b06cf : nfs1 : nfs-ganesha-15456[main] main :MAIN :EVENT :nfs-ganesha Starting: Ganesha Version 3.5
04/08/2021 21:29:51 : epoch 610b06cf : nfs1 : nfs-ganesha-15456[main] ganesha_yyerror :CONFIG :CRIT :Config file (/etc/ganesha/ganesha.conf:13) error: syntax error
04/08/2021 21:29:51 : epoch 610b06cf : nfs1 : nfs-ganesha-15456[main] nfs_set_param_from_conf :NFS STARTUP :EVENT :Configuration file successfully parsed
04/08/2021 21:29:51 : epoch 610b06cf : nfs1 : nfs-ganesha-15456[main] init_server_pkgs :NFS STARTUP :EVENT :Initializing ID Mapper.
04/08/2021 21:29:51 : epoch 610b06cf : nfs1 : nfs-ganesha-15456[main] init_server_pkgs :NFS STARTUP :EVENT :ID Mapper successfully initialized.
04/08/2021 21:29:51 : epoch 610b06cf : nfs1 : nfs-ganesha-15456[main] nfs_start_grace :STATE :EVENT :NFS Server Now IN GRACE, duration 90
04/08/2021 21:29:51 : epoch 610b06cf : nfs1 : nfs-ganesha-15456[main] rados_cluster_grace_enforcing :CLIENT ID :EVENT :rados_cluster_grace_enforcing: ret=0
04/08/2021 21:29:51 : epoch 610b06cf : nfs1 : nfs-ganesha-15456[main] main :NFS STARTUP :WARN :No export entries found in configuration file !!!
04/08/2021 21:29:51 : epoch 610b06cf : nfs1 : nfs-ganesha-15456[main] config_errs_to_log :CONFIG :WARN :Config File (/etc/ganesha/ganesha.conf:12): Syntax error in statement
04/08/2021 21:29:51 : epoch 610b06cf : nfs1 : nfs-ganesha-15456[main] config_errs_to_log :CONFIG :WARN :Config File (/etc/ganesha/ganesha.conf:18): Unknown parameter (Protocols)
04/08/2021 21:29:51 : epoch 610b06cf : nfs1 : nfs-ganesha-15456[main] config_errs_to_log :CONFIG :WARN :Config File (/etc/ganesha/ganesha.conf:19): Unknown parameter (Transports)
04/08/2021 21:29:51 : epoch 610b06cf : nfs1 : nfs-ganesha-15456[main] config_errs_to_log :CONFIG :WARN :Config File (/etc/ganesha/ganesha.conf:20): Unknown parameter (Path)
04/08/2021 21:29:51 : epoch 610b06cf : nfs1 : nfs-ganesha-15456[main] config_errs_to_log :CONFIG :WARN :Config File (/etc/ganesha/ganesha.conf:21): Unknown parameter (Pseudo)
04/08/2021 21:29:51 : epoch 610b06cf : nfs1 : nfs-ganesha-15456[main] config_errs_to_log :CONFIG :WARN :Config File (/etc/ganesha/ganesha.conf:22): Unknown parameter (Access_Type)
04/08/2021 21:29:51 : epoch 610b06cf : nfs1 : nfs-ganesha-15456[main] config_errs_to_log :CONFIG :WARN :Config File (/etc/ganesha/ganesha.conf:23): Unknown parameter (Attr_Expiration_Time)
04/08/2021 21:29:51 : epoch 610b06cf : nfs1 : nfs-ganesha-15456[main] config_errs_to_log :CONFIG :WARN :Config File (/etc/ganesha/ganesha.conf:24): Unknown parameter (Squash)
04/08/2021 21:29:51 : epoch 610b06cf : nfs1 : nfs-ganesha-15456[main] config_errs_to_log :CONFIG :WARN :Config File (/etc/ganesha/ganesha.conf:26): Unknown parameter (FSAL)
04/08/2021 21:29:51 : epoch 610b06cf : nfs1 : nfs-ganesha-15456[main] lower_my_caps :NFS STARTUP :EVENT :CAP_SYS_RESOURCE was successfully removed for proper quota management in FSAL
04/08/2021 21:29:51 : epoch 610b06cf : nfs1 : nfs-ganesha-15456[main] lower_my_caps :NFS STARTUP :EVENT :currenty set capabilities are: =ep cap_sys_resource-ep
04/08/2021 21:29:51 : epoch 610b06cf : nfs1 : nfs-ganesha-15456[main] nfs_Init_svc :DISP :CRIT :Cannot acquire credentials for principal nfs
04/08/2021 21:29:51 : epoch 610b06cf : nfs1 : nfs-ganesha-15456[main] nfs_Init_admin_thread :NFS CB :EVENT :Admin thread initialized
04/08/2021 21:29:51 : epoch 610b06cf : nfs1 : nfs-ganesha-15456[main] nfs_rpc_cb_init_ccache :NFS STARTUP :WARN :gssd_refresh_krb5_machine_credential failed (-1765328160:2)
04/08/2021 21:29:51 : epoch 610b06cf : nfs1 : nfs-ganesha-15456[main] nfs_Start_threads :THREAD :EVENT :Starting delayed executor.
04/08/2021 21:29:51 : epoch 610b06cf : nfs1 : nfs-ganesha-15456[main] nfs_Start_threads :THREAD :EVENT :gsh_dbusthread was started successfully
04/08/2021 21:29:51 : epoch 610b06cf : nfs1 : nfs-ganesha-15456[main] nfs_Start_threads :THREAD :EVENT :admin thread was started successfully
04/08/2021 21:29:51 : epoch 610b06cf : nfs1 : nfs-ganesha-15456[main] nfs_Start_threads :THREAD :EVENT :reaper thread was started successfully
04/08/2021 21:29:51 : epoch 610b06cf : nfs1 : nfs-ganesha-15456[main] nfs_Start_threads :THREAD :EVENT :General fridge was started successfully
04/08/2021 21:29:51 : epoch 610b06cf : nfs1 : nfs-ganesha-15456[main] nfs_start :NFS STARTUP :EVENT :-------------------------------------------------
04/08/2021 21:29:51 : epoch 610b06cf : nfs1 : nfs-ganesha-15456[main] nfs_start :NFS STARTUP :EVENT : NFS SERVER INITIALIZED
04/08/2021 21:29:51 : epoch 610b06cf : nfs1 : nfs-ganesha-15456[main] nfs_start :NFS STARTUP :EVENT :-------------------------------------------------
04/08/2021 21:30:01 : epoch 610b06cf : nfs1 : nfs-ganesha-15456[reaper] nfs_lift_grace_locked :STATE :EVENT :NFS Server Now NOT IN GRACE
I'd agree that fail quit on startup but on reload a better error message that it was staying with the current running config is certainly the way to go.
It would be nice if the parser spit out more information about an error... It would be nice if someone a bit more familiar with Bison and Lex could dig into that...
I'd also be open to a patch to make export config errors on startup fatal.
I just added a -x command line option to nfs-ganesha to abort if ANY config errors are reported, even mild ones:
https://review.gerrithub.io/c/ffilz/nfs-ganesha/+/552407
V5.0 has been released. Closing.