perl5 icon indicating copy to clipboard operation
perl5 copied to clipboard

POC - eliminate C preprocessor complexity

Open happy-barney opened this issue 3 years ago • 11 comments

This is an example how to eliminate C preprocessor complexity on example of one function.

This PR is made to support discussion on mailing list.

Complexity of PP functions:

pp lines paths variables file
pp_system 213 20 9 pp_sys.c
pp_gpwent 223 14 15 pp_sys.c
pp_aassign 649 13 10 pp_hot.c
pp_ftrread 120 12 5 pp_sys.c
pp_sselect 177 12 12 pp_sys.c
pp_uc 323 11 3 pp.c
pp_stat 233 10 10 pp_sys.c
pp_ghostent 91 10 8 pp_sys.c
pp_subst 396 9 2 pp_hot.c
pp_ehostent 64 9 10 pp_sys.c
pp_gnetent 72 9 7 pp_sys.c
pp_fc 207 8 3 pp.c
pp_ucfirst 347 7 1 pp.c
pp_ggrent 71 7 7 pp_sys.c
pp_fork 71 7 6 pp_sys.c
pp_lc 216 6 1 pp.c
pp_pow 180 6 6 pp.c
pp_truncate 106 6 6 pp_sys.c
pp_sysread 260 6 7 pp_sys.c
pp_gservent 64 6 4 pp_sys.c
pp_gprotoent 59 6 4 pp_sys.c
pp_crypt 44 5 5 pp.c
pp_divide 116 5 6 pp.c
pp_match 216 5 2 pp_hot.c
pp_chdir 88 5 2 pp_sys.c
pp_shostent 36 5 4 pp_sys.c
pp_ioctl 72 5 4 pp_sys.c
pp_syswrite 153 5 4 pp_sys.c
pp_multiply 192 4 7 pp.c
pp_regcomp 108 4 2 pp_ctl.c
pp_accept 60 4 5 pp_sys.c
pp_glob 58 4 3 pp_sys.c
pp_sysseek 46 4 2 pp_sys.c
pp_getpeername 63 4 4 pp_sys.c
pp_fileno 47 4 4 pp_sys.c
pp_waitpid 36 4 7 pp_sys.c
pp_readdir 51 4 5 pp_sys.c
pp_ftrowned 82 4 3 pp_sys.c
pp_sin 55 3 3 pp.c
pp_wait 27 3 6 pp_sys.c
pp_fttext 197 3 3 pp_sys.c
pp_telldir 31 3 4 pp_sys.c
pp_rename 27 3 1 pp_sys.c
pp_exec 38 3 1 pp_sys.c
pp_tms 30 3 2 pp_sys.c
pp_closedir 32 3 3 pp_sys.c
pp_getpgrp 21 3 2 pp_sys.c
pp_setpgrp 30 3 2 pp_sys.c
pp_undef 119 2 1 pp.c
pp_negate 43 2 1 pp.c
pp_scmp 17 2 1 pp.c
pp_substr 139 2 1 pp.c
pp_argelem 135 2 2 pp.c
pp_sle 41 2 1 pp.c
pp_vec 53 2 2 pp.c
pp_argdefelem 20 2 2 pp.c
pp_subtract 165 2 1 pp.c
pp_rand 49 2 2 pp.c
pp_quotemeta 77 2 1 pp.c
pp_split 450 2 1 pp.c
pp_flop 93 2 2 pp_ctl.c
pp_exit 25 2 1 pp_ctl.c
pp_goto 435 2 1 pp_ctl.c
pp_formline 487 2 1 pp_ctl.c
pp_leaveeval 57 2 1 pp_ctl.c
pp_sort 349 2 1 pp_sort.c
pp_aelem 81 2 1 pp_hot.c
pp_entersub 323 2 2 pp_hot.c
pp_add 215 2 1 pp_hot.c
pp_defined 53 2 1 pp_hot.c
pp_ftis 50 2 2 pp_sys.c
pp_getlogin 15 2 1 pp_sys.c
pp_chroot 12 2 1 pp_sys.c
pp_flock 26 2 1 pp_sys.c
pp_ssockopt 66 2 1 pp_sys.c
pp_semctl 19 2 3 pp_sys.c
pp_tell 33 2 2 pp_sys.c
pp_pipe_op 48 2 1 pp_sys.c
pp_shmwrite 29 2 3 pp_sys.c
pp_getppid 10 2 1 pp_sys.c
pp_die 57 2 1 pp_sys.c
pp_umask 29 2 1 pp_sys.c
pp_rewinddir 23 2 2 pp_sys.c
pp_syscall 71 2 1 pp_sys.c
pp_seekdir 25 2 2 pp_sys.c
pp_time 10 2 1 pp_sys.c
pp_semget 14 2 3 pp_sys.c
pp_alarm 32 2 1 pp_sys.c
pp_sockpair 43 2 5 pp_sys.c
pp_open_dir 25 2 2 pp_sys.c
pp_setpriority 14 2 1 pp_sys.c
pp_rmdir 18 2 1 pp_sys.c
pp_mkdir 26 2 1 pp_sys.c
pp_readlink 24 2 1 pp_sys.c
pp_getpriority 12 2 1 pp_sys.c

happy-barney avatar Oct 12 '22 16:10 happy-barney

This would create 2 new subdirectories in our code tree. Color me ... skeptical.

jkeenan avatar Oct 12 '22 20:10 jkeenan

Is this rearrangement based on some type of best practices document?

The PP() on Perl_do_kv is a good change.

Adding a common and well named macro to control which versions of the function is good, but I don't think it needs to be in a separate header, it could simply be close to the uses, at least for the dirhandle functions.

I don't think moving the various versions of the function bodies out to headers is improving the readability of the code. If I were going to rearrange this (at least for the dirhandle functions) I might do something like:

#if PERL_PLATFORM_HAS_DIR_HANDLE_FUNCTIONS
PP(pp_open_dir)
{
   /* implementation of opendir() */
}

PP(pp_readdir)
{
  /* implementation of readdir() */
}
...
#else 
PP(pp_open_dir)
{
    DIE(aTHX_ PL_no_dir_func, "opendir");
}

PP(pp_readdir)
{
    DIE(aTHX_ PL_no_dir_func, "readdir");
}

...
#endif

but the "not implemented" versions of these functions are trivial, so I'm not sure it's worthwhile.

tonycoz avatar Oct 13 '22 00:10 tonycoz

On Thu, 13 Oct 2022 at 02:21, Tony Cook @.***> wrote:

Is this rearrangement based on some type of best practices document?

The PP() on Perl_do_kv is a good change.

Adding a common and well named macro to control which versions of the function is good, but I don't think it needs to be in a separate header, it could simply be close to the uses, at least for the dirhandle functions.

I don't think moving the various versions of the function bodies out to headers is improving the readability of the code. If I were going to rearrange this (at least for the dirhandle functions) I might do something like:

This is POC showing multiple steps (and I agree that moving code into .inc files is little bit controversial - I made it to demonstrate).

I worked with alternative but was not able to deal with unused functions problem so far (example for pp_readdir):

  • there will be macro PERL_PP_READDIR with values Perl_pp_readdir_dirhandle or Perl_pp_readdir_not_implemented
  • this macro will be used instead of current token Perl_pp_readdir

#if PERL_PLATFORM_HAS_DIR_HANDLE_FUNCTIONS

This is also good idea, except instead of #else there should be #ifndef Plus mechanism for possible platform specific implementation (not capable to provide reasonable example yet)

Regardless on anything, I think that every PP function should have it's own file - it's much better when tracking history.

but the "not implemented" versions of these functions are trivial, so I'm

not sure it's worthwhile.

Basically I had three poc choices how to demonstrate approach:

  • these dir handles functions
  • elimination of amigaos4 token from generic codebase
  • pp_system

pp_system is too complex to be an effective example, but it will show at least reason for split:

  • pp_system_forky
  • pp_system_spawny

Regarding "not implemented" - will it be possible to create generic function Perl_pp_not_implemented?

Brano

Message ID: @.***>

happy-barney avatar Oct 13 '22 04:10 happy-barney

On Wed, 12 Oct 2022 at 22:44, James E Keenan @.***> wrote:

This would create 2 new subdirectories in our code tree. Color me ... skeptical.

Regarding directories - that will only help with potential contributors - directory will identify module. Then you can require module name in commit summary. Now, it's only known to current contributors.

  • there should be pp directory containing at least files like pp_system.c pp_readdir.c it's better for history tracking - eg git history pp/pp_system.c will not show you changes eg in pp_readdir

  • there should be directory platform with current directories moved there, eg platform/haiku Let's make it clear - not everyone knows every platform

There is also so many undocumented things in code base, like allowed and disallowed

  • C syntax constructs (eg recent c99)
  • shell syntax constructs (eg: can I safely use readlink -f ?)
  • makefile syntax constructs (eg: can I safely use include ?)

Disallowed (not supported by one/few platforms) code structures should be listed per platform.

(but that's for different thread, that belong to category "why I feel like perl5 code base is unwelcoming")

Message ID: @.***>

happy-barney avatar Oct 13 '22 04:10 happy-barney

@happy-barney I know this is a draft, but it is stalled and in conflict. Would you like to rebase it so it applies cleanly to blead and we can discuss further?

demerphq avatar Feb 07 '23 14:02 demerphq

@demerphq done

happy-barney avatar Feb 07 '23 19:02 happy-barney

@happy-barney it is in conflict again, if you want i can try to resolve the conflicts for you.

demerphq avatar Feb 19 '23 14:02 demerphq

conflict are in generated file due changes in generator.

I think that change should be in source even without this PoC so I extracted it into standalone PR #20835

happy-barney avatar Feb 20 '23 10:02 happy-barney

I resolved the conflicts on this anyway.

demerphq avatar Feb 20 '23 14:02 demerphq

rebased (with PP commit in blead), so it should not produce conflicts anymore

happy-barney avatar Feb 20 '23 16:02 happy-barney

@happy-barney I did that already yesterday after I merged. No harm doing it twice tho.

demerphq avatar Feb 21 '23 00:02 demerphq