need to optimize symbol lookup?
I used uftrace to analyze the flow and performance of shecc and found the most time shecc spent is strcmp(): $ uftrace record out/shecc src/main.c # check stage0 for now, since stage1 didn't support something like gcc -pg $ uftrace report Total time Self time Calls Function ========== ========== ========== ==================== 166.474 ms 0.776 us 1 main ... 41.193 ms 41.193 ms 1008573 strcmp //<- for stage1, this would be even higher since strcmp() is naive
It's not too hard to realize this since strcmp() was used a lot by find_func() and other similar functions (with linear search). Do you think we need to use some kind of dictionary to optimize this? (I can add it. :-) )
After all, hash table or trie might not be too complicated, we can still have the educational purpose. Or this is intentional for students to add?
Thanks for pointing this. I've never analyzed the internals. Feel free to submit potential improvements over look-ups.
newlib implementation for strcmp and strncmp:
Hi @jserv, I have implemented (#58) the trie struct to enhance the find_func function. However, I encountered a syntax error on the first attempt when running make all. Surprisingly, the error disappeared, and the code compiled successfully when I ran make all again.
Here is the error message I received:
> make all
Target machine code switch to arm
CC+LD out/inliner
GEN out/libc.inc
CC out/src/main.o
In file included from src/main.c:14:
src/globals.c: In function ‘insert_trie_t’:
src/globals.c:53:18: warning: array subscript has type ‘char’ [-Wchar-subscripts]
53 | if(!obj->next[c]){
| ^
src/globals.c:54:18: warning: array subscript has type ‘char’ [-Wchar-subscripts]
54 | obj->next[c] = funcs_idx_trie++;
| ^
src/globals.c:56:33: warning: array subscript has type ‘char’ [-Wchar-subscripts]
56 | FUNCS_TRIE[obj->next[c]].next[i] = 0;
| ^
src/globals.c:57:29: warning: array subscript has type ‘char’ [-Wchar-subscripts]
57 | FUNCS_TRIE[obj->next[c]].index = 0;
| ^
src/globals.c:59:40: warning: array subscript has type ‘char’ [-Wchar-subscripts]
59 | insert_trie_t(&FUNCS_TRIE[obj->next[c]], name, index);
| ^
src/globals.c: In function ‘search_trie_t’:
src/globals.c:66:23: warning: array subscript has type ‘char’ [-Wchar-subscripts]
66 | else if(!obj->next[c])return 0;
| ^
src/globals.c:67:52: warning: array subscript has type ‘char’ [-Wchar-subscripts]
67 | else return search_trie_t(&FUNCS_TRIE[obj->next[c]], word);
| ^
LD out/shecc
SHECC out/shecc-stage1.elf
SHECC out/shecc-stage2.elf
out/shecc-stage1.elf: 1: Syntax error: word unexpected (expecting ")")
make: *** [Makefile:76: out/shecc-stage2.elf] Error 2
> make all
SHECC out/shecc-stage2.elf
To investigate the issue, I examined the Makefile but couldn't determine the purpose of TARGET_EXEC. It appears that TARGET_EXEC is set to ARM_EXEC when the target is ARM in the config file:
TARGET_EXEC := $($(shell head -1 config | sed 's/.*: \([^ ]*\).*/\1/')_EXEC)
$(OUT)/$(STAGE2): $(OUT)/$(STAGE1)
$(VECHO) " SHECC\t$@\n"
$(Q)$(TARGET_EXEC) $(OUT)/$(STAGE1) -o $@ $(SRCDIR)/main.c
I would appreciate your guidance on how the ARM_EXEC affects the command and any advice you have regarding this matter.
To investigate the issue, I examined the Makefile but couldn't determine the purpose of
TARGET_EXEC. It appears thatTARGET_EXECis set toARM_EXECwhen the target is ARM in the config file:
It is intentionally set. The expected sequence of commands for building is as following:
make clean
make config
make
make check
Run make -n and/or make -dn to debug the build rules.
I see. Thank you for the explanation!