mmm.. I don't know. In that bug the linking was static. In this case it is dynamic and instead of calling _init in .so file, it calls _init in the exe file that is causing the crash
I am presently in a bit of a mess. I have fixed InitSHLibs, so that it looks up the .ctors section. This required fixing a bug in LoadSegment. So far so good. But now I get a deadlock when the app closes down, and apparently (from what I can gather from Sashimi, before it dies) InitSHLibs is called AFTER CloseElf, which is probably the cause of the deadlock.
I cannot see any place, that this can happen from inside elf.library, so I gather the crtend.o or shcrtend.o is responsible. I cannot produce a disassembly any of of these. Can anyone else produce a disassembly?
so I gather the crtend.o or shcrtend.o is responsible
IIRC there is no code inside crtend.o and shcrtend.o, they only include the end markers of the constuctor/destructor arrays, the code for calling the functions is in crtbegin.o and shcrtbegin.o. If there is a problem with crtend.o or shcrtend.o it can only be because of a wrong linking order, i.e. something is wrong in the specs file of GCC.
In case of newlib libc.so and libc.a are identical, and libm.(a|so) isn't used (the functions are in libc.(a|so) instead), there can't be a problem with that. But the missing libgcc(_s).so may be a problem.
IIRC there is no code inside crtend.o and shcrtend.o, they only include the end markers of the constuctor/destructor arrays, the code for calling the functions is in crtbegin.o and shcrtbegin.o. If there is a problem with crtend.o or shcrtend.o it can only be because of a wrong linking order, i.e. something is wrong in the specs file of GCC.
This is good to know. There is definitely something wrong with the calling order, so now the question is just where. @MightyMax : Are you up for this?
No need to worry about the specs after all. It was simply a missread of the debug output on my behalf. Everything runs as it should.
Now, I have this :
Quote:
3.Work4.1:STICK/ctor_test> test_dyn foo bar f 123 main result 123 3.Work4.1:STICK/ctor_test>
So this is a step in the right direction. What I am doing is just to look up the .ctors section and treat it as an array of (uint32 *) pointers to functions. So far - as you can see - the foo and bar functions are called. These are coupled in a single entry through a caller function. The .ctors section is 12 bytes long (0x0c), so three pointers. The two others are NULL. I am guessing, that these other two represent the ctor() and ctor2() functions. Apparently there is an issue with incomplete relocation.
Is it possible that someone could assist in analyzing the readelf output to try and figure out, which kind of relocation is missing? Thanks!
EDIT: As a matter of fact, it is worth to take a look at this :
Quote:
3.Work4.1:STICK/ctor_test> readelf -s lib.o
Symbol table '.symtab' contains 22 entries: Num: Value Size Type Bind Vis Ndx Name 0: 00000000 0 NOTYPE LOCAL DEFAULT UND 1: 00000000 0 FILE LOCAL DEFAULT ABS lib.c 2: 00000000 0 SECTION LOCAL DEFAULT 1 3: 00000000 0 SECTION LOCAL DEFAULT 3 4: 00000000 0 SECTION LOCAL DEFAULT 4 5: 00000000 0 SECTION LOCAL DEFAULT 5 6: 00000000 0 SECTION LOCAL DEFAULT 6 7: 000000a8 56 FUNC LOCAL DEFAULT 1 _ZN12_GLOBAL__N_115foo_ct 8: 000000a8 56 FUNC LOCAL DEFAULT 1 _ZN12_GLOBAL__N_115foo_ct 9: 00000000 1 OBJECT LOCAL DEFAULT 4 _ZN12_GLOBAL__N_1L18foo_c 10: 000000e0 56 FUNC LOCAL DEFAULT 1 _ZN12_GLOBAL__N_115bar_ct 11: 000000e0 56 FUNC LOCAL DEFAULT 1 _ZN12_GLOBAL__N_115bar_ct 12: 00000001 1 OBJECT LOCAL DEFAULT 4 _ZN12_GLOBAL__N_1L18bar_c 13: 00000184 128 FUNC LOCAL DEFAULT 1 _Z41__static_initializati 14: 00000204 60 FUNC LOCAL DEFAULT 1 _GLOBAL__sub_I_lib.c 15: 00000000 0 SECTION LOCAL DEFAULT 8 16: 00000000 0 SECTION LOCAL DEFAULT 10 17: 00000004 80 FUNC GLOBAL DEFAULT 1 _Z3foov 18: 00000000 0 NOTYPE GLOBAL DEFAULT UND puts 19: 00000058 80 FUNC GLOBAL DEFAULT 1 _Z3barv 20: 0000011c 100 FUNC GLOBAL DEFAULT 1 _Z1fi 21: 00000000 0 NOTYPE GLOBAL DEFAULT UND printf 3.Work4.1:STICK/ctor_test>
Apparently the assembler has discared the ctor(2) and dtor(2) functions (they are included in the assembly). So probably this is the reason for the two vacant spaces in the .ctors section.
Now - why would the assembler discard entries? The entries look almost exactly the same as the entries, that are included (in the assembly).
This is good to know. There is definitely something wrong with the calling order, so now the question is just where. @MightyMax : Are you up for this?
Of course, It it wouldn't make sense to stop now. After having digging around in executable and sobjs etc. I starting to dram about elfs and dwarfs But my "free" time will be limited the next weeks/month, but i will try to keep up and investigate further.
Quote:
Is it possible that someone could assist in analyzing the readelf output to try and figure out, which kind of relocation is missing?
So only two entries. The missing entry is now the static_initialization function, that calls foo() and bar().
Why is there a difference in size here?? I think elf.library gives the correct output, because it seems to just chuck out what's on disk. So the theory is, that readelf is 'inventing' entries as it goes along. But how?
I don't know what's wrong, but I'd suggest to start testing from known working versions, i.e. GCC 2.95.x/3.4x and the corresponding binutils 2.14.x, to newer versions one by one to check which version of GCC and/or binutils exactly introduced the bugs. That way it should be much easier to find out which changes in GCC and/or binutils cause the bugs, than just trying to find out what's wrong with the current versions which are up to 10 versions newer than the last working versions.
The problem was, that the current implementation of __shlib_call_constructors would fail (= yield an incomplete array of constructors).
The reason for this problem is not 100% clear. But at least it has to do with the fact, that the .ctors (and .dtors) section would somehow not read completely.
Doing an implementation inside of elf.library does the job. This secures, that the section has the right size, that it is thoroughly relocated and that NULL pointers inside the section is handled.