I'm a little late to this thread, but thankfully @joerg has explained most things. He and I are probably only two of a handful of developers who used -mbaserel - IIRC, both OWB and Timberwolf use it. IBrowse 2.5 used to use it, but I switched to using -msdata instead...
One of the good things about -mbaserel is that the compiler allowed developers to easily restore r2 for library interface functions, for example, like __saveds did on OS3. There is no such thing for restoring r13, so you have to do the r13 save/restoring manually using asm instead.
AmiSSL is likely to always use -mbaserel because although the data size that OpenSSL uses is not yet 64K, its getting close and will no doubt go over that limit in the future. It is not feasible to patch OpenSSL to use less data.
GCC 4.0.4 (available as native and cross-compiler) was the last version to fully support -mbaserel and generate the correct code, hence why we are stuck with it for AmiSSL. IIRC, the linker and binutils from 4.2.4 did also handle baserel correctly, but the compiler itself created broken baserel code.
Libraries and startup code use the elf library CopyDataSegment() function to copy and relocate the data segment (segment, not section) and FreeDataSegmentCopy() when done with it. These functions were slightly buggy until I fixed them in V53.35.
But the most use case is loading executables resident, which can be done on AmigaOS with the C:Resident command, or by setting the p(ure) (=re-entrant) and h(old) (=keep in memory) protection bits. For example for shell commands used often that's much faster since it only has to be loaded once from disk (and code and rodata relocation done only once by elf.library) and isn't unloaded after it exits. The next time it's simply started from memory, and it can be executed multiple times by different shells at the same time. Of course that can only work if every task/shell running it has an own copy of the .data/.bss, only .text (code) and .rodata (read-only data) are shared. Even if it's not executed by different tasks at the same time the copy of .bss/.data is required since the program may modify global variables in .bss/.data but only works if they are set to the initial values at program start.
Now things start to clear up for me. Now makes all sense.
BTW, If i understand it correctly it is a nice feature of AmiagOS, that i can make executrices resident (aka cache them in memory) and thus speed up the execution, and if it calls multipel times at "once" reduce memory usage. I wasn't aware of that!
@Futaura & Co
I think that binutils (old and new in progress) have all things in place to support baserel. But gcc needs update to in regards to this feature.
Anyway, it seems that gcc 2.x.y still supported baserel out of the box. So it seems that other targets even used to use or could use baserel. Does anyone know why it was drop out of gcc development?
If we switch the viewpoint more to gcc and code generating: For me baserel and small data model is more er less the same concept except different used register. BUT i see that baserel does suffer from the ~65K size limitation. How is that achieved?
@kas1e
So to test the new binutils support for baserel, we need a setup of gcc 4.0.4 and new binutils.
So to test the new binutils support for baserel, we need a setup of gcc 4.0.4 and new binutils.
No problems, already done :) we need test case now, to see if it works with gcc 4.0.4 and binutils from adtools and/or older binutils coming in those times with gcc 4.x
Anyway, it seems that gcc 2.x.y still supported baserel out of the box. So it seems that other targets even used to use or could use baserel.
Maybe it was something similar, but since -mbaserel on AmigaOS had to use new BREL relocations it can't be the same.
Quote:
BUT i see that baserel does suffer from the ~65K size limitation. How is that achieved?
Instead of a single 16 bit relative relocation (R_PPC_SDAREL16?) in the small data model, resulting in the 64KB size limit, baserel is using 2 BREL relocations, similar to absolute _LO and _HI relocations.
and binutils from adtools and/or older binutils coming in those times with gcc 4.x
The adtools binutils you used in "gcc 9 and 10" can't work, just like the new binutils it dumps all sections into a single segment. baserel needs a segment with data and bss (only), which can be copied by IElf->CopyDataSegment().
The adtools binutils you used in "gcc 9 and 10" can't work, just like the new binutils it dumps all sections into a single segment.
As i wrote in another topic after the tests : it's the same with binutils 2.14 and gcc 4.0.4 too : same single segment. At least for the simple hello world test case. Maybe when -baserel is added only then it split out segments on 2.14, will check.
I'm 100% sure I never got an all-in-one segment with all sections, but OTOH most of the time I used vlink instead of ld.
AFAIK elf.library only uses the segment-, not program- or section-, headers, but Futaura should have more information about the current versions of elf.library.
I don't think anything dumps all sections into a single segment does it? What does "objdump -ph" say?
AFAIK, it has always been the case that, at least from 2005 or so, that compilers will generate two segments. That's what happens with GCC 4.x, 11 and VBCC, at least. The first segment contains .text and .rodata (physically marked as read only when loaded - anything that writes to this segment will cause a GR) and the second contains all the other data sections (.data, .bss, .sdata, .sbss, .ctors, .dtors, etc). If there are other versions of GCC that do dump all code and data into a single segment, then that would be an incorrect thing to do.
Regarding IElf->CopyDataSegment(), as the name suggests, it copies the entire data segment (not only .data and .bss). How does it know which segment is the data segment? The original elf.library used the segment containing the .data section, whereas the latest elf.library additionally looks for .bss, .sdata and .sbss too (in case there is no .data section). I use IElf->CopyDataSegment() with both -mbaserel and -msdata.
I've not tried it, but if there were truly only a single segment containing both code and data, probably IElf->CopyDataSegment() might still work. However, it would obviously be suboptimal unnecessarily copying all the code too each time.
A little OT, but I almost forgot, AmiSSL is also restricted to GCC 4.x because it heavily relies on -mcheck68kfuncptr, which I think Stefan Burstroem added, but it never made it into newer releases. This is required for 68k programs to be able to use AmiSSL PPC because OpenSSL uses many function pointers which are user definable. The -mcheck68kfuncptr feature caused function pointers to be checked if they were 68k or PPC code, automatically running it via emulation if 68k.
If there are other versions of GCC that do dump all code and data into a single segment, then that would be an incorrect thing to do.
I think it can be because we do test some simple assembler based test case (just “as” and “ld” involved, not “GCC” binary with its ldscripts), which may just put everything into .text segment:
.text
.global main
main:
lis %r3,.msg@ha #
la %r3,.msg@l(%r3) # printf("aaaa");
bl printf #
li %r3,0 # exit(0);
bl exit #
.msg:
.string "aaaa"
Quote:
What does “objdump -ph” say?
For this simple test case, it brings that:
$ objdump -ph test_new
test_new: file format elf32-big
Program Header:
LOAD off 0x00000054 vaddr 0x01000054 paddr 0x01000054 align 2**2
filesz 0x00000ffd memsz 0x00001024 flags rwx
So, when it's GCC used, then we do have different sections. When use simple assembler based test case without ldscripts, then we have what we have.
In other words, it seems all fine, just when we test assembler builds, it acts. When we use “GCC” (So ldscripts coming with, etc.). Then .text is AX (not WAX) as expected, and there are more than one segment.
Edited by kas1e on 2023/8/12 13:55:29 Edited by kas1e on 2023/8/12 14:04:13
In other words, it seems all fine, just when we test assembler builds, it acts.
As I wrote already your assembler code is broken. Check the output of gcc -S test.c, IIRC it doesn't simply include ".text", ".rodata", etc. but has some options after it which are missing in your assembler code but required to generate correct sections and segments.
@Futaura readelf output of the broken executables is in the gcc 9 and 10 thread: writable .text and everything in single segment.
The only difference there for binary builds is GCC (so ldscripts) VS "ld" without ldsripts.
Probably, prebuild ldscripts coming with GCC for os4, is what make binary to have more sections, to have .text be AX instead of WAX and co. And when we use pure LD, then we fail to some broken defaults or something, dunno.
@joerg Damn, this was -N option for "ld" which i use back in past to my own exactly reasssons to __not page align data, do not make text readonly__.
Blah, just simple using old line i use, and that what caused issues.
So both issues about are fail, it's just -N flag i use when build asm code. Once remove, then everything looks correct, even for my broken assembler code.
@joerg Quote:
@Futaura readelf output of the broken executables is in the gcc 9 and 10 thread: writable .text and everything in single segment.
Same issues because of -N switch i use for ld :) It says exactly to make it writable, and non page things, so put all in one segment. Meaning 3 issues were reported because of me using "-N" (damn, only now remember why i use it before : i doing so for creating as simple and small as possible binary for os4, for hacking purposes, that why i play with all those linker options, and now after 10 years just copy+paste this line and loose 3 days of all our times for nothing ..)
BTW, I know it was only test code, but you should never put data in .text, even if it is read-only. Otherwise, the LTE is likely to bite you and mangle it (I found this out the hard way in OpenSSL).
@Joerg,Fuaura Interesting, why -N switch also make a single section, while without -N made more than one section. In description, it says "Do not page align data, do not make text readonly" , while second part is clear : how the first one make an impact on the amount of the segment sections ? I mean, we do not page align data, so it means to not create data section because we do not align it ?
Like if we do not page-align the data segment, then it automatically puts to the code segment.
I've been following this thread with interest, as I've always used -N when linking my programs. IIRC, the alignment with the standard ldscript is 64K, which ends up leaving a lot of unused empty space in the executable on disk, and I suppose in memory as well. For modest-sized programs the waste is significant, and using -N substantially reduces it.
That's the benefit to using -N. Is there a downside?
@msteed For myself i use -N only when making this article where i tried to make as small as possible executable, so with just one sections, no data and stuff. Probabaly, if one want a bit smaller binary, without zeroes to fill the 64 paddings, then why not. Just it will be not casual binary, but still should be no problems. IMHO, of course.
@msteed The alignment should be 4KB (page size), and it's required because there are 3 parts in AmigaOS executables: - read-only, executable (.text, the program code) - read-only, non-executable (.rodata, for example strings) - writable data (.data, .bss, .sdata, .sbss, etc.)
On CPUs using, on AmigaOS 256 MB, segment registers (60x, 750, 74xy) instead of the MMU to set virtual memory to be (non-)executable it may not be required, the .text has to be copied or remapped anyway to the executable segment, but on CPUs which do use the MMU/TLB (for example 440/460) to mark individual pages as (non-)executable it can't work correctly without page aligned sections. I guess if you remove the alignment, with ld -N or by using a custom ldscript, some parts of .rodata may be set to be executable, which is of course wrong.
I was remembering the alignment incorrectly; it is actually 4K. When you're looking at a little Hello World program that's a few hundred bytes, a 4K aligment gap seems enormous.
As a test, I rebuilt my PassPocket program -- which was linked with -N -- using the normal ldscript. I got lucky with the alignment gap there; the program grew by only about 800 bytes, or less than 1%. Worst case it might have grown by nearly 4K, though that's still less than 5%.
To help understand the difference, I looked at the executable for both versions with a hex viewer. Both versions have an alignment gap between the .text section and the .rodata section, so .text can live in its own executable memory page. The page certainly could be made read-only as well, though the -N version's ELF file specifies it as writeable (is there a tool of some sort that would display this information?).
The standard version also has an alignment gap between the .rodata section and the .data section (and all the other writeable sections that follow), while the -N version does not. I suppose that makes the .rodata section writeable in the -N version, since it would live in the same memory page as the other writeable sections.
I'm not sure how big an issue that is. I guess if a bug causes the program to try to write to the .rodata section it would trigger a DSI that would point out the problem and allow it to be fixed, while the -N version would not get this protection. And there would be some degree of protection against some other program running wild and scribbling over the .rodata (or over the .text section, if that's write protected), though that's not something that's very likely to happen.