Anyway, it seems that gcc 2.x.y still supported baserel out of the box. So it seems that other targets even used to use or could use baserel.
Maybe it was something similar, but since -mbaserel on AmigaOS had to use new BREL relocations it can't be the same.
Quote:
BUT i see that baserel does suffer from the ~65K size limitation. How is that achieved?
Instead of a single 16 bit relative relocation (R_PPC_SDAREL16?) in the small data model, resulting in the 64KB size limit, baserel is using 2 BREL relocations, similar to absolute _LO and _HI relocations.
and binutils from adtools and/or older binutils coming in those times with gcc 4.x
The adtools binutils you used in "gcc 9 and 10" can't work, just like the new binutils it dumps all sections into a single segment. baserel needs a segment with data and bss (only), which can be copied by IElf->CopyDataSegment().
The adtools binutils you used in "gcc 9 and 10" can't work, just like the new binutils it dumps all sections into a single segment.
As i wrote in another topic after the tests : it's the same with binutils 2.14 and gcc 4.0.4 too : same single segment. At least for the simple hello world test case. Maybe when -baserel is added only then it split out segments on 2.14, will check.
I'm 100% sure I never got an all-in-one segment with all sections, but OTOH most of the time I used vlink instead of ld.
AFAIK elf.library only uses the segment-, not program- or section-, headers, but Futaura should have more information about the current versions of elf.library.
I don't think anything dumps all sections into a single segment does it? What does "objdump -ph" say?
AFAIK, it has always been the case that, at least from 2005 or so, that compilers will generate two segments. That's what happens with GCC 4.x, 11 and VBCC, at least. The first segment contains .text and .rodata (physically marked as read only when loaded - anything that writes to this segment will cause a GR) and the second contains all the other data sections (.data, .bss, .sdata, .sbss, .ctors, .dtors, etc). If there are other versions of GCC that do dump all code and data into a single segment, then that would be an incorrect thing to do.
Regarding IElf->CopyDataSegment(), as the name suggests, it copies the entire data segment (not only .data and .bss). How does it know which segment is the data segment? The original elf.library used the segment containing the .data section, whereas the latest elf.library additionally looks for .bss, .sdata and .sbss too (in case there is no .data section). I use IElf->CopyDataSegment() with both -mbaserel and -msdata.
I've not tried it, but if there were truly only a single segment containing both code and data, probably IElf->CopyDataSegment() might still work. However, it would obviously be suboptimal unnecessarily copying all the code too each time.
A little OT, but I almost forgot, AmiSSL is also restricted to GCC 4.x because it heavily relies on -mcheck68kfuncptr, which I think Stefan Burstroem added, but it never made it into newer releases. This is required for 68k programs to be able to use AmiSSL PPC because OpenSSL uses many function pointers which are user definable. The -mcheck68kfuncptr feature caused function pointers to be checked if they were 68k or PPC code, automatically running it via emulation if 68k.
If there are other versions of GCC that do dump all code and data into a single segment, then that would be an incorrect thing to do.
I think it can be because we do test some simple assembler based test case (just “as” and “ld” involved, not “GCC” binary with its ldscripts), which may just put everything into .text segment:
.text
.global main
main:
lis %r3,.msg@ha #
la %r3,.msg@l(%r3) # printf("aaaa");
bl printf #
li %r3,0 # exit(0);
bl exit #
.msg:
.string "aaaa"
Quote:
What does “objdump -ph” say?
For this simple test case, it brings that:
$ objdump -ph test_new
test_new: file format elf32-big
Program Header:
LOAD off 0x00000054 vaddr 0x01000054 paddr 0x01000054 align 2**2
filesz 0x00000ffd memsz 0x00001024 flags rwx
So, when it's GCC used, then we do have different sections. When use simple assembler based test case without ldscripts, then we have what we have.
In other words, it seems all fine, just when we test assembler builds, it acts. When we use “GCC” (So ldscripts coming with, etc.). Then .text is AX (not WAX) as expected, and there are more than one segment.
Edited by kas1e on 2023/8/12 13:55:29 Edited by kas1e on 2023/8/12 14:04:13
In other words, it seems all fine, just when we test assembler builds, it acts.
As I wrote already your assembler code is broken. Check the output of gcc -S test.c, IIRC it doesn't simply include ".text", ".rodata", etc. but has some options after it which are missing in your assembler code but required to generate correct sections and segments.
@Futaura readelf output of the broken executables is in the gcc 9 and 10 thread: writable .text and everything in single segment.
The only difference there for binary builds is GCC (so ldscripts) VS "ld" without ldsripts.
Probably, prebuild ldscripts coming with GCC for os4, is what make binary to have more sections, to have .text be AX instead of WAX and co. And when we use pure LD, then we fail to some broken defaults or something, dunno.
@joerg Damn, this was -N option for "ld" which i use back in past to my own exactly reasssons to __not page align data, do not make text readonly__.
Blah, just simple using old line i use, and that what caused issues.
So both issues about are fail, it's just -N flag i use when build asm code. Once remove, then everything looks correct, even for my broken assembler code.
@joerg Quote:
@Futaura readelf output of the broken executables is in the gcc 9 and 10 thread: writable .text and everything in single segment.
Same issues because of -N switch i use for ld :) It says exactly to make it writable, and non page things, so put all in one segment. Meaning 3 issues were reported because of me using "-N" (damn, only now remember why i use it before : i doing so for creating as simple and small as possible binary for os4, for hacking purposes, that why i play with all those linker options, and now after 10 years just copy+paste this line and loose 3 days of all our times for nothing ..)
BTW, I know it was only test code, but you should never put data in .text, even if it is read-only. Otherwise, the LTE is likely to bite you and mangle it (I found this out the hard way in OpenSSL).
@Joerg,Fuaura Interesting, why -N switch also make a single section, while without -N made more than one section. In description, it says "Do not page align data, do not make text readonly" , while second part is clear : how the first one make an impact on the amount of the segment sections ? I mean, we do not page align data, so it means to not create data section because we do not align it ?
Like if we do not page-align the data segment, then it automatically puts to the code segment.
I've been following this thread with interest, as I've always used -N when linking my programs. IIRC, the alignment with the standard ldscript is 64K, which ends up leaving a lot of unused empty space in the executable on disk, and I suppose in memory as well. For modest-sized programs the waste is significant, and using -N substantially reduces it.
That's the benefit to using -N. Is there a downside?
@msteed For myself i use -N only when making this article where i tried to make as small as possible executable, so with just one sections, no data and stuff. Probabaly, if one want a bit smaller binary, without zeroes to fill the 64 paddings, then why not. Just it will be not casual binary, but still should be no problems. IMHO, of course.
@msteed The alignment should be 4KB (page size), and it's required because there are 3 parts in AmigaOS executables: - read-only, executable (.text, the program code) - read-only, non-executable (.rodata, for example strings) - writable data (.data, .bss, .sdata, .sbss, etc.)
On CPUs using, on AmigaOS 256 MB, segment registers (60x, 750, 74xy) instead of the MMU to set virtual memory to be (non-)executable it may not be required, the .text has to be copied or remapped anyway to the executable segment, but on CPUs which do use the MMU/TLB (for example 440/460) to mark individual pages as (non-)executable it can't work correctly without page aligned sections. I guess if you remove the alignment, with ld -N or by using a custom ldscript, some parts of .rodata may be set to be executable, which is of course wrong.
I was remembering the alignment incorrectly; it is actually 4K. When you're looking at a little Hello World program that's a few hundred bytes, a 4K aligment gap seems enormous.
As a test, I rebuilt my PassPocket program -- which was linked with -N -- using the normal ldscript. I got lucky with the alignment gap there; the program grew by only about 800 bytes, or less than 1%. Worst case it might have grown by nearly 4K, though that's still less than 5%.
To help understand the difference, I looked at the executable for both versions with a hex viewer. Both versions have an alignment gap between the .text section and the .rodata section, so .text can live in its own executable memory page. The page certainly could be made read-only as well, though the -N version's ELF file specifies it as writeable (is there a tool of some sort that would display this information?).
The standard version also has an alignment gap between the .rodata section and the .data section (and all the other writeable sections that follow), while the -N version does not. I suppose that makes the .rodata section writeable in the -N version, since it would live in the same memory page as the other writeable sections.
I'm not sure how big an issue that is. I guess if a bug causes the program to try to write to the .rodata section it would trigger a DSI that would point out the problem and allow it to be fixed, while the -N version would not get this protection. And there would be some degree of protection against some other program running wild and scribbling over the .rodata (or over the .text section, if that's write protected), though that's not something that's very likely to happen.
I've played around some more with linking with and without -N, and what effect that has on the resulting executable.
elf.library allocates tracked memory when loading a program, so the TrackedAddr tab in Ranger conveniently shows all the loaded sections and the addresses they live at (the sizes of all the tracked memory allocations exactly match the section sizes that readelf reports).
Ranger doesn't report the memory attributes of those allocations, so I wrote a quick'n'dirty tool to call IMMU->GetMemoryAttrs() for any address I give it. With this I was able to confirm that the only difference between code linked with -N and code linked without it is that the .text section is writeable with -N, and is read-only with the standard ldscript. .rodata and .__newlib_version are read-only for both both ldscript versions, and all the other sections are writeable for both versions.
Curiously, the .text sections are not marked as executable, at least not as reported by IMMU->GetMemoryAttrs(). The address of the loaded .text sections is notably different than all the other sections, so perhaps some other mechanism is used to set aside a block of memory as executable. (Maybe the segment registers joerg mentioned? This is an X1000...)
Leaving the code writeable is more of a risk, especially for a program like PassPocket that is trying to prevent malicious code from potentially messing with it. So I guess it's time to change my linking habits.
@All We a bit on the line to be in the beta state for some fresh release of new binutils, but for that we need to pass some more tests about which we had issues before, and one of the latest issues we want to sort out is that "baserel" stuff.
But testing this is not trivial (for us at least), as we don't know how to write proper test case to test baserel stuff.
All we find, is that in adtools, there are some old/unmaintained test cases written years ago to be tested on qemu: there they are:
It can be very well that back in days -me500 were enough to say to AS to handle proper opcode, but today surely not, and for sake of tests i provide something -mcpu=e5500 instead to make it compiles. But then binary simple crashes.
All of this sounds kind of hardcore, so, maybe anyone can wrote a simple-little test case for both baserel and baserel-large parts (the ones we need to support probably), without all those specific to CPU opcodes, so we can test it ?
It has do with MMU, so perhaps something can be rewritten use the ExecSG mmu api instead.
Is for simple testing, baserel stuff needs to use any registers/mmu/whatever at all ? I mean, isn't it possible to write some very simple and basic test case testing baserel features ?