Discussion:
Tuple and changes for m68k with -malign-int
(too old to reply)
John Paul Adrian Glaubitz
2023-08-26 11:00:02 UTC
Permalink
Hi James!
I wasn't sure whether to send this to libc-alpha or here. This feels more like
a request for help, so I decided to play it safe. :)
I am CC'ing Debian's m68k mailing list and the Linux m68k kernel mailing list
to make sure we're getting enough exposure.
The Debian m68k maintainers proposed building their packages with -malign-int
last year, aligning to 32-bit instead of 16-bit, which improves compatibility
with some projects and should give better performance on 68020+, at the cost
of slightly increased memory usage. The mold linker is at least one project
that has been shown to work after making this change where it previously
didn't.
Not only mold but also most notably the following projects:

- LLVM
- Firebird Database
- OpenJDK
- Various Qt packages

It's a regular occurrence that a package doesn't build on m68k due to it's unusual
default alignment. Thus, in order to keep the port alive in the future, I think
switching to 32-bit alignment by default is inevitable.
It goes against the traditional ABIs, but practically no m68k Linux binaries
are published outside of distributions, so this not a concern. We need to
break the ABI anyway with time_t going 64-bit, so it makes sense to do these
two things at the same time.
Fully agreed.
We in Gentoo fully support this idea. We had hoped that Debian would take the
initiative, but we're not aware of any movement yet, and we're keen to make
this transition, so I'm here to get the ball rolling.
We haven't had a larger discussion yet and I didn't want to impose any changes
before we have agreed on how to move forward. Thanks a lot for finally starting
the discussion.
We think this warrants a new tuple, and we'd like to ensure that everyone gets
behind the same one. It is currently m68k-*-gnu. Perhaps it could be
m68k-*-gnu32 or m68k-*-gnu32a? I considered gnu32i (for int), but the flag
actually affects floats and doubles too. I don't really care what it is
though, so feel free to suggest something totally different.
I think -gnu32 sounds very reasonable. I'm actually also wondering what is being
used for other ports that are going to be rebuilt with 64-bit time_t. Maybe we
could use that naming scheme. I guess using "gnu32" for any 32-bit port with
64-bit time_t might not be the obvious choice.

So, while I like the gnu32 suffix, I would suggest we do some research first to find
out what the commonly used triplet change will be used for 32-bit ports switching
to 64-bit time_t.
Once that is agreed, I'm happy to put together the patch to automatically
enable the flag for this tuple in GCC. The part I do need help with is
necessary changes to glibc, if any. Assembly is not my area at all, so what I
came up with here was a total guess.
Thanks for already looking into the implementation details!
--- a/sysdeps/m68k/crti.S 2022-07-29 23:03:09.000000000 +0100
+++ b/sysdeps/m68k/crti.S 2022-11-30 21:41:52.710135230 +0000
@@ -56,7 +56,7 @@
#endif
- .align 2
+ .p2align 2
.globl _init
.hidden _init
@@ -74,7 +74,7 @@
#endif
- .align 2
+ .p2align 2
.globl _fini
.hidden _fini
I did try this out, and it largely seemed to work, although processes
occasionally hung. Perhaps this was unrelated.
It was a while back now and I can't remember if I also built the Linux kernel
with -malign-int. Does it need to match? Presumably it would at least give the
same kind of performance benefit?
I cannot comment on this at the moment, so let's wait for the more experienced
m68k kernel and toolchain folks to chime in.
Thanks for helping to keep m68k alive.
Thank you, too, and thanks for getting this rolling!

Adrian
--
.''`. John Paul Adrian Glaubitz
: :' : Debian Developer
`. `' Physicist
`- GPG: 62FF 8A75 84E0 2956 9546 0006 7426 3B37 F5B5 F913
Richard
2023-08-26 19:40:01 UTC
Permalink
Post by John Paul Adrian Glaubitz
Hi James!
I wasn't sure whether to send this to libc-alpha or here. This feels more like
a request for help, so I decided to play it safe. :)
I am CC'ing Debian's m68k mailing list and the Linux m68k kernel mailing list
to make sure we're getting enough exposure.
The Debian m68k maintainers proposed building their packages with -malign-int
last year, aligning to 32-bit instead of 16-bit, which improves compatibility
with some projects and should give better performance on 68020+, at the cost
of slightly increased memory usage. The mold linker is at least one project
that has been shown to work after making this change where it previously
didn't.
a linker that is broken by a slightly unusual alignment isn't exactly a prime example.. if any project I would expect linkers and binary tools to pay attention to portability.
Post by John Paul Adrian Glaubitz
- LLVM
Ok .. too big to complain about.. and see above.
Post by John Paul Adrian Glaubitz
- OpenJDK
OpenJDK has not only that one problem.
Post by John Paul Adrian Glaubitz
It's a regular occurrence that a package doesn't build on m68k due to it's unusual
default alignment.
Unfortunately. Some time ago m68k was not the only one with this problem?
Post by John Paul Adrian Glaubitz
Thus, in order to keep the port alive in the future, I think
switching to 32-bit alignment by default is inevitable.
Ok.
Post by John Paul Adrian Glaubitz
We need to
break the ABI anyway with time_t going 64-bit, so it makes sense to do these
two things at the same time.
What exactly will be broken? Afaics kernel ABIs have been since long pretty carefully designed to avoid this problems and noone of the mentioned examples should touch them anyway.

Thus.. is there any need to change the kernel ABI?

Richard
James Le Cuirot
2023-08-26 21:10:01 UTC
Permalink
Post by Richard
Post by John Paul Adrian Glaubitz
Hi James!
I wasn't sure whether to send this to libc-alpha or here. This feels more like
a request for help, so I decided to play it safe. :)
I am CC'ing Debian's m68k mailing list and the Linux m68k kernel mailing list
to make sure we're getting enough exposure.
The Debian m68k maintainers proposed building their packages with -malign-int
last year, aligning to 32-bit instead of 16-bit, which improves compatibility
with some projects and should give better performance on 68020+, at the cost
of slightly increased memory usage. The mold linker is at least one project
that has been shown to work after making this change where it previously
didn't.
a linker that is broken by a slightly unusual alignment isn't exactly a prime example.. if any project I would expect linkers and binary tools to pay attention to portability.
Not the best example, I grant you, but it was the only one where I'd
personally witnessed it making a difference so far.
Post by Richard
Post by John Paul Adrian Glaubitz
It's a regular occurrence that a package doesn't build on m68k due to it's unusual
default alignment.
Unfortunately. Some time ago m68k was not the only one with this problem?
Possibly, but I wouldn't know. I suspect it may be the only one still in use
with Linux. Gentoo supports most of the architectures to some degree, and I'm
not aware of any those having this issue.
Post by Richard
Post by John Paul Adrian Glaubitz
We need to
break the ABI anyway with time_t going 64-bit, so it makes sense to do these
two things at the same time.
What exactly will be broken? Afaics kernel ABIs have been since long pretty carefully designed to avoid this problems and noone of the mentioned examples should touch them anyway.
Thus.. is there any need to change the kernel ABI?
I mentioned the kernel, but I'm not sure whether that's actually affected.
This is more about userland compatibility in the same way that arm-*-gnu,
arm-*-gnueabi, and arm-*gnueabihf are incompatible with each other. I did try
mixing the latter two once. This was swiftly met with a segfault.

Of course, a tuple doesn't stop users from mixing these binaries, but it is a
good way to ensure that GCC enables the flag when appropriate. This is too
important to rely on CFLAGS.

As for time_t, I hadn't realised a different tuple was being proposed for
that, but a fellow Gentoo dev confirms. The breakage here is less severe but
still significant. I witnessed it first-hand on 32-bit ARM when GnuTLS started
using 64-bit time_t while curl was still expecting 32-bit, which lead to HTTPS
requests failing because the certificate start/end dates were completely
wrong. At that point, we realised this is something that needs to be applied
system-wide.

I believe we're still waiting on consensus for that too. gnu64time anyone?
It's 2023, how about gnu🕛64? ;)
Geert Uytterhoeven
2023-08-28 07:00:02 UTC
Permalink
On Sat, Aug 26, 2023 at 11:00 PM James Le Cuirot
Post by James Le Cuirot
Post by Richard
Post by John Paul Adrian Glaubitz
It's a regular occurrence that a package doesn't build on m68k due to it's unusual
default alignment.
Unfortunately. Some time ago m68k was not the only one with this problem?
Possibly, but I wouldn't know. I suspect it may be the only one still in use
with Linux. Gentoo supports most of the architectures to some degree, and I'm
not aware of any those having this issue.
AXIS CRIS was in the same (or a similar) boat, but support for CRIS
was dropped in Linux v4.17.

Gr{oetje,eeting}s,

Geert
--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ***@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds
John Paul Adrian Glaubitz
2023-08-28 11:00:01 UTC
Permalink
Post by Richard
a linker that is broken by a slightly unusual alignment isn't exactly a
prime example.. if any project I would expect linkers and binary tools
to pay attention to portability.
Portable shouldn't mean having to accommodate for unreasonable design decisions
of other developers. It's perfectly fine to assume 32-bit natural alignment on
a 32-bit platform and I don't think it's fair to put the burden of adopting for
unusual design decisions on to upstream projects.

This kind of attitude was certainly one of the reasons why the Itanium architecture
failed. Its designers made weird decisions which made life hard for upstream developers
and most of them were happy when the architecture was finally abandoned.
Post by Richard
Post by John Paul Adrian Glaubitz
- LLVM
Ok .. too big to complain about.. and see above.
It's also nearly impossible to make LLVM work with 16-bit alignment because the code uses
certainly packed data types which require 32-bit alignment or higher.
Post by Richard
Post by John Paul Adrian Glaubitz
- OpenJDK
OpenJDK has not only that one problem.
That's an unnecessary remark that is not helpful here. Please don't do that!
Post by Richard
Post by John Paul Adrian Glaubitz
It's a regular occurrence that a package doesn't build on m68k due to it's unusual
default alignment.
Unfortunately. Some time ago m68k was not the only one with this problem?
Well, as mentioned above, other architectures with weird requirements such as Itanium
have been abandoned and most upstream projects were happy when this finally happened.
Post by Richard
Post by John Paul Adrian Glaubitz
Thus, in order to keep the port alive in the future, I think
switching to 32-bit alignment by default is inevitable.
Ok.
Post by John Paul Adrian Glaubitz
We need to
break the ABI anyway with time_t going 64-bit, so it makes sense to do these
two things at the same time.
What exactly will be broken? Afaics kernel ABIs have been since long pretty carefully
designed to avoid this problems and noone of the mentioned examples should touch them anyway.
Thus.. is there any need to change the kernel ABI?
I don't think this mandates changes to the kernel ABI.

Adrian
--
.''`. John Paul Adrian Glaubitz
: :' : Debian Developer
`. `' Physicist
`- GPG: 62FF 8A75 84E0 2956 9546 0006 7426 3B37 F5B5 F913
Richard
2023-08-28 12:20:01 UTC
Permalink
Post by John Paul Adrian Glaubitz
Post by Richard
a linker that is broken by a slightly unusual alignment isn't exactly a
prime example.. if any project I would expect linkers and binary tools
to pay attention to portability.
Portable shouldn't mean having to accommodate for unreasonable design decisions
of other developers. It's perfectly fine to assume 32-bit natural alignment on
a 32-bit platform and I don't think it's fair to put the burden of adopting for
unusual design decisions on to upstream projects.
Assuming anything that is not declared by the c standard is not good imho. The C lang is well known for its pitfalls and the basic binary tools ought not to set bad precedents ignoring those.

It is also reasonable to assume that on modern hw cache is filled in blocks of perhaps 1k or more and thus "unnatural" alignment might actually help performance because more fits into that one data burst.
Post by John Paul Adrian Glaubitz
Post by Richard
Thus.. is there any need to change the kernel ABI?
I don't think this mandates changes to the kernel ABI.
That would be really good, anything else could be handled by library versioning in a mostly backwards compatible way?

Richard
Geert Uytterhoeven
2023-08-28 12:30:01 UTC
Permalink
Hi Richard,
Post by Richard
Post by John Paul Adrian Glaubitz
Post by Richard
a linker that is broken by a slightly unusual alignment isn't exactly a
prime example.. if any project I would expect linkers and binary tools
to pay attention to portability.
Portable shouldn't mean having to accommodate for unreasonable design decisions
of other developers. It's perfectly fine to assume 32-bit natural alignment on
a 32-bit platform and I don't think it's fair to put the burden of adopting for
unusual design decisions on to upstream projects.
Assuming anything that is not declared by the c standard is not good imho. The C lang is well known for its pitfalls and the basic binary tools ought not to set bad precedents ignoring those.
It is also reasonable to assume that on modern hw cache is filled in blocks of perhaps 1k or more and thus "unnatural" alignment might actually help performance because more fits into that one data burst.
"1k" (I assume you mean 1 KiB?) is a bit much...

Note that on several architectures you cannot do unaligned accesses,
so you have to declare such a structure with __attribute__((__packed__)),
and thus not only live with the overhead of doing unaligned accesses
from the D-cache, but also in emulating them in software...

Gr{oetje,eeting}s,

Geert
--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ***@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds
John Paul Adrian Glaubitz
2023-08-28 12:50:01 UTC
Permalink
Post by Richard
Post by John Paul Adrian Glaubitz
Post by Richard
a linker that is broken by a slightly unusual alignment isn't exactly a
prime example.. if any project I would expect linkers and binary tools
to pay attention to portability.
Portable shouldn't mean having to accommodate for unreasonable design decisions
of other developers. It's perfectly fine to assume 32-bit natural alignment on
a 32-bit platform and I don't think it's fair to put the burden of adopting for
unusual design decisions on to upstream projects.
Assuming anything that is not declared by the c standard is not good imho. The C
lang is well known for its pitfalls and the basic binary tools ought not to set
bad precedents ignoring those.
It is also reasonable to assume that on modern hw cache is filled in blocks of perhap
1k or more and thus "unnatural" alignment might actually help performance because more
fits into that one data burst.
This is a very academic discussion really and doesn't really solve the problem we're
seeing. We're here to solve a technical problem, not to discuss whether something is
according to the C standard.

Upstream projects decide on their own what maintenance burden they're willing to accept
and which not. If they don't think it's reasonable to accommodate for the specific m68k
alignment requirements, the burden to keep these packages working are on the distribution
maintainers meaning that I will have to continue spending time unbreaking packages like
OpenJDK in Debian which I prefer not having to in the future.

Adrian
--
.''`. John Paul Adrian Glaubitz
: :' : Debian Developer
`. `' Physicist
`- GPG: 62FF 8A75 84E0 2956 9546 0006 7426 3B37 F5B5 F913
Finn Thain
2023-08-27 01:10:01 UTC
Permalink
Post by John Paul Adrian Glaubitz
...
- LLVM
- Firebird Database
- OpenJDK
- Various Qt packages
And potentially more in the future, which may be anticipated on the basis
that "those users don't need a stable ABI any more, so let's just ignore
the portability issues in our code and leave the problem to the distros
and toolchain developers".

That is the precedent you would set.

Moreover, why is it that only a few developers have a problem with making
explicit their decisions regarding alignment of shorts? What actual pain
does it cause them to accept a patch to make their struct layouts plain?
Post by John Paul Adrian Glaubitz
It goes against the traditional ABIs, but practically no m68k Linux
binaries are published outside of distributions, so this not a
concern.
It is of concern to some users (though not all, apparently).
Post by John Paul Adrian Glaubitz
We need to break the ABI anyway with time_t going 64-bit, so it makes
sense to do these two things at the same time.
Fully agreed.
If the kernel breaks the ABI, that's a bug, not an excuse. Either you're
okay with proliferation of incompatible binaries and tools or there are
some criteria (yet to be identified AFAIK) which permit this bug.

It's not difficult to foresee fragmentation because it follows from the
manpower shortage. There will always be sufficient manpower to produce a
break that pleases a few. There may never be enough manpower to produce a
stable ABI that pleases everyone for the foreseeable future.
Post by John Paul Adrian Glaubitz
I think -gnu32 sounds very reasonable.
You do? I think 32 is misleading in the absence of 16-bit or 64-bit
variants, and -gnu is misleading if other tooling like LLVM already
supports malign-int. Moreover, it's impossible to align to a bit count in
general. Not that you'd want to -- it's actually the natural alignment of
shorts that is at issue, AIUI.

So, for naming purposes, the proposal might be described as either the ABI
du jour (leading to -abi23 for 2023) or the new ABI for ever (leading to
-abin as in -gnuabin32 on MIPS).

If it's the former, perhaps you should not push it upstream. If it's the
latter, perhaps this redesign should seek to address real shortcomings
with the existing ABI, including problems which (for all I know) may have
entirely prevented some people from using it thus far. That is, it should
consider silicon beyond 680x0.
James Le Cuirot
2023-08-27 09:40:01 UTC
Permalink
Post by Finn Thain
Post by John Paul Adrian Glaubitz
...
- LLVM
- Firebird Database
- OpenJDK
- Various Qt packages
And potentially more in the future, which may be anticipated on the basis
that "those users don't need a stable ABI any more, so let's just ignore
the portability issues in our code and leave the problem to the distros
and toolchain developers".
That is the precedent you would set.
Moreover, why is it that only a few developers have a problem with making
explicit their decisions regarding alignment of shorts? What actual pain
does it cause them to accept a patch to make their struct layouts plain?
Some projects do accept patches. Yann Collet was even kind enough to fix this
in zstd themselves. On the other hand, we have had to fight to stop Python
from dropping m68k support entirely. The real problem is the effort required
to produce these patches. I haven't been able to wrap my head around this so
far, but I would still like to learn. I could see myself eventually fixing
mold, but LLVM feels like a very tall order.
Post by Finn Thain
Post by John Paul Adrian Glaubitz
We need to break the ABI anyway with time_t going 64-bit, so it makes
sense to do these two things at the same time.
Fully agreed.
If the kernel breaks the ABI, that's a bug, not an excuse. Either you're
okay with proliferation of incompatible binaries and tools or there are
some criteria (yet to be identified AFAIK) which permit this bug.
If you're referring to time_t, the kernel is not breaking the ABI. New
syscalls were added to 32-bit architectures for 64-bit time_t. The
incompatibility is within userland, such as in the curl vs GnuTLS example I
mentioned.
Post by Finn Thain
It's not difficult to foresee fragmentation because it follows from the
manpower shortage. There will always be sufficient manpower to produce a
break that pleases a few. There may never be enough manpower to produce a
stable ABI that pleases everyone for the foreseeable future.
Since this is about userland, are you suggesting that all userland ABIs should
simultaneously support both 32-bit and 64-bit time_t? That would never happen,
especially when 32-bit time_t will naturally become useless.
Post by Finn Thain
Post by John Paul Adrian Glaubitz
I think -gnu32 sounds very reasonable.
You do? I think 32 is misleading in the absence of 16-bit or 64-bit
variants, and -gnu is misleading if other tooling like LLVM already
supports malign-int. Moreover, it's impossible to align to a bit count in
general. Not that you'd want to -- it's actually the natural alignment of
shorts that is at issue, AIUI.
I picked -gnu because this is a variation on what we have already and I've
never heard of glibc using anything other than -gnu*. You still use -gnu when
building with Clang, so I'm not sure what Clang supporting -malign-int has to
do with it. Of course, glibc is not the only libc, but the others are not
compatible anyway and have their own tuples. They will presumably follow suit
though, as they have done in the past, e.g. -gnueabihf -> -musleabihf.
Post by Finn Thain
So, for naming purposes, the proposal might be described as either the ABI
du jour (leading to -abi23 for 2023) or the new ABI for ever (leading to
-abin as in -gnuabin32 on MIPS).
If it's the former, perhaps you should not push it upstream. If it's the
latter, perhaps this redesign should seek to address real shortcomings
with the existing ABI, including problems which (for all I know) may have
entirely prevented some people from using it thus far. That is, it should
consider silicon beyond 680x0.
I'm not sure what you mean here. I don't think anyone has been prevented from
using the existing ABI when it is the only m68k ABI on Linux. We *are*
considering other architectures with the time_t issue. I haven't heard anyone
shouting about any other common issues. They should really be shouting about
time_t, as it is somewhat pressing, but surprisingly little has been said
about it.

I do know that m68k Linux has been significantly slower since the transition
from linuxthreads to NPTL due to the lack of a spare register, but I gather
nothing can be done about that.
Richard
2023-08-27 11:30:02 UTC
Permalink
Post by James Le Cuirot
Post by Finn Thain
If the kernel breaks the ABI, that's a bug, not an excuse.
...
...
Post by James Le Cuirot
I do know that m68k Linux has been significantly slower since the transition
from linuxthreads to NPTL due to the lack of a spare register, but I gather
nothing can be done about that.
Thanks for saying that. Radically redefining "c" after 35 years of existence for next to zero gain wasn't such a great idea imho.

I hope the kernel ABI can remain stable and everything else is the problem of libraries?

Richard
Geert Uytterhoeven
2023-08-28 07:10:01 UTC
Permalink
On Sun, Aug 27, 2023 at 11:36 AM James Le Cuirot
Post by James Le Cuirot
Post by Finn Thain
Moreover, why is it that only a few developers have a problem with making
explicit their decisions regarding alignment of shorts? What actual pain
does it cause them to accept a patch to make their struct layouts plain?
Some projects do accept patches. Yann Collet was even kind enough to fix this
in zstd themselves. On the other hand, we have had to fight to stop Python
from dropping m68k support entirely. The real problem is the effort required
to produce these patches. I haven't been able to wrap my head around this so
far, but I would still like to learn. I could see myself eventually fixing
mold, but LLVM feels like a very tall order.
Perhaps we need a new compiler warning: "hole in structure due to
non-natural alignment, please consider adding explicit padding"?

Gr{oetje,eeting}s,

Geert
--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ***@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds
Richard
2023-08-28 11:30:01 UTC
Permalink
Post by Geert Uytterhoeven
On Sun, Aug 27, 2023 at 11:36 AM James Le Cuirot
Post by James Le Cuirot
Post by Finn Thain
Moreover, why is it that only a few developers have a problem with making
explicit their decisions regarding alignment of shorts? What actual pain
does it cause them to accept a patch to make their struct layouts plain?
Some projects do accept patches. Yann Collet was even kind enough to fix this
in zstd themselves. On the other hand, we have had to fight to stop Python
from dropping m68k support entirely. The real problem is the effort required
to produce these patches. I haven't been able to wrap my head around this so
far, but I would still like to learn. I could see myself eventually fixing
mold, but LLVM feels like a very tall order.
Perhaps we need a new compiler warning: "hole in structure due to
non-natural alignment, please consider adding explicit padding"?
Sounds reasonable but I am afraid in 99% of cases this would be completely irrelevant and not break anything so the acceptance would be pretty low.

The problem arises only when people start doing "strange" things with such structs. Can we define strange things in a better way? It appears to me all modern c standards somewhat lack an attribute to mark a struct as being "special use" and thus emit more warnings and avoid some kinds of trickery.

Richard
Geert Uytterhoeven
2023-08-28 11:50:01 UTC
Permalink
Hi Richard,
Post by Richard
Post by Geert Uytterhoeven
On Sun, Aug 27, 2023 at 11:36 AM James Le Cuirot
Post by James Le Cuirot
Post by Finn Thain
Moreover, why is it that only a few developers have a problem with making
explicit their decisions regarding alignment of shorts? What actual pain
does it cause them to accept a patch to make their struct layouts plain?
Some projects do accept patches. Yann Collet was even kind enough to fix this
in zstd themselves. On the other hand, we have had to fight to stop Python
from dropping m68k support entirely. The real problem is the effort required
to produce these patches. I haven't been able to wrap my head around this so
far, but I would still like to learn. I could see myself eventually fixing
mold, but LLVM feels like a very tall order.
Perhaps we need a new compiler warning: "hole in structure due to
non-natural alignment, please consider adding explicit padding"?
Sounds reasonable but I am afraid in 99% of cases this would be completely irrelevant and not break anything so the acceptance would be pretty low.
The problem arises only when people start doing "strange" things with such structs. Can we define strange things in a better way? It appears to me all modern c standards somewhat lack an attribute to mark a struct as being "special use" and thus emit more warnings and avoid some kinds of trickery.
Do you consider

struct foo {
short x;
int y;
} bar;

a "strange" thing? In se it's not strange.

Unless someone starts doing:

assert(sizeof(struct foo) == 8);

or:

write(fd, &bar, sizeof(bar));

and expects this to be portable/interoperable (ignoring endianness
for now).

IIRC, there are similar issues with the alignment of long long and double
on some 32-bit platforms, where they would not be aligned naturally.

In Linux userspace APIs, we always[*] use natural alignment and
explicit padding.

[*] try to.

Gr{oetje,eeting}s,

Geert
--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ***@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds
Richard
2023-08-28 20:20:01 UTC
Permalink
Post by Richard
Post by Geert Uytterhoeven
On Sun, Aug 27, 2023 at 11:36 AM James Le Cuirot
Perhaps we need a new compiler warning: "hole in structure due to
non-natural alignment, please consider adding explicit padding"?
Sounds reasonable but I am afraid in 99% of cases this would be completely irrelevant and not break anything so the acceptance would be pretty low.
On a second thought, that warning might get some acceptance if it is formulated slightly differently.. making it more clear that the way the struct is arranged will waste memory in addition to creating potential portability problems?


Richard
Geert Uytterhoeven
2023-08-29 07:00:01 UTC
Permalink
Hi Richard,
Post by Richard
Post by Richard
Post by Geert Uytterhoeven
On Sun, Aug 27, 2023 at 11:36 AM James Le Cuirot
Perhaps we need a new compiler warning: "hole in structure due to
non-natural alignment, please consider adding explicit padding"?
Sounds reasonable but I am afraid in 99% of cases this would be completely irrelevant and not break anything so the acceptance would be pretty low.
On a second thought, that warning might get some acceptance if it is formulated slightly differently.. making it more clear that the way the struct is arranged will waste memory in addition to creating potential portability problems?
It will not always waste memory, only if some members can be moved
into holes.

Anyway, not wasting memory is merely an optimization.
Creating portability problems is a bug,

Gr{oetje,eeting}s,

Geert
--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ***@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds
Geert Uytterhoeven
2023-08-28 07:00:02 UTC
Permalink
Post by Finn Thain
Post by John Paul Adrian Glaubitz
- LLVM
- Firebird Database
- OpenJDK
- Various Qt packages
And potentially more in the future, which may be anticipated on the basis
that "those users don't need a stable ABI any more, so let's just ignore
the portability issues in our code and leave the problem to the distros
and toolchain developers".
Indeed, the world is slowly turning into "everything is 64-bit little endian"...
Post by Finn Thain
Moreover, why is it that only a few developers have a problem with making
explicit their decisions regarding alignment of shorts? What actual pain
does it cause them to accept a patch to make their struct layouts plain?
I guess you mean "ints" and "longs" instead of "shorts"?

Gr{oetje,eeting}s,

Geert
--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ***@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds
John Paul Adrian Glaubitz
2023-08-28 11:20:01 UTC
Permalink
Post by Geert Uytterhoeven
Post by Finn Thain
And potentially more in the future, which may be anticipated on the basis
that "those users don't need a stable ABI any more, so let's just ignore
the portability issues in our code and leave the problem to the distros
and toolchain developers".
Indeed, the world is slowly turning into "everything is 64-bit little endian"...
Well, if we want to prevent that to happen in the future, we should make sure that
the m68k port is prepared for the future.

Adrian
--
.''`. John Paul Adrian Glaubitz
: :' : Debian Developer
`. `' Physicist
`- GPG: 62FF 8A75 84E0 2956 9546 0006 7426 3B37 F5B5 F913
Finn Thain
2023-08-29 01:40:01 UTC
Permalink
Post by John Paul Adrian Glaubitz
Post by Geert Uytterhoeven
Post by Finn Thain
And potentially more in the future, which may be anticipated on the
basis that "those users don't need a stable ABI any more, so let's
just ignore the portability issues in our code and leave the problem
to the distros and toolchain developers".
Indeed, the world is slowly turning into "everything is 64-bit little endian"...
Well, if we want to prevent that to happen in the future, we should make
sure that the m68k port is prepared for the future.
Agreed. And if we get it right, all those 64-bit architectures will not
find themselves in the same predicament m68k is in now, once vendors of
shiny 128-bit and 256-bit architectures start tossing them on the scrap
heap. How will they avoid that predicament? By following our lead, and
making struct member alignment decisions explicit.
John Paul Adrian Glaubitz
2023-08-28 11:20:01 UTC
Permalink
Post by Finn Thain
Post by John Paul Adrian Glaubitz
- LLVM
- Firebird Database
- OpenJDK
- Various Qt packages
And potentially more in the future, which may be anticipated on the basis
that "those users don't need a stable ABI any more, so let's just ignore
the portability issues in our code and leave the problem to the distros
and toolchain developers".
It's reasonable to assume that a 32-bit architecture uses 32-bit alignment and
I understand every single upstream project that doesn't want to care about obscure
design the decisions of some ABI designers of the past.
Post by Finn Thain
That is the precedent you would set.
No, I wouldn't set such precedent. I would fix something that has been broken
for years and has caused endless headaches for people maintaining the m68k port
in Linux distributions.

And since we have to break the ABI anyway to be able to use 64-bit time_t, I don't
see any valid reason to stick to the problematic 16-bit alignment used by the current
ABI.
Post by Finn Thain
Moreover, why is it that only a few developers have a problem with making
explicit their decisions regarding alignment of shorts? What actual pain
does it cause them to accept a patch to make their struct layouts plain?
The problem aren't upstream projects but the lack of manpower to work on all these
issues. Talk is cheap when there is hardly anyone doing this work.

I have invested a ton of work to get the m68k port into better shape and with the
help of the community, we even managed to land m68k support in LLVM. It was a HUGE
disappointment to me when the 16-bit alignment again caused trouble for a relevant
upstream project on m68k meaning that LLVM can currently not be used natively on
m68k.
Post by Finn Thain
Post by John Paul Adrian Glaubitz
It goes against the traditional ABIs, but practically no m68k Linux
binaries are published outside of distributions, so this not a
concern.
It is of concern to some users (though not all, apparently).
If these users really cared, they would actually help address these issues. I haven't
seen any contributions trying to address these issues outside my efforts and the efforts
of the Gentoo developers.
Post by Finn Thain
Post by John Paul Adrian Glaubitz
We need to break the ABI anyway with time_t going 64-bit, so it makes
sense to do these two things at the same time.
Fully agreed.
If the kernel breaks the ABI, that's a bug, not an excuse. Either you're
okay with proliferation of incompatible binaries and tools or there are
some criteria (yet to be identified AFAIK) which permit this bug.
It's not difficult to foresee fragmentation because it follows from the
manpower shortage. There will always be sufficient manpower to produce a
break that pleases a few. There may never be enough manpower to produce a
stable ABI that pleases everyone for the foreseeable future.
Again, talk is cheap. Show me the code.
Post by Finn Thain
Post by John Paul Adrian Glaubitz
I think -gnu32 sounds very reasonable.
You do? I think 32 is misleading in the absence of 16-bit or 64-bit
variants, and -gnu is misleading if other tooling like LLVM already
supports malign-int. Moreover, it's impossible to align to a bit count in
general. Not that you'd want to -- it's actually the natural alignment of
shorts that is at issue, AIUI.
Yes, I do and that's just my personal opinion. But as I said, I am open to
other naming suggestions.
Post by Finn Thain
So, for naming purposes, the proposal might be described as either the ABI
du jour (leading to -abi23 for 2023) or the new ABI for ever (leading to
-abin as in -gnuabin32 on MIPS).
That's why I suggested we can look how the ARM developers will name their
triplet when switching to 64-bit time_t on 32-bit ARM systems.
Post by Finn Thain
If it's the former, perhaps you should not push it upstream. If it's the
latter, perhaps this redesign should seek to address real shortcomings
with the existing ABI, including problems which (for all I know) may have
entirely prevented some people from using it thus far. That is, it should
consider silicon beyond 680x0.
It's a historic architecture. We don't have to redesign everything. It's enough
to address the most pressing issues and these are 16-bit alignment and 32-bit
time_t.

Adrian
--
.''`. John Paul Adrian Glaubitz
: :' : Debian Developer
`. `' Physicist
`- GPG: 62FF 8A75 84E0 2956 9546 0006 7426 3B37 F5B5 F913
John Paul Adrian Glaubitz
2023-08-28 13:00:02 UTC
Permalink
Hi Adhemerval!
If the idea is really to endeavor on a new ABI for m68k, it means a different
loader and the question: will it be interoperable with current m68k ABI in the
sense that i686 is interoperable with x86_64? It would allow to keep old binaries
running, similar to what old ABI did for 32 to 64 bits transition.
OK.
It would require take care that some possible shared data structures (such as
pthread_mutex_t and alike) have the same layout and alignment, add some support
to ldconfig to differentiated between DSO with different ABIs (either through
e_flags as ARM, PT_GNU_PROPERTY used by aarch64 or x86_64, or something else),
bump the required minimum kernel (for 64 bit time_t support), and check current
status of the port.
Understood.
I am bringing the later because I fixed some recent m68k build issues [1], that
seems to be from gcc changes over the years (as hinted by Andreas Schwab) where
compiler changed some internal defined flags and it was not reflected on glibc
(for a short, it seems that -mcpu=680X0 does not already define __mc68020__).
The build fix is straightforward, but it raised question whether something
else is not broken and has not been caught yet.
Waldemar Brodkorb has posted his results on running glibc 2.38 on qemu and
949 FAIL
3344 PASS
99 UNSUPPORTED
16 XFAIL
2 XPASS
I guess the math failures are from the extra rounding and exception testing, which
requires a fully compliant IEEE 754 fp unit (which I guess m68k does not provide).
The last m68k testsuite report where from 2.26 release [1] running under ARAnyM,
which shows the port is a better shape.
The FP failures are most likely the result of the limitations of the FPU emulation
in QEMU for m68k. ARAnyM is known to have much better FPU emulation support than
QEMU, so if you want to have more accurate results, you should test on ARAnyM.
I also noted that gcc on mc68060 changed the __DEC_EVAL_METHOD__ to 2, which makes
glibc tests to fail to build (since it assumes __DEC_EVAL_METHOD__ equal 0). This
again raised questions on how the math library would behave depending of the target
chip.
All of this issues and potentially work required for a new ABI makes me wonder
if is really worth to keep *2* distinct ABIs for m68k. Yes, m68k can follow the
MIPS mess and have 28 different ABIs that fails to be fully interoperable; but
I think that if you really want to on this 'gnu32' journey, I think it will be
better to just deprecate the m68k current ABI, remove it from glibc; and move
everything to new ABI.
I actually wouldn't have a problem with that. I don't plan on supporting the old
ABI with 16-bit alignment. After all, we had to change the ABI for TLS support
as well, didn't we?

Adrian
--
.''`. John Paul Adrian Glaubitz
: :' : Debian Developer
`. `' Physicist
`- GPG: 62FF 8A75 84E0 2956 9546 0006 7426 3B37 F5B5 F913
Andreas Schwab
2023-08-28 13:50:01 UTC
Permalink
Post by John Paul Adrian Glaubitz
The FP failures are most likely the result of the limitations of the FPU emulation
in QEMU for m68k. ARAnyM is known to have much better FPU emulation support than
QEMU, so if you want to have more accurate results, you should test on ARAnyM.
No, you should test on real hardware. Neither ARAnyM nor QEMU comes
close.
Post by John Paul Adrian Glaubitz
After all, we had to change the ABI for TLS support as well, didn't
we?
Nope.
--
Andreas Schwab, ***@linux-m68k.org
GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510 2552 DF73 E780 A9DA AEC1
"And now for something completely different."
John Paul Adrian Glaubitz
2023-08-29 11:00:02 UTC
Permalink
Post by Andreas Schwab
Post by John Paul Adrian Glaubitz
The FP failures are most likely the result of the limitations of the FPU emulation
in QEMU for m68k. ARAnyM is known to have much better FPU emulation support than
QEMU, so if you want to have more accurate results, you should test on ARAnyM.
No, you should test on real hardware. Neither ARAnyM nor QEMU comes
close.
In an ideal world, I would be testing on real hardware, yes. Unfortunately, even on
my Amiga 4000 with 68060/50 MHz the testsuite would run two weeks or so.
Post by Andreas Schwab
Post by John Paul Adrian Glaubitz
After all, we had to change the ABI for TLS support as well, didn't
we?
Nope.
So, any binaries from Debian Potato will still work against glibc 2.38 on m68k?

Adrian
--
.''`. John Paul Adrian Glaubitz
: :' : Debian Developer
`. `' Physicist
`- GPG: 62FF 8A75 84E0 2956 9546 0006 7426 3B37 F5B5 F913
Geert Uytterhoeven
2023-08-29 15:30:02 UTC
Permalink
Hi Adrian,

On Tue, Aug 29, 2023 at 12:51 PM John Paul Adrian Glaubitz
Post by John Paul Adrian Glaubitz
Post by John Paul Adrian Glaubitz
After all, we had to change the ABI for TLS support as well, didn't
we?
Nope.
So, any binaries from Debian Potato will still work against glibc 2.38 on m68k?
They should.

I regularly boot filesys-ELF-2.0.x-1400K-2.gz, which was created in 1996,
right after m68k switched from a.out to ELF. Any failures are reported
(yes, this does happen, ca. once per decade), and fixed.

Gr{oetje,eeting}s,

Geert
--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ***@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds
Michael Schmitz
2023-08-29 20:40:01 UTC
Permalink
Hi Adrian,
Post by John Paul Adrian Glaubitz
Post by Andreas Schwab
Post by John Paul Adrian Glaubitz
The FP failures are most likely the result of the limitations of the FPU emulation
in QEMU for m68k. ARAnyM is known to have much better FPU emulation support than
QEMU, so if you want to have more accurate results, you should test on ARAnyM.
No, you should test on real hardware. Neither ARAnyM nor QEMU comes
close.
In an ideal world, I would be testing on real hardware, yes. Unfortunately, even on
my Amiga 4000 with 68060/50 MHz the testsuite would run two weeks or so.
Then, in a slightly less than ideal world, errors in FPU emulation
should be identified and corrected so emulation can be used to run
testsuites with confidence.
Post by John Paul Adrian Glaubitz
Post by Andreas Schwab
Post by John Paul Adrian Glaubitz
After all, we had to change the ABI for TLS support as well, didn't
we?
Nope.
So, any binaries from Debian Potato will still work against glibc 2.38 on m68k?
Haven't gone back to potato, but binaries from sarge still run against
glibc from bullseye (that's the latest test image I use - glibc 2.30 IIRC).

Geert kindly provided links to the old filessys-ELF ram disks so I can
try and extract binaries from those, but as far as I'm concerned, I'd
take Geert's word for it.

Some LD_PRELOAD and LD_LIBRARY_PATH hacking might be necessary to
provide libraries missing from a current system, but it can be done. And
I occasionally do use binaries that I've compiled on one particular
system but cannot readily rebuild on another.

Cheers,

    Michael
Post by John Paul Adrian Glaubitz
Adrian
James Le Cuirot
2023-08-28 13:50:02 UTC
Permalink
Post by John Paul Adrian Glaubitz
Hi Adhemerval!
If the idea is really to endeavor on a new ABI for m68k, it means a different
loader and the question: will it be interoperable with current m68k ABI in the
sense that i686 is interoperable with x86_64? It would allow to keep old binaries
running, similar to what old ABI did for 32 to 64 bits transition.
OK.
To that, I would add: what old binaries? Linux on m68k is very obscure these
days, with Gentoo, Debian, and NixOS being the only major distributions still
supporting it. As the Gentoo m68k maintainer, I would not expect users to be
pulling binaries from elsewhere, and I imagine Adrian would say the same.
Where would you even get them from? I thought there might be a handful on
Aminet, but I cannot even find any there.

Upgrading an existing system might be awkward, but time_t alone will probably
warrant a reinstall. Having said that, I just tried a somewhat unscientific
experiment of running a bunch of random binaries from my 32-bit aligned system
on my 16-bit aligned one and nothing broke. I then tried the reverse and saw
stash smashing detection kicking in on anything more complex than ls.
Post by John Paul Adrian Glaubitz
I am bringing the later because I fixed some recent m68k build issues [1], that
seems to be from gcc changes over the years (as hinted by Andreas Schwab) where
compiler changed some internal defined flags and it was not reflected on glibc
(for a short, it seems that -mcpu=680X0 does not already define __mc68020__).
The build fix is straightforward, but it raised question whether something
else is not broken and has not been caught yet.
I had been aware of that issue for a while, but I wasn't able to figure it out
in a few minutes, and I never got around to looking deeper. Sorry for not
reporting it sooner.
Post by John Paul Adrian Glaubitz
Waldemar Brodkorb has posted his results on running glibc 2.38 on qemu and
949 FAIL
3344 PASS
99 UNSUPPORTED
16 XFAIL
2 XPASS
I guess the math failures are from the extra rounding and exception testing, which
requires a fully compliant IEEE 754 fp unit (which I guess m68k does not provide).
The last m68k testsuite report where from 2.26 release [1] running under ARAnyM,
which shows the port is a better shape.
The FP failures are most likely the result of the limitations of the FPU emulation
in QEMU for m68k. ARAnyM is known to have much better FPU emulation support than
QEMU, so if you want to have more accurate results, you should test on ARAnyM.
This is fairly typical of the math-related test failures I have seen from
other projects. I hadn't realised that QEMU's FPU emulation was lacking and
had just chalked it up to m68k's FP hardware having different capabilities.
Either way, I have never noticed any issues here when using software in
practise. Not that I've done any heavy number crunching on m68k, but who
would?
Post by John Paul Adrian Glaubitz
I also noted that gcc on mc68060 changed the __DEC_EVAL_METHOD__ to 2, which makes
glibc tests to fail to build (since it assumes __DEC_EVAL_METHOD__ equal 0). This
again raised questions on how the math library would behave depending of the target
chip.
All of this issues and potentially work required for a new ABI makes me wonder
if is really worth to keep *2* distinct ABIs for m68k. Yes, m68k can follow the
MIPS mess and have 28 different ABIs that fails to be fully interoperable; but
I think that if you really want to on this 'gnu32' journey, I think it will be
better to just deprecate the m68k current ABI, remove it from glibc; and move
everything to new ABI.
I actually wouldn't have a problem with that. I don't plan on supporting the old
ABI with 16-bit alignment. After all, we had to change the ABI for TLS support
as well, didn't we?
I don't want to force anyone here, but I'd also be fine with that. The only
downside, apart from compatibility, appears to be slightly increased memory
usage, and you're not exactly going to run modern Linux with 8MB RAM anyway.
John Paul Adrian Glaubitz
2023-08-29 11:00:01 UTC
Permalink
Post by James Le Cuirot
Post by John Paul Adrian Glaubitz
Hi Adhemerval!
If the idea is really to endeavor on a new ABI for m68k, it means a different
loader and the question: will it be interoperable with current m68k ABI in the
sense that i686 is interoperable with x86_64? It would allow to keep old binaries
running, similar to what old ABI did for 32 to 64 bits transition.
OK.
To that, I would add: what old binaries? Linux on m68k is very obscure these
days, with Gentoo, Debian, and NixOS being the only major distributions still
supporting it. As the Gentoo m68k maintainer, I would not expect users to be
pulling binaries from elsewhere, and I imagine Adrian would say the same.
Where would you even get them from? I thought there might be a handful on
Aminet, but I cannot even find any there.
Fully agreed.
Post by James Le Cuirot
Upgrading an existing system might be awkward, but time_t alone will probably
warrant a reinstall. Having said that, I just tried a somewhat unscientific
experiment of running a bunch of random binaries from my 32-bit aligned system
on my 16-bit aligned one and nothing broke. I then tried the reverse and saw
stash smashing detection kicking in on anything more complex than ls.
Thanks so much for performing such tests. This is really appreciated and provides
valuable information that's very helpful for the transition process.
Post by James Le Cuirot
Post by John Paul Adrian Glaubitz
most likely the result of the limitations of the FPU emulation
in QEMU for m68k. ARAnyM is known to have much better FPU emulation support than
QEMU, so if you want to have more accurate results, you should test on ARAnyM.
This is fairly typical of the math-related test failures I have seen from
other projects. I hadn't realised that QEMU's FPU emulation was lacking and
had just chalked it up to m68k's FP hardware having different capabilities.
Either way, I have never noticed any issues here when using software in
practise. Not that I've done any heavy number crunching on m68k, but who
would?
There have always been FPU-relevant issues on both QEMU and Aranym although it's
better on Aranym than on QEMU. This is a well known issue.
Post by James Le Cuirot
Post by John Paul Adrian Glaubitz
I also noted that gcc on mc68060 changed the __DEC_EVAL_METHOD__ to 2, which makes
glibc tests to fail to build (since it assumes __DEC_EVAL_METHOD__ equal 0). This
again raised questions on how the math library would behave depending of the target
chip.
All of this issues and potentially work required for a new ABI makes me wonder
if is really worth to keep *2* distinct ABIs for m68k. Yes, m68k can follow the
MIPS mess and have 28 different ABIs that fails to be fully interoperable; but
I think that if you really want to on this 'gnu32' journey, I think it will be
better to just deprecate the m68k current ABI, remove it from glibc; and move
everything to new ABI.
I actually wouldn't have a problem with that. I don't plan on supporting the old
ABI with 16-bit alignment. After all, we had to change the ABI for TLS support
as well, didn't we?
I don't want to force anyone here, but I'd also be fine with that. The only
downside, apart from compatibility, appears to be slightly increased memory
usage, and you're not exactly going to run modern Linux with 8MB RAM anyway.
Agreed. And I finally want to be able to use Rust and LLVM on m68k ;-).

Adrian
--
.''`. John Paul Adrian Glaubitz
: :' : Debian Developer
`. `' Physicist
`- GPG: 62FF 8A75 84E0 2956 9546 0006 7426 3B37 F5B5 F913
Karoly Balogh
2023-08-29 22:10:01 UTC
Permalink
Hi,
Post by John Paul Adrian Glaubitz
Post by James Le Cuirot
I don't want to force anyone here, but I'd also be fine with that. The only
downside, apart from compatibility, appears to be slightly increased memory
usage, and you're not exactly going to run modern Linux with 8MB RAM anyway.
Agreed. And I finally want to be able to use Rust and LLVM on m68k ;-).
So, let me get this straight (or from anothe perspective if you will) -
neither LLVM and Rust is ready for prime time, because it can't accomodate
a decade old established standard on our platform. But Linux maintainers
rush forward, and break^Wchange the ABI, so we can accomodate some
half-baked fancy new tools.

Sometime later someone realizes: if you want to support any other system
on m68k (Amiga, Atari, 68k Mac, *BSD, game consoles (embedded) you name
it), you still need to add support for the original alignment
restrictions, because on those systems you're not always going to be able
recompile the $world. So that someone will have the skills to add the
needed changes to these tools, so they can finally mature and accommodate
more real world scenarios that are out there.

At that point Linux m68k broke their own ABI for no reason, but because
someone couldn't wait until the necessary work was done, instead of
hacking problems around.

Ask me if I've seen this already (elsewhere).

Best,
--
Charlie

(Ps: Also, IMO the Itanium analogy is totally bogus. Itanium never had the
history and the historical significance of m68k, and the hardware has been
always been an expensive toy for a select few, with a few having any sort
of self-motivating emotional attachment to it. Also, where you draw the
line? At which point are we going to do a little endian ABI for m68k, so
upstream can ignore big endian? Don't laugh, apart from the well known
ppc64le case by IBM, this has been done the other in an m68k-context too,
but the other way around - a big-endian x86 GCC, so you can compile Amiga
ABI compatible libraries that contain native x86 code on emulators...)
Jeffrey Walton
2023-08-30 01:40:01 UTC
Permalink
On Tue, Aug 29, 2023 at 5:53 PM Karoly Balogh via Libc-help
Post by Karoly Balogh
Post by John Paul Adrian Glaubitz
Post by James Le Cuirot
I don't want to force anyone here, but I'd also be fine with that. The only
downside, apart from compatibility, appears to be slightly increased memory
usage, and you're not exactly going to run modern Linux with 8MB RAM anyway.
Agreed. And I finally want to be able to use Rust and LLVM on m68k ;-).
So, let me get this straight (or from anothe perspective if you will) -
neither LLVM and Rust is ready for prime time, because it can't accomodate
a decade old established standard on our platform. But Linux maintainers
rush forward, and break^Wchange the ABI, so we can accomodate some
half-baked fancy new tools.
Regarding Rust, it is only guaranteed to work on x86_64 and Aarch64.
Other platforms are a roll of the dice. See
https://doc.rust-lang.org/nightly/rustc/platform-support.html.

In practice, we had to scrap a project that was based on Rust. It gave
us too many problems on armel, armhf, aarch64 and powerpc. Rust could
not even compile its own cargos. We rebooted and went back to C. (This
was several years ago, before Aarch64 became Tier I).
Post by Karoly Balogh
Sometime later someone realizes: if you want to support any other system
on m68k (Amiga, Atari, 68k Mac, *BSD, game consoles (embedded) you name
it), you still need to add support for the original alignment
restrictions, because on those systems you're not always going to be able
recompile the $world. So that someone will have the skills to add the
needed changes to these tools, so they can finally mature and accommodate
more real world scenarios that are out there.
At that point Linux m68k broke their own ABI for no reason, but because
someone couldn't wait until the necessary work was done, instead of
hacking problems around.
Jeff
Adhemerval Zanella Netto
2023-08-28 13:10:01 UTC
Permalink
Post by John Paul Adrian Glaubitz
Post by Finn Thain
Post by John Paul Adrian Glaubitz
- LLVM
- Firebird Database
- OpenJDK
- Various Qt packages
And potentially more in the future, which may be anticipated on the basis
that "those users don't need a stable ABI any more, so let's just ignore
the portability issues in our code and leave the problem to the distros
and toolchain developers".
It's reasonable to assume that a 32-bit architecture uses 32-bit alignment and
I understand every single upstream project that doesn't want to care about obscure
design the decisions of some ABI designers of the past.
Post by Finn Thain
That is the precedent you would set.
No, I wouldn't set such precedent. I would fix something that has been broken
for years and has caused endless headaches for people maintaining the m68k port
in Linux distributions.
And since we have to break the ABI anyway to be able to use 64-bit time_t, I don't
see any valid reason to stick to the problematic 16-bit alignment used by the current
ABI.
Post by Finn Thain
Moreover, why is it that only a few developers have a problem with making
explicit their decisions regarding alignment of shorts? What actual pain
does it cause them to accept a patch to make their struct layouts plain?
The problem aren't upstream projects but the lack of manpower to work on all these
issues. Talk is cheap when there is hardly anyone doing this work.
I have invested a ton of work to get the m68k port into better shape and with the
help of the community, we even managed to land m68k support in LLVM. It was a HUGE
disappointment to me when the 16-bit alignment again caused trouble for a relevant
upstream project on m68k meaning that LLVM can currently not be used natively on
m68k.
Post by Finn Thain
Post by John Paul Adrian Glaubitz
It goes against the traditional ABIs, but practically no m68k Linux
binaries are published outside of distributions, so this not a
concern.
It is of concern to some users (though not all, apparently).
If these users really cared, they would actually help address these issues. I haven't
seen any contributions trying to address these issues outside my efforts and the efforts
of the Gentoo developers.
Post by Finn Thain
Post by John Paul Adrian Glaubitz
We need to break the ABI anyway with time_t going 64-bit, so it makes
sense to do these two things at the same time.
Fully agreed.
If the kernel breaks the ABI, that's a bug, not an excuse. Either you're
okay with proliferation of incompatible binaries and tools or there are
some criteria (yet to be identified AFAIK) which permit this bug.
It's not difficult to foresee fragmentation because it follows from the
manpower shortage. There will always be sufficient manpower to produce a
break that pleases a few. There may never be enough manpower to produce a
stable ABI that pleases everyone for the foreseeable future.
Again, talk is cheap. Show me the code.
Post by Finn Thain
Post by John Paul Adrian Glaubitz
I think -gnu32 sounds very reasonable.
You do? I think 32 is misleading in the absence of 16-bit or 64-bit
variants, and -gnu is misleading if other tooling like LLVM already
supports malign-int. Moreover, it's impossible to align to a bit count in
general. Not that you'd want to -- it's actually the natural alignment of
shorts that is at issue, AIUI.
Yes, I do and that's just my personal opinion. But as I said, I am open to
other naming suggestions.
Post by Finn Thain
So, for naming purposes, the proposal might be described as either the ABI
du jour (leading to -abi23 for 2023) or the new ABI for ever (leading to
-abin as in -gnuabin32 on MIPS).
That's why I suggested we can look how the ARM developers will name their
triplet when switching to 64-bit time_t on 32-bit ARM systems.
Post by Finn Thain
If it's the former, perhaps you should not push it upstream. If it's the
latter, perhaps this redesign should seek to address real shortcomings
with the existing ABI, including problems which (for all I know) may have
entirely prevented some people from using it thus far. That is, it should
consider silicon beyond 680x0.
It's a historic architecture. We don't have to redesign everything. It's enough
to address the most pressing issues and these are 16-bit alignment and 32-bit
time_t.
If the idea is really to endeavor on a new ABI for m68k, it means a different
loader and the question: will it be interoperable with current m68k ABI in the
sense that i686 is interoperable with x86_64? It would allow to keep old binaries
running, similar to what old ABI did for 32 to 64 bits transition.

It would require take care that some possible shared data structures (such as
pthread_mutex_t and alike) have the same layout and alignment, add some support
to ldconfig to differentiated between DSO with different ABIs (either through
e_flags as ARM, PT_GNU_PROPERTY used by aarch64 or x86_64, or something else),
bump the required minimum kernel (for 64 bit time_t support), and check current
status of the port.

I am bringing the later because I fixed some recent m68k build issues [1], that
seems to be from gcc changes over the years (as hinted by Andreas Schwab) where
compiler changed some internal defined flags and it was not reflected on glibc
(for a short, it seems that -mcpu=680X0 does not already define __mc68020__).
The build fix is straightforward, but it raised question whether something
else is not broken and has not been caught yet.

Waldemar Brodkorb has posted his results on running glibc 2.38 on qemu and
it shows a lot of regression:

949 FAIL
3344 PASS
99 UNSUPPORTED
16 XFAIL
2 XPASS

I guess the math failures are from the extra rounding and exception testing, which
requires a fully compliant IEEE 754 fp unit (which I guess m68k does not provide).
The last m68k testsuite report where from 2.26 release [1] running under ARAnyM,
which shows the port is a better shape.

I also noted that gcc on mc68060 changed the __DEC_EVAL_METHOD__ to 2, which makes
glibc tests to fail to build (since it assumes __DEC_EVAL_METHOD__ equal 0). This
again raised questions on how the math library would behave depending of the target
chip.

All of this issues and potentially work required for a new ABI makes me wonder
if is really worth to keep *2* distinct ABIs for m68k. Yes, m68k can follow the
MIPS mess and have 28 different ABIs that fails to be fully interoperable; but
I think that if you really want to on this 'gnu32' journey, I think it will be
better to just deprecate the m68k current ABI, remove it from glibc; and move
everything to new ABI.

[1] https://sourceware.org/bugzilla/show_bug.cgi?id=30740
[2] https://sourceware.org/bugzilla/show_bug.cgi?id=30740#c16
[3] https://sourceware.org/glibc/wiki/Release/2.26#M68K
Finn Thain
2023-08-29 01:40:01 UTC
Permalink
Post by John Paul Adrian Glaubitz
And since we have to break the ABI anyway to be able to use 64-bit
time_t
If you're worried about Y2038, aren't you jumping the gun? I reckon we
have about 10 years in which to figure out what a better m68k ABI should
look like.
Post by John Paul Adrian Glaubitz
I don't see any valid reason to stick to the problematic 16-bit
alignment used by the current ABI.
Well, here are a few reasons why all those padding patches you wrote were
a good thing (besides the obvious benefit of avoiding an ABI break):

- That code is now more portable among projects which care about
portability to 16-bit platforms etc.

- Explicit alignment reveals suboptimal cache footprint and wasted memory.

- Data structures often outlive the software that introduced them. It's
safe to say that the struct definitions you fixed will produce a benefit
you may never hear about, by virtue of code re-use.
Post by John Paul Adrian Glaubitz
...
Post by Finn Thain
If it's the former, perhaps you should not push it upstream. If it's
the latter, perhaps this redesign should seek to address real
shortcomings with the existing ABI, including problems which (for all
I know) may have entirely prevented some people from using it thus
far. That is, it should consider silicon beyond 680x0.
It's a historic architecture. We don't have to redesign everything.
Coldfire is still shipping (is it "historic" yet?). Not sure about Apollo
68080 and Buffee BP68040. Most likely TG68K and Pistorm will end up
gaining whatever features Linux requires (MMU etc.).

If we get the ABI right, such designs can benefit if it allows them to go
beyond 680x0 and better exploit the FPGA they may be implemented on.
(Dare I mention SMP?)

Considering just Coldfire for a moment, one question we could look at is,
how could the ABI be changed to permit the same binaries to work
efficiently on both kernels (CF and 680x0)?

It seems likely that ABI changes could potentially help to accelerate 68k
emulators.

Inefficient thread local storage is an issue that might be addressed with
VDSO calls rather than an ABI break.
Eero Tamminen
2023-08-29 09:10:01 UTC
Permalink
Hi,
Post by Finn Thain
Post by John Paul Adrian Glaubitz
And since we have to break the ABI anyway to be able to use 64-bit
time_t
If you're worried about Y2038, aren't you jumping the gun? I reckon we
have about 10 years in which to figure out what a better m68k ABI should
look like.
Debian is discussing LFS + time_t transition for the next release, for
all architectures. They are related, and if one needs to break /
transition ABI, doing it once is better than doing it twice...

LWN has summary of the discussion: https://lwn.net/Articles/938149/


- Eero
John Paul Adrian Glaubitz
2024-05-15 17:10:01 UTC
Permalink
Hi,
Post by John Paul Adrian Glaubitz
The Debian m68k maintainers proposed building their packages with -malign-int
last year, aligning to 32-bit instead of 16-bit, which improves compatibility
with some projects and should give better performance on 68020+, at the cost
of slightly increased memory usage. The mold linker is at least one project
that has been shown to work after making this change where it previously
didn't.
- LLVM
- Firebird Database
- OpenJDK
- Various Qt packages
We can now add CPython to this list as 3.13 requires 32-bit alignment [1]:

In file included from ../Include/internal/pycore_backoff.h:12,
from ../Include/internal/pycore_code.h:474,
from ../Include/internal/pycore_interp.h:16,
from ../Include/internal/pycore_runtime.h:17,
from ../Include/internal/pycore_pystate.h:12,
from ../Include/internal/pycore_critical_section.h:9,
from ../Python/critical_section.c:4:
../Python/critical_section.c:6:1: error: static assertion failed: "critical section must be aligned to at least 4 bytes"
6 | static_assert(_Alignof(_PyCriticalSection) >= 4,
| ^~~~~~~~~~~~~

We should really make the switch now. It's certainly not getting better.

Adrian
Post by John Paul Adrian Glaubitz
[1] https://buildd.debian.org/status/fetch.php?pkg=python3.13&arch=m68k&ver=3.13.0%7Eb1-2&stamp=1715773703&raw=0
--
.''`. John Paul Adrian Glaubitz
: :' : Debian Developer
`. `' Physicist
`- GPG: 62FF 8A75 84E0 2956 9546 0006 7426 3B37 F5B5 F913
Loading...