There was a recent post to the gcc mailing list which showed
interesting concept of dealing with large constants in an ISA:
Splitting a the instruction and constant stream. It can be found
at https://github.com/michaeljclark/glyph/ , and is named "glyph".
I think the problem the author is trying to solve is better addressed by
My 66000 (and I would absolutely _hate_ to write an assembler for it).
Still, I thought it worth mentioning.
On Sat, 8 Mar 2025 14:21:51 +0000, Thomas Koenig wrote:
There was a recent post to the gcc mailing list which showed
interesting concept of dealing with large constants in an ISA:
Splitting a the instruction and constant stream. It can be found
at https://github.com/michaeljclark/glyph/ , and is named "glyph".
I knew a guy with that name at AMD--he did microcode--and did it well.
I think the problem the author is trying to solve is better addressed by
My 66000 (and I would absolutely _hate_ to write an assembler for it).
Still, I thought it worth mentioning.
I took a quick look, and it seems that
a) too few registers
b) too many OpCode bits
although it does look easy to parse.
There was a recent post to the gcc mailing list which showed
interesting concept of dealing with large constants in an ISA:
Splitting a the instruction and constant stream. It can be found
at https://github.com/michaeljclark/glyph/ , and is named "glyph".
I think the problem the author is trying to solve is better addressed by
My 66000 (and I would absolutely _hate_ to write an assembler for it).
Still, I thought it worth mentioning.
Thomas Koenig wrote:
There was a recent post to the gcc mailing list which showedI have not looked at the link, but I would be quite surprised if the
interesting concept of dealing with large constants in an ISA:
Splitting a the instruction and constant stream. It can be found
at https://github.com/michaeljclark/glyph/ , and is named "glyph".
I think the problem the author is trying to solve is better addressed by
My 66000 (and I would absolutely _hate_ to write an assembler for it).
Still, I thought it worth mentioning.
idea isn't already covered by one or more Mill patents.
Mill does indeed split the instruction stream in two, it is one of the enablers for supporting a lot more instructions/cycle.
There was a recent post to the gcc mailing list which showed
interesting concept of dealing with large constants in an ISA:
Splitting a the instruction and constant stream. It can be found
at https://github.com/michaeljclark/glyph/ , and is named "glyph".
I think the problem the author is trying to solve is better addressed by
My 66000 (and I would absolutely _hate_ to write an assembler for it).
Still, I thought it worth mentioning.
On Sat, 8 Mar 2025 14:21:51 +0000, Thomas Koenig wrote:
There was a recent post to the gcc mailing list which showed
interesting concept of dealing with large constants in an ISA:
Splitting a the instruction and constant stream. It can be found
at https://github.com/michaeljclark/glyph/ , and is named "glyph".
I knew a guy with that name at AMD--he did microcode--and did it well.
I think the problem the author is trying to solve is better addressed by
My 66000 (and I would absolutely _hate_ to write an assembler for it).
Still, I thought it worth mentioning.
I took a quick look, and it seems that
a) too few registers
b) too many OpCode bits
although it does look easy to parse.
I think splitting the code and constant into separate streams requires another port(s) on the I$. The port may already be present if jump-through-table, JTT, is supported.
I guess that the constant tables for a subroutine would be placed either before or after a subroutine.
On 2025-03-08 9:21 a.m., Thomas Koenig wrote:
There was a recent post to the gcc mailing list which showed
interesting concept of dealing with large constants in an ISA:
Splitting a the instruction and constant stream. It can be found
at https://github.com/michaeljclark/glyph/ , and is named "glyph".
I think the problem the author is trying to solve is better addressed by
My 66000 (and I would absolutely _hate_ to write an assembler for it).
Still, I thought it worth mentioning.
Found that post interesting.
As outlined, the immediate base register requires a double-wide link register. This may be okay for code with 32b addresses running in a 64-
bit machine. But otherwise would probably need to go through another GPR
to manage the immediate base register. It is potentially more
instructions in the function prolog / epilog code. And more instructions
at function call.
I think splitting the code and constant into separate streams requires another port(s) on the I$. The port may already be present if jump- through-table, JTT, is supported.
I guess that the constant tables for a subroutine would be placed either before or after a subroutine. I would not use the constant tables for
all constants. Small constants are better encoded directly in the instruction. That means using bits to select between small constants or relative addresses.
I think it is better to use a constant prefix / postfix instruction to encode larger constants in the instruction stream. Or use a wider instruction format. In Q+ constant postfixes can be used to override a register spec, allowing immediate constants to be used with many more instructions.
On Sat, 8 Mar 2025 17:53:34 +0000, MitchAlsup1 wrote:
On Sat, 8 Mar 2025 14:21:51 +0000, Thomas Koenig wrote:
There was a recent post to the gcc mailing list which showed
interesting concept of dealing with large constants in an ISA:
Splitting a the instruction and constant stream. It can be found
at https://github.com/michaeljclark/glyph/ , and is named "glyph".
I knew a guy with that name at AMD--he did microcode--and did it well.
I think the problem the author is trying to solve is better addressed by >>> My 66000 (and I would absolutely _hate_ to write an assembler for it).
Still, I thought it worth mentioning.
I took a quick look, and it seems that
a) too few registers
b) too many OpCode bits
although it does look easy to parse.
The length decode is wasteful of bits. There are 4 sizes of instructions
16, 32, 54, 128 denoted by the first halfword having (respectively)
00, 01, 10, 11. But successive halfwords contain 2-bits that simply
waste entropy and could have been used for "other good stuff".
16-bit instructions get a 5-bit opcode, and the entire 32 instruction
space is already fully populated.
32-bit instructions get a 10-bit OpCode space. At this point I should
note that my entire OpCode instruction space has only 62 instructions.
64-bit instructions get a 20-bit OpCode space. Nobody is going to need
1M individual instructions.
So, a bit of rearrangement would provide for a healthy OpCode space
and more bits for registers, and possibly a 96-bit instruction in-
stead of a 128-bit instruction.
So, we are still missing::
a) a memory order model
b) a translation model
c) atomic instructions
d) external linkage {code and data}
e) thread support using his {ip, bp) construct
f) system call model
g) debug model
h) timers and counters
i) floating point
..
On 3/8/2025 5:07 PM, Robert Finch wrote:
On 2025-03-08 9:21 a.m., Thomas Koenig wrote:
There was a recent post to the gcc mailing list which showed
interesting concept of dealing with large constants in an ISA:
Splitting a the instruction and constant stream. It can be found
at https://github.com/michaeljclark/glyph/ , and is named "glyph".
I think the problem the author is trying to solve is better addressed by >>> My 66000 (and I would absolutely _hate_ to write an assembler for it).
Still, I thought it worth mentioning.
Found that post interesting.
As outlined, the immediate base register requires a double-wide link
register. This may be okay for code with 32b addresses running in a
64- bit machine. But otherwise would probably need to go through
another GPR to manage the immediate base register. It is potentially
more instructions in the function prolog / epilog code. And more
instructions at function call.
I think splitting the code and constant into separate streams requires
another port(s) on the I$. The port may already be present if jump-
through-table, JTT, is supported.
I found a few of the ideas questionable at best...
Possibly an IB like use-case could be handled instead by just using it
as a dedicated base register for constant loads. But, this would have similar latency to a traditional constant pool (which also sucks).
But, if it is directly loaded inline, this could add extra complexity
and delay to the pipeline.
It almost seems like a case of "what if we took a constant pool, and
made it worse...".
Or, if a constant pool does have a strong enough use-case (say, one
wants fixed-length 16-bit ops), maybe treat it like a constant pool but
have a few special case helper ops.
Say, 16-bit ops:
MOV.L @IB+, Rn //load and advance 4 bytes
MOV.Q @IB+, Rn //load and advance 8 bytes
MOV.L (IB, Disp4n*4), Rn
MOV.Q (IB, Disp4n*4), Rn
Where, the displacement is negative to allow repeating a recently seen
prior value.
With the usual caveats of supporting auto-increment.
I guess that the constant tables for a subroutine would be placed
either before or after a subroutine. I would not use the constant
tables for all constants. Small constants are better encoded directly
in the instruction. That means using bits to select between small
constants or relative addresses.
I think it is better to use a constant prefix / postfix instruction to
encode larger constants in the instruction stream. Or use a wider
instruction format. In Q+ constant postfixes can be used to override a
register spec, allowing immediate constants to be used with many more
instructions.
Agree...
If one is already going to have a variable length encoding, why not make
it have decent inline immediate fields?...
One thought I had a while ago using a similar technique to glyph's was
to place constants at the beginning or the end of a cache line. Then the immediate base register is not needed. The relative offsets would be in
terms of the current cache line. It has a couple of drawbacks though,
one being the need to branch around the constant data; could be done by carefully maintaining the next fetch address. Another drawback is the
code is repositionable only at cache-line boundaries. Might make
assembling / linking code interesting.
On Sun, 9 Mar 2025 12:03:22 +0000, Robert Finch wrote:
One thought I had a while ago using a similar technique to glyph's was
to place constants at the beginning or the end of a cache line. Then the
immediate base register is not needed. The relative offsets would be in
terms of the current cache line. It has a couple of drawbacks though,
one being the need to branch around the constant data; could be done by
carefully maintaining the next fetch address. Another drawback is the
code is repositionable only at cache-line boundaries. Might make
assembling / linking code interesting.
If you put the constants at the end of the cache line, you will have
accessed the constants while decoding the instructions and you can
figure out when to jump to the next cache line without branching.
Robert Finch <robfi680@gmail.com> schrieb:
I think splitting the code and constant into separate streams requires
another port(s) on the I$. The port may already be present if
jump-through-table, JTT, is supported.
There is also the problem of additional cache (page, ...) misses with
the instruction stream. Maybe an extra "constant data" cache?
That would depend on how far the extra data is from the code.
But branches are going to be more expensive because it is not
only the PC that needs to changed, but also the data pointer.
Thinking about this a bit more... conceptually, this is not so far--- Synchronet 3.20c-Linux NewsLink 1.2
off from the /360 base pointer addressing mode, but with the base
pointer implied instead of explicit.
I guess that the constant tables for a subroutine would be placed either
before or after a subroutine.
Like what was usually done for the /360, I believe.
But much more "fun" could be had if the base pointer was supplied
by the caller. Want a routine that does something different,
just call it with a different constant stream for instructions.
(OK, you could also pass an argument, but that would offer less
possibilities for quasi self-modifying code).
On 2025-03-08 9:21 a.m., Thomas Koenig wrote:
There was a recent post to the gcc mailing list which showed
interesting concept of dealing with large constants in an ISA:
Splitting a the instruction and constant stream. It can be found
at https://github.com/michaeljclark/glyph/ , and is named "glyph".
I think the problem the author is trying to solve is better addressed by
My 66000 (and I would absolutely _hate_ to write an assembler for it).
Still, I thought it worth mentioning.
I think it is better to use a constant prefix / postfix instruction to encode larger constants in the instruction stream. Or use a wider instruction format. In Q+ constant postfixes can be used to override a register spec, allowing immediate constants to be used with many more instructions.
Robert Finch wrote:
On 2025-03-08 9:21 a.m., Thomas Koenig wrote:
There was a recent post to the gcc mailing list which showed
interesting concept of dealing with large constants in an ISA:
Splitting a the instruction and constant stream. It can be found
at https://github.com/michaeljclark/glyph/ , and is named "glyph".
I think the problem the author is trying to solve is better addressed by >>> My 66000 (and I would absolutely _hate_ to write an assembler for it).
Still, I thought it worth mentioning.
I think it is better to use a constant prefix / postfix instruction to
encode larger constants in the instruction stream. Or use a wider
instruction format. In Q+ constant postfixes can be used to override a
register spec, allowing immediate constants to be used with many more
instructions.
Yes a kind of prefix instruction that say "here comes an immediate
value"
and loads a 2, 4, or 8 byte immediate with all the sign or zero extend options into a special constant register in the Decoder and marks it
valid.
The next instruction just says add immediate "ADDI rd, rs" and it
implies
the constant it just stashed.
That relieves the consumer opcodes from having to encode all the
different variable immediate formats.
It could easily extend to multiple immediate prefix instructions so
one can have instructions like store immediate STD [rd+imm1], imm2
by just adding a second constant register to the Decoder.
The only complication I can see is if the instruction producer-consumer
pair straddle pages and their is a page fault on the second.
I wouldn't want to have to save the stashed constant as "thread context"
so it should roll back to the start of the immediate instruction.
In which case the faulting RIP is the first instruction and the--- Synchronet 3.20c-Linux NewsLink 1.2
faulting address is someplace in the second.
MitchAlsup1 <mitchalsup@aol.com> schrieb:
On Sun, 9 Mar 2025 12:03:22 +0000, Robert Finch wrote:
One thought I had a while ago using a similar technique to glyph's was
to place constants at the beginning or the end of a cache line. Then the >>> immediate base register is not needed. The relative offsets would be in
terms of the current cache line. It has a couple of drawbacks though,
one being the need to branch around the constant data; could be done by
carefully maintaining the next fetch address. Another drawback is the
code is repositionable only at cache-line boundaries. Might make
assembling / linking code interesting.
If you put the constants at the end of the cache line, you will have
accessed the constants while decoding the instructions and you can
figure out when to jump to the next cache line without branching.
Did I mention I would not like to write an assembler for that? :-)
On Sun, 9 Mar 2025 21:02:44 +0000, EricP wrote:
Robert Finch wrote:
On 2025-03-08 9:21 a.m., Thomas Koenig wrote:
There was a recent post to the gcc mailing list which showed
interesting concept of dealing with large constants in an ISA:
Splitting a the instruction and constant stream. It can be found
at https://github.com/michaeljclark/glyph/ , and is named "glyph".
I think the problem the author is trying to solve is better
addressed by
My 66000 (and I would absolutely _hate_ to write an assembler for it). >>>> Still, I thought it worth mentioning.
I think it is better to use a constant prefix / postfix instruction to
encode larger constants in the instruction stream. Or use a wider
instruction format. In Q+ constant postfixes can be used to override a
register spec, allowing immediate constants to be used with many more
instructions.
Yes a kind of prefix instruction that say "here comes an immediate
value"
and loads a 2, 4, or 8 byte immediate with all the sign or zero extend
options into a special constant register in the Decoder and marks it
valid.
The next instruction just says add immediate "ADDI rd, rs" and it
implies
the constant it just stashed.
That relieves the consumer opcodes from having to encode all the
different variable immediate formats.
It could easily extend to multiple immediate prefix instructions so
one can have instructions like store immediate STD [rd+imm1], imm2
by just adding a second constant register to the Decoder.
The only complication I can see is if the instruction producer-consumer
pair straddle pages and their is a page fault on the second.
I wouldn't want to have to save the stashed constant as "thread context"
so it should roll back to the start of the immediate instruction.
Execute the instruction and the (preceding) constant as a single
instruction, so any fault leaves IP pointing at the constant.
In which case the faulting RIP is the first instruction and the
faulting address is someplace in the second.
On Sun, 9 Mar 2025 21:02:44 +0000, EricP wrote:
Robert Finch wrote:
On 2025-03-08 9:21 a.m., Thomas Koenig wrote:
There was a recent post to the gcc mailing list which showed
interesting concept of dealing with large constants in an ISA:
Splitting a the instruction and constant stream. It can be found
at https://github.com/michaeljclark/glyph/ , and is named "glyph".
I think the problem the author is trying to solve is better
addressed by
My 66000 (and I would absolutely _hate_ to write an assembler for it). >>>> Still, I thought it worth mentioning.
I think it is better to use a constant prefix / postfix instruction to
encode larger constants in the instruction stream. Or use a wider
instruction format. In Q+ constant postfixes can be used to override a
register spec, allowing immediate constants to be used with many more
instructions.
Yes a kind of prefix instruction that say "here comes an immediate
value"
and loads a 2, 4, or 8 byte immediate with all the sign or zero extend
options into a special constant register in the Decoder and marks it
valid.
The next instruction just says add immediate "ADDI rd, rs" and it
implies
the constant it just stashed.
That relieves the consumer opcodes from having to encode all the
different variable immediate formats.
It could easily extend to multiple immediate prefix instructions so
one can have instructions like store immediate STD [rd+imm1], imm2
by just adding a second constant register to the Decoder.
The only complication I can see is if the instruction producer-consumer
pair straddle pages and their is a page fault on the second.
I wouldn't want to have to save the stashed constant as "thread context"
so it should roll back to the start of the immediate instruction.
Execute the instruction and the (preceding) constant as a single
instruction, so any fault leaves IP pointing at the constant.
In which case the faulting RIP is the first instruction and the
faulting address is someplace in the second.
On Sun, 9 Mar 2025 21:02:44 +0000, EricP wrote:
Robert Finch wrote:
On 2025-03-08 9:21 a.m., Thomas Koenig wrote:
There was a recent post to the gcc mailing list which showed
interesting concept of dealing with large constants in an ISA:
Splitting a the instruction and constant stream. It can be found
at https://github.com/michaeljclark/glyph/ , and is named "glyph".
I think the problem the author is trying to solve is better
addressed by
My 66000 (and I would absolutely _hate_ to write an assembler for it). >>>> Still, I thought it worth mentioning.
I think it is better to use a constant prefix / postfix instruction to
encode larger constants in the instruction stream. Or use a wider
instruction format. In Q+ constant postfixes can be used to override a
register spec, allowing immediate constants to be used with many more
instructions.
Yes a kind of prefix instruction that say "here comes an immediate
value"
and loads a 2, 4, or 8 byte immediate with all the sign or zero extend
options into a special constant register in the Decoder and marks it
valid.
The next instruction just says add immediate "ADDI rd, rs" and it
implies
the constant it just stashed.
That relieves the consumer opcodes from having to encode all the
different variable immediate formats.
It could easily extend to multiple immediate prefix instructions so
one can have instructions like store immediate STD [rd+imm1], imm2
by just adding a second constant register to the Decoder.
The only complication I can see is if the instruction producer-consumer
pair straddle pages and their is a page fault on the second.
I wouldn't want to have to save the stashed constant as "thread context"
so it should roll back to the start of the immediate instruction.
Execute the instruction and the (preceding) constant as a single
instruction, so any fault leaves IP pointing at the constant.
In which case the faulting RIP is the first instruction and the
faulting address is someplace in the second.
Then we have the page-crossing issue. Is it better to force the compiler/assembler to align such instructions so that they never cross
page boundaries?
Marcus <m.delete@this.bitsnbites.eu> schrieb:
Then we have the page-crossing issue. Is it better to force the
compiler/assembler to align such instructions so that they never cross
page boundaries?
Power 10 chose to do so; actually, larger instructions cannot
cross a (likely) Cache line size there. According to the Power
ISA Version 3.1, section 1.6:
"Prefixed instructions do not cross 64-byte instruction address
boundaries. When a prefixed instruction crosses a 64-byte boundary,
the system alignment error handler is invoked."
On 2025-03-22 11:04 a.m., Thomas Koenig wrote:
Marcus <m.delete@this.bitsnbites.eu> schrieb:
Then we have the page-crossing issue. Is it better to force the
compiler/assembler to align such instructions so that they never cross
page boundaries?
Power 10 chose to do so; actually, larger instructions cannot
cross a (likely) Cache line size there. According to the Power
ISA Version 3.1, section 1.6:
"Prefixed instructions do not cross 64-byte instruction address
boundaries. When a prefixed instruction crosses a 64-byte boundary,
the system alignment error handler is invoked."
In the latest test project, the LB650 similar to a PowerPC, large
constants are encoded at the end of the cache line. So, there is a
similar issue of code running into the constant area.
I have the assembler moving the code that overlaps to the next cache line.
It is confusing to look at listing files, as there are constants output inline with the code. Makes it look like the code should not work. How
does it know where to go for the next instruction? Is the question that comes to mind.
For now, the hardware decoder takes the cheezy approach of marking instructions fetched in the constant area as invalid. The constant area
gets fetched and loaded into the pipeline, but as NOPs.
It is quite a trick getting the assembler to place constants at the end
of the cache line and generate references to the constants. It is interesting because I have *constants* being relocated by the assembler
/ linker. Normally there would not be a relocation associated with a constant. A relocation reference to the constant is spit out by the assembler, and the linker updates the index to the constant in the code.
It does not quite work yet. Constants are placed and code is moved, but
the linked program does not have the correct references yet.
Experimental, but looking like things will work.
In the latest test project, the LB650 similar to a PowerPC, large
constants are encoded at the end of the cache line. So, there is a
similar issue of code running into the constant area.
Den 2025-03-09 kl. 22:19, skrev MitchAlsup1:
On Sun, 9 Mar 2025 21:02:44 +0000, EricP wrote:
Robert Finch wrote:
On 2025-03-08 9:21 a.m., Thomas Koenig wrote:
There was a recent post to the gcc mailing list which showed
interesting concept of dealing with large constants in an ISA:
Splitting a the instruction and constant stream. It can be found
at https://github.com/michaeljclark/glyph/ , and is named "glyph".
I think the problem the author is trying to solve is better
addressed by
My 66000 (and I would absolutely _hate_ to write an assembler for it). >>>>> Still, I thought it worth mentioning.
I think it is better to use a constant prefix / postfix instruction to >>>> encode larger constants in the instruction stream. Or use a wider
instruction format. In Q+ constant postfixes can be used to override a >>>> register spec, allowing immediate constants to be used with many more
instructions.
Yes a kind of prefix instruction that say "here comes an immediate
value"
and loads a 2, 4, or 8 byte immediate with all the sign or zero extend
options into a special constant register in the Decoder and marks it
valid.
The next instruction just says add immediate "ADDI rd, rs" and it
implies
the constant it just stashed.
That relieves the consumer opcodes from having to encode all the
different variable immediate formats.
It could easily extend to multiple immediate prefix instructions so
one can have instructions like store immediate STD [rd+imm1], imm2
by just adding a second constant register to the Decoder.
The only complication I can see is if the instruction producer-consumer
pair straddle pages and their is a page fault on the second.
I wouldn't want to have to save the stashed constant as "thread context" >>> so it should roll back to the start of the immediate instruction.
Execute the instruction and the (preceding) constant as a single
instruction, so any fault leaves IP pointing at the constant.
Then we have the page-crossing issue. Is it better to force the compiler/assembler to align such instructions so that they never cross
page boundaries?
In which case the faulting RIP is the first instruction and the
faulting address is someplace in the second.
Den 2025-03-23 kl. 04:06, skrev Robert Finch:
On 2025-03-22 11:04 a.m., Thomas Koenig wrote:
Marcus <m.delete@this.bitsnbites.eu> schrieb:
Then we have the page-crossing issue. Is it better to force the
compiler/assembler to align such instructions so that they never cross >>>> page boundaries?
Power 10 chose to do so; actually, larger instructions cannot
cross a (likely) Cache line size there. According to the Power
ISA Version 3.1, section 1.6:
"Prefixed instructions do not cross 64-byte instruction address
boundaries. When a prefixed instruction crosses a 64-byte boundary,
the system alignment error handler is invoked."
In the latest test project, the LB650 similar to a PowerPC, large
constants are encoded at the end of the cache line. So, there is a
similar issue of code running into the constant area.
I have the assembler moving the code that overlaps to the next cache
line.
It is confusing to look at listing files, as there are constants output
inline with the code. Makes it look like the code should not work. How
does it know where to go for the next instruction? Is the question that
comes to mind.
For now, the hardware decoder takes the cheezy approach of marking
instructions fetched in the constant area as invalid. The constant area
gets fetched and loaded into the pipeline, but as NOPs.
It is quite a trick getting the assembler to place constants at the end
of the cache line and generate references to the constants. It is
interesting because I have *constants* being relocated by the assembler
/ linker. Normally there would not be a relocation associated with a
constant. A relocation reference to the constant is spit out by the
assembler, and the linker updates the index to the constant in the code.
It does not quite work yet. Constants are placed and code is moved, but
the linked program does not have the correct references yet.
Experimental, but looking like things will work.
Although I have not tried any of these techniques, here are my thoughts.
Why not always place the constant next to (right after) the instruction
that references it, instead of at an offset within the cache line?
The effect should be very similar, but now you have a simpler offset
(it's always zero) and you eliminate the problem with having to keep
track of where the constants are in order to prevent the PC/IP from
running into the constant area.
Den 2025-03-23 kl. 04:06, skrev Robert Finch:
On 2025-03-22 11:04 a.m., Thomas Koenig wrote:
Marcus <m.delete@this.bitsnbites.eu> schrieb:
Then we have the page-crossing issue. Is it better to force the
compiler/assembler to align such instructions so that they never cross >>>> page boundaries?
Power 10 chose to do so; actually, larger instructions cannot
cross a (likely) Cache line size there. According to the Power
ISA Version 3.1, section 1.6:
"Prefixed instructions do not cross 64-byte instruction address
boundaries. When a prefixed instruction crosses a 64-byte boundary,
the system alignment error handler is invoked."
In the latest test project, the LB650 similar to a PowerPC, large
constants are encoded at the end of the cache line. So, there is a
similar issue of code running into the constant area.
I have the assembler moving the code that overlaps to the next cache
line.
It is confusing to look at listing files, as there are constants
output inline with the code. Makes it look like the code should not
work. How does it know where to go for the next instruction? Is the
question that comes to mind.
For now, the hardware decoder takes the cheezy approach of marking
instructions fetched in the constant area as invalid. The constant
area gets fetched and loaded into the pipeline, but as NOPs.
It is quite a trick getting the assembler to place constants at the
end of the cache line and generate references to the constants. It is
interesting because I have *constants* being relocated by the
assembler / linker. Normally there would not be a relocation
associated with a constant. A relocation reference to the constant is
spit out by the assembler, and the linker updates the index to the
constant in the code.
It does not quite work yet. Constants are placed and code is moved,
but the linked program does not have the correct references yet.
Experimental, but looking like things will work.
Although I have not tried any of these techniques, here are my thoughts.
Why not always place the constant next to (right after) the instruction
that references it, instead of at an offset within the cache line?
The effect should be very similar, but now you have a simpler offset
(it's always zero) and you eliminate the problem with having to keep
track of where the constants are in order to prevent the PC/IP from
running into the constant area.
On 2025-03-23 8:12 a.m., Marcus wrote:
Den 2025-03-23 kl. 04:06, skrev Robert Finch:
On 2025-03-22 11:04 a.m., Thomas Koenig wrote:
Marcus <m.delete@this.bitsnbites.eu> schrieb:
That is a very good idea. It is the same thing almost as using aIt does not quite work yet. Constants are placed and code is moved,
but the linked program does not have the correct references yet.
Experimental, but looking like things will work.
Although I have not tried any of these techniques, here are my thoughts.
Why not always place the constant next to (right after) the instruction
that references it, instead of at an offset within the cache line?
variable length instruction.
LB650 uses a smaller constant packet (16-bits) than the instruction. So, instructions would need to be able to be aligned at 16-bit boundaries.
LB650 instruction are fixed 32-bit. There is also the possibility of
sharing the same constants, although slim.
The effect should be very similar, but now you have a simpler offsetI wish I had thought of that last night. But I have coded things now.
(it's always zero) and you eliminate the problem with having to keep
track of where the constants are in order to prevent the PC/IP from
running into the constant area.
Got the compiler / assembler going. The listings are a few percent--- Synchronet 3.20c-Linux NewsLink 1.2
shorter than the PowerPC. It may be due to bugs yet. I think the
difference may be the PowerPC burns up bits using pairs of instructions
for high/low halves of the constant.
The vbcc compiler for the PowerPC was modified.
Robert Finch <robfi680@gmail.com> schrieb:
In the latest test project, the LB650 similar to a PowerPC, large
constants are encoded at the end of the cache line. So, there is a
similar issue of code running into the constant area.
What is your motivation for this?
If you have an instruction including constant(s) which no longer
fits your cache line (say, 8 bytes left and 12 bytes needed)
it does not matter where you put the constants and where you
put the instructions - it will not fit, and you have to start
a new cache line.
I am not seeing an advantage over what Power 10 does, which is
just to add a NOP at the end if things don't fit on a cacheline.
On 2025-03-23 8:12 a.m., Marcus wrote:
Why not always place the constant next to (right after) the instructionThat is a very good idea. It is the same thing almost as using a
that references it, instead of at an offset within the cache line?
variable length instruction.
Robert Finch <robfi680@gmail.com> schrieb:
In the latest test project, the LB650 similar to a PowerPC, large
constants are encoded at the end of the cache line. So, there is a
similar issue of code running into the constant area.
What is your motivation for this?
If you have an instruction including constant(s) which no longer
fits your cache line (say, 8 bytes left and 12 bytes needed)
it does not matter where you put the constants and where you
put the instructions - it will not fit, and you have to start
a new cache line.
I am not seeing an advantage over what Power 10 does, which is
just to add a NOP at the end if things don't fit on a cacheline.
Sysop: | DaiTengu |
---|---|
Location: | Appleton, WI |
Users: | 1,030 |
Nodes: | 10 (0 / 10) |
Uptime: | 29:46:55 |
Calls: | 13,346 |
Calls today: | 3 |
Files: | 186,574 |
D/L today: |
5,012 files (1,223M bytes) |
Messages: | 3,357,857 |