• Instruction Parcel Size

    From Robert Finch@robfi680@gmail.com to comp.arch on Sat Mar 8 20:15:14 2025
    From Newsgroup: comp.arch

    Recently started Q+2 development.

    Trying to get code density closer to something like the 68k or VAX.
    Sounds like My66000 also has good code density using 32-bit parcels. If
    32-bit parcels work well, I have thought to try 24-bit parcels.

    Decided to stay away from other odd sized parcels which create
    addressing issues. Currently 3 sizes of instructions: 24 / 48 and
    96-bit. The 96-bit instructions are usually for encoding a 64-bit
    immediate. The first two opcode bits determine the size. 00=24 bit,
    01=48 bit, 10=96 bit, 11 (reserved)

    ADD, AND, OR, EOR, CMP have 24-bit instruction forms: iiiii-aaaaa-ttttt-ooooooo-00 <- immediate
    bbbbb-aaaaa-ttttt-ooooooo-00 <- register
    24-bit instruction forms allow using only the first 32 registers.

    ADD, AND, OR, EOR, CMP 48-bit instruction forms: fffffff-oooo-ccccccc-bbbbbbb-aaaaaaa-ttttttt-ooooooo-01
    48-bit instruction forms support a sign control bit on the register
    spec, along with 64 registers.

    LOAD / STORE word size have 24-bit forms
    ddddd-aaaaa-ttttt-ooooooo-00

    Conditional Branches (compare and branch) are 48-bit pp-R-TTTTTTTTTTTTTTTTTTT-aaaaaa-bbbbbb-A-ffff-ooooooo-01

    With load / store / basic arithmetic as 24-bit and 48-bit
    compare-and-branch a good portion of instructions should occupy the same
    or less storage space than a 32-bit ISA.

    I have not written the assembler yet, so nothing to measure.


    --- Synchronet 3.20c-Linux NewsLink 1.2
  • From mitchalsup@mitchalsup@aol.com (MitchAlsup1) to comp.arch on Sun Mar 9 01:43:07 2025
    From Newsgroup: comp.arch

    On Sun, 9 Mar 2025 1:15:14 +0000, Robert Finch wrote:

    Recently started Q+2 development.

    Trying to get code density closer to something like the 68k or VAX.
    Sounds like My66000 also has good code density using 32-bit parcels. If 32-bit parcels work well, I have thought to try 24-bit parcels.

    Ok, go all Quadriblock on me .... see if I care !

    Decided to stay away from other odd sized parcels which create
    addressing issues. Currently 3 sizes of instructions: 24 / 48 and
    96-bit. The 96-bit instructions are usually for encoding a 64-bit
    immediate. The first two opcode bits determine the size. 00=24 bit,
    01=48 bit, 10=96 bit, 11 (reserved)

    You might find 72-bit instructions useful in carrying a 32-bit
    immediate.

    ADD, AND, OR, EOR, CMP have 24-bit instruction forms: iiiii-aaaaa-ttttt-ooooooo-00 <- immediate
    bbbbb-aaaaa-ttttt-ooooooo-00 <- register
    24-bit instruction forms allow using only the first 32 registers.

    A 6-bit OpCode might let 1 more bit into immediate, or similar
    sign control over register operand. Remember, this is the highly
    used OpCode category. So, we have 32 Imm5(6) OpCodes.

    ADD, AND, OR, EOR, CMP 48-bit instruction forms: fffffff-oooo-ccccccc-bbbbbbb-aaaaaaa-ttttttt-ooooooo-01
    48-bit instruction forms support a sign control bit on the register
    spec, along with 64 registers.

    It really looks like you are forming 2×24-bit instructions into
    (wait for it) 2×24-bit containers. Sign control on operands is
    useful 5-register operands may not be so. I am not against this
    format, but I think you are wasting a lot of entropy here.

    LOAD / STORE word size have 24-bit forms
    ddddd-aaaaa-ttttt-ooooooo-00

    Where do you get stack and structure displacements ?? This is one
    place My 66000 ISA is significantly better than RISC-V.

    Conditional Branches (compare and branch) are 48-bit pp-R-TTTTTTTTTTTTTTTTTTT-aaaaaa-bbbbbb-A-ffff-ooooooo-01

    Careful choice of oooooo may allow it to contain the condition
    in the ffff field expanding the displacement to 25-effective
    bits.

    With load / store / basic arithmetic as 24-bit and 48-bit
    compare-and-branch a good portion of instructions should occupy the same
    or less storage space than a 32-bit ISA.

    Questions remain wrt floating point constants and large integer
    constants.

    I have not written the assembler yet, so nothing to measure.

    The data will be interesting.
    --- Synchronet 3.20c-Linux NewsLink 1.2
  • From Robert Finch@robfi680@gmail.com to comp.arch on Sat Mar 8 23:14:49 2025
    From Newsgroup: comp.arch

    On 2025-03-08 8:43 p.m., MitchAlsup1 wrote:
    On Sun, 9 Mar 2025 1:15:14 +0000, Robert Finch wrote:

    Recently started Q+2 development.

    Trying to get code density closer to something like the 68k or VAX.
    Sounds like My66000 also has good code density using 32-bit parcels. If
    32-bit parcels work well, I have thought to try 24-bit parcels.

    Ok, go all Quadriblock on me .... see if I care !

    Decided to stay away from other odd sized parcels which create
    addressing issues. Currently 3 sizes of instructions: 24 / 48 and
    96-bit. The 96-bit instructions are usually for encoding a 64-bit
    immediate. The first two opcode bits determine the size. 00=24 bit,
    01=48 bit, 10=96 bit, 11 (reserved)

    You might find 72-bit instructions useful in carrying a 32-bit
    immediate.

    Yeah, I was sure I wanted to support 23-bit immediates (in the 48-bit
    format) and 64-bit immediates (96-bit format). But had not decided on supporting other sizes perhaps 128-bit immediates.A 72-bit format would
    allow over 40 bits for immediates.

    ADD, AND, OR, EOR, CMP have 24-bit instruction forms:
    iiiii-aaaaa-ttttt-ooooooo-00    <- immediate
    bbbbb-aaaaa-ttttt-ooooooo-00 <- register
    24-bit instruction forms allow using only the first 32 registers.

    A 6-bit OpCode might let 1 more bit into immediate, or similar
    sign control over register operand. Remember, this is the highly
    used OpCode category. So, we have 32 Imm5(6) OpCodes.

    Yeah, I have been scratching my head over how to free up another opcode
    bit. But the decode is simple. ooooooo always refers to the same opcode
    even if longer/shorter forms do not make sense.
    There are only about 23 opcodes free out of 128. That is 105
    instructions. Sometimes the opcode bits are used to determine the
    precision of the operation. So, ooooooo is sometimes ooooopp. The data
    type is also sometimes encoded in ooooooo (ooooTTT).


    ADD, AND, OR, EOR, CMP 48-bit instruction forms:
    fffffff-oooo-ccccccc-bbbbbbb-aaaaaaa-ttttttt-ooooooo-01
    48-bit instruction forms support a sign control bit on the register
    spec, along with 64 registers.

    It really looks like you are forming 2×24-bit instructions into
    (wait for it) 2×24-bit containers. Sign control on operands is
    useful 5-register operands may not be so. I am not against this
    format, but I think you are wasting a lot of entropy here.

    It is almost two instructions, but there is only a single target
    register. That way in some cases up to 8 ops per clock can be processed instead of just 4. The compiler does not do a good job of making use of
    the dual ops yet.
    There are only four register operands. fffffff is a function code. oooo
    is a second operation between the result of (a op b) and c. The seven
    bit register spec includes sign control (there are only 64 GPRs).
    aaaaaaa is really saaaaaa.
    Register ops do not needs anything beyond 48 bits so there is a lot
    wasted as the length field could also be 10 or 11 both of which are not
    used.


    LOAD / STORE word size have 24-bit forms
    ddddd-aaaaa-ttttt-ooooooo-00

    Where do you get stack and structure displacements ?? This is one
    place My 66000 ISA is significantly better than RISC-V.
    ddddd is the displacement which is multiplied by 8 (word size).

    Conditional Branches (compare and branch) are 48-bit
    pp-R-TTTTTTTTTTTTTTTTTTT-aaaaaa-bbbbbb-A-ffff-ooooooo-01

    Careful choice of oooooo may allow it to contain the condition
    in the ffff field expanding the displacement to 25-effective
    bits.
    There is a whole row (8 opcodes) in the opcode table dedicated to the
    branch data type. The data type could be integer, unsigned integer,
    float, decimal float, posit or capability.

    ooooooo = 0101TTT (the same breakdown is used for loads and stores).
    TTT
    000 - unsigned
    001 - signed
    010 - float
    011 - decimal float
    100 - posit
    101 - capability

    1000TTT = loads
    1001TTT = stores

    It may be possible to gain a bit. I did not think it was critical. There
    are 19 displacement bits. 'A' indicates relative or absolute addressing
    for the target. Only relative addressing is really needed so a bit could
    be gained there as well. 'R' indicates to use register 'c' as the
    target, another bit that may not be needed. But it allows branch to
    register / conditional return.


    With load / store / basic arithmetic as 24-bit and 48-bit
    compare-and-branch a good portion of instructions should occupy the same
    or less storage space than a 32-bit ISA.

    Questions remain wrt floating point constants and large integer
    constants.

    iiiiiiiiiiiiiiiiiiiiiii-pp-saaaaaa-stttttt-ooooooo-01 <- immediate
    iii--65 bits--iii-pp-saaaaaa-stttttt-ooooooo-10 <- immediate

    Immediate postfixes may override register specs. Useful for encoding
    float constants too.

    iiiiiiiiiiiii-SS-1111100-00 (18-bit immediate postfix - 24 bit insn
    iii--36 bits--iii-SS-1111100-01 (41-bit immediate postfix - 48 bit insn
    iii--79 bits--iii-SS-1111100-10 (89-bit immediate postfix - 96 bit insn

    SS selects which register spec to override.


    I have not written the assembler yet, so nothing to measure.

    The data will be interesting.

    --- Synchronet 3.20c-Linux NewsLink 1.2
  • From Robert Finch@robfi680@gmail.com to comp.arch on Sun Mar 9 08:36:59 2025
    From Newsgroup: comp.arch

    Conditional Branches (compare and branch) are 48-bit
    pp-R-TTTTTTTTTTTTTTTTTTT-aaaaaa-bbbbbb-A-ffff-ooooooo-01

    Careful choice of oooooo may allow it to contain the condition
    in the ffff field expanding the displacement to 25-effective
    bits.

    Gained a bit in the displacement field by allocating another row of
    opcodes and moving the 'R' bit into the opcode. So, now its

    pp-TTTTTTTTTTTTTTTTTTTT-aaaaaa-bbbbbb-A-ffff-oooRooo-01

    a longer form of branches could also be made using a 96-bit instruction

    pp-{68{T}}-aaaaaa-bbbbbb-A-ffff-oooRooo-10

    been pondering coming up with a shorter form (24-bit) branches, maybe by comparing to zero, BEQZ / BNEZ. Say,

    TTTTTTTTT-aaaaaa-ooooooo-00

    would be good only for word-size integer value comparisons, but that
    might work a significant portion of the time.

    Having 20 T's gives 21.5 bits of effective displacement, as the
    displacement T's are multiplied by three.

    Using up eight of the free opcodes, so there is only about 13 left now,
    but I think it was worth it to get a branch displacement bit.

    Hmmm, I could get rid of the 'A' bit by moving it to a control register.
    One would likely want absolute addressing for branches for the entire
    program, not just one-at-a-time selection.



    --- Synchronet 3.20c-Linux NewsLink 1.2
  • From mitchalsup@mitchalsup@aol.com (MitchAlsup1) to comp.arch on Sun Mar 9 18:09:22 2025
    From Newsgroup: comp.arch

    On Sun, 9 Mar 2025 12:36:59 +0000, Robert Finch wrote:

    Conditional Branches (compare and branch) are 48-bit
    pp-R-TTTTTTTTTTTTTTTTTTT-aaaaaa-bbbbbb-A-ffff-ooooooo-01

    Careful choice of oooooo may allow it to contain the condition
    in the ffff field expanding the displacement to 25-effective
    bits.

    Gained a bit in the displacement field by allocating another row of
    opcodes and moving the 'R' bit into the opcode. So, now its

    pp-TTTTTTTTTTTTTTTTTTTT-aaaaaa-bbbbbb-A-ffff-oooRooo-01

    a longer form of branches could also be made using a 96-bit instruction

    pp-{68{T}}-aaaaaa-bbbbbb-A-ffff-oooRooo-10

    been pondering coming up with a shorter form (24-bit) branches, maybe by comparing to zero, BEQZ / BNEZ. Say,

    My 66000 uses 29 of the available 32 Conditions in the compare to
    zero and branch instruction. 6 signed integer, 4 unsigned integer,
    8 float, 8 double, and I stuck SVC, SVR, and RET in this instruction
    too.

    TTTTTTTTT-aaaaaa-ooooooo-00

    would be good only for word-size integer value comparisons, but that
    might work a significant portion of the time.

    Having 20 T's gives 21.5 bits of effective displacement, as the
    displacement T's are multiplied by three.

    Using up eight of the free opcodes, so there is only about 13 left now,
    but I think it was worth it to get a branch displacement bit.

    Hmmm, I could get rid of the 'A' bit by moving it to a control register.
    One would likely want absolute addressing for branches for the entire program, not just one-at-a-time selection.
    --- Synchronet 3.20c-Linux NewsLink 1.2