• VLIW: The Road Less Travelled

    From Lawrence =?iso-8859-13?q?D=FFOliveiro?=@ldo@nz.invalid to comp.arch on Tue May 26 05:53:46 2026
    From Newsgroup: comp.arch

    A backgrounder from the ever-dependable Asianometry channel <https://www.youtube.com/watch?v=J7157XB8rxc> on the history of VLIW,
    beginning with the PhD student who invented to the concept, who then
    went on to found a company (Multiflow) to capitalize on it as part of
    the short-lived “mini-supercomputer” boomlet of the early-to-mid
    1980s.

    It was a truly tough concept to prove in practice. And in the end it
    seems to have been in vain. Seems the infamous HP/Intel Itanium
    project also took in some ex-employees of the mini-super companies.

    So what happened? RISC happened. That turned out to be a much more
    practical way of achieving leaps in performance.

    I think I saw someone in the video comments say VLIW is still used
    in DSP. But I think they mean SIMD, not VLIW.
    --- Synchronet 3.22a-Linux NewsLink 1.2
  • From David Brown@david.brown@hesbynett.no to comp.arch on Tue May 26 08:54:41 2026
    From Newsgroup: comp.arch

    On 26/05/2026 07:53, Lawrence D’Oliveiro wrote:
    A backgrounder from the ever-dependable Asianometry channel <https://www.youtube.com/watch?v=J7157XB8rxc> on the history of VLIW, beginning with the PhD student who invented to the concept, who then
    went on to found a company (Multiflow) to capitalize on it as part of
    the short-lived “mini-supercomputer” boomlet of the early-to-mid
    1980s.

    It was a truly tough concept to prove in practice. And in the end it
    seems to have been in vain. Seems the infamous HP/Intel Itanium
    project also took in some ex-employees of the mini-super companies.

    So what happened? RISC happened. That turned out to be a much more
    practical way of achieving leaps in performance.

    I think I saw someone in the video comments say VLIW is still used
    in DSP. But I think they mean SIMD, not VLIW.

    I have not watched the video. But it is certainly the case that some
    DSP's are referred to as VLIW, with good reason. The key feature of
    powerful DSP's is not that they can do SIMD-style operations (they can
    often do that as well), but that they can do multiple different
    operations in the same cycle, with explicit control. A DSP core might
    be doing two separate MAC operations along with matching loads and
    stores, where these memory operations are for different memory areas and buses, and with different rules for auto-incrementing, cyclic wrapping,
    etc. There's a count register to decrement and check for 0, breaking
    out of the loop. VLIW-style DSP's can let you do the whole lot in one
    or two instructions.

    --- Synchronet 3.22a-Linux NewsLink 1.2
  • From BGB@cr88192@gmail.com to comp.arch on Wed May 27 14:07:00 2026
    From Newsgroup: comp.arch

    On 5/26/2026 1:54 AM, David Brown wrote:
    On 26/05/2026 07:53, Lawrence D’Oliveiro wrote:
    A backgrounder from the ever-dependable Asianometry channel
    <https://www.youtube.com/watch?v=J7157XB8rxc> on the history of VLIW,
    beginning with the PhD student who invented to the concept, who then
    went on to found a company (Multiflow) to capitalize on it as part of
    the short-lived “mini-supercomputer” boomlet of the early-to-mid
    1980s.

    It was a truly tough concept to prove in practice. And in the end it
    seems to have been in vain. Seems the infamous HP/Intel Itanium
    project also took in some ex-employees of the mini-super companies.

    So what happened? RISC happened. That turned out to be a much more
    practical way of achieving leaps in performance.

    I think I saw someone in the video comments say VLIW is still used
    in DSP. But I think they mean SIMD, not VLIW.

    I have not watched the video.  But it is certainly the case that some
    DSP's are referred to as VLIW, with good reason.  The key feature of powerful DSP's is not that they can do SIMD-style operations (they can
    often do that as well), but that they can do multiple different
    operations in the same cycle, with explicit control.  A DSP core might
    be doing two separate MAC operations along with matching loads and
    stores, where these memory operations are for different memory areas and buses, and with different rules for auto-incrementing, cyclic wrapping, etc.  There's a count register to decrement and check for 0, breaking
    out of the loop.  VLIW-style DSP's can let you do the whole lot in one
    or two instructions.


    Yes, pretty much.


    I can probably also admit some that the general approach towards "WEX"
    in my own ISA designs (tagging the instruction words), along with the
    ASM syntax (using '|' to specify parallel instructions), took a "not
    exactly small" inspiration from the TMS320 DSP's (though a very similar approach was taken in the PIC32 / Xtensa ISA).

    This can be contrast with the 128b/3 approach taken by IA64.


    I had originally been looking into a 64/3 encoding as an alt-mode for
    BJX1, where, say:
    64-bit bundle holds either 3x 21-bit instructions;
    Or, 2x 31 bits;
    Or, a single 64-bit instruction.
    Say:
    0xxx: 3x 21 (2R/2RI only)
    10xx: 2x 31 (3R/3RI)
    11xx: Single Instr

    There would have been separate modes:
    A SH-4 compatibility mode
    Would run 32-bit SH-4 code;
    Dropped 64A mode:
    First attempt at 64-bit SH-4, used a modal encoding scheme
    Needed to toggle status bits to bank-swap various instructions.
    Was super annoying...
    A 64C mode:
    Ran a modified version of the SH encoding scheme.
    Dropped/reworked things to avoid modal encodings.
    A considered VLIW mode:
    Would have run the 64/3 bundles;
    Never really got implemented (idea was abandoned).
    Switched the to instead using an 80-bit encoding in the 64C mode:
    Bundle format followed a 16-bit escape op.

    After this, there were a few offshoot paths:
    SH-4 => B32V (a stripped down SH-4) => BTSR1
    BTSR1 x BJX1-64C => BJX2
    Then encoding jostled around, before landing on what I now call XG1.

    The original bundle format was dropped in favor of explicitly tagging
    the 32-bit encodings (after 32-bit encodings became the dominant core
    ISA, and the 16-bit encodings switched to being secondary). It then
    gained predication and jumbo prefixes and similar.

    Then from there:
    XG2: Dropped the 16-bit encodings, reusing the bits for expanding the
    entire ISA to 64 registers.

    XG3: Reworked the encoding scheme for XG2 to be more compatible with
    sharing the same execution context and encoding scheme as RV64G.
    Essentially unifying XG2 and RV64G into a singular ISA (but then
    necessary inconsistencies began to emerge between XG2 and XG3).



    Ironically, the main "nail in the coffin" for my "WEX" approach, was
    realizing that I could achieve nearly the same effect without the
    explicit tagging.

    Compiler shuffles the instructions into "basically the right order" but
    then similar logic to what the compiler used to tag the instructions post-shuffle is instead mostly handled in hardware (and more flexibly,
    as the pipeline mechanics are no longer rigidly baked into the program,
    even if the scheduling does still effect performance).

    Some "features", like specifying which instructions are allowed in which
    lane, or slightly alternate semantics for the instruction if used in
    Lane 2, etc, have gone away.

    The mechanism worked for RISC-V and turned out to be a lot cheaper than initial expectations, and (when I added XG3) could naturally extend it
    to cover XG3 as well.


    The jumbo prefixes are in some way a vestige of the old mechanism, but
    the approach is still effective as a "make instructions bigger" thing
    (and scales more gracefully than "new blue sky encoding spaces" at each
    size).

    More so when most of what one needs from a larger encoding is note some new/novel instruction, but usually just an older instruction with a
    bigger immediate (and one can slot the new instructions into spaces
    where extending the old instruction in a given way is invalid or does
    not make sense).

    Well, and lacking another better approach that works in both my own
    encoding schemes and when bolted onto RISC-V.

    ...


    Had also on/off considered the possibility of a BJX3 project, but have
    not yet done so. The main transition here would be to effectively subset
    my current project (and hopefully get back to a simpler CPU core).

    Though, ironically, most of this involved prerequisite work to try to
    get things into a state where the RV64G+XG3 mode could be
    self-supporting and with meaningful cost savings.

    But, it is more work than it would likely seem trying to get things to a
    state where XG3 can take over from XG1 in terms of managing OS stuff
    (and one could maybe debate whether it might be better to do a
    hard-switch vs my current more incremental approach).


    But, this has resulted in some recent breaking changes in XG3 trying to eliminate some of the remaining "ugly gotchas" in the encoding scheme.

    Though, one can debate if RV64G+XG3 is actually the right path forward.

    Well, and more specific points, like what specific features should be nominally implemented in hardware vs left to trap and emulate, but this
    is more of a configuration-time thing.

    ...


    --- Synchronet 3.22a-Linux NewsLink 1.2
  • From MitchAlsup@user5857@newsgrouper.org.invalid to comp.arch on Wed May 27 21:23:49 2026
    From Newsgroup: comp.arch


    David Brown <david.brown@hesbynett.no> posted:

    On 26/05/2026 07:53, Lawrence D’Oliveiro wrote:
    A backgrounder from the ever-dependable Asianometry channel <https://www.youtube.com/watch?v=J7157XB8rxc> on the history of VLIW, beginning with the PhD student who invented to the concept, who then
    went on to found a company (Multiflow) to capitalize on it as part of
    the short-lived “mini-supercomputer” boomlet of the early-to-mid
    1980s.

    It was a truly tough concept to prove in practice. And in the end it
    seems to have been in vain. Seems the infamous HP/Intel Itanium
    project also took in some ex-employees of the mini-super companies.

    So what happened? RISC happened. That turned out to be a much more practical way of achieving leaps in performance.

    I think I saw someone in the video comments say VLIW is still used
    in DSP. But I think they mean SIMD, not VLIW.

    I have not watched the video. But it is certainly the case that some
    DSP's are referred to as VLIW, with good reason. The key feature of powerful DSP's is not that they can do SIMD-style operations (they can
    often do that as well), but that they can do multiple different
    operations in the same cycle, with explicit control.

    MIMD--DSPs are designed to run small kernels of code where each cycle
    the core can {access memory, crunch some numbers, consider a branch}
    Which makes them ideal for {filtering, FFTing, some cyphers, store&
    forward network controllers}. These are the kinds of algorithms that
    fit the VLIW mantra well.

    {compilers, Assemblers, Operating Systems, File systems, ISRs, DPC's,
    GUIs, synchronization, multi-threaded shared libraries, ...} not so much.

    But for the applications they are suited for, they might save more than
    70% of the power needed by a GBOoO at less than 30% the die area (and
    cost). In other words, they pay a smaller vonNeumann tax. About the
    only thing that can be lower area and power costs would be hardwired
    algorithms {Texture, Rasterization, Interpolation, ...} which GPUs
    have figured out.

    A DSP core might
    be doing two separate MAC operations along with matching loads and
    stores, where these memory operations are for different memory areas and buses, and with different rules for auto-incrementing, cyclic wrapping,
    etc. There's a count register to decrement and check for 0, breaking
    out of the loop. VLIW-style DSP's can let you do the whole lot in one
    or two instructions.

    --- Synchronet 3.22a-Linux NewsLink 1.2
  • From Lawrence =?iso-8859-13?q?D=FFOliveiro?=@ldo@nz.invalid to comp.arch on Wed May 27 22:39:39 2026
    From Newsgroup: comp.arch

    On Wed, 27 May 2026 21:23:49 GMT, MitchAlsup wrote:

    MIMD--DSPs are designed to run small kernels of code where each
    cycle the core can {access memory, crunch some numbers, consider a
    branch} Which makes them ideal for {filtering, FFTing, some cyphers,
    store& forward network controllers}. These are the kinds of
    algorithms that fit the VLIW mantra well.

    MIMD is not VLIW, though. MIMD means the processors are decoupled and
    run asynchronously. At least I thought that’s what it meant. VLIW
    implies some kind of coupling, while still allowing different units to
    perform different operations, unlike SIMD.

    Then there is MISD, aka dataflow.
    --- Synchronet 3.22a-Linux NewsLink 1.2
  • From MitchAlsup@user5857@newsgrouper.org.invalid to comp.arch on Thu May 28 01:33:29 2026
    From Newsgroup: comp.arch


    Lawrence =?iso-8859-13?q?D=FFOliveiro?= <ldo@nz.invalid> posted:

    On Wed, 27 May 2026 21:23:49 GMT, MitchAlsup wrote:

    MIMD--DSPs are designed to run small kernels of code where each
    cycle the core can {access memory, crunch some numbers, consider a
    branch} Which makes them ideal for {filtering, FFTing, some cyphers,
    store& forward network controllers}. These are the kinds of
    algorithms that fit the VLIW mantra well.

    MIMD is not VLIW,

    You cannot have VILW without MIMD. You can have MIMD without VLIW.

    though. MIMD means the processors are decoupled and
    run asynchronously.

    Not necessarily, one can have a single core that is MIMD all by itself.

    At least I thought that’s what it meant.

    GBOoO can be considered MIMD, but most do not. Flynn's taxonomy only
    requires multiple instructions operating on multiple different pieces
    of data simultaneously.

    VLIW
    implies some kind of coupling, while still allowing different units to perform different operations, unlike SIMD.

    Fisher's VLIW requires that the MI part has been compiled such that
    the DECODE unit can decode and execute all the instructions simul-
    taneously.

    Then there is MISD, aka dataflow.

    Does "real" Data-flow even have instructions ? or flow control ?
    --- Synchronet 3.22a-Linux NewsLink 1.2
  • From Lawrence =?iso-8859-13?q?D=FFOliveiro?=@ldo@nz.invalid to comp.arch on Thu May 28 05:36:44 2026
    From Newsgroup: comp.arch

    On Thu, 28 May 2026 01:33:29 GMT, MitchAlsup wrote:

    On Wed, 27 May 2026 22:39:39 -0000 (UTC), Lawrence D’Oliveiro wrote:

    MIMD is not VLIW,

    You cannot have VILW without MIMD. You can have MIMD without VLIW.

    VLIW implies multiple function units, of course. The “MI” in “MIMD” implies multiple instruction fetch/decode units, surely. Otherwise
    it would be just “superscalar”.

    MIMD means the processors are decoupled and run asynchronously.

    Not necessarily, one can have a single core that is MIMD all by itself.

    Umm, what does the “M” stand for, again?

    VLIW implies some kind of coupling, while still allowing different
    units to perform different operations, unlike SIMD.

    Fisher's VLIW requires that the MI part has been compiled such that
    the DECODE unit can decode and execute all the instructions
    simultaneously.

    If by “simultaneously” you mean “in lockstep” rather than just “concurrently”, then “MI” on its own does not imply that.

    Then there is MISD, aka dataflow.

    Does "real" Data-flow even have instructions ? or flow control ?

    I’m not familiar with hardware uses of it, but I can point at software systems that implement it. Yes, they have instructions, and flow
    control.
    --- Synchronet 3.22a-Linux NewsLink 1.2
  • From David Brown@david.brown@hesbynett.no to comp.arch on Thu May 28 08:23:39 2026
    From Newsgroup: comp.arch

    On 27/05/2026 23:23, MitchAlsup wrote:

    David Brown <david.brown@hesbynett.no> posted:

    On 26/05/2026 07:53, Lawrence D’Oliveiro wrote:
    A backgrounder from the ever-dependable Asianometry channel
    <https://www.youtube.com/watch?v=J7157XB8rxc> on the history of VLIW,
    beginning with the PhD student who invented to the concept, who then
    went on to found a company (Multiflow) to capitalize on it as part of
    the short-lived “mini-supercomputer” boomlet of the early-to-mid
    1980s.

    It was a truly tough concept to prove in practice. And in the end it
    seems to have been in vain. Seems the infamous HP/Intel Itanium
    project also took in some ex-employees of the mini-super companies.

    So what happened? RISC happened. That turned out to be a much more
    practical way of achieving leaps in performance.

    I think I saw someone in the video comments say VLIW is still used
    in DSP. But I think they mean SIMD, not VLIW.

    I have not watched the video. But it is certainly the case that some
    DSP's are referred to as VLIW, with good reason. The key feature of
    powerful DSP's is not that they can do SIMD-style operations (they can
    often do that as well), but that they can do multiple different
    operations in the same cycle, with explicit control.

    MIMD--DSPs are designed to run small kernels of code where each cycle
    the core can {access memory, crunch some numbers, consider a branch}
    Which makes them ideal for {filtering, FFTing, some cyphers, store&
    forward network controllers}. These are the kinds of algorithms that
    fit the VLIW mantra well.

    Absolutely. They are highly specialised devices. They can be a real
    PITA for anything that is /not/ a DSP kernel - trying to do something as simple as efficient UART serial communication on a DSP with 16-bit
    CHAR_BIT is not fun.


    {compilers, Assemblers, Operating Systems, File systems, ISRs, DPC's,
    GUIs, synchronization, multi-threaded shared libraries, ...} not so much.

    But for the applications they are suited for, they might save more than
    70% of the power needed by a GBOoO at less than 30% the die area (and
    cost). In other words, they pay a smaller vonNeumann tax. About the
    only thing that can be lower area and power costs would be hardwired algorithms {Texture, Rasterization, Interpolation, ...} which GPUs
    have figured out.

    A DSP core might
    be doing two separate MAC operations along with matching loads and
    stores, where these memory operations are for different memory areas and
    buses, and with different rules for auto-incrementing, cyclic wrapping,
    etc. There's a count register to decrement and check for 0, breaking
    out of the loop. VLIW-style DSP's can let you do the whole lot in one
    or two instructions.


    --- Synchronet 3.22a-Linux NewsLink 1.2
  • From EricP@ThatWouldBeTelling@thevillage.com to comp.arch on Thu May 28 10:38:35 2026
    From Newsgroup: comp.arch

    On 2026-May-27 18:39, Lawrence D’Oliveiro wrote:
    On Wed, 27 May 2026 21:23:49 GMT, MitchAlsup wrote:

    MIMD--DSPs are designed to run small kernels of code where each
    cycle the core can {access memory, crunch some numbers, consider a
    branch} Which makes them ideal for {filtering, FFTing, some cyphers,
    store& forward network controllers}. These are the kinds of
    algorithms that fit the VLIW mantra well.

    MIMD is not VLIW, though. MIMD means the processors are decoupled and
    run asynchronously. At least I thought that’s what it meant. VLIW
    implies some kind of coupling, while still allowing different units to perform different operations, unlike SIMD.

    Then there is MISD, aka dataflow.

    The naming is based on the number of independent instruction pointers.
    VILW has 1 IP, MIMD has multiple IP.


    --- Synchronet 3.22a-Linux NewsLink 1.2
  • From EricP@ThatWouldBeTelling@thevillage.com to comp.arch on Thu May 28 10:45:27 2026
    From Newsgroup: comp.arch

    On 2026-May-28 10:38, EricP wrote:
    On 2026-May-27 18:39, Lawrence D’Oliveiro wrote:
    On Wed, 27 May 2026 21:23:49 GMT, MitchAlsup wrote:

    MIMD--DSPs are designed to run small kernels of code where each
    cycle the core can {access memory, crunch some numbers, consider a
    branch} Which makes them ideal for {filtering, FFTing, some cyphers,
    store& forward network controllers}. These are the kinds of
    algorithms that fit the VLIW mantra well.

    MIMD is not VLIW, though. MIMD means the processors are decoupled and
    run asynchronously. At least I thought that’s what it meant. VLIW
    implies some kind of coupling, while still allowing different units to
    perform different operations, unlike SIMD.

    Then there is MISD, aka dataflow.

    The naming is based on the number of independent instruction pointers.
    VILW has 1 IP, MIMD has multiple IP.

    I added the word independent above because Mill has 2 IP but
    they are not independent - it has one branch destination IP
    and then a bidirectional instruction fetch.



    --- Synchronet 3.22a-Linux NewsLink 1.2
  • From MitchAlsup@user5857@newsgrouper.org.invalid to comp.arch on Thu May 28 18:01:06 2026
    From Newsgroup: comp.arch


    Lawrence =?iso-8859-13?q?D=FFOliveiro?= <ldo@nz.invalid> posted:

    On Thu, 28 May 2026 01:33:29 GMT, MitchAlsup wrote:

    On Wed, 27 May 2026 22:39:39 -0000 (UTC), Lawrence D’Oliveiro wrote:

    MIMD is not VLIW,

    You cannot have VILW without MIMD. You can have MIMD without VLIW.

    VLIW implies multiple function units, of course. The “MI” in “MIMD” implies multiple instruction fetch/decode units, surely. Otherwise
    it would be just “superscalar”.

    MIMD means the processors are decoupled and run asynchronously.

    Not necessarily, one can have a single core that is MIMD all by itself.

    Umm, what does the “M” stand for, again?

    Multi-instruction or Multi-data

    A single core can perform multi-instruction on multi-data.

    VLIW implies some kind of coupling, while still allowing different
    units to perform different operations, unlike SIMD.

    Fisher's VLIW requires that the MI part has been compiled such that
    the DECODE unit can decode and execute all the instructions
    simultaneously.

    If by “simultaneously” you mean “in lockstep” rather than just “concurrently”, then “MI” on its own does not imply that.

    It is not supposed to, that is what VLIW stands for.

    Then there is MISD, aka dataflow.

    Does "real" Data-flow even have instructions ? or flow control ?

    I’m not familiar with hardware uses of it, but I can point at software systems that implement it. Yes, they have instructions, and flow
    control.

    See Arvind MIT data-flow
    --- Synchronet 3.22a-Linux NewsLink 1.2
  • From MitchAlsup@user5857@newsgrouper.org.invalid to comp.arch on Thu May 28 18:04:12 2026
    From Newsgroup: comp.arch


    EricP <ThatWouldBeTelling@thevillage.com> posted:

    On 2026-May-27 18:39, Lawrence D’Oliveiro wrote:
    On Wed, 27 May 2026 21:23:49 GMT, MitchAlsup wrote:

    MIMD--DSPs are designed to run small kernels of code where each
    cycle the core can {access memory, crunch some numbers, consider a
    branch} Which makes them ideal for {filtering, FFTing, some cyphers,
    store& forward network controllers}. These are the kinds of
    algorithms that fit the VLIW mantra well.

    MIMD is not VLIW, though. MIMD means the processors are decoupled and
    run asynchronously. At least I thought that’s what it meant. VLIW
    implies some kind of coupling, while still allowing different units to perform different operations, unlike SIMD.

    Then there is MISD, aka dataflow.

    The naming is based on the number of independent instruction pointers.
    VILW has 1 IP, MIMD has multiple IP.

    Boroughs BSP had a single IP orchestrating 16 'cores' {more GPU-like}
    and was considered MIMD because many instructions were executed
    simultaneously each execution being performed on its own data.
    --- Synchronet 3.22a-Linux NewsLink 1.2
  • From EricP@ThatWouldBeTelling@thevillage.com to comp.arch on Thu May 28 14:31:21 2026
    From Newsgroup: comp.arch

    On 2026-May-28 14:04, MitchAlsup wrote:

    EricP <ThatWouldBeTelling@thevillage.com> posted:

    On 2026-May-27 18:39, Lawrence D’Oliveiro wrote:
    On Wed, 27 May 2026 21:23:49 GMT, MitchAlsup wrote:

    MIMD--DSPs are designed to run small kernels of code where each
    cycle the core can {access memory, crunch some numbers, consider a
    branch} Which makes them ideal for {filtering, FFTing, some cyphers,
    store& forward network controllers}. These are the kinds of
    algorithms that fit the VLIW mantra well.

    MIMD is not VLIW, though. MIMD means the processors are decoupled and
    run asynchronously. At least I thought that’s what it meant. VLIW
    implies some kind of coupling, while still allowing different units to
    perform different operations, unlike SIMD.

    Then there is MISD, aka dataflow.

    The naming is based on the number of independent instruction pointers.
    VILW has 1 IP, MIMD has multiple IP.

    Boroughs BSP had a single IP orchestrating 16 'cores' {more GPU-like}
    and was considered MIMD because many instructions were executed simultaneously each execution being performed on its own data.

    That's SIMD then, like ILLIAC-IV.




    --- Synchronet 3.22a-Linux NewsLink 1.2
  • From Chris M. Thomasson@chris.m.thomasson.1@gmail.com to comp.arch on Thu May 28 14:35:11 2026
    From Newsgroup: comp.arch

    On 5/27/2026 2:23 PM, MitchAlsup wrote:

    David Brown <david.brown@hesbynett.no> posted:

    On 26/05/2026 07:53, Lawrence D’Oliveiro wrote:
    A backgrounder from the ever-dependable Asianometry channel
    <https://www.youtube.com/watch?v=J7157XB8rxc> on the history of VLIW,
    beginning with the PhD student who invented to the concept, who then
    went on to found a company (Multiflow) to capitalize on it as part of
    the short-lived “mini-supercomputer” boomlet of the early-to-mid
    1980s.

    It was a truly tough concept to prove in practice. And in the end it
    seems to have been in vain. Seems the infamous HP/Intel Itanium
    project also took in some ex-employees of the mini-super companies.

    So what happened? RISC happened. That turned out to be a much more
    practical way of achieving leaps in performance.

    I think I saw someone in the video comments say VLIW is still used
    in DSP. But I think they mean SIMD, not VLIW.

    I have not watched the video. But it is certainly the case that some
    DSP's are referred to as VLIW, with good reason. The key feature of
    powerful DSP's is not that they can do SIMD-style operations (they can
    often do that as well), but that they can do multiple different
    operations in the same cycle, with explicit control.

    MIMD--DSPs are designed to run small kernels of code where each cycle
    the core can {access memory, crunch some numbers, consider a branch}
    Which makes them ideal for {filtering, FFTing, some cyphers, store&
    forward network controllers}. These are the kinds of algorithms that
    fit the VLIW mantra well.

    Basically, kind of akin to compute shaders..

    [...]
    --- Synchronet 3.22a-Linux NewsLink 1.2
  • From George Neuner@gneuner2@comcast.net to comp.arch on Thu May 28 18:02:05 2026
    From Newsgroup: comp.arch

    On Thu, 28 May 2026 01:33:29 GMT, MitchAlsup
    <user5857@newsgrouper.org.invalid> wrote:


    Lawrence =?iso-8859-13?q?D=FFOliveiro?= <ldo@nz.invalid> posted:

    On Wed, 27 May 2026 21:23:49 GMT, MitchAlsup wrote:

    MIMD--DSPs are designed to run small kernels of code where each
    cycle the core can {access memory, crunch some numbers, consider a
    branch} Which makes them ideal for {filtering, FFTing, some cyphers,
    store& forward network controllers}. These are the kinds of
    algorithms that fit the VLIW mantra well.

    MIMD is not VLIW,

    You cannot have VILW without MIMD. You can have MIMD without VLIW.

    VLIW is multiple /operations/ in a single instruction. Ivan Godard
    coined "MOMD" to describe this.

    MIMD is multiple instructions - any or each of which may be VLIW.
    --- Synchronet 3.22a-Linux NewsLink 1.2
  • From Lawrence =?iso-8859-13?q?D=FFOliveiro?=@ldo@nz.invalid to comp.arch on Fri May 29 02:35:19 2026
    From Newsgroup: comp.arch

    On Thu, 28 May 2026 14:35:11 -0700, Chris M. Thomasson wrote:

    On 5/27/2026 2:23 PM, MitchAlsup wrote:

    MIMD--DSPs are designed to run small kernels of code where each
    cycle the core can {access memory, crunch some numbers, consider a
    branch} Which makes them ideal for {filtering, FFTing, some
    cyphers, store& forward network controllers}. These are the kinds
    of algorithms that fit the VLIW mantra well.

    Basically, kind of akin to compute shaders..

    Given that the specs for such things regularly talk about “pipelines”,
    you might say such shader ensembles fit the MISD/dataflow paradigm ...
    --- Synchronet 3.22a-Linux NewsLink 1.2
  • From Lawrence =?iso-8859-13?q?D=FFOliveiro?=@ldo@nz.invalid to comp.arch on Fri May 29 02:36:26 2026
    From Newsgroup: comp.arch

    On Thu, 28 May 2026 14:31:21 -0400, EricP wrote:

    On 2026-May-28 14:04, MitchAlsup wrote:

    EricP <ThatWouldBeTelling@thevillage.com> posted:

    The naming is based on the number of independent instruction
    pointers. VILW has 1 IP, MIMD has multiple IP.

    Boroughs BSP had a single IP orchestrating 16 'cores' {more
    GPU-like} and was considered MIMD because many instructions were
    executed simultaneously each execution being performed on its own
    data.

    That's SIMD then, like ILLIAC-IV.

    I would agree.
    --- Synchronet 3.22a-Linux NewsLink 1.2