Forum: War Ensemble BBS

VLIW: The Road Less Travelled

From Lawrence =?iso-8859-13?q?D=FFOliveiro?=@ldo@nz.invalid to comp.arch on Tue May 26 05:53:46 2026

From Newsgroup: comp.arch

A backgrounder from the ever-dependable Asianometry channel <https://www.youtube.com/watch?v=J7157XB8rxc> on the history of VLIW,
beginning with the PhD student who invented to the concept, who then
went on to found a company (Multiflow) to capitalize on it as part of
the short-lived “mini-supercomputer” boomlet of the early-to-mid
1980s.

It was a truly tough concept to prove in practice. And in the end it
seems to have been in vain. Seems the infamous HP/Intel Itanium
project also took in some ex-employees of the mini-super companies.

So what happened? RISC happened. That turned out to be a much more
practical way of achieving leaps in performance.

I think I saw someone in the video comments say VLIW is still used
in DSP. But I think they mean SIMD, not VLIW.
--- Synchronet 3.22a-Linux NewsLink 1.2

From David Brown@david.brown@hesbynett.no to comp.arch on Tue May 26 08:54:41 2026

From Newsgroup: comp.arch

On 26/05/2026 07:53, Lawrence D’Oliveiro wrote:

A backgrounder from the ever-dependable Asianometry channel <https://www.youtube.com/watch?v=J7157XB8rxc> on the history of VLIW, beginning with the PhD student who invented to the concept, who then
went on to found a company (Multiflow) to capitalize on it as part of
the short-lived “mini-supercomputer” boomlet of the early-to-mid
1980s.

It was a truly tough concept to prove in practice. And in the end it
seems to have been in vain. Seems the infamous HP/Intel Itanium
project also took in some ex-employees of the mini-super companies.

So what happened? RISC happened. That turned out to be a much more
practical way of achieving leaps in performance.

I think I saw someone in the video comments say VLIW is still used
in DSP. But I think they mean SIMD, not VLIW.

I have not watched the video. But it is certainly the case that some
DSP's are referred to as VLIW, with good reason. The key feature of
powerful DSP's is not that they can do SIMD-style operations (they can
often do that as well), but that they can do multiple different
operations in the same cycle, with explicit control. A DSP core might
be doing two separate MAC operations along with matching loads and
stores, where these memory operations are for different memory areas and buses, and with different rules for auto-incrementing, cyclic wrapping,
etc. There's a count register to decrement and check for 0, breaking
out of the loop. VLIW-style DSP's can let you do the whole lot in one
or two instructions.

--- Synchronet 3.22a-Linux NewsLink 1.2

From BGB@cr88192@gmail.com to comp.arch on Wed May 27 14:07:00 2026

From Newsgroup: comp.arch

On 5/26/2026 1:54 AM, David Brown wrote:

On 26/05/2026 07:53, Lawrence D’Oliveiro wrote:

A backgrounder from the ever-dependable Asianometry channel
<https://www.youtube.com/watch?v=J7157XB8rxc> on the history of VLIW,
beginning with the PhD student who invented to the concept, who then
went on to found a company (Multiflow) to capitalize on it as part of
the short-lived “mini-supercomputer” boomlet of the early-to-mid
1980s.

It was a truly tough concept to prove in practice. And in the end it
seems to have been in vain. Seems the infamous HP/Intel Itanium
project also took in some ex-employees of the mini-super companies.

So what happened? RISC happened. That turned out to be a much more
practical way of achieving leaps in performance.

I think I saw someone in the video comments say VLIW is still used
in DSP. But I think they mean SIMD, not VLIW.

I have not watched the video. But it is certainly the case that some
DSP's are referred to as VLIW, with good reason. The key feature of powerful DSP's is not that they can do SIMD-style operations (they can
often do that as well), but that they can do multiple different
operations in the same cycle, with explicit control. A DSP core might
be doing two separate MAC operations along with matching loads and
stores, where these memory operations are for different memory areas and buses, and with different rules for auto-incrementing, cyclic wrapping, etc. There's a count register to decrement and check for 0, breaking
out of the loop. VLIW-style DSP's can let you do the whole lot in one
or two instructions.

Yes, pretty much.

I can probably also admit some that the general approach towards "WEX"
in my own ISA designs (tagging the instruction words), along with the
ASM syntax (using '|' to specify parallel instructions), took a "not
exactly small" inspiration from the TMS320 DSP's (though a very similar approach was taken in the PIC32 / Xtensa ISA).

This can be contrast with the 128b/3 approach taken by IA64.

I had originally been looking into a 64/3 encoding as an alt-mode for
BJX1, where, say:
64-bit bundle holds either 3x 21-bit instructions;
Or, 2x 31 bits;
Or, a single 64-bit instruction.
Say:
0xxx: 3x 21 (2R/2RI only)
10xx: 2x 31 (3R/3RI)
11xx: Single Instr

There would have been separate modes:
A SH-4 compatibility mode
Would run 32-bit SH-4 code;
Dropped 64A mode:
First attempt at 64-bit SH-4, used a modal encoding scheme
Needed to toggle status bits to bank-swap various instructions.
Was super annoying...
A 64C mode:
Ran a modified version of the SH encoding scheme.
Dropped/reworked things to avoid modal encodings.
A considered VLIW mode:
Would have run the 64/3 bundles;
Never really got implemented (idea was abandoned).
Switched the to instead using an 80-bit encoding in the 64C mode:
Bundle format followed a 16-bit escape op.

After this, there were a few offshoot paths:
SH-4 => B32V (a stripped down SH-4) => BTSR1
BTSR1 x BJX1-64C => BJX2
Then encoding jostled around, before landing on what I now call XG1.

The original bundle format was dropped in favor of explicitly tagging
the 32-bit encodings (after 32-bit encodings became the dominant core
ISA, and the 16-bit encodings switched to being secondary). It then
gained predication and jumbo prefixes and similar.

Then from there:
XG2: Dropped the 16-bit encodings, reusing the bits for expanding the
entire ISA to 64 registers.

XG3: Reworked the encoding scheme for XG2 to be more compatible with
sharing the same execution context and encoding scheme as RV64G.
Essentially unifying XG2 and RV64G into a singular ISA (but then
necessary inconsistencies began to emerge between XG2 and XG3).

Ironically, the main "nail in the coffin" for my "WEX" approach, was
realizing that I could achieve nearly the same effect without the
explicit tagging.

Compiler shuffles the instructions into "basically the right order" but
then similar logic to what the compiler used to tag the instructions post-shuffle is instead mostly handled in hardware (and more flexibly,
as the pipeline mechanics are no longer rigidly baked into the program,
even if the scheduling does still effect performance).

Some "features", like specifying which instructions are allowed in which
lane, or slightly alternate semantics for the instruction if used in
Lane 2, etc, have gone away.

The mechanism worked for RISC-V and turned out to be a lot cheaper than initial expectations, and (when I added XG3) could naturally extend it
to cover XG3 as well.

The jumbo prefixes are in some way a vestige of the old mechanism, but
the approach is still effective as a "make instructions bigger" thing
(and scales more gracefully than "new blue sky encoding spaces" at each
size).

More so when most of what one needs from a larger encoding is note some new/novel instruction, but usually just an older instruction with a
bigger immediate (and one can slot the new instructions into spaces
where extending the old instruction in a given way is invalid or does
not make sense).

Well, and lacking another better approach that works in both my own
encoding schemes and when bolted onto RISC-V.

...

Had also on/off considered the possibility of a BJX3 project, but have
not yet done so. The main transition here would be to effectively subset
my current project (and hopefully get back to a simpler CPU core).

Though, ironically, most of this involved prerequisite work to try to
get things into a state where the RV64G+XG3 mode could be
self-supporting and with meaningful cost savings.

But, it is more work than it would likely seem trying to get things to a
state where XG3 can take over from XG1 in terms of managing OS stuff
(and one could maybe debate whether it might be better to do a
hard-switch vs my current more incremental approach).

But, this has resulted in some recent breaking changes in XG3 trying to eliminate some of the remaining "ugly gotchas" in the encoding scheme.

Though, one can debate if RV64G+XG3 is actually the right path forward.

Well, and more specific points, like what specific features should be nominally implemented in hardware vs left to trap and emulate, but this
is more of a configuration-time thing.

...

--- Synchronet 3.22a-Linux NewsLink 1.2

From MitchAlsup@user5857@newsgrouper.org.invalid to comp.arch on Wed May 27 21:23:49 2026

From Newsgroup: comp.arch

David Brown <david.brown@hesbynett.no> posted:

On 26/05/2026 07:53, Lawrence D’Oliveiro wrote:

A backgrounder from the ever-dependable Asianometry channel <https://www.youtube.com/watch?v=J7157XB8rxc> on the history of VLIW, beginning with the PhD student who invented to the concept, who then
went on to found a company (Multiflow) to capitalize on it as part of
the short-lived “mini-supercomputer” boomlet of the early-to-mid
1980s.

It was a truly tough concept to prove in practice. And in the end it
seems to have been in vain. Seems the infamous HP/Intel Itanium
project also took in some ex-employees of the mini-super companies.

So what happened? RISC happened. That turned out to be a much more practical way of achieving leaps in performance.

I think I saw someone in the video comments say VLIW is still used
in DSP. But I think they mean SIMD, not VLIW.

I have not watched the video. But it is certainly the case that some
DSP's are referred to as VLIW, with good reason. The key feature of powerful DSP's is not that they can do SIMD-style operations (they can
often do that as well), but that they can do multiple different
operations in the same cycle, with explicit control.

MIMD--DSPs are designed to run small kernels of code where each cycle
the core can {access memory, crunch some numbers, consider a branch}
Which makes them ideal for {filtering, FFTing, some cyphers, store&
forward network controllers}. These are the kinds of algorithms that
fit the VLIW mantra well.

{compilers, Assemblers, Operating Systems, File systems, ISRs, DPC's,
GUIs, synchronization, multi-threaded shared libraries, ...} not so much.

But for the applications they are suited for, they might save more than
70% of the power needed by a GBOoO at less than 30% the die area (and
cost). In other words, they pay a smaller vonNeumann tax. About the
only thing that can be lower area and power costs would be hardwired
algorithms {Texture, Rasterization, Interpolation, ...} which GPUs
have figured out.

A DSP core might
be doing two separate MAC operations along with matching loads and
stores, where these memory operations are for different memory areas and buses, and with different rules for auto-incrementing, cyclic wrapping,
etc. There's a count register to decrement and check for 0, breaking
out of the loop. VLIW-style DSP's can let you do the whole lot in one
or two instructions.

--- Synchronet 3.22a-Linux NewsLink 1.2

From Lawrence =?iso-8859-13?q?D=FFOliveiro?=@ldo@nz.invalid to comp.arch on Wed May 27 22:39:39 2026

From Newsgroup: comp.arch

On Wed, 27 May 2026 21:23:49 GMT, MitchAlsup wrote:

MIMD--DSPs are designed to run small kernels of code where each
cycle the core can {access memory, crunch some numbers, consider a
branch} Which makes them ideal for {filtering, FFTing, some cyphers,
store& forward network controllers}. These are the kinds of
algorithms that fit the VLIW mantra well.

MIMD is not VLIW, though. MIMD means the processors are decoupled and
run asynchronously. At least I thought that’s what it meant. VLIW
implies some kind of coupling, while still allowing different units to
perform different operations, unlike SIMD.

Then there is MISD, aka dataflow.
--- Synchronet 3.22a-Linux NewsLink 1.2

From MitchAlsup@user5857@newsgrouper.org.invalid to comp.arch on Thu May 28 01:33:29 2026

From Newsgroup: comp.arch

Lawrence =?iso-8859-13?q?D=FFOliveiro?= <ldo@nz.invalid> posted:

On Wed, 27 May 2026 21:23:49 GMT, MitchAlsup wrote:

MIMD--DSPs are designed to run small kernels of code where each
cycle the core can {access memory, crunch some numbers, consider a
branch} Which makes them ideal for {filtering, FFTing, some cyphers,
store& forward network controllers}. These are the kinds of
algorithms that fit the VLIW mantra well.

MIMD is not VLIW,

You cannot have VILW without MIMD. You can have MIMD without VLIW.

though. MIMD means the processors are decoupled and
run asynchronously.

Not necessarily, one can have a single core that is MIMD all by itself.

At least I thought that’s what it meant.

GBOoO can be considered MIMD, but most do not. Flynn's taxonomy only
requires multiple instructions operating on multiple different pieces
of data simultaneously.

VLIW
implies some kind of coupling, while still allowing different units to perform different operations, unlike SIMD.

Fisher's VLIW requires that the MI part has been compiled such that
the DECODE unit can decode and execute all the instructions simul-
taneously.

Then there is MISD, aka dataflow.

Does "real" Data-flow even have instructions ? or flow control ?
--- Synchronet 3.22a-Linux NewsLink 1.2

From Lawrence =?iso-8859-13?q?D=FFOliveiro?=@ldo@nz.invalid to comp.arch on Thu May 28 05:36:44 2026

From Newsgroup: comp.arch

On Thu, 28 May 2026 01:33:29 GMT, MitchAlsup wrote:

On Wed, 27 May 2026 22:39:39 -0000 (UTC), Lawrence D’Oliveiro wrote:

MIMD is not VLIW,

You cannot have VILW without MIMD. You can have MIMD without VLIW.

VLIW implies multiple function units, of course. The “MI” in “MIMD” implies multiple instruction fetch/decode units, surely. Otherwise
it would be just “superscalar”.

MIMD means the processors are decoupled and run asynchronously.

Not necessarily, one can have a single core that is MIMD all by itself.

Umm, what does the “M” stand for, again?

VLIW implies some kind of coupling, while still allowing different
units to perform different operations, unlike SIMD.

Fisher's VLIW requires that the MI part has been compiled such that
the DECODE unit can decode and execute all the instructions
simultaneously.

If by “simultaneously” you mean “in lockstep” rather than just “concurrently”, then “MI” on its own does not imply that.

Then there is MISD, aka dataflow.

Does "real" Data-flow even have instructions ? or flow control ?

I’m not familiar with hardware uses of it, but I can point at software systems that implement it. Yes, they have instructions, and flow
control.
--- Synchronet 3.22a-Linux NewsLink 1.2

From David Brown@david.brown@hesbynett.no to comp.arch on Thu May 28 08:23:39 2026

From Newsgroup: comp.arch

On 27/05/2026 23:23, MitchAlsup wrote:

David Brown <david.brown@hesbynett.no> posted:

On 26/05/2026 07:53, Lawrence D’Oliveiro wrote:

A backgrounder from the ever-dependable Asianometry channel
<https://www.youtube.com/watch?v=J7157XB8rxc> on the history of VLIW,
beginning with the PhD student who invented to the concept, who then
went on to found a company (Multiflow) to capitalize on it as part of
the short-lived “mini-supercomputer” boomlet of the early-to-mid
1980s.

It was a truly tough concept to prove in practice. And in the end it
seems to have been in vain. Seems the infamous HP/Intel Itanium
project also took in some ex-employees of the mini-super companies.

So what happened? RISC happened. That turned out to be a much more
practical way of achieving leaps in performance.

I think I saw someone in the video comments say VLIW is still used
in DSP. But I think they mean SIMD, not VLIW.

I have not watched the video. But it is certainly the case that some
DSP's are referred to as VLIW, with good reason. The key feature of
powerful DSP's is not that they can do SIMD-style operations (they can
often do that as well), but that they can do multiple different
operations in the same cycle, with explicit control.

MIMD--DSPs are designed to run small kernels of code where each cycle
the core can {access memory, crunch some numbers, consider a branch}
Which makes them ideal for {filtering, FFTing, some cyphers, store&
forward network controllers}. These are the kinds of algorithms that
fit the VLIW mantra well.

Absolutely. They are highly specialised devices. They can be a real
PITA for anything that is /not/ a DSP kernel - trying to do something as simple as efficient UART serial communication on a DSP with 16-bit
CHAR_BIT is not fun.

{compilers, Assemblers, Operating Systems, File systems, ISRs, DPC's,
GUIs, synchronization, multi-threaded shared libraries, ...} not so much.

But for the applications they are suited for, they might save more than
70% of the power needed by a GBOoO at less than 30% the die area (and
cost). In other words, they pay a smaller vonNeumann tax. About the
only thing that can be lower area and power costs would be hardwired algorithms {Texture, Rasterization, Interpolation, ...} which GPUs
have figured out.

A DSP core might
be doing two separate MAC operations along with matching loads and
stores, where these memory operations are for different memory areas and
buses, and with different rules for auto-incrementing, cyclic wrapping,
etc. There's a count register to decrement and check for 0, breaking
out of the loop. VLIW-style DSP's can let you do the whole lot in one
or two instructions.

--- Synchronet 3.22a-Linux NewsLink 1.2

From EricP@ThatWouldBeTelling@thevillage.com to comp.arch on Thu May 28 10:38:35 2026

From Newsgroup: comp.arch

On 2026-May-27 18:39, Lawrence D’Oliveiro wrote:

On Wed, 27 May 2026 21:23:49 GMT, MitchAlsup wrote:

MIMD--DSPs are designed to run small kernels of code where each
cycle the core can {access memory, crunch some numbers, consider a
branch} Which makes them ideal for {filtering, FFTing, some cyphers,
store& forward network controllers}. These are the kinds of
algorithms that fit the VLIW mantra well.

MIMD is not VLIW, though. MIMD means the processors are decoupled and
run asynchronously. At least I thought that’s what it meant. VLIW
implies some kind of coupling, while still allowing different units to perform different operations, unlike SIMD.

Then there is MISD, aka dataflow.

The naming is based on the number of independent instruction pointers.
VILW has 1 IP, MIMD has multiple IP.

--- Synchronet 3.22a-Linux NewsLink 1.2

From EricP@ThatWouldBeTelling@thevillage.com to comp.arch on Thu May 28 10:45:27 2026

From Newsgroup: comp.arch

On 2026-May-28 10:38, EricP wrote:

On 2026-May-27 18:39, Lawrence D’Oliveiro wrote:

On Wed, 27 May 2026 21:23:49 GMT, MitchAlsup wrote:

MIMD--DSPs are designed to run small kernels of code where each
cycle the core can {access memory, crunch some numbers, consider a
branch} Which makes them ideal for {filtering, FFTing, some cyphers,
store& forward network controllers}. These are the kinds of
algorithms that fit the VLIW mantra well.

MIMD is not VLIW, though. MIMD means the processors are decoupled and
run asynchronously. At least I thought that’s what it meant. VLIW
implies some kind of coupling, while still allowing different units to
perform different operations, unlike SIMD.

Then there is MISD, aka dataflow.

The naming is based on the number of independent instruction pointers.
VILW has 1 IP, MIMD has multiple IP.

I added the word independent above because Mill has 2 IP but
they are not independent - it has one branch destination IP
and then a bidirectional instruction fetch.

--- Synchronet 3.22a-Linux NewsLink 1.2

From MitchAlsup@user5857@newsgrouper.org.invalid to comp.arch on Thu May 28 18:01:06 2026

From Newsgroup: comp.arch

Lawrence =?iso-8859-13?q?D=FFOliveiro?= <ldo@nz.invalid> posted:

On Thu, 28 May 2026 01:33:29 GMT, MitchAlsup wrote:

On Wed, 27 May 2026 22:39:39 -0000 (UTC), Lawrence D’Oliveiro wrote:

MIMD is not VLIW,

You cannot have VILW without MIMD. You can have MIMD without VLIW.

VLIW implies multiple function units, of course. The “MI” in “MIMD” implies multiple instruction fetch/decode units, surely. Otherwise
it would be just “superscalar”.

MIMD means the processors are decoupled and run asynchronously.

Not necessarily, one can have a single core that is MIMD all by itself.

Umm, what does the “M” stand for, again?

Multi-instruction or Multi-data

A single core can perform multi-instruction on multi-data.

VLIW implies some kind of coupling, while still allowing different
units to perform different operations, unlike SIMD.

Fisher's VLIW requires that the MI part has been compiled such that
the DECODE unit can decode and execute all the instructions
simultaneously.

If by “simultaneously” you mean “in lockstep” rather than just “concurrently”, then “MI” on its own does not imply that.

It is not supposed to, that is what VLIW stands for.

Then there is MISD, aka dataflow.

Does "real" Data-flow even have instructions ? or flow control ?

I’m not familiar with hardware uses of it, but I can point at software systems that implement it. Yes, they have instructions, and flow
control.

See Arvind MIT data-flow
--- Synchronet 3.22a-Linux NewsLink 1.2

From MitchAlsup@user5857@newsgrouper.org.invalid to comp.arch on Thu May 28 18:04:12 2026

From Newsgroup: comp.arch

EricP <ThatWouldBeTelling@thevillage.com> posted:

On 2026-May-27 18:39, Lawrence D’Oliveiro wrote:

On Wed, 27 May 2026 21:23:49 GMT, MitchAlsup wrote:

MIMD--DSPs are designed to run small kernels of code where each
cycle the core can {access memory, crunch some numbers, consider a
branch} Which makes them ideal for {filtering, FFTing, some cyphers,
store& forward network controllers}. These are the kinds of
algorithms that fit the VLIW mantra well.

MIMD is not VLIW, though. MIMD means the processors are decoupled and
run asynchronously. At least I thought that’s what it meant. VLIW
implies some kind of coupling, while still allowing different units to perform different operations, unlike SIMD.

Then there is MISD, aka dataflow.

The naming is based on the number of independent instruction pointers.
VILW has 1 IP, MIMD has multiple IP.

Boroughs BSP had a single IP orchestrating 16 'cores' {more GPU-like}
and was considered MIMD because many instructions were executed
simultaneously each execution being performed on its own data.
--- Synchronet 3.22a-Linux NewsLink 1.2

From EricP@ThatWouldBeTelling@thevillage.com to comp.arch on Thu May 28 14:31:21 2026

From Newsgroup: comp.arch

On 2026-May-28 14:04, MitchAlsup wrote:

EricP <ThatWouldBeTelling@thevillage.com> posted:

On 2026-May-27 18:39, Lawrence D’Oliveiro wrote:

On Wed, 27 May 2026 21:23:49 GMT, MitchAlsup wrote:

MIMD--DSPs are designed to run small kernels of code where each
cycle the core can {access memory, crunch some numbers, consider a
branch} Which makes them ideal for {filtering, FFTing, some cyphers,
store& forward network controllers}. These are the kinds of
algorithms that fit the VLIW mantra well.

MIMD is not VLIW, though. MIMD means the processors are decoupled and
run asynchronously. At least I thought that’s what it meant. VLIW
implies some kind of coupling, while still allowing different units to
perform different operations, unlike SIMD.

Then there is MISD, aka dataflow.

The naming is based on the number of independent instruction pointers.
VILW has 1 IP, MIMD has multiple IP.

Boroughs BSP had a single IP orchestrating 16 'cores' {more GPU-like}
and was considered MIMD because many instructions were executed simultaneously each execution being performed on its own data.

That's SIMD then, like ILLIAC-IV.

--- Synchronet 3.22a-Linux NewsLink 1.2

From Chris M. Thomasson@chris.m.thomasson.1@gmail.com to comp.arch on Thu May 28 14:35:11 2026

From Newsgroup: comp.arch

On 5/27/2026 2:23 PM, MitchAlsup wrote:

David Brown <david.brown@hesbynett.no> posted:

On 26/05/2026 07:53, Lawrence D’Oliveiro wrote:

A backgrounder from the ever-dependable Asianometry channel
<https://www.youtube.com/watch?v=J7157XB8rxc> on the history of VLIW,
beginning with the PhD student who invented to the concept, who then
went on to found a company (Multiflow) to capitalize on it as part of
the short-lived “mini-supercomputer” boomlet of the early-to-mid
1980s.

It was a truly tough concept to prove in practice. And in the end it
seems to have been in vain. Seems the infamous HP/Intel Itanium
project also took in some ex-employees of the mini-super companies.

So what happened? RISC happened. That turned out to be a much more
practical way of achieving leaps in performance.

I think I saw someone in the video comments say VLIW is still used
in DSP. But I think they mean SIMD, not VLIW.

I have not watched the video. But it is certainly the case that some
DSP's are referred to as VLIW, with good reason. The key feature of
powerful DSP's is not that they can do SIMD-style operations (they can
often do that as well), but that they can do multiple different
operations in the same cycle, with explicit control.

MIMD--DSPs are designed to run small kernels of code where each cycle
the core can {access memory, crunch some numbers, consider a branch}
Which makes them ideal for {filtering, FFTing, some cyphers, store&
forward network controllers}. These are the kinds of algorithms that
fit the VLIW mantra well.

Basically, kind of akin to compute shaders..

[...]
--- Synchronet 3.22a-Linux NewsLink 1.2

From George Neuner@gneuner2@comcast.net to comp.arch on Thu May 28 18:02:05 2026

From Newsgroup: comp.arch

On Thu, 28 May 2026 01:33:29 GMT, MitchAlsup
<user5857@newsgrouper.org.invalid> wrote:

Lawrence =?iso-8859-13?q?D=FFOliveiro?= <ldo@nz.invalid> posted:

On Wed, 27 May 2026 21:23:49 GMT, MitchAlsup wrote:

MIMD--DSPs are designed to run small kernels of code where each
cycle the core can {access memory, crunch some numbers, consider a
branch} Which makes them ideal for {filtering, FFTing, some cyphers,
store& forward network controllers}. These are the kinds of
algorithms that fit the VLIW mantra well.

MIMD is not VLIW,

You cannot have VILW without MIMD. You can have MIMD without VLIW.

VLIW is multiple /operations/ in a single instruction. Ivan Godard
coined "MOMD" to describe this.

MIMD is multiple instructions - any or each of which may be VLIW.
--- Synchronet 3.22a-Linux NewsLink 1.2

From Lawrence =?iso-8859-13?q?D=FFOliveiro?=@ldo@nz.invalid to comp.arch on Fri May 29 02:35:19 2026

From Newsgroup: comp.arch

On Thu, 28 May 2026 14:35:11 -0700, Chris M. Thomasson wrote:

On 5/27/2026 2:23 PM, MitchAlsup wrote:

MIMD--DSPs are designed to run small kernels of code where each
cycle the core can {access memory, crunch some numbers, consider a
branch} Which makes them ideal for {filtering, FFTing, some
cyphers, store& forward network controllers}. These are the kinds
of algorithms that fit the VLIW mantra well.

Basically, kind of akin to compute shaders..

Given that the specs for such things regularly talk about “pipelines”,
you might say such shader ensembles fit the MISD/dataflow paradigm ...
--- Synchronet 3.22a-Linux NewsLink 1.2

From Lawrence =?iso-8859-13?q?D=FFOliveiro?=@ldo@nz.invalid to comp.arch on Fri May 29 02:36:26 2026

From Newsgroup: comp.arch

On Thu, 28 May 2026 14:31:21 -0400, EricP wrote:

On 2026-May-28 14:04, MitchAlsup wrote:

EricP <ThatWouldBeTelling@thevillage.com> posted:

The naming is based on the number of independent instruction
pointers. VILW has 1 IP, MIMD has multiple IP.

Boroughs BSP had a single IP orchestrating 16 'cores' {more
GPU-like} and was considered MIMD because many instructions were
executed simultaneously each execution being performed on its own
data.

That's SIMD then, like ILLIAC-IV.

I would agree.
--- Synchronet 3.22a-Linux NewsLink 1.2

Who's Online

System Info

Sysop:	DaiTengu
Location:	Appleton, WI
Users:	1,118
Nodes:	10 (0 / 10)
Uptime:	39:22:37
Calls:	14,340
Files:	186,357
D/L today:	23,668 files (7,691M bytes)
Messages:	2,532,986

VLIW: The Road Less Travelled

Who's Online

System Info