Forum: War Ensemble BBS

else ladders practice

From fir@fir@grunge.pl to comp.lang.c on Thu Oct 31 13:11:30 2024

From Newsgroup: comp.lang.c

somethins i got such pices of code like

if(n==1) {/*something/}
if(n==2) {/*something/}
if(n==3) {/*something/}
if(n==4) {/*something/}
if(n==5) {/*something/}

technically i would need to add elses - but the question is if to do that

do teh code has really chance to be slower without else (except some
very prmitive compilers) ??

not adding else makes liek code shorter.. so im not literally sure which
is better
--- Synchronet 3.20a-Linux NewsLink 1.114

From Anton Shepelev@anton.txt@g{oogle}mail.com to comp.lang.c on Thu Oct 31 16:15:45 2024

From Newsgroup: comp.lang.c

fir:

somethins i got such pies of code like

if(n==1) {/*something/}
if(n==2) {/*something/}
if(n==3) {/*something/}
if(n==4) {/*something/}
if(n==5) {/*something/}

technically i would need to add elses

Why?

but the question is if to do that o teh code has really
chance to be slower without else (except some very
prmitive compilers) ?? ot adding else makes liek code
shorter.. so im not literally sure which is better

I am all for code literalism, and want it to emphasize the
most natural method of execution. Therefore, I usually
append `return' or `goto' to each then-block in an tabular
if-sequence as above. I believe the perormance will depend
on the compiler.
--
() ascii ribbon campaign -- against html e-mail
/\ www.asciiribbon.org -- against proprietary attachments
--- Synchronet 3.20a-Linux NewsLink 1.114

From Richard Harnden@richard.nospam@gmail.invalid to comp.lang.c on Thu Oct 31 13:25:36 2024

From Newsgroup: comp.lang.c

On 31/10/2024 12:11, fir wrote:

somethins i got such pices of code like

if(n==1) {/*something/}
if(n==2) {/*something/}
if(n==3) {/*something/}
if(n==4) {/*something/}
if(n==5) {/*something/}

technically i would need to add elses - but the question is if to do that

Why not ...

switch (n)
{
case 1:
/* something */
break;

case 2:
/* etc ... */

default:
/* something else */
}

... ?

do teh code has really chance to be slower without else (except some
very prmitive compilers) ??

not adding else makes liek code shorter.. so im not literally sure which
is better

The size of teh [sic] source code won't make any difference to the size
of the executable - so aim for readability first and foremost.

--- Synchronet 3.20a-Linux NewsLink 1.114

From fir@fir@grunge.pl to nospam.harnden on Thu Oct 31 14:48:37 2024

From Newsgroup: comp.lang.c

Richard Harnden wrote:

On 31/10/2024 12:11, fir wrote:

somethins i got such pices of code like

if(n==1) {/*something/}
if(n==2) {/*something/}
if(n==3) {/*something/}
if(n==4) {/*something/}
if(n==5) {/*something/}

technically i would need to add elses - but the question is if to do that

Why not ...

switch (n)
{
case 1:
/* something */
break;

case 2:
/* etc ... */

default:
/* something else */
}

... ?

switch is literally flawed construction - (i was even writing or at
kleast thinking why, last time but sadli i literrally forgot my
arguments, would need to find that notes)

so i forgot why but i forgiot that it is flawed - i mean the form of
this c switch has sme error

do teh code has really chance to be slower without else (except some
very prmitive compilers) ??

not adding else makes liek code shorter.. so im not literally sure
which is better

The size of teh [sic] source code won't make any difference to the size
of the executable - so aim for readability first and foremost.

--- Synchronet 3.20a-Linux NewsLink 1.114

From Dan Purgert@dan@djph.net to comp.lang.c on Thu Oct 31 14:14:49 2024

From Newsgroup: comp.lang.c

On 2024-10-31, fir wrote:

somethins i got such pices of code like

if(n==1) {/*something/}
if(n==2) {/*something/}
if(n==3) {/*something/}
if(n==4) {/*something/}
if(n==5) {/*something/}

technically i would need to add elses - but the question is if to do that

do teh code has really chance to be slower without else (except some
very prmitive compilers) ??

In the above, all conditionals are always checked -- that is the truth
of a previous conditional statement has no bearing on subsequent tests.
This leads to the potential of tests going off in directions you hadn't necessarily anticipated.

However, 'if..elseif..else' will only check subsequent conditionals if
the prior statements were false. So for the case that "n=2", you're
only ever testing the two cases "if (n==1)" (which is false) and "elseif(n==2)". The computer just skips to the end of the set of
statements.

Given this MWE (my own terrible code aside ;) ):

int main(){
int n=0;
printf ("all if, n=%u\n",n);
if (n==0) { printf ("n: %u\n",n); n++;}
if (n==1) { printf ("n: %u\n",n); n++;}
if (n==2) { printf ("n: %u\n",n); n++;}
if (n==3) { printf ("n: %u\n",n); n++;}
if (n==4) { printf ("n: %u\n",n); n++;}
printf ("all if completed, n=%u\n",n);

n=3;
printf ("with else if, n=%u\n",n);
if (n==0) { printf ("n: %u\n",n); n++;}
else if (n==1) { printf ("n: %u\n",n); n++;}
else if (n==2) { printf ("n: %u\n",n); n++;}
else if (n==3) { printf ("n: %u\n",n); n++;}
else { printf ("n: %u\n",n); n++;}
printf ("with else if completed, n=%u\n",n);
}

You'll get the output:

all if, n=0
n: 0
n: 1
n: 2
n: 3
n: 4
all if completed, n=5
with else if, n=3
n: 3
with else if completed, n=4

HTH :)
--
|_|O|_|
|_|_|O| Github: https://github.com/dpurgert
|O|O|O| PGP: DDAB 23FB 19FA 7D85 1CC1 E067 6D65 70E5 4CE7 2860
--- Synchronet 3.20a-Linux NewsLink 1.114

From fir@fir@grunge.pl to comp.lang.c on Thu Oct 31 15:17:44 2024

From Newsgroup: comp.lang.c

fir wrote:

Richard Harnden wrote:

On 31/10/2024 12:11, fir wrote:

somethins i got such pices of code like

if(n==1) {/*something/}
if(n==2) {/*something/}
if(n==3) {/*something/}
if(n==4) {/*something/}
if(n==5) {/*something/}

technically i would need to add elses - but the question is if to do
that

Why not ...

switch (n)
{
case 1:
/* something */
break;

case 2:
/* etc ... */

default:
/* something else */
}

... ?

switch is literally flawed construction - (i was even writing or at
kleast thinking why, last time but sadli i literrally forgot my
arguments, would need to find that notes)

so i forgot why but i forgiot that it is flawed - i mean the form of
this c switch has sme error

ok i found it was here in a thread freeform switch case
i was noted that swich shoudl look like

switch(a);
case(1) {}
case(2) {}
case(3) {}
case(4) {}
case(5) {}

i mean thuis enclosing brackets are wrong..and makes thic construction
hard to understand

here its cleat case simply work witch last swich (though this could be understood in few ways, liek it could be last in runtime or last in
top-down listing etc

also betetr word for switch is probably "select"

also case probably culd have 2 forms

case(1) { }
case(2) { }

dont needs break as its exclosive execution

case 1:
case 2:
case 3:

could eventually work like goto and fall through
(though im not sure what is needed and what not)

overally the one thing is rather sure those enclosing {} are wrong
and that swich case clausule in generall could have many variations
not only one
(this fallthru one probably should have name closer to goto

--- Synchronet 3.20a-Linux NewsLink 1.114

From fir@fir@grunge.pl to comp.lang.c on Thu Oct 31 15:23:52 2024

From Newsgroup: comp.lang.c

Anton Shepelev wrote:

fir:

somethins i got such pies of code like

if(n==1) {/*something/}
if(n==2) {/*something/}
if(n==3) {/*something/}
if(n==4) {/*something/}
if(n==5) {/*something/}

technically i would need to add elses

Why?

becouse its more strict expressed with else, with else
compiler is informed to drop all teh elese if its found

without else it is strictly liek it should test all teh ifs each time..

though in practice it would be liek that if compiler only compiles line
by line in separation, if it has a view on whole procedure it is clear
that those branchhes exclude itself so teh assembly generated should not generate such tests but put soem goto itself

but the question is if to do that o teh code has really
chance to be slower without else (except some very
prmitive compilers) ?? ot adding else makes liek code
shorter.. so im not literally sure which is better

I am all for code literalism, and want it to emphasize the
most natural method of execution. Therefore, I usually
append `return' or `goto' to each then-block in an tabular
if-sequence as above. I believe the perormance will depend
on the compiler.

--- Synchronet 3.20a-Linux NewsLink 1.114

From Bonita Montero@Bonita.Montero@gmail.com to comp.lang.c on Thu Oct 31 15:25:34 2024

From Newsgroup: comp.lang.c

Am 31.10.2024 um 14:25 schrieb Richard Harnden:

On 31/10/2024 12:11, fir wrote:

somethins i got such pices of code like

if(n==1) {/*something/}
if(n==2) {/*something/}
if(n==3) {/*something/}
if(n==4) {/*something/}
if(n==5) {/*something/}

technically i would need to add elses - but the question is if to do that

Why not ...

switch (n)
{
    case 1:
        /* something */
        break;

    case 2:
    /* etc ... */

    default:
        /* something else */

std::unreachable();

}
... ?

do teh code has really chance to be slower without else (except some
very prmitive compilers) ??

not adding else makes liek code shorter.. so im not literally sure
which is better

The size of teh [sic] source code won't make any difference to the size
of the executable - so aim for readability first and foremost.

--- Synchronet 3.20a-Linux NewsLink 1.114

From fir@fir@grunge.pl to Dan Purgert on Thu Oct 31 15:29:34 2024

From Newsgroup: comp.lang.c

Dan Purgert wrote:

On 2024-10-31, fir wrote:

somethins i got such pices of code like

if(n==1) {/*something/}
if(n==2) {/*something/}
if(n==3) {/*something/}
if(n==4) {/*something/}
if(n==5) {/*something/}

technically i would need to add elses - but the question is if to do that

do teh code has really chance to be slower without else (except some
very prmitive compilers) ??

In the above, all conditionals are always checked -- that is the truth
of a previous conditional statement has no bearing on subsequent tests.
This leads to the potential of tests going off in directions you hadn't necessarily anticipated.

However, 'if..elseif..else' will only check subsequent conditionals if
the prior statements were false. So for the case that "n=2", you're
only ever testing the two cases "if (n==1)" (which is false) and "elseif(n==2)". The computer just skips to the end of the set of
statements.

Given this MWE (my own terrible code aside ;) ):

int main(){
int n=0;
printf ("all if, n=%u\n",n);
if (n==0) { printf ("n: %u\n",n); n++;}
if (n==1) { printf ("n: %u\n",n); n++;}
if (n==2) { printf ("n: %u\n",n); n++;}
if (n==3) { printf ("n: %u\n",n); n++;}
if (n==4) { printf ("n: %u\n",n); n++;}
printf ("all if completed, n=%u\n",n);

n=3;
printf ("with else if, n=%u\n",n);
if (n==0) { printf ("n: %u\n",n); n++;}
else if (n==1) { printf ("n: %u\n",n); n++;}
else if (n==2) { printf ("n: %u\n",n); n++;}
else if (n==3) { printf ("n: %u\n",n); n++;}
else { printf ("n: %u\n",n); n++;}
printf ("with else if completed, n=%u\n",n);
}

You'll get the output:

all if, n=0
n: 0
n: 1
n: 2
n: 3
n: 4
all if completed, n=5
with else if, n=3
n: 3
with else if completed, n=4

HTH :)

i not modify n in those {} blocks so this example is not much relevant

my quiestion is more liek what is a metter of beter style

from my youth i roughly always added those elses notw i consider not to
write them
--- Synchronet 3.20a-Linux NewsLink 1.114

From Dan Purgert@dan@djph.net to comp.lang.c on Thu Oct 31 14:34:46 2024

From Newsgroup: comp.lang.c

On 2024-10-31, fir wrote:

Dan Purgert wrote:

On 2024-10-31, fir wrote:

somethins i got such pices of code like

if(n==1) {/*something/}
if(n==2) {/*something/}
if(n==3) {/*something/}
if(n==4) {/*something/}
if(n==5) {/*something/}

technically i would need to add elses - but the question is if to do that >>>
do teh code has really chance to be slower without else (except some
very prmitive compilers) ??

In the above, all conditionals are always checked -- that is the truth
of a previous conditional statement has no bearing on subsequent tests.
This leads to the potential of tests going off in directions you hadn't
necessarily anticipated.

However, 'if..elseif..else' will only check subsequent conditionals if
the prior statements were false. So for the case that "n=2", you're
only ever testing the two cases "if (n==1)" (which is false) and
"elseif(n==2)". The computer just skips to the end of the set of
statements.

Given this MWE (my own terrible code aside ;) ):

int main(){
int n=0;
printf ("all if, n=%u\n",n);
if (n==0) { printf ("n: %u\n",n); n++;}
if (n==1) { printf ("n: %u\n",n); n++;}
if (n==2) { printf ("n: %u\n",n); n++;}
if (n==3) { printf ("n: %u\n",n); n++;}
if (n==4) { printf ("n: %u\n",n); n++;}
printf ("all if completed, n=%u\n",n);

n=3;
printf ("with else if, n=%u\n",n);
if (n==0) { printf ("n: %u\n",n); n++;}
else if (n==1) { printf ("n: %u\n",n); n++;}
else if (n==2) { printf ("n: %u\n",n); n++;}
else if (n==3) { printf ("n: %u\n",n); n++;}
else { printf ("n: %u\n",n); n++;}
printf ("with else if completed, n=%u\n",n);
}

You'll get the output:

all if, n=0
n: 0
n: 1
n: 2
n: 3
n: 4
all if completed, n=5
with else if, n=3
n: 3
with else if completed, n=4

HTH :)

i not modify n in those {} blocks so this example is not much relevant

I'm using that as a simplified case to force the issue. "n" could be
modified anywhere, just so long as it is "between" any two of the test
cases being checked.

my quiestion is more liek what is a metter of beter style

If it is a series of related conditions, then "if .. else if .. else".
--
|_|O|_|
|_|_|O| Github: https://github.com/dpurgert
|O|O|O| PGP: DDAB 23FB 19FA 7D85 1CC1 E067 6D65 70E5 4CE7 2860
--- Synchronet 3.20a-Linux NewsLink 1.114

From fir@fir@grunge.pl to comp.lang.c on Thu Oct 31 15:43:39 2024

From Newsgroup: comp.lang.c

fir wrote:

switch(a);
case(1) {}
case(2) {}
case(3) {}
case(4) {}
case(5) {}

in fact thise if else if else if else laddars also suck imo so in fact
this switch is more logical from some point of view and if so those
above would be probably most suitable probably working goto way

though meybe thsi also could be replaaced by soem ad-hoc defined
dictionary and a key-value

--- Synchronet 3.20a-Linux NewsLink 1.114

From fir@fir@grunge.pl to comp.lang.c on Thu Oct 31 15:49:11 2024

From Newsgroup: comp.lang.c

Dan Purgert wrote:

On 2024-10-31, fir wrote:

Dan Purgert wrote:

On 2024-10-31, fir wrote:

somethins i got such pices of code like

if(n==1) {/*something/}
if(n==2) {/*something/}
if(n==3) {/*something/}
if(n==4) {/*something/}
if(n==5) {/*something/}

technically i would need to add elses - but the question is if to do that >>>>
do teh code has really chance to be slower without else (except some
very prmitive compilers) ??

In the above, all conditionals are always checked -- that is the truth
of a previous conditional statement has no bearing on subsequent tests.
This leads to the potential of tests going off in directions you hadn't
necessarily anticipated.

However, 'if..elseif..else' will only check subsequent conditionals if
the prior statements were false. So for the case that "n=2", you're
only ever testing the two cases "if (n==1)" (which is false) and
"elseif(n==2)". The computer just skips to the end of the set of
statements.

Given this MWE (my own terrible code aside ;) ):

int main(){
int n=0;
printf ("all if, n=%u\n",n);
if (n==0) { printf ("n: %u\n",n); n++;}
if (n==1) { printf ("n: %u\n",n); n++;}
if (n==2) { printf ("n: %u\n",n); n++;}
if (n==3) { printf ("n: %u\n",n); n++;}
if (n==4) { printf ("n: %u\n",n); n++;}
printf ("all if completed, n=%u\n",n);

n=3;
printf ("with else if, n=%u\n",n);
if (n==0) { printf ("n: %u\n",n); n++;}
else if (n==1) { printf ("n: %u\n",n); n++;}
else if (n==2) { printf ("n: %u\n",n); n++;}
else if (n==3) { printf ("n: %u\n",n); n++;}
else { printf ("n: %u\n",n); n++;}
printf ("with else if completed, n=%u\n",n);
}

You'll get the output:

all if, n=0
n: 0
n: 1
n: 2
n: 3
n: 4
all if completed, n=5
with else if, n=3
n: 3
with else if completed, n=4

HTH :)

i not modify n in those {} blocks so this example is not much relevant

I'm using that as a simplified case to force the issue. "n" could be modified anywhere, just so long as it is "between" any two of the test
cases being checked.

bot not in the cases i got on mind (i wrote its if elseelse else elese
case but with else just skipped

probbaly people use switch in such case but i distasted switch so i
iused else ladders, now i think if to use just ifs

my quiestion is more liek what is a metter of beter style

If it is a series of related conditions, then "if .. else if .. else".

--- Synchronet 3.20a-Linux NewsLink 1.114

From James Kuyper@jameskuyper@alumni.caltech.edu to comp.lang.c on Thu Oct 31 14:16:23 2024

From Newsgroup: comp.lang.c

On 10/31/24 09:15, Anton Shepelev wrote:

fir:

somethins i got such pies of code like

if(n==1) {/*something/}
if(n==2) {/*something/}
if(n==3) {/*something/}
if(n==4) {/*something/}
if(n==5) {/*something/}

technically i would need to add elses

Why?

He has indicated that the value of n is not changed inside any of the if-clauses. A sufficiently sophisticated compiler could notice that
fact, and also that each of the conditions is on the same variable, and
as a result it could generate the same kind of code as if it had been
written with 'else', so it won't generate unnecessary condition tests.
It might, in fact, generate the same kind of code which would have been generated if it had been coded properly, as a switch statement, so it
might use a jump table, if appropriate.
But it's better to write it as a switch statement in the first place, so
you don't have to rely upon the compiler being sufficiently
sophisticated to get the best results.
--- Synchronet 3.20a-Linux NewsLink 1.114

From Janis Papanagnou@janis_papanagnou+ng@hotmail.com to comp.lang.c on Thu Oct 31 20:24:42 2024

From Newsgroup: comp.lang.c

On 31.10.2024 13:11, fir wrote:

somethins i got such pices of code like

if(n==1) {/*something/}
if(n==2) {/*something/}
if(n==3) {/*something/}
if(n==4) {/*something/}
if(n==5) {/*something/}

technically i would need to add elses - but the question is if to do that

do teh code has really chance to be slower without else (except some
very prmitive compilers) ??

not adding else makes liek code shorter.. so im not literally sure which
is better

The language (and the compiler) is there to serve me, not vice versa.

While I expect modern compilers to do optimizations you generally
cannot rely on that.

I'm not much interested in doing peep-hole optimizations, that's the
task of the compilers.

There's optimizations on the logic that is of more concern (to me),
and often has even a greater impact on performance.

I use the constructs that serve me best to express what I intend to
implement. (But I have to use what the respective language supports.)

There's (in various languages) different ways to implement cascades;
three way decisions (<0, =0, >0), arrays of switch branches triggered
by a positive integer (1, 2, 3, ...), a case/switch whose argument
is visibly evaluated just once, if-cascades that allow for multiple
possible conditions, and with exclusive conditions. You can also
implement hierarchical cascades considering the assumed distribution
of appearing values (n<3, n>4, ...), or, with unknown distribution,
binary hierarchical nesting of the if-tests. And last but not least,
and depending on what the 'n' is algorithmically bound to, replacing conditional cascades by OO polymorphism (where possible).

Of course I'd use 'else' to make the intention clear. I'd prefer an
'else-if', if the language supports that, to make things clearer and
the syntax construct simpler and more reliable.

In above example the 'switch' would probably be the obvious way to
write the case (assuming 'n' being an invariant in the if-cascade).
Although the necessity (in "C") for using 'break' makes it clumsy,
I have to admit, but since C is anyway not that abstract as language
I bite the bullet, since that is the most obvious fitting construct
here (as far as no more context is provided).

In a language like Awk you don't need the 'if' and write (e.g.)
n==1 { *something* ; break }
but you might also want to add a 'break' after "something" (as
shown) to explicitly prevent unnecessary further checks. (GNU Awk,
BTW, also supports a more powerful 'switch' that operates not only
on simple scalar data types, but also on strings and patterns.)

Janis

--- Synchronet 3.20a-Linux NewsLink 1.114

From Bart@bc@freeuk.com to comp.lang.c on Thu Oct 31 21:33:40 2024

From Newsgroup: comp.lang.c

On 31/10/2024 12:11, fir wrote:

somethins i got such pices of code like

if(n==1) {/*something/}
if(n==2) {/*something/}
if(n==3) {/*something/}
if(n==4) {/*something/}
if(n==5) {/*something/}

technically i would need to add elses - but the question is if to do that

do teh code has really chance to be slower without else (except some
very prmitive compilers) ??

not adding else makes liek code shorter.. so im not literally sure which
is better

There are several clear patterns here: you're testing the same variable
'n' against several mutually exclusive alternatives, which also happen
to be consecutive values.

C is short of ways to express this, if you want to keep those
'somethings' as inline code (otherwise arrays of function pointers or
even label pointers could be used).

The closest way is to use 'switch', which is also likely to do all
comparisons at the same time (in O(1) time), if the compiler thinks
that's the best way to do it.

The next closest is to use an if-else chain, which is almost what you
have. But apparently you don't like writing 'else if' instead of 'if':

if(n==1) {/*something1*/}
else if(n==2) {/*something2*/}
else if(n==3) {/*something3*/}
else if(n==4) {/*something4*/}
else if(n==5) {/*something5*/}

This is actually cleaner than the 'break' you have to use with switch.

It also guarantees, whatever the compiler, that you don't waste time
doing the rest of the rests.
--- Synchronet 3.20a-Linux NewsLink 1.114

From David Brown@david.brown@hesbynett.no to comp.lang.c on Fri Nov 1 09:56:56 2024

From Newsgroup: comp.lang.c

On 31/10/2024 19:16, James Kuyper wrote:

On 10/31/24 09:15, Anton Shepelev wrote:

fir:

somethins i got such pies of code like

if(n==1) {/*something/}
if(n==2) {/*something/}
if(n==3) {/*something/}
if(n==4) {/*something/}
if(n==5) {/*something/}

technically i would need to add elses

Why?

He has indicated that the value of n is not changed inside any of the if-clauses. A sufficiently sophisticated compiler could notice that
fact, and also that each of the conditions is on the same variable, and
as a result it could generate the same kind of code as if it had been
written with 'else', so it won't generate unnecessary condition tests.
It might, in fact, generate the same kind of code which would have been generated if it had been coded properly, as a switch statement, so it
might use a jump table, if appropriate.
But it's better to write it as a switch statement in the first place, so
you don't have to rely upon the compiler being sufficiently
sophisticated to get the best results.

I disagree entirely.

It is best to write the code in the way that makes most sense - whatever
gives the best clarity and makes the programmer's intentions obvious to readers, and with the least risk of errors. Consider the
maintainability of the code - is it likely to be changed in the future,
or adapted and re-used in other contexts? If so, that should be a big influence on how you structure the source code. Can a different
structure make it less likely for errors to occur unnoticed? For
example, if the controlling value can be an enumeration then with a
switch, a good compiler can check if there are accidentally unhandled
cases (and even a poor compiler can check for duplicates).

But details of the efficiency of the generated object code, especially
on weaker compilers, should not be a consideration unless you have
measured the speed of the code, found it slower than the requirements,
and identified the code section that can be made significantly more
efficient by re-structuring.

I would rather say that you /should/ rely on the compiler being
sophisticated, if code speed is important to your task. Let the
compiler handle the little things, so that the programmer can
concentrate on the big things - clear code, correct code, maintainable
code, low-risk code, and efficient /algorithms/.

I am sure you would not advocate for using #define'd constants instead
of "const" values, or function-like macros instead of inline functions,
or goto's instead of "for" loops, or re-using a "register int temp1;"
variable defined at the top of a function instead of block-local
appropriately named variables. All these things can give more efficient results on weak compilers - but all are detrimental to code quality, and that's what you should put first.

In practice, I probably /would/ structure the code as a switch rather
than a series of "if" statements, unless I had overriding reason to do otherwise. But /not/ because of efficiency of the results.

Without all the details of the OP's code, it is of course impossible to
be sure what structure is clearest for his code.

--- Synchronet 3.20a-Linux NewsLink 1.114

From fir@fir@grunge.pl to comp.lang.c on Fri Nov 1 12:32:30 2024

From Newsgroup: comp.lang.c

Bart wrote:

ral clear patterns here: you're testing the same variable 'n' against
several mutually exclusive alternatives, which also happen to be
consecutive values.

C is short of ways to express this, if you want to keep those
'somethings' as inline code (otherwise arrays of function pointers or
even label pointers could be use

so in short this groupo seem to have no conclusion but is tolerant foir various approaches as it seems

imo the else latder is like most proper but i dont lkie it optically,
swich case i also dont like (use as far i i remember never in my code,
for years dont use even one)

so i persnally would use bare ifs and maybe elses ocasionally
(and switch should be mended but its fully not clear how,

as to those pointer tables im not sure but im like measurad it onece and
it was (not sure as to thsi as i dont remember exactly) slow maybe
dependant on architecture so its noth wort of use (if i remember correctly)
--- Synchronet 3.20a-Linux NewsLink 1.114

From Bart@bc@freeuk.com to comp.lang.c on Fri Nov 1 12:03:51 2024

From Newsgroup: comp.lang.c

On 01/11/2024 11:32, fir wrote:

Bart wrote:

ral clear patterns here: you're testing the same variable 'n' against
several mutually exclusive alternatives, which also happen to be
consecutive values.

C is short of ways to express this, if you want to keep those
'somethings' as inline code (otherwise arrays of function pointers or
even label pointers could be use

so in short this groupo seem to have no conclusion but is tolerant foir various approaches as it seems

imo the else latder is like most proper but i dont lkie it optically,
swich case i also dont like (use as far i i remember never in my code,
for years dont use even one)

so i persnally would use bare ifs and maybe elses ocasionally
(and switch should be mended but its fully not clear how,

as to those pointer tables im not sure but im like measurad it onece and
it was (not sure as to thsi as i dont remember exactly) slow maybe
dependant on architecture so its noth wort of use (if i remember correctly)

Well, personally I don't like that repetition, that's why I mentioned
the patterns. You're writing 'n' 5 times, '==' 5 times, and you're
writing out the numbers 1, 2, 3, 4, 5.

I also don't like the lack of exclusivity.

However I don't need to use C. If those 'somethings' were simple, or
were expressions, I could use syntax like this:

(n | s1, s2, s3, s4, s5)

If they were more elaborate statements, I would use a heavier syntax,
but still one where 'n' is only written once, and I don't need to repeat
'=='.

In the C version, you could mistakenly write 'm' instead of 'n', or '=' instead of '=='; it's more error prone, and a compiler might not be able
to detect it.

In the C, you could probably do something like this:

#define or else if

if (x == a) {}
or (x == b) {}
or (x == c) {}

--- Synchronet 3.20a-Linux NewsLink 1.114

From fir@fir@grunge.pl to Bart on Fri Nov 1 13:55:46 2024

From Newsgroup: comp.lang.c

Bart wrote:

On 01/11/2024 11:32, fir wrote:

Bart wrote:

ral clear patterns here: you're testing the same variable 'n' against
several mutually exclusive alternatives, which also happen to be
consecutive values.

C is short of ways to express this, if you want to keep those
'somethings' as inline code (otherwise arrays of function pointers or
even label pointers could be use

so in short this groupo seem to have no conclusion but is tolerant
foir various approaches as it seems

imo the else latder is like most proper but i dont lkie it optically,
swich case i also dont like (use as far i i remember never in my code,
for years dont use even one)

so i persnally would use bare ifs and maybe elses ocasionally
(and switch should be mended but its fully not clear how,

as to those pointer tables im not sure but im like measurad it onece
and it was (not sure as to thsi as i dont remember exactly) slow maybe
dependant on architecture so its noth wort of use (if i remember
correctly)

Well, personally I don't like that repetition, that's why I mentioned
the patterns. You're writing 'n' 5 times, '==' 5 times, and you're
writing out the numbers 1, 2, 3, 4, 5.

I also don't like the lack of exclusivity.

However I don't need to use C. If those 'somethings' were simple, or
were expressions, I could use syntax like this:

(n | s1, s2, s3, s4, s5)

on a C ground more suitable is

{s1,s2,s3,s4,s5)[n]

//which is just array indexing, and could use also like

{
{0,0,0,0,0}
{0,1,1,1,0}
{1,1,1,1,1}
{0,1,1,1,0}
{0,0,1,0,0}} [j][i]

anmd so on
(already wrote on thsi once)

bot for general switch something more is needed probably

i could cnsider something like

n -1-> /*something / -2-> /*something / -3-> /*something /

but i dont know (this above line seem not quite good

If they were more elaborate statements, I would use a heavier syntax,
but still one where 'n' is only written once, and I don't need to repeat '=='.

In the C version, you could mistakenly write 'm' instead of 'n', or '=' instead of '=='; it's more error prone, and a compiler might not be able
to detect it.

In the C, you could probably do something like this:

#define or else if

if (x == a) {}
or (x == b) {}
or (x == c) {}

--- Synchronet 3.20a-Linux NewsLink 1.114

From fir@fir@grunge.pl to Bart on Fri Nov 1 14:05:01 2024

From Newsgroup: comp.lang.c

Bart wrote:

In the C, you could probably do something like this:

#define or else if

if (x == a) {}
or (x == b) {}
or (x == c) {}

thsi coud have some sense if this or would be buildin keyword
(but not saying "enough" sense to do that, as writing this
logical "buildings" in code overally is not nice

overally this ilustrates the interesting difference its like
difference in

if a|b|c|d {}

and

if a {} | if b {} | if c {} | if d {}

where | means or

this means that both such ors are needed imo (but id doesnt necessary
mean that building that logical constructions is good imo) (i mean there
are complex way of writing code and simpel pialn ones and plain ones may
be better, if there is some plain one here)

--- Synchronet 3.20a-Linux NewsLink 1.114

From Bart@bc@freeuk.com to comp.lang.c on Fri Nov 1 13:39:10 2024

From Newsgroup: comp.lang.c

On 01/11/2024 12:55, fir wrote:

Bart wrote:

On 01/11/2024 11:32, fir wrote:

Bart wrote:

ral clear patterns here: you're testing the same variable 'n' against
several mutually exclusive alternatives, which also happen to be
consecutive values.

C is short of ways to express this, if you want to keep those
'somethings' as inline code (otherwise arrays of function pointers or
even label pointers could be use

so in short this groupo seem to have no conclusion but is tolerant
foir various approaches as it seems

imo the else latder is like most proper but i dont lkie it optically,
swich case i also dont like (use as far i i remember never in my code,
for years dont use even one)

so i persnally would use bare ifs and maybe elses ocasionally
(and switch should be mended but its fully not clear how,

as to those pointer tables im not sure but im like measurad it onece
and it was (not sure as to thsi as i dont remember exactly) slow maybe
dependant on architecture so its noth wort of use (if i remember
correctly)

Well, personally I don't like that repetition, that's why I mentioned
the patterns. You're writing 'n' 5 times, '==' 5 times, and you're
writing out the numbers 1, 2, 3, 4, 5.

I also don't like the lack of exclusivity.

However I don't need to use C. If those 'somethings' were simple, or
were expressions, I could use syntax like this:

(n | s1, s2, s3, s4, s5)

on a C ground more suitable is

{s1,s2,s3,s4,s5)[n]

//which is just array indexing

No, it's specifically not array indexing, as only one of s1 - s5 is
evaluated, or nothing is when n is not in range, eg. n is 100.

You could try something like that in C:

int x;

x = ((int[]){(puts("a"),10), (puts("b"),20), (puts("c"), 30), (puts("d"),40)})[3];

printf("X=%d\n", x);

The output is:

a
b
c
d
X=40

Showing that all elements are evaluated first. If index is 100, the
result is also undefined.

--- Synchronet 3.20a-Linux NewsLink 1.114

From fir@fir@grunge.pl to comp.lang.c on Fri Nov 1 15:08:45 2024

From Newsgroup: comp.lang.c

Bart wrote:

On 01/11/2024 12:55, fir wrote:

Bart wrote:

On 01/11/2024 11:32, fir wrote:

Bart wrote:

ral clear patterns here: you're testing the same variable 'n' against >>>>> several mutually exclusive alternatives, which also happen to be
consecutive values.

C is short of ways to express this, if you want to keep those
'somethings' as inline code (otherwise arrays of function pointers or >>>>> even label pointers could be use

so in short this groupo seem to have no conclusion but is tolerant
foir various approaches as it seems

imo the else latder is like most proper but i dont lkie it optically,
swich case i also dont like (use as far i i remember never in my code, >>>> for years dont use even one)

so i persnally would use bare ifs and maybe elses ocasionally
(and switch should be mended but its fully not clear how,

as to those pointer tables im not sure but im like measurad it onece
and it was (not sure as to thsi as i dont remember exactly) slow maybe >>>> dependant on architecture so its noth wort of use (if i remember
correctly)

Well, personally I don't like that repetition, that's why I mentioned
the patterns. You're writing 'n' 5 times, '==' 5 times, and you're
writing out the numbers 1, 2, 3, 4, 5.

I also don't like the lack of exclusivity.

However I don't need to use C. If those 'somethings' were simple, or
were expressions, I could use syntax like this:

(n | s1, s2, s3, s4, s5)

on a C ground more suitable is

{s1,s2,s3,s4,s5)[n]

//which is just array indexing

No, it's specifically not array indexing, as only one of s1 - s5 is evaluated, or nothing is when n is not in range, eg. n is 100.

You could try something like that in C:

int x;

x = ((int[]){(puts("a"),10), (puts("b"),20), (puts("c"), 30), (puts("d"),40)})[3];

printf("X=%d\n", x);

The output is:

a
b
c
d
X=40

Showing that all elements are evaluated first. If index is 100, the
result is also undefined.

:-O
what is this, first time i see such thing

--- Synchronet 3.20a-Linux NewsLink 1.114

From fir@fir@grunge.pl to comp.lang.c on Fri Nov 1 15:17:30 2024

From Newsgroup: comp.lang.c

fir wrote:

Bart wrote:

On 01/11/2024 12:55, fir wrote:

Bart wrote:

On 01/11/2024 11:32, fir wrote:

Bart wrote:

ral clear patterns here: you're testing the same variable 'n' against >>>>>> several mutually exclusive alternatives, which also happen to be
consecutive values.

C is short of ways to express this, if you want to keep those
'somethings' as inline code (otherwise arrays of function pointers or >>>>>> even label pointers could be use

so in short this groupo seem to have no conclusion but is tolerant
foir various approaches as it seems

imo the else latder is like most proper but i dont lkie it optically, >>>>> swich case i also dont like (use as far i i remember never in my code, >>>>> for years dont use even one)

so i persnally would use bare ifs and maybe elses ocasionally
(and switch should be mended but its fully not clear how,

as to those pointer tables im not sure but im like measurad it onece >>>>> and it was (not sure as to thsi as i dont remember exactly) slow maybe >>>>> dependant on architecture so its noth wort of use (if i remember
correctly)

Well, personally I don't like that repetition, that's why I mentioned
the patterns. You're writing 'n' 5 times, '==' 5 times, and you're
writing out the numbers 1, 2, 3, 4, 5.

I also don't like the lack of exclusivity.

However I don't need to use C. If those 'somethings' were simple, or
were expressions, I could use syntax like this:

(n | s1, s2, s3, s4, s5)

on a C ground more suitable is

{s1,s2,s3,s4,s5)[n]

//which is just array indexing

No, it's specifically not array indexing, as only one of s1 - s5 is
evaluated, or nothing is when n is not in range, eg. n is 100.

You could try something like that in C:

int x;

x = ((int[]){(puts("a"),10), (puts("b"),20), (puts("c"), 30),
(puts("d"),40)})[3];

printf("X=%d\n", x);

The output is:

a
b
c
d
X=40

Showing that all elements are evaluated first. If index is 100, the
result is also undefined.

:-O
what is this, first time i see such thing

im surprised that it work, but in fact i meant that this syntax is old c compatible but sych thing like

{printf("ONE"), printf("TWO"), printf("THREE")} [2]

shouldn evaluate al just the one is selected
like in array tab[23] not eveluates something other than tab[23]
--- Synchronet 3.20a-Linux NewsLink 1.114

From Bart@bc@freeuk.com to comp.lang.c on Fri Nov 1 15:59:53 2024

From Newsgroup: comp.lang.c

On 01/11/2024 14:17, fir wrote:

fir wrote:

Bart wrote:

On 01/11/2024 12:55, fir wrote:

Bart wrote:

On 01/11/2024 11:32, fir wrote:

Bart wrote:

ral clear patterns here: you're testing the same variable 'n'
against
several mutually exclusive alternatives, which also happen to be >>>>>>> consecutive values.

C is short of ways to express this, if you want to keep those
'somethings' as inline code (otherwise arrays of function
pointers or
even label pointers could be use

so in short this groupo seem to have no conclusion but is tolerant >>>>>> foir various approaches as it seems

imo the else latder is like most proper but i dont lkie it optically, >>>>>> swich case i also dont like (use as far i i remember never in my
code,
for years dont use even one)

so i persnally would use bare ifs and maybe elses ocasionally
(and switch should be mended but its fully not clear how,

as to those pointer tables im not sure but im like measurad it onece >>>>>> and it was (not sure as to thsi as i dont remember exactly) slow
maybe
dependant on architecture so its noth wort of use (if i remember
correctly)

Well, personally I don't like that repetition, that's why I mentioned >>>>> the patterns. You're writing 'n' 5 times, '==' 5 times, and you're
writing out the numbers 1, 2, 3, 4, 5.

I also don't like the lack of exclusivity.

However I don't need to use C. If those 'somethings' were simple, or >>>>> were expressions, I could use syntax like this:

    (n | s1, s2, s3, s4, s5)

on a C ground more suitable is

{s1,s2,s3,s4,s5)[n]

//which is just array indexing

No, it's specifically not array indexing, as only one of s1 - s5 is
evaluated, or nothing is when n is not in range, eg. n is 100.

You could try something like that in C:

     int x;

     x = ((int[]){(puts("a"),10), (puts("b"),20), (puts("c"), 30),
(puts("d"),40)})[3];

     printf("X=%d\n", x);

The output is:

    a
    b
    c
    d
    X=40

Showing that all elements are evaluated first. If index is 100, the
result is also undefined.

:-O
what is this, first time i see such thing

im surprised that it work, but in fact i meant that this syntax is old c compatible but sych thing like

{printf("ONE"), printf("TWO"), printf("THREE")} [2]

shouldn evaluate al just the one is selected
like in array tab[23] not eveluates something other than tab[23]

It's a 'compound literal'. It allows you to have the same {...}
initialisation data format, but anywhere, not just for initialing.
However it always needs a cast:

(int[]){printf("ONE"), printf("TWO"), printf("THREE")}[2];

This prints ONETWOTHREE, it also then indexes the 3rd value of the
array, which is 5, as returned by printf, so this:

printf("%d\n", (int[]){printf("ONE"), printf("TWO"),
printf("THREE")}[2]);

prints ONETWOTHREE5

--- Synchronet 3.20a-Linux NewsLink 1.114

From David Brown@david.brown@hesbynett.no to comp.lang.c on Fri Nov 1 18:35:24 2024

From Newsgroup: comp.lang.c

On 01/11/2024 16:59, Bart wrote:

On 01/11/2024 14:17, fir wrote:

fir wrote:

Bart wrote:

On 01/11/2024 12:55, fir wrote:

Bart wrote:

On 01/11/2024 11:32, fir wrote:

Bart wrote:

ral clear patterns here: you're testing the same variable 'n' >>>>>>>> against
several mutually exclusive alternatives, which also happen to be >>>>>>>> consecutive values.

C is short of ways to express this, if you want to keep those
'somethings' as inline code (otherwise arrays of function
pointers or
even label pointers could be use

so in short this groupo seem to have no conclusion but is tolerant >>>>>>> foir various approaches as it seems

imo the else latder is like most proper but i dont lkie it
optically,
swich case i also dont like (use as far i i remember never in my >>>>>>> code,
for years dont use even one)

so i persnally would use bare ifs and maybe elses ocasionally
(and switch should be mended but its fully not clear how,

as to those pointer tables im not sure but im like measurad it onece >>>>>>> and it was (not sure as to thsi as i dont remember exactly) slow >>>>>>> maybe
dependant on architecture so its noth wort of use (if i remember >>>>>>> correctly)

Well, personally I don't like that repetition, that's why I mentioned >>>>>> the patterns. You're writing 'n' 5 times, '==' 5 times, and you're >>>>>> writing out the numbers 1, 2, 3, 4, 5.

I also don't like the lack of exclusivity.

However I don't need to use C. If those 'somethings' were simple, or >>>>>> were expressions, I could use syntax like this:

    (n | s1, s2, s3, s4, s5)

on a C ground more suitable is

{s1,s2,s3,s4,s5)[n]

//which is just array indexing

No, it's specifically not array indexing, as only one of s1 - s5 is
evaluated, or nothing is when n is not in range, eg. n is 100.

You could try something like that in C:

     int x;

     x = ((int[]){(puts("a"),10), (puts("b"),20), (puts("c"), 30), >>>> (puts("d"),40)})[3];

     printf("X=%d\n", x);

The output is:

    a
    b
    c
    d
    X=40

Showing that all elements are evaluated first. If index is 100, the
result is also undefined.

:-O
what is this, first time i see such thing

im surprised that it work, but in fact i meant that this syntax is old
c compatible but sych thing like

{printf("ONE"), printf("TWO"), printf("THREE")} [2]

shouldn evaluate al just the one is selected
like in array tab[23] not eveluates something other than tab[23]

It's a 'compound literal'. It allows you to have the same {...} initialisation data format, but anywhere, not just for initialing.
However it always needs a cast:

(int[]){printf("ONE"), printf("TWO"), printf("THREE")}[2];

This prints ONETWOTHREE, it also then indexes the 3rd value of the
array, which is 5, as returned by printf, so this:

printf("%d\n", (int[]){printf("ONE"), printf("TWO"), printf("THREE")}[2]);

prints ONETWOTHREE5

What you have written here is all correct, but a more common method
would be to avoid having three printf's :

void shout_a_number(int n) {
printf( (const char* []) { "ONE", "TWO", "THREE" } [n] );
}

That's more likely to match what people would want.

Of course, you may need to sanity-check the value of "n" here!

--- Synchronet 3.20a-Linux NewsLink 1.114

From Bart@bc@freeuk.com to comp.lang.c on Fri Nov 1 18:05:02 2024

From Newsgroup: comp.lang.c

On 01/11/2024 17:35, David Brown wrote:

On 01/11/2024 16:59, Bart wrote:

On 01/11/2024 14:17, fir wrote:

fir wrote:

Bart wrote:

On 01/11/2024 12:55, fir wrote:

Bart wrote:

On 01/11/2024 11:32, fir wrote:

Bart wrote:

ral clear patterns here: you're testing the same variable 'n' >>>>>>>>> against
several mutually exclusive alternatives, which also happen to be >>>>>>>>> consecutive values.

C is short of ways to express this, if you want to keep those >>>>>>>>> 'somethings' as inline code (otherwise arrays of function
pointers or
even label pointers could be use

so in short this groupo seem to have no conclusion but is tolerant >>>>>>>> foir various approaches as it seems

imo the else latder is like most proper but i dont lkie it
optically,
swich case i also dont like (use as far i i remember never in my >>>>>>>> code,
for years dont use even one)

so i persnally would use bare ifs and maybe elses ocasionally
(and switch should be mended but its fully not clear how,

as to those pointer tables im not sure but im like measurad it >>>>>>>> onece
and it was (not sure as to thsi as i dont remember exactly) slow >>>>>>>> maybe
dependant on architecture so its noth wort of use (if i remember >>>>>>>> correctly)

Well, personally I don't like that repetition, that's why I
mentioned
the patterns. You're writing 'n' 5 times, '==' 5 times, and you're >>>>>>> writing out the numbers 1, 2, 3, 4, 5.

I also don't like the lack of exclusivity.

However I don't need to use C. If those 'somethings' were simple, or >>>>>>> were expressions, I could use syntax like this:

    (n | s1, s2, s3, s4, s5)

on a C ground more suitable is

{s1,s2,s3,s4,s5)[n]

//which is just array indexing

No, it's specifically not array indexing, as only one of s1 - s5 is
evaluated, or nothing is when n is not in range, eg. n is 100.

You could try something like that in C:

     int x;

     x = ((int[]){(puts("a"),10), (puts("b"),20), (puts("c"), 30), >>>>> (puts("d"),40)})[3];

     printf("X=%d\n", x);

The output is:

    a
    b
    c
    d
    X=40

Showing that all elements are evaluated first. If index is 100, the
result is also undefined.

:-O
what is this, first time i see such thing

im surprised that it work, but in fact i meant that this syntax is
old c compatible but sych thing like

{printf("ONE"), printf("TWO"), printf("THREE")} [2]

shouldn evaluate al just the one is selected
like in array tab[23] not eveluates something other than tab[23]

It's a 'compound literal'. It allows you to have the same {...}
initialisation data format, but anywhere, not just for initialing.
However it always needs a cast:

   (int[]){printf("ONE"), printf("TWO"), printf("THREE")}[2];

This prints ONETWOTHREE, it also then indexes the 3rd value of the
array, which is 5, as returned by printf, so this:

   printf("%d\n", (int[]){printf("ONE"), printf("TWO"),
printf("THREE")}[2]);

   prints ONETWOTHREE5

What you have written here is all correct, but a more common method
would be to avoid having three printf's :

void shout_a_number(int n) {
    printf( (const char* []) { "ONE", "TWO", "THREE" } [n] );
}

That's more likely to match what people would want.

I was also trying to show that all elements are evaluated, so each has
to have some side-effect to illustrate that.

A true N-way-select construct (C only really has ?:) would evaluate only
one, and would deal with an out-of-range condition.

(In my implementations, a default/else branch value must be provided if
the whole thing is expected to return a value.)

--- Synchronet 3.20a-Linux NewsLink 1.114

From David Brown@david.brown@hesbynett.no to comp.lang.c on Fri Nov 1 19:47:23 2024

From Newsgroup: comp.lang.c

On 01/11/2024 19:05, Bart wrote:

On 01/11/2024 17:35, David Brown wrote:

On 01/11/2024 16:59, Bart wrote:

On 01/11/2024 14:17, fir wrote:

fir wrote:

Bart wrote:

On 01/11/2024 12:55, fir wrote:

Bart wrote:

On 01/11/2024 11:32, fir wrote:

Bart wrote:

ral clear patterns here: you're testing the same variable 'n' >>>>>>>>>> against
several mutually exclusive alternatives, which also happen to be >>>>>>>>>> consecutive values.

C is short of ways to express this, if you want to keep those >>>>>>>>>> 'somethings' as inline code (otherwise arrays of function >>>>>>>>>> pointers or
even label pointers could be use

so in short this groupo seem to have no conclusion but is tolerant >>>>>>>>> foir various approaches as it seems

imo the else latder is like most proper but i dont lkie it
optically,
swich case i also dont like (use as far i i remember never in >>>>>>>>> my code,
for years dont use even one)

so i persnally would use bare ifs and maybe elses ocasionally >>>>>>>>> (and switch should be mended but its fully not clear how,

as to those pointer tables im not sure but im like measurad it >>>>>>>>> onece
and it was (not sure as to thsi as i dont remember exactly) >>>>>>>>> slow maybe
dependant on architecture so its noth wort of use (if i remember >>>>>>>>> correctly)

Well, personally I don't like that repetition, that's why I
mentioned
the patterns. You're writing 'n' 5 times, '==' 5 times, and you're >>>>>>>> writing out the numbers 1, 2, 3, 4, 5.

I also don't like the lack of exclusivity.

However I don't need to use C. If those 'somethings' were
simple, or
were expressions, I could use syntax like this:

    (n | s1, s2, s3, s4, s5)

on a C ground more suitable is

{s1,s2,s3,s4,s5)[n]

//which is just array indexing

No, it's specifically not array indexing, as only one of s1 - s5 is >>>>>> evaluated, or nothing is when n is not in range, eg. n is 100.

You could try something like that in C:

     int x;

     x = ((int[]){(puts("a"),10), (puts("b"),20), (puts("c"), 30), >>>>>> (puts("d"),40)})[3];

     printf("X=%d\n", x);

The output is:

    a
    b
    c
    d
    X=40

Showing that all elements are evaluated first. If index is 100, the >>>>>> result is also undefined.

:-O
what is this, first time i see such thing

im surprised that it work, but in fact i meant that this syntax is
old c compatible but sych thing like

{printf("ONE"), printf("TWO"), printf("THREE")} [2]

shouldn evaluate al just the one is selected
like in array tab[23] not eveluates something other than tab[23]

It's a 'compound literal'. It allows you to have the same {...}
initialisation data format, but anywhere, not just for initialing.
However it always needs a cast:

   (int[]){printf("ONE"), printf("TWO"), printf("THREE")}[2];

This prints ONETWOTHREE, it also then indexes the 3rd value of the
array, which is 5, as returned by printf, so this:

   printf("%d\n", (int[]){printf("ONE"), printf("TWO"),
printf("THREE")}[2]);

   prints ONETWOTHREE5

What you have written here is all correct, but a more common method
would be to avoid having three printf's :

void shout_a_number(int n) {
     printf( (const char* []) { "ONE", "TWO", "THREE" } [n] );
}

That's more likely to match what people would want.

I was also trying to show that all elements are evaluated, so each has
to have some side-effect to illustrate that.

Fair enough.

A true N-way-select construct (C only really has ?:) would evaluate only one, and would deal with an out-of-range condition.

That's a matter of opinion and design choice, rather than being
requirements for a "true" select construct. You are free to choose the
rules you want for your own language, but you are not free to dictate
what you think the rules should be for others. (You are welcome to /opinions/, of course.)

(In my implementations, a default/else branch value must be provided if
the whole thing is expected to return a value.)

OK, if that's what you want. My preference, if I were putting together
what /I/ thought was an idea language for /my/ use, would be heavy use
of explicit specifications and contracts for code, so that a
default/else branch is either disallowed (if there the selection covers
all legal values) or required (if the selection is abbreviated). A
default value "just in case" is, IMHO, worse than useless.

Different people, different preferences.

--- Synchronet 3.20a-Linux NewsLink 1.114

From Bart@bc@freeuk.com to comp.lang.c on Fri Nov 1 19:47:53 2024

From Newsgroup: comp.lang.c

On 01/11/2024 18:47, David Brown wrote:

On 01/11/2024 19:05, Bart wrote:

On 01/11/2024 17:35, David Brown wrote:

What you have written here is all correct, but a more common method
would be to avoid having three printf's :

void shout_a_number(int n) {
printf( (const char* []) { "ONE", "TWO", "THREE" } [n] );
}

That's more likely to match what people would want.

I was also trying to show that all elements are evaluated, so each has
to have some side-effect to illustrate that.

Fair enough.

A true N-way-select construct (C only really has ?:) would evaluate
only one, and would deal with an out-of-range condition.

That's a matter of opinion and design choice, rather than being
requirements for a "true" select construct.

I don't think it's just opinion.

In general, an if-else-if chain (which was the point of the OP), would evaluate only one branch. So would a switch-case construct if sensibly implemented (in C's version, anything goes).

The same applies to C's c?a:b operator: only one of a or b is evaluated,
not both.

(This also why implementing if, switch, ?: via functions, which lots are
keen to do in the reddit PL forum, requires closures, lazy evaluation or
other advanced features.)

You are free to choose the
rules you want for your own language, but you are not free to dictate
what you think the rules should be for others. (You are welcome to /opinions/, of course.)

(In my implementations, a default/else branch value must be provided
if the whole thing is expected to return a value.)

OK, if that's what you want. My preference, if I were putting together what /I/ thought was an idea language for /my/ use, would be heavy use
of explicit specifications and contracts for code, so that a
default/else branch is either disallowed (if there the selection covers
all legal values) or required (if the selection is abbreviated). A
default value "just in case" is, IMHO, worse than useless.

All such multiway constructs in my languages (there are 4, one of which
the job of both 'if' and C's ?:) have an optional else branch. A
missing 'else' has an notional 'void' type.

But it becomes mandatory if the whole thing returns a value, to satisfy
the type system, because otherwise it will try and match with 'void'.

SOMETHING needs to happen when none of the branches are executed; what
value would be returned then? The behaviour needs to be defined. You
don't want to rely on compiler analysis for this stuff.

In C on the other hand, the ':' of '?:' is always needed, even when it
is not expected to yield a value. Hence you often see this things like this:

p == NULL ? puts("error"): 0;

Here, gcc at least, also requires the types of the two branches to
match, even though the whole construct yields no common value. Meanwhile
I allow this (if I was keen on a compact form):

(p = nil | print "error")

No else is needed.

--- Synchronet 3.20a-Linux NewsLink 1.114

From James Kuyper@jameskuyper@alumni.caltech.edu to comp.lang.c on Fri Nov 1 22:25:56 2024

From Newsgroup: comp.lang.c

On 11/1/24 04:56, David Brown wrote:

On 31/10/2024 19:16, James Kuyper wrote:

On 10/31/24 09:15, Anton Shepelev wrote:

fir:

somethins i got such pies of code like

if(n==1) {/*something/}
if(n==2) {/*something/}
if(n==3) {/*something/}
if(n==4) {/*something/}
if(n==5) {/*something/}

technically i would need to add elses

Why?

He has indicated that the value of n is not changed inside any of the
if-clauses. A sufficiently sophisticated compiler could notice that
fact, and also that each of the conditions is on the same variable, and
as a result it could generate the same kind of code as if it had been
written with 'else', so it won't generate unnecessary condition tests.
It might, in fact, generate the same kind of code which would have been
generated if it had been coded properly, as a switch statement, so it
might use a jump table, if appropriate.
But it's better to write it as a switch statement in the first place, so
you don't have to rely upon the compiler being sufficiently
sophisticated to get the best results.

I disagree entirely.

It is best to write the code in the way that makes most sense -
whatever gives the best clarity and makes the programmer's intentions
obvious to readers, and with the least risk of errors. Consider the maintainability of the code - is it likely to be changed in the
future, or adapted and re-used in other contexts? If so, that should
be a big influence on how you structure the source code. Can a
different structure make it less likely for errors to occur unnoticed?
For example, if the controlling value can be an enumeration then with
a switch, a good compiler can check if there are accidentally
unhandled cases (and even a poor compiler can check for duplicates).

I don't see those criteria as conflicting with my advice. A switch seems
to me to unambiguously the clearest way of writing this logic, for
precisely the same reason it also makes it easier for unsophisticated
compilers to optimize it - what needs to be done is clearer both to the compiler and to the human reader.

--- Synchronet 3.20a-Linux NewsLink 1.114

From fir@fir@grunge.pl to comp.lang.c on Sat Nov 2 09:37:35 2024

From Newsgroup: comp.lang.c

David Brown wrote:

It is best to write the code in the way that makes most sense - whatever gives the best clarity and makes the programmer's intentions obvious to readers, and with the least risk of errors.

the fact is it is somewhat hard to say which is more obvious to readers

if(key=='A') Something();
else if(key=='B') Something();
else if(key=='C') Something();
else if(key=='D') Something();

or

if(key=='A') Something();
if(key=='B') Something();
if(key=='C') Something();
if(key=='D') Something();

imo the second is more for human but logically its a bit diferent
becouse else chain only goes forward on "false" and new statemant on
both "true and false"

(there is also an option to go only on true (something like "then" keyword

if(key=='A') Something();
then if(key=='B') Something();
then if(key=='C') Something();
then if(key=='D') Something();

though c dont has it (eventually can be done by nestong ifs)

--- Synchronet 3.20a-Linux NewsLink 1.114

From David Brown@david.brown@hesbynett.no to comp.lang.c on Sat Nov 2 11:44:02 2024

From Newsgroup: comp.lang.c

On 02/11/2024 03:25, James Kuyper wrote:

On 11/1/24 04:56, David Brown wrote:

On 31/10/2024 19:16, James Kuyper wrote:

On 10/31/24 09:15, Anton Shepelev wrote:

fir:

somethins i got such pies of code like

if(n==1) {/*something/}
if(n==2) {/*something/}
if(n==3) {/*something/}
if(n==4) {/*something/}
if(n==5) {/*something/}

technically i would need to add elses

Why?

He has indicated that the value of n is not changed inside any of the
if-clauses. A sufficiently sophisticated compiler could notice that
fact, and also that each of the conditions is on the same variable, and
as a result it could generate the same kind of code as if it had been
written with 'else', so it won't generate unnecessary condition tests.
It might, in fact, generate the same kind of code which would have been
generated if it had been coded properly, as a switch statement, so it
might use a jump table, if appropriate.
But it's better to write it as a switch statement in the first place, so >>> you don't have to rely upon the compiler being sufficiently
sophisticated to get the best results.

I disagree entirely.

It is best to write the code in the way that makes most sense -
whatever gives the best clarity and makes the programmer's intentions
obvious to readers, and with the least risk of errors. Consider the
maintainability of the code - is it likely to be changed in the
future, or adapted and re-used in other contexts? If so, that should
be a big influence on how you structure the source code. Can a
different structure make it less likely for errors to occur unnoticed?
For example, if the controlling value can be an enumeration then with
a switch, a good compiler can check if there are accidentally
unhandled cases (and even a poor compiler can check for duplicates).

I don't see those criteria as conflicting with my advice. A switch seems
to me to unambiguously the clearest way of writing this logic, for
precisely the same reason it also makes it easier for unsophisticated compilers to optimize it - what needs to be done is clearer both to the compiler and to the human reader.

It's not the advice itself that I disagree with - it is the /reason/ for
the advice.

Coding always has some trade-offs, and how you weight different factors
such as portability and code efficiency depends on the task at hand.
Unless you know there are other more important factors, the appropriate balance for most code is to put the emphasis on code clarity over code efficiency, as that typically improves your chances of getting it
correct. (And correct results always trump fast results.)

As a bonus, writing code clearly - from a human programmer perspective -
often results in code that a compiler can optimise well. But that's not
the motivation.

Your advice was to write a switch because compilers can generate more efficient code even if the compiler is not particularly sophisticated.
That might be the right advice, but it is the wrong reasoning.

The OP should write the code that is /clearest/ for the task in hand.
It is certainly likely that a switch is often clearest for this kind of structure - but perhaps something else fits better for the OP's purpose.

Or if code efficiency really is an issue for the OP's task, then he
should write it in half a dozen different ways, test them, and compare
the speed in reality.

--- Synchronet 3.20a-Linux NewsLink 1.114

From David Brown@david.brown@hesbynett.no to comp.lang.c on Sat Nov 2 11:51:56 2024

From Newsgroup: comp.lang.c

On 02/11/2024 09:37, fir wrote:

David Brown wrote:

It is best to write the code in the way that makes most sense - whatever
gives the best clarity and makes the programmer's intentions obvious to
readers, and with the least risk of errors.

the fact is it is somewhat hard to say which is more obvious to readers

if(key=='A') Something();
else if(key=='B') Something();
else if(key=='C') Something();
else if(key=='D') Something();

or

if(key=='A') Something();
if(key=='B') Something();
if(key=='C') Something();
if(key=='D') Something();

imo the second is more for human but logically its a bit diferent
becouse else chain only goes forward on "false" and new statemant on
both "true and false"

You might also split up your functions differently, so that you can write :

if (key == 'A') return Something_A();
if (key == 'B') return Something_B();
if (key == 'C') return Something_C();
if (key == 'D') return Something_D();

That can be clear to the reader, and has the same effect as "else if".

(You might also consider using the space key more - it adds
significantly to code legibility. It's not by chance that it is the
biggest key on the keyboard.)

--- Synchronet 3.20a-Linux NewsLink 1.114

From David Brown@david.brown@hesbynett.no to comp.lang.c on Sat Nov 2 12:41:20 2024

From Newsgroup: comp.lang.c

On 01/11/2024 20:47, Bart wrote:

On 01/11/2024 18:47, David Brown wrote:

On 01/11/2024 19:05, Bart wrote:

On 01/11/2024 17:35, David Brown wrote:

What you have written here is all correct, but a more common method
would be to avoid having three printf's :

void shout_a_number(int n) {
printf( (const char* []) { "ONE", "TWO", "THREE" } [n] );
}

That's more likely to match what people would want.

I was also trying to show that all elements are evaluated, so each
has to have some side-effect to illustrate that.

Fair enough.

A true N-way-select construct (C only really has ?:) would evaluate
only one, and would deal with an out-of-range condition.

That's a matter of opinion and design choice, rather than being
requirements for a "true" select construct.

I don't think it's just opinion.

Yes, it is.

I don't disagree that such an "select one of these and evaluate only
that" construct can be a useful thing, or a perfectly good alternative
to to an "evaluate all of these then select one of them" construct. But
you are completely wrong to think that one of these two is somehow the
"true" or only correct way to have a selection.

In some languages, the construct for "A or B" will evaluate both, then
"or" them. In other languages, it will evaluate "A" then only evaluate
"B" if necessary. In others, expressions "A" and "B" cannot have side-effects, so the evaluation or not makes no difference. All of
these are perfectly valid design choices for a language.

In general, an if-else-if chain (which was the point of the OP), would evaluate only one branch.

It evaluates all the conditionals down the chain until it hits a "true" result, then evaluates the body of the "if" that matches, then skips the
rest.

(Of course generated code can evaluate all sorts of things in different orders, as long as observable behaviour - side-effects - are correct.)

So would a switch-case construct if sensibly
implemented (in C's version, anything goes).

C's switch is perfectly simply and clearly defined. It is not "anything goes". The argument to the switch is evaluated once, then control jumps
to the label of the switch case, then evaluation continues from that
point. It is totally straight-forward.

You might not like the "fall-through" concept or the way C's switch does
not quite fit with structured programming. If so, I'd agree entirely.
The requirement for lots of "break;" statements in most C switch uses is
a source of countless errors in C coding and IMHO a clear mistake in the language design. But that does not hinder C's switch statements from
being very useful, very easy to understand (when used sensibly), and
with no doubts about how they work (again, when used sensibly).

The same applies to C's c?a:b operator: only one of a or b is evaluated,
not both.

You are conflating several ideas, then you wrote something that you
/know/ is pure FUD about C's switch statements. So writing "The same
applies" makes no sense.

You are, of course, correct that in "c ? a : b", "c" is evaluated first
and then one and only one of "a" and "b".

(This also why implementing if, switch, ?: via functions, which lots are keen to do in the reddit PL forum, requires closures, lazy evaluation or other advanced features.)

Yes, you'd need something like that to implement such "short-circuit" operators using functions in C. In other languages, things may be
different.

You are free to choose the rules you want for your own language, but
you are not free to dictate what you think the rules should be for
others. (You are welcome to /opinions/, of course.)

(In my implementations, a default/else branch value must be provided
if the whole thing is expected to return a value.)

OK, if that's what you want. My preference, if I were putting
together what /I/ thought was an idea language for /my/ use, would be
heavy use of explicit specifications and contracts for code, so that a
default/else branch is either disallowed (if there the selection
covers all legal values) or required (if the selection is
abbreviated). A default value "just in case" is, IMHO, worse than
useless.

All such multiway constructs in my languages (there are 4, one of which
the job of both 'if' and C's ?:) have an optional else branch. A
missing 'else' has an notional 'void' type.

But it becomes mandatory if the whole thing returns a value, to satisfy
the type system, because otherwise it will try and match with 'void'.

Your language, your choice. I'd question the whole idea of having a
construct that can evaluate to something of different types in the first place, whether or not it returns a value, but that's your choice.

SOMETHING needs to happen when none of the branches are executed; what
value would be returned then? The behaviour needs to be defined. You
don't want to rely on compiler analysis for this stuff.

In my hypothetical language described above, it never happens that none
of the branches are executed.

Do you feel you need to write code like this?

const char * flag_to_text_A(bool b) {
if (b == true) {
return "It's true!";
} else if (b == false) {
return "It's false!";
} else {
return "Schrödinger's cat has escaped!";
}
}

When you have your "else" or "default" clause that is added for
something that can't ever happen, how do you test it?

In C on the other hand, the ':' of '?:' is always needed, even when it
is not expected to yield a value. Hence you often see this things like
this:

p == NULL ? puts("error"): 0;

Given that the tertiary operator chooses between two things, it seems
fairly obvious that you need two alternatives to choose from - having a
choice operator without at least two choices would be rather useless.

I can't say I have ever seen the tertiary operator used like this.
There are a few C programmers that like to code with everything as expressions, using commas instead of semicolons, but they are IMHO
mostly just being smart-arses. It's a lot more common to write :

if (!p) puts("error");

And in most cases, you'd have a return when "p" is not valid, or make
the rest of the function part of the conditional rather than continuing
with a null "p".

Here, gcc at least, also requires the types of the two branches to
match, even though the whole construct yields no common value.

The C standards require a certain level of type matching here - it is
not gcc specific. (The exact requirements are in 6.5.15p3 of the C
standards - I am sure you read these when you implemented your own C compiler.) And yes, the whole construct /does/ yield a value of a
common type. It is not the language's fault if some hypothetical
programmer writes something in an odd manner.

Meanwhile
I allow this (if I was keen on a compact form):

(p = nil | print "error")

No else is needed.

In C you could write :

p == NULL || puts("error");

which is exactly the same structure.

I think all of these, including your construct in your language, are smart-arse choices compared to a simple "if" statement, but personal
styles and preferences vary.

--- Synchronet 3.20a-Linux NewsLink 1.114

From James Kuyper@jameskuyper@alumni.caltech.edu to comp.lang.c on Sat Nov 2 09:11:47 2024

From Newsgroup: comp.lang.c

On 02/11/2024 09:37, fir wrote:
...

the fact is it is somewhat hard to say which is more obvious to readers

if(key=='A') Something();
else if(key=='B') Something();
else if(key=='C') Something();
else if(key=='D') Something();

or

if(key=='A') Something();
if(key=='B') Something();
if(key=='C') Something();
if(key=='D') Something();

imo the second is more for human but logically its a bit diferent
becouse else chain only goes forward on "false" and new statemant on
both "true and false"

If I saw code written the second way, I would ask myself why it wasn't
written the first way. The most obvious reason for not writing it the
first way is that the calls to Something() might indirectly change the
value of 'key', and that the later if() conditions are intentionally
dependent on those changes. I would waste time confirming that there's
no such possibility. You should write your code so that others reading
it won't have to worry about that possibility.
--- Synchronet 3.20a-Linux NewsLink 1.114

From Tim Rentsch@tr.17687@z991.linuxsc.com to comp.lang.c on Sat Nov 2 11:09:42 2024

From Newsgroup: comp.lang.c

fir <fir@grunge.pl> writes:

Bart wrote:

ral clear patterns here: you're testing the same variable 'n'
against several mutually exclusive alternatives, which also happen
to be consecutive values.

C is short of ways to express this, if you want to keep those
'somethings' as inline code (otherwise arrays of function pointers
or even label pointers could be use

so in short this groupo seem to have no conclusion but is tolerant
foir various approaches as it seems

imo the else latder is like most proper but i dont lkie it
optically, swich case i also dont like (use as far i i remember
never in my code, for years dont use even one)

so i persnally would use bare ifs and maybe elses ocasionally
(and switch should be mended but its fully not clear how,

I think you should have confidence in your own opinion. All
you're getting from other people is their opinion about what is
easier to understand, or "clear", or "readable", etc. As long as
the code is logically correct you are free to choose either
style, and it's perfectly okay to choose the one that you find
more appealing.

There is a case where using 'else' is necessary, when there is a
catchall action for circumstances matching "none of the above".
Alternatively a 'break' or 'continue' or 'goto' or 'return' may
be used to bypass subsequent cases, but you get the idea.

With the understanding that I am offering more than my own opinion,
I can say that I might use any of the patterns mentioned, depending
on circumstances. I don't think any one approach is either always
right or always wrong.
--- Synchronet 3.20a-Linux NewsLink 1.114

From Bart@bc@freeuk.com to comp.lang.c on Sat Nov 2 20:44:31 2024

From Newsgroup: comp.lang.c

On 02/11/2024 11:41, David Brown wrote:

On 01/11/2024 20:47, Bart wrote:

On 01/11/2024 18:47, David Brown wrote:

On 01/11/2024 19:05, Bart wrote:

On 01/11/2024 17:35, David Brown wrote:

What you have written here is all correct, but a more common method >>>>> would be to avoid having three printf's :

void shout_a_number(int n) {
     printf( (const char* []) { "ONE", "TWO", "THREE" } [n] );
}

That's more likely to match what people would want.

I was also trying to show that all elements are evaluated, so each
has to have some side-effect to illustrate that.

Fair enough.

A true N-way-select construct (C only really has ?:) would evaluate
only one, and would deal with an out-of-range condition.

That's a matter of opinion and design choice, rather than being
requirements for a "true" select construct.

I don't think it's just opinion.

Yes, it is.

Then we disagree on what 'multi-way' select might mean. I think it means branching, even if notionally, on one-of-N possible code paths.

The whole construct may or may not return a value. If it does, then one
of the N paths must be a default path.

I don't disagree that such an "select one of these and evaluate only
that" construct can be a useful thing, or a perfectly good alternative
to to an "evaluate all of these then select one of them" construct. But you are completely wrong to think that one of these two is somehow the "true" or only correct way to have a selection.

In some languages, the construct for "A or B" will evaluate both, then
"or" them. In other languages, it will evaluate "A" then only evaluate
"B" if necessary. In others, expressions "A" and "B" cannot have side-effects, so the evaluation or not makes no difference. All of
these are perfectly valid design choices for a language.

Those logical operators that may or may not short-circuit.

One feature of my concept of 'multi-way select' is that there is one or
more controlling expressions which determine which path is followed.

So, I'd be interested in what you think of as a multi-way select which
may evaluate more than one branch. Or was it that 'or' example?

In general, an if-else-if chain (which was the point of the OP), would
evaluate only one branch.

It evaluates all the conditionals down the chain until it hits a "true" result, then evaluates the body of the "if" that matches, then skips the rest.

I don't count evaluating the conditionals: here it is the branches that
count (since it is one of those that is 'selected' via those
conditionals), and here you admit that only one is executed.

(Of course generated code can evaluate all sorts of things in different orders, as long as observable behaviour - side-effects - are correct.)

So would a switch-case construct if sensibly implemented (in C's
version, anything goes).

C's switch is perfectly simply and clearly defined. It is not "anything goes". The argument to the switch is evaluated once, then control jumps
to the label of the switch case, then evaluation continues from that point. It is totally straight-forward.

It's pretty much the complete opposite of straightforward, as you go on
to demonstrate.

C 'switch' looks like it might be properly structured if written
sensibly. The reality is different: what follows `switch (x)` is just
ONE C statement, often a compound statement.

Case labels can located ANYWHERE within that statement, including within nested statements (eg. inside a for-statement), and including
'default:', which could go before all the case labels!

The only place they can't go is within a further nested switch, which
has its own set of case-labels.

Control tranfers to any matching case-label or 'default:' and just keeps executing code within that ONE statement, unless it hits 'break;'.

It is totally chaotic. This is what I mean by 'anything goes'. This is a
valid switch statement for example: 'switch (x);'.

You can't use such a statement as a solid basis for a multi-way
construct that returns a value, since it is, in general, impossible to sensibly enumerate the N branches.

You might not like the "fall-through" concept or the way C's switch does
not quite fit with structured programming. If so, I'd agree entirely.

Good.

The requirement for lots of "break;" statements in most C switch uses is
a source of countless errors in C coding and IMHO a clear mistake in the language design. But that does not hinder C's switch statements from
being very useful, very easy to understand (when used sensibly), and
with no doubts about how they work (again, when used sensibly).

The same applies to C's c?a:b operator: only one of a or b is
evaluated, not both.

You are conflating several ideas, then you wrote something that you
/know/ is pure FUD about C's switch statements.

It wasn't. YOU wrote FUD when you called them straightforward. I would
bet you that the majority of C programmers don't know just how weird
switch is.

So writing "The same
applies" makes no sense.

'The same applies' was in reference to this previous remark of mine:

"In general, an if-else-if chain (which was the point of the OP), would evaluate only one branch. So would a switch-case construct if sensibly implemented (in C's version, anything goes). "

You are, of course, correct that in "c ? a : b", "c" is evaluated first
and then one and only one of "a" and "b".

And here you confirm that it does in fact apply: only one branch is
executed.

You can't apply it to C's switch as there is no rigorous way of even determining what is a branch. Maybe it is a span between 2 case labels?
But then, one of those might be in a different nested statement!

(This also why implementing if, switch, ?: via functions, which lots
are keen to do in the reddit PL forum, requires closures, lazy
evaluation or other advanced features.)

Yes, you'd need something like that to implement such "short-circuit" operators using functions in C. In other languages, things may be different.

Yes, short-circut operators would need the same features. That's why
it's easier to build this stuff into a core language than to try and
design a language where 90% of the features are there to implement what
should be core features.

But it becomes mandatory if the whole thing returns a value, to
satisfy the type system, because otherwise it will try and match with
'void'.

Your language, your choice.

These things tend to come about because that is the natural order that
comes through. It's something I observed rather than decided.

I'd question the whole idea of having a
construct that can evaluate to something of different types in the first place, whether or not it returns a value, but that's your choice.

If the result of a multi-way execution doesn't yield a value to be used,
then the types don't matter.

If it does, then they DO matter, as they have to be compatible types in
a static language.

This is just common sense; I don't know why you're questioning it. (I'd
quite like to see a language of your design!)

SOMETHING needs to happen when none of the branches are executed; what
value would be returned then? The behaviour needs to be defined. You
don't want to rely on compiler analysis for this stuff.

In my hypothetical language described above, it never happens that none
of the branches are executed.

Do you feel you need to write code like this?

const char * flag_to_text_A(bool b) {
    if (b == true) {
        return "It's true!";
    } else if (b == false) {
        return "It's false!";
    } else {
        return "Schrödinger's cat has escaped!";
    }
}

When you have your "else" or "default" clause that is added for
something that can't ever happen, how do you test it?

I write code like this:

func F(b) =
if X then
A # 'return' is optional
elsif Y then
B
fi
end

As it is, it requires 'else' (because this is a value-returning function.

X Y A B are arbitrary expressions. The need for 'else' is determined
during type analysis. Whether it will ever execute the default path
would be up to extra analysis, that I don't do, and would anyway be done later.

You can't design a language like this where valid syntax depends on
compiler and what it might or might not discover when analysing the code.

The rule instead is simple: where a multi-path construct yields a value,
then it needs the default branch, always.

A compiler /might/ figure out it isn't needed, and not generate that bit
of code. (Or as I suggested, it might insert a suitable branch.)

You seem to like putting the onus on compiler writers to have to analyse programs to the limit.

(Note that my example is for dynamic code; there X Y may only be known
at runtime anyway.)

In my languages, the last statement of a function can be arbitrarily
complex and nested; there could be dozens of points where a return value
is needed.

In C on the other hand, the ':' of '?:' is always needed, even when it
is not expected to yield a value. Hence you often see this things like
this:

    p == NULL ? puts("error"): 0;

Given that the tertiary operator chooses between two things, it seems
fairly obvious that you need two alternatives to choose from - having a choice operator without at least two choices would be rather useless.

It seems you are just arguing in the defence of C rather than
objectively, and being contradictory in the process.

For example, earlier you said I'm wrong to insist on a default path for multi-way ops when it is expected to yield a value. But here you say it
is 'obvious' for the ?: multi-way operator to insist on a default path
even when any value is not used.

This is on top of saying that I'm spreading 'FUD' about switch and that
is it really a perfectly straightforward feature!

Now *I* am wary of trusting your judgement.

I can't say I have ever seen the tertiary operator used like this. There
are a few C programmers that like to code with everything as
expressions, using commas instead of semicolons, but they are IMHO
mostly just being smart-arses. It's a lot more common to write :

    if (!p) puts("error");

Well, it happens, and I've seen it (and I've had to ensure my C compiler
deals with it when it comes up, which it has). Maybe some instances of
it are hidden behind macros.

Meanwhile I allow this (if I was keen on a compact form):

   (p = nil | print "error")

No else is needed.

In C you could write :

    p == NULL || puts("error");

which is exactly the same structure.

This is new to me. So this is another possibility for the OP?

It's an untidy feature however; it's abusing || in similar ways to those
who separate things with commas to avoid needing a compounds statement.

It is also error prone as it is untuitive: you probably meant one of:

p != NULL || puts("error");
p == NULL && puts("error");

There are also limitations: what follows || or || needs to be something
that returns a type that can be coerced to an 'int' type.

(Note that the '|' is my example is not 'or'; it means 'then':

( c | a ) # these are exactly equivalent
if c then a fi

( c | a | ) # so are these
if c then a else b fi

There is no restriction on what a and b are, statements or expressions,
unless the whole returns some value.)

I think all of these, including your construct in your language, are smart-arse choices compared to a simple "if" statement, but personal
styles and preferences vary.

C's if statement is rather limited. As it is only if-else, then
if-else-if sequences must be emulated using nested if-else-(if else (if else....

Misleading indentation needs to be used to stop nested if's disappearing
to the right. When coding style mandates braces around if branches, an exception needs to be made for if-else-if chains (otherwise you will end
up with }}}}}}}... at the end.

And the whole thing cannot return a value; a separate ?: feature (whose branches must be expressions) is needed.

It is also liable to 'dangling' else, and error prone due to braces
being optional.

It's a mess. By contrast, my if statements look like this:

if then elsif then ... [else] fi

'elsif' is a part of the syntax. The whole thing can return a value.
There is a compact form (not for elsif, that would be too much) as shown above.

--- Synchronet 3.20a-Linux NewsLink 1.114

From Tim Rentsch@tr.17687@z991.linuxsc.com to comp.lang.c on Sat Nov 2 14:25:53 2024

From Newsgroup: comp.lang.c

Tim Rentsch <tr.17687@z991.linuxsc.com> writes:

With the understanding that I am offering more than my own opinion,

that should be

With the understanding that I am offering nothing more than my own opinion,
--- Synchronet 3.20a-Linux NewsLink 1.114

From Tim Rentsch@tr.17687@z991.linuxsc.com to comp.lang.c on Sat Nov 2 15:08:38 2024

From Newsgroup: comp.lang.c

Bart <bc@freeuk.com> writes:

On 02/11/2024 11:41, David Brown wrote:

On 01/11/2024 20:47, Bart wrote:

On 01/11/2024 18:47, David Brown wrote:

On 01/11/2024 19:05, Bart wrote:

On 01/11/2024 17:35, David Brown wrote:

What you have written here is all correct, but a more common
method would be to avoid having three printf's :

void shout_a_number(int n) {
printf( (const char* []) { "ONE", "TWO", "THREE" } [n] );
}

That's more likely to match what people would want.

I was also trying to show that all elements are evaluated, so
each has to have some side-effect to illustrate that.

Fair enough.

A true N-way-select construct (C only really has ?:) would
evaluate only one, and would deal with an out-of-range condition.

That's a matter of opinion and design choice, rather than being
requirements for a "true" select construct.

I don't think it's just opinion.

Yes, it is.

I believe the phrase "N-way-select" would be understood by
most people to mean either exactly one or at most one out
of the N choices is put into effect. Saying it's just an
opinion is idiotic. We may not know how different people
would understand it, but how they understand it is something
that can be determined objectively, simply by asking them.

Then we disagree on what 'multi-way' select might mean. I think it
means branching, even if notionally, on one-of-N possible code paths.

The whole construct may or may not return a value. If it does, then
one of the N paths must be a default path.

Alternatively there could be an implicit default value. For
example, a hypothetical construct

( p; q; r; s; t )

where all of the variables are pointers, might return the first
pointer than is non-null, or a null pointer if all of them are
null (with an obvious generalization if the result expressions
and gating boolean expressions are distinct). Isn't this how
'cond' in Lisp works? Return the first expression whose guard
is non-nil, or nil if all the guards are nil.
--- Synchronet 3.20a-Linux NewsLink 1.114

From fir@fir@grunge.pl to Bart on Sun Nov 3 01:26:41 2024

From Newsgroup: comp.lang.c

Bart wrote:

...

as to this switch as i said the C jas some syntax that resembles switch
and it is

[2] { printf("one"), printf("two"), printf("three") }

i mean it is like this compound sometheng you posted

{ printf("one"), printf("two"), printf("three") } [2]

but with "key" on the left to ilustrate the analogy to

swich(n) {case 0: printf("one"); case 1: printf("two"); case 2:
rintf("three") }

imo the resemblance gives to think

the difference is this compound (array-like) example dont uses defined
keys so it semms some should be added

[n] {{1: printf("one")},{2: printf("two")},{3: printf("three")} }

so those deduction on switch gives the above imo

the question is if some things couldnt be ommitted for simplicity

[key] {'A': printf("one"); 'B': printf("two"); 'C': printf("three"}; }

something like that

(insted of

switch(key)
{
case 'A': printf("one"); break;
case 'B': printf("two"); break;
case 'C': printf("three"}; break;
}

--- Synchronet 3.20a-Linux NewsLink 1.114

From fir@fir@grunge.pl to Tim Rentsch on Sun Nov 3 01:53:16 2024

From Newsgroup: comp.lang.c

Tim Rentsch wrote:

fir <fir@grunge.pl> writes:

Bart wrote:

ral clear patterns here: you're testing the same variable 'n'
against several mutually exclusive alternatives, which also happen
to be consecutive values.

C is short of ways to express this, if you want to keep those
'somethings' as inline code (otherwise arrays of function pointers
or even label pointers could be use

so in short this groupo seem to have no conclusion but is tolerant
foir various approaches as it seems

imo the else latder is like most proper but i dont lkie it
optically, swich case i also dont like (use as far i i remember
never in my code, for years dont use even one)

so i persnally would use bare ifs and maybe elses ocasionally
(and switch should be mended but its fully not clear how,

I think you should have confidence in your own opinion. All
you're getting from other people is their opinion about what is
easier to understand, or "clear", or "readable", etc. As long as
the code is logically correct you are free to choose either
style, and it's perfectly okay to choose the one that you find
more appealing.

There is a case where using 'else' is necessary, when there is a
catchall action for circumstances matching "none of the above".
Alternatively a 'break' or 'continue' or 'goto' or 'return' may
be used to bypass subsequent cases, but you get the idea.

With the understanding that I am offering more than my own opinion,
I can say that I might use any of the patterns mentioned, depending
on circumstances. I don't think any one approach is either always
right or always wrong.

maybe, but some may heve some strong arguments (for use this and not
that) i may overlook
--- Synchronet 3.20a-Linux NewsLink 1.114

From fir@fir@grunge.pl to Bart on Sun Nov 3 02:21:45 2024

From Newsgroup: comp.lang.c

Bart wrote:

It's a mess. By contrast, my if statements look like this:

if then elsif then ... [else] fi

'elsif' is a part of the syntax. The whole thing can return a value.
There is a compact form (not for elsif, that would be too much) as shown above.

as to if when thinking of it the if construct has such parts

if X then S else E

and the keyword if is not necessary imo as the expression x return
logical value them then can be used on this without if

X then {}
X else {}

i would prefer to denote (at least temporerely) then as ->
and else as ~> then you can build construct like

a -> b -> c -> d ~> e ~> f

when the arrows take logical value of the left
(if a true then b, if be true then c if c true then d,if
d false then e and if e false then f)

but some need also to use else to some previous espression and
i think how it could be done but maybe just parenthesis can be used

a (->b->c) ~>z

if a true then b and if b true then c but if a false then z

--- Synchronet 3.20a-Linux NewsLink 1.114

From Bart@bc@freeuk.com to comp.lang.c on Sun Nov 3 11:47:41 2024

From Newsgroup: comp.lang.c

On 03/11/2024 01:21, fir wrote:

Bart wrote:

It's a mess. By contrast, my if statements look like this:

if then elsif then ... [else] fi

'elsif' is a part of the syntax. The whole thing can return a value.
There is a compact form (not for elsif, that would be too much) as shown
above.

as to if when thinking of it the if construct has such parts

if X then S else E

and the keyword if is not necessary imo as the expression x return
logical value them then can be used on this without if

X then {}
X else {}

i would prefer to denote (at least temporerely) then as ->
and else as ~> then you can build construct like

a -> b -> c -> d ~> e ~> f

when the arrows take logical value of the left
(if a true then b, if be true then c if c true then d,if
d false then e and if e false then f)

but some need also to use else to some previous espression and
i think how it could be done but maybe just parenthesis can be used

a (->b->c) ~>z

if a true then b and if b true then c but if a false then z

C already has this (I've added parentheses for clarity):

(a ? (b ? c : -) : z)

This shows you haven't provided a branch for b being false.

Also it's not clear if you intended for b to be evaluated twice; I've
assumed only once as it is nonsense otherwise.

--- Synchronet 3.20a-Linux NewsLink 1.114

From Bart@bc@freeuk.com to comp.lang.c on Sun Nov 3 12:03:03 2024

From Newsgroup: comp.lang.c

On 03/11/2024 00:26, fir wrote:

Bart wrote:

...

as to this switch as i said the C jas some syntax that resembles switch
and it is

[2] { printf("one"), printf("two"), printf("three") }

i mean it is like this compound sometheng you posted

{ printf("one"), printf("two"), printf("three") } [2]

but with "key" on the left to ilustrate the analogy to

swich(n) {case 0: printf("one"); case 1: printf("two"); case 2: rintf("three") }

imo the resemblance gives to think

the difference is this compound (array-like) example dont uses defined
keys so it semms some should be added

[n] {{1: printf("one")},{2: printf("two")},{3: printf("three")} }

so those deduction on switch gives the above imo

the question is if some things couldnt be ommitted for simplicity

[key] {'A': printf("one"); 'B': printf("two"); 'C': printf("three"}; }

something like that

(insted of

switch(key)
{
case 'A': printf("one"); break;
case 'B': printf("two"); break;
case 'C': printf("three"}; break;
}

Here the switch looks clearer. Write it with 300 cases instead of 3,
then that becomes obvious.

The first time I wrote a big C program, I used a syntax like this:

switch (x)
when 'A', 'B' then printf("one")
when 'C' then printf("two")
else printf("three")
endsw

This needed to be converted to normal C before compiling, but the macro
system wasn't quite up to the job (making using gnu C which allows for
lists of case labels).

Instead I used a script to do the conversion, which needed 1:1 line correspondence. The result was something like this:

switch (x) {
break; case 'A': case 'B': printf("one");
break; case 'C': printf("two");
break; default: printf("three");
}

--- Synchronet 3.20a-Linux NewsLink 1.114

From fir@fir@grunge.pl to comp.lang.c on Sun Nov 3 14:39:38 2024

From Newsgroup: comp.lang.c

Bart wrote:

On 03/11/2024 00:26, fir wrote:

Bart wrote:

...

as to this switch as i said the C jas some syntax that resembles
switch and it is

[2] { printf("one"), printf("two"), printf("three") }

i mean it is like this compound sometheng you posted

{ printf("one"), printf("two"), printf("three") } [2]

but with "key" on the left to ilustrate the analogy to

swich(n) {case 0: printf("one"); case 1: printf("two"); case 2:
rintf("three") }

imo the resemblance gives to think

the difference is this compound (array-like) example dont uses defined
keys so it semms some should be added

[n] {{1: printf("one")},{2: printf("two")},{3: printf("three")} }

so those deduction on switch gives the above imo

the question is if some things couldnt be ommitted for simplicity

[key] {'A': printf("one"); 'B': printf("two"); 'C': printf("three"}; }

something like that

(insted of

switch(key)
{
case 'A': printf("one"); break;
case 'B': printf("two"); break;
case 'C': printf("three"}; break;
}

Here the switch looks clearer. Write it with 300 cases instead of 3,
then that becomes obvious.

depend on what some understoods by clearer - imo not

this []{;;;} at least is like logically drawed from other c syntax

and switch case overally the word case is ok imo but the word switch is overrally like wrong imo switch could be better replaced by two
word "select" and maybe "goto" as this swich that selects could use
select and this one wgo does goto could use word goto

goto key;
'A': printf("a");
'B': printf("b");
'C': printf("c");
'

overally thete is lso possibility to do it such way

void foo()
{

"a" { printf("aaa"); } //definitions not calls itself
"b" { printf("bbb"); }
"c" { printf("ccc"); }

"a";"b";"c"; //calls (???)
// would need maybe some some syntax to call it (many could be chosen)

// "a"() ? foo."a" ? foo.[key] ?

maybe this woudl be the best if established as ths is more syntaktc "low lewel"

}

The first time I wrote a big C program, I used a syntax like this:

switch (x)
when 'A', 'B' then printf("one")
when 'C' then printf("two")
else printf("three")
endsw

This needed to be converted to normal C before compiling, but the macro system wasn't quite up to the job (making using gnu C which allows for
lists of case labels).

Instead I used a script to do the conversion, which needed 1:1 line correspondence. The result was something like this:

switch (x) {
break; case 'A': case 'B': printf("one");
break; case 'C': printf("two");
break; default: printf("three");
}

--- Synchronet 3.20a-Linux NewsLink 1.114

From fir@fir@grunge.pl to comp.lang.c on Sun Nov 3 14:46:59 2024

From Newsgroup: comp.lang.c

Bart wrote:

On 03/11/2024 01:21, fir wrote:

Bart wrote:

It's a mess. By contrast, my if statements look like this:

if then elsif then ... [else] fi

'elsif' is a part of the syntax. The whole thing can return a value.
There is a compact form (not for elsif, that would be too much) as shown >>> above.

as to if when thinking of it the if construct has such parts

if X then S else E

and the keyword if is not necessary imo as the expression x return
logical value them then can be used on this without if

X then {}
X else {}

i would prefer to denote (at least temporerely) then as ->
and else as ~> then you can build construct like

a -> b -> c -> d ~> e ~> f

when the arrows take logical value of the left
(if a true then b, if be true then c if c true then d,if
d false then e and if e false then f)

but some need also to use else to some previous espression and
i think how it could be done but maybe just parenthesis can be used

a (->b->c) ~>z

if a true then b and if b true then c but if a false then z

C already has this (I've added parentheses for clarity):

(a ? (b ? c : -) : z)

This shows you haven't provided a branch for b being false.

coz ypu dont need to provide such branch
in c yu need to put that ":" each time? (if so some error was discovered
as it ould be better to not oblige its existence)

Also it's not clear if you intended for b to be evaluated twice; I've
assumed only once as it is nonsense otherwise.

why twice? -> just goes forward on true and ~> goes forward on fales

a (->b ->c) ~>z

a is checked if true b is called if b is true c is called, if a was
false z is called (eveluated)

--- Synchronet 3.20a-Linux NewsLink 1.114

From fir@fir@grunge.pl to comp.lang.c on Sun Nov 3 14:55:46 2024

From Newsgroup: comp.lang.c

fir wrote:

bart wrote

C already has this (I've added parentheses for clarity):

(a ? (b ? c : -) : z)

This shows you haven't provided a branch for b being false.

coz ypu dont need to provide such branch
in c yu need to put that ":" each time? (if so some error was discovered
as it ould be better to not oblige its existence)

ye i checked and it seems you cant

x ? printf("aa");

you must write

x ? printf("aa"):0;

this also go
x ?: printf("bbb");

thsi is a mistake as "?" should go and ":" optionally

--- Synchronet 3.20a-Linux NewsLink 1.114

From fir@fir@grunge.pl to comp.lang.c on Sun Nov 3 15:17:46 2024

From Newsgroup: comp.lang.c

fir wrote:

Bart wrote:

On 03/11/2024 00:26, fir wrote:

Bart wrote:

...

as to this switch as i said the C jas some syntax that resembles
switch and it is

[2] { printf("one"), printf("two"), printf("three") }

i mean it is like this compound sometheng you posted

{ printf("one"), printf("two"), printf("three") } [2]

but with "key" on the left to ilustrate the analogy to

swich(n) {case 0: printf("one"); case 1: printf("two"); case 2:
rintf("three") }

imo the resemblance gives to think

the difference is this compound (array-like) example dont uses defined
keys so it semms some should be added

[n] {{1: printf("one")},{2: printf("two")},{3: printf("three")} }

so those deduction on switch gives the above imo

the question is if some things couldnt be ommitted for simplicity

[key] {'A': printf("one"); 'B': printf("two"); 'C': printf("three"}; }

something like that

(insted of

switch(key)
{
case 'A': printf("one"); break;
case 'B': printf("two"); break;
case 'C': printf("three"}; break;
}

Here the switch looks clearer. Write it with 300 cases instead of 3,
then that becomes obvious.

depend on what some understoods by clearer - imo not

this []{;;;} at least is like logically drawed from other c syntax

and switch case overally the word case is ok imo but the word switch is overrally like wrong imo switch could be better replaced by two
word "select" and maybe "goto" as this swich that selects could use
select and this one wgo does goto could use word goto

goto key;
'A': printf("a");
'B': printf("b");
'C': printf("c");
'

overally thete is lso possibility to do it such way

void foo()
{

"a" { printf("aaa"); } //definitions not calls itself
"b" { printf("bbb"); }
"c" { printf("ccc"); }

"a";"b";"c"; //calls (???)
// would need maybe some some syntax to call it (many could be chosen)

// "a"() ? foo."a" ? foo.[key] ?

maybe this woudl be the best if established as ths is more syntaktc "low lewel"

}

the calling sign shuld be needed as as for years i wanted to do
something like

void foo()
{
a {/**/}
b {/**/}
c {/**/}

a;b;b;c; //calls a calls b twice calls c

}

( you can alse make expressions of it a+b*c (where a,b,c are code blocks)

then if using

1 {/**/}
2 {/**/}
3 {/**/}

"A" {/**/}
"B" {/**/}
"CCC" {/**/}

it would clash with normal usage
lika printf("CCC");

so some operator to call it is needed (and it could wrk as swich if you
allow to call it on variable not immediate value

--- Synchronet 3.20a-Linux NewsLink 1.114

From David Brown@david.brown@hesbynett.no to comp.lang.c on Sun Nov 3 18:00:51 2024

From Newsgroup: comp.lang.c

On 02/11/2024 21:44, Bart wrote:

On 02/11/2024 11:41, David Brown wrote:

On 01/11/2024 20:47, Bart wrote:

On 01/11/2024 18:47, David Brown wrote:

On 01/11/2024 19:05, Bart wrote:

On 01/11/2024 17:35, David Brown wrote:

What you have written here is all correct, but a more common
method would be to avoid having three printf's :

void shout_a_number(int n) {
     printf( (const char* []) { "ONE", "TWO", "THREE" } [n] ); >>>>>> }

That's more likely to match what people would want.

I was also trying to show that all elements are evaluated, so each
has to have some side-effect to illustrate that.

Fair enough.

A true N-way-select construct (C only really has ?:) would evaluate >>>>> only one, and would deal with an out-of-range condition.

That's a matter of opinion and design choice, rather than being
requirements for a "true" select construct.

I don't think it's just opinion.

Yes, it is.

Then we disagree on what 'multi-way' select might mean. I think it means branching, even if notionally, on one-of-N possible code paths.

I would disagree on that definition, yes. A "multi-way selection" would
mean, to me, a selection of one of N possible things - nothing more than
that. It is far too general a phrase to say that it must involve
branching of some sort ("notional" or otherwise). And it is too general
to say if you are selecting one of many things to do, or doing many
things and selecting one.

The whole construct may or may not return a value. If it does, then one
of the N paths must be a default path.

No, that is simply incorrect. For one thing, you can say that it is
perfectly fine for the selection construct to return a value sometimes
and not at other times. It's fine if it never returns at all for some
cases. It's fine to give selection choices for all possible inputs.
It's fine to say that the input must be a value for which there is a choice.

What I see here is that you don't like C's constructs (that may be for
good reasons, it may be from your many misunderstandings about C, or it
may be from your knee-jerk dislike of everything C related). You have
some different selection constructs in your own language, which you /do/
like. (It would be very strange for you to have constructs that you
don't like in your own personal one-man language.)

And then you are leaping from "I like things this way" to "Everyone
likes them this way", "This way is the only correct way", "This is the
true definition", and "Other ways are clearly wrong".

I don't disagree that such an "select one of these and evaluate only
that" construct can be a useful thing, or a perfectly good alternative
to to an "evaluate all of these then select one of them" construct.
But you are completely wrong to think that one of these two is somehow
the "true" or only correct way to have a selection.

In some languages, the construct for "A or B" will evaluate both, then
"or" them. In other languages, it will evaluate "A" then only
evaluate "B" if necessary. In others, expressions "A" and "B" cannot
have side-effects, so the evaluation or not makes no difference. All
of these are perfectly valid design choices for a language.

Those logical operators that may or may not short-circuit.

Usually a language definition will be explicit about whether or not they short-circuit. Some may be vaguer, then it is only safe to use the
constructs when there are no side-effects involved. (Languages which do
not allow side-effects, such as pure functional programming languages,
may be intentionally vague.)

One feature of my concept of 'multi-way select' is that there is one or
more controlling expressions which determine which path is followed.

Okay, that's fine for /your/ language's multi-way select construct. But
other people and other languages may do things differently.

So, I'd be interested in what you think of as a multi-way select which
may evaluate more than one branch. Or was it that 'or' example?

The examples you started with in C - creating an array, then picking one
of them - is a perfectly good multi-way selection construct which
evaluates all the options and then picks the result of one of them.

In general, an if-else-if chain (which was the point of the OP),
would evaluate only one branch.

It evaluates all the conditionals down the chain until it hits a
"true" result, then evaluates the body of the "if" that matches, then
skips the rest.

I don't count evaluating the conditionals: here it is the branches that count (since it is one of those that is 'selected' via those
conditionals), and here you admit that only one is executed.

That seems an arbitrary choice - the controlling expressions for the conditionals also need to be evaluated.

And "admit" is a strange choice of words - are you trying to make it
sound like you have convinced me of something, or forced a concession of
some sort? I have merely being describing how C works here - this is a
matter of fact, not opinion or knowledge. We can all have opinions on
whether we like way C does things - and we may be persuaded to change
those opinions. But the reality of how C works is not up for debate.

(Of course generated code can evaluate all sorts of things in
different orders, as long as observable behaviour - side-effects - are
correct.)

So would a switch-case construct if sensibly implemented (in C's
version, anything goes).

C's switch is perfectly simply and clearly defined. It is not
"anything goes". The argument to the switch is evaluated once, then
control jumps to the label of the switch case, then evaluation
continues from that point. It is totally straight-forward.

It's pretty much the complete opposite of straightforward, as you go on
to demonstrate.

I described it in a single sentence. Though I skipped some details, it covered enough for almost all real-world uses of "switch".

C 'switch' looks like it might be properly structured if written
sensibly. The reality is different: what follows `switch (x)` is just
ONE C statement, often a compound statement.

Grammatically it is one statement, but invariably that one statement is
a compound statement.

Case labels can located ANYWHERE within that statement, including within nested statements (eg. inside a for-statement), and including
'default:', which could go before all the case labels!

Yes - the programmer has a responsibility to write sensible code.

There are plenty of C programmers - including me - who would have
preferred to have "switch" be a more structured construct which could
not be intertwined with other constructs in this way. That does not
mean "switch" is not clearly defined - nor does it hinder almost every real-world use of "switch" from being reasonably clear and structured.
It does, however, /allow/ people to use "switch" in more complex and
less clear ways.

The only place they can't go is within a further nested switch, which
has its own set of case-labels.

Control tranfers to any matching case-label or 'default:' and just keeps executing code within that ONE statement, unless it hits 'break;'.

It is totally chaotic. This is what I mean by 'anything goes'. This is a valid switch statement for example: 'switch (x);'.

You are confusing "this makes it possible to write messy code" with a
belief that messy code is inevitable or required. And you are
forgetting that it is always possible to write messy or incomprehensible
code in any language, with any construct.

You can't use such a statement as a solid basis for a multi-way
construct that returns a value, since it is, in general, impossible to sensibly enumerate the N branches.

It is simple and obvious to enumerate the branches in almost all
real-world cases of switch statements. (And /please/ don't faff around
with cherry-picked examples you have found somewhere as if they were representative of anything.)

You might not like the "fall-through" concept or the way C's switch
does not quite fit with structured programming. If so, I'd agree
entirely.

Good.

The requirement for lots of "break;" statements in most C switch uses
is a source of countless errors in C coding and IMHO a clear mistake
in the language design. But that does not hinder C's switch
statements from being very useful, very easy to understand (when used
sensibly), and with no doubts about how they work (again, when used
sensibly).

The same applies to C's c?a:b operator: only one of a or b is
evaluated, not both.

You are conflating several ideas, then you wrote something that you
/know/ is pure FUD about C's switch statements.

It wasn't.

Do you know what "FUD" means? "Fear, uncertainty and doubt". You
/know/ how switch statements work. You /know/ they are not hard to use,
or, in almost all cases, hard to understand. You have, so you claim, implemented a C compiler - surely you understand the way things work
here. You know perfectly well that it is not "anything goes". Yet you
write what is, at best, wild exaggeration about how difficult and incomprehensible it all is.

YOU wrote FUD when you called them straightforward. I would
bet you that the majority of C programmers don't know just how weird
switch is.

I would bet you that the vast majority of C programmers are perfectly
capable of using "switch" without trouble (other than, perhaps,
forgetting a "break" statement or two - something that is easily spotted
by automated tools as used by most competent programmers). I would bet
you that they are also capable of understanding the vast majority of real-world uses of "switch". I would bet the proportions here are not
much different from the use of, say, "for" loops in C.

I don't dispute that it is possible to do complicated things with
"switch" that can be hard to follow. Probably most C programmers would
have to think hard to understand how Duff's Device works, or some of the
macro and switch based libraries for making a poor man's coroutine
support for C.

But that's all like saying English is a difficult language because most
people don't understand the language of law court filings.

So writing "The same applies" makes no sense.

'The same applies' was in reference to this previous remark of mine:

"In general, an if-else-if chain (which was the point of the OP), would evaluate only one branch. So would a switch-case construct if sensibly implemented (in C's version, anything goes). "

You are, of course, correct that in "c ? a : b", "c" is evaluated
first and then one and only one of "a" and "b".

And here you confirm that it does in fact apply: only one branch is executed.

So if I understand correctly, you are saying that chains of if/else, an imaginary version of "switch", and the C tertiary operator all evaluate
the same things in the same way, while with C's switch you have no idea
what happens? That is true, if you cherry-pick what you choose to
ignore in each case until it fits your pre-conceived ideas.

You can't apply it to C's switch as there is no rigorous way of even determining what is a branch. Maybe it is a span between 2 case labels?
But then, one of those might be in a different nested statement!

(This also why implementing if, switch, ?: via functions, which lots
are keen to do in the reddit PL forum, requires closures, lazy
evaluation or other advanced features.)

Yes, you'd need something like that to implement such "short-circuit"
operators using functions in C. In other languages, things may be
different.

Yes, short-circut operators would need the same features. That's why
it's easier to build this stuff into a core language than to try and
design a language where 90% of the features are there to implement what should be core features.

So does that mean you are happy that C has language features for
short-circuit evaluation in particular cases?

But it becomes mandatory if the whole thing returns a value, to
satisfy the type system, because otherwise it will try and match with
'void'.

Your language, your choice.

These things tend to come about because that is the natural order that
comes through. It's something I observed rather than decided.

No, what you call "natural" is entirely subjective. You have looked at
a microscopic fraction of code written in a tiny proportion of
programming languages within a very narrow set of programming fields.
That's not criticism - few people have looked at anything more. The programming world is vast, and most of us have limited time for looking
at code. What I /do/ criticise is that your assumption that this almost negligible experience gives you the right to decide what is "natural" or "true", or how programming languages or tools "should" work. You need
to learn that other people have different ideas, needs, opinions or preferences.

I'd question the whole idea of having a construct that can evaluate
to something of different types in the first place, whether or not it
returns a value, but that's your choice.

If the result of a multi-way execution doesn't yield a value to be used, then the types don't matter.

Of course they do. If you have a typed language (not all programming languages place significant store on the concept of "type"), then "no
value" has to fit into the type. Sometimes that might be done in an
obvious way within the type - a NULL pointer in a pointer type, or a NaN
in a floating point type. Maybe the type you use is a tagged union of
some kind - a "sum type" in programming language theory parlance. Maybe
it is something like C++'s std::optional<>, or a Haskell "Maybe" type.
It's still all types.

If it does, then they DO matter, as they have to be compatible types in
a static language.

They have to be compatible in some way, according to the language rules.

This is just common sense; I don't know why you're questioning it. (I'd quite like to see a language of your design!)

def foo(n) :
if n == 1 : return 10
if n == 2 : return 20
if n == 3 : return

That's Python, quite happily having a multiple choice selection that
sometimes does not return a value. Yes, that is a dynamically typed
language, not a statically type language.

std::optional<int> foo(int n) {
if (n == 1) return 10;
if (n == 2) return 20;
if (n == 3) return {};
}

That's C++, a statically typed language, with a multiple choice
selection that sometimes does not return a value - the return type
supports values of type "int" and non-values.

const int * foo(int n) {
static const int ten = 10;
static const int twenty = 20;

if (n == 1) return &ten;
if (n == 2) return &twenty;
if (n == 3) return NULL;
}

In a language that I would design, I'd aim for something like the C++
version above but with core language support for sum types and optional
types, and type deduction for a case like this. (I'd also have other differences in syntax that are not relevant here.)

SOMETHING needs to happen when none of the branches are executed;
what value would be returned then? The behaviour needs to be defined.
You don't want to rely on compiler analysis for this stuff.

In my hypothetical language described above, it never happens that
none of the branches are executed.

Do you feel you need to write code like this?

const char * flag_to_text_A(bool b) {
     if (b == true) {
         return "It's true!";
     } else if (b == false) {
         return "It's false!";
     } else {
         return "Schrödinger's cat has escaped!";
     }
}

When you have your "else" or "default" clause that is added for
something that can't ever happen, how do you test it?

I write code like this:

   func F(b) =
      if X then
          A                # 'return' is optional
      elsif Y then
          B
      fi
   end

As it is, it requires 'else' (because this is a value-returning function.

X Y A B are arbitrary expressions. The need for 'else' is determined
during type analysis. Whether it will ever execute the default path
would be up to extra analysis, that I don't do, and would anyway be done later.

But if it is not possible for neither of X or Y to be true, then how
would you test the "else" clause? Surely you are not proposing that programmers be required to write lines of code that will never be
executed and cannot be tested?

You can't design a language like this where valid syntax depends on
compiler and what it might or might not discover when analysing the code.

Why not? It is entirely reasonable to say that a compiler for a
language has to be able to do certain types of analysis. It is
certainly more reasonable to require the compiler to do extra work than
to require the programmer to do extra work. (Unless by "a language like this", you mean "a language where the only user also implements the compiler".)

The rule instead is simple: where a multi-path construct yields a value, then it needs the default branch, always.

A compiler /might/ figure out it isn't needed, and not generate that bit
of code. (Or as I suggested, it might insert a suitable branch.)

You seem to like putting the onus on compiler writers to have to analyse programs to the limit.

I don't see that I have been putting much demand on the compiler here.
I am actually entirely happy with the concept of "undefined behaviour" -
if a function is specified to take an input in a certain range, and you
give it something outside that range, it is undefined behaviour. It is
just silly to force programmers to make up behaviour for something that
can't happen and has no meaning. (It is, however, very useful if the
tools can have a mode where they automatically add an "else" clause here
that exits with an error message - that kind of thing is helpful for debugging.)

I would prefer a language to support explicit formal specifications for
inputs and outputs of functions, and that it did analysis (and had
debugging options) to help find errors in the code - but those are not /required/ here.

(Note that my example is for dynamic code; there X Y may only be known
at runtime anyway.)

In my languages, the last statement of a function can be arbitrarily
complex and nested; there could be dozens of points where a return value
is needed.

In C on the other hand, the ':' of '?:' is always needed, even when
it is not expected to yield a value. Hence you often see this things
like this:

    p == NULL ? puts("error"): 0;

Given that the tertiary operator chooses between two things, it seems
fairly obvious that you need two alternatives to choose from - having
a choice operator without at least two choices would be rather useless.

It seems you are just arguing in the defence of C rather than
objectively, and being contradictory in the process.

For example, earlier you said I'm wrong to insist on a default path for multi-way ops when it is expected to yield a value. But here you say it
is 'obvious' for the ?: multi-way operator to insist on a default path
even when any value is not used.

The "? :" tertiary operator in C is /not/ a multi-way operator - it is a two-way choice. The clue is in the name often used - "tertiary" - which
means it has three operands. (To be fair, the C standards refer to it
as the "conditional operator", but they are very clear that it has three operands.)

This is on top of saying that I'm spreading 'FUD' about switch and that
is it really a perfectly straightforward feature!

Now *I* am wary of trusting your judgement.

Then RTFM to confirm what I write about C.

I can't say I have ever seen the tertiary operator used like this.
There are a few C programmers that like to code with everything as
expressions, using commas instead of semicolons, but they are IMHO
mostly just being smart-arses. It's a lot more common to write :

     if (!p) puts("error");

Well, it happens, and I've seen it (and I've had to ensure my C compiler deals with it when it comes up, which it has). Maybe some instances of
it are hidden behind macros.

And I've seen a three-legged dog. But generally, I'd say that dogs have
four legs.

Meanwhile I allow this (if I was keen on a compact form):

   (p = nil | print "error")

No else is needed.

In C you could write :

     p == NULL || puts("error");

which is exactly the same structure.

This is new to me. So this is another possibility for the OP?

Seriously? You might not have seen such a construct in real code (I
haven't, as far as I can recall), but it's not particularly hard to
imagine - especially not when you suggest exactly the same thing in your language! The logic is the opposite of the original use of C's
conditional operator or the "if" version, but I assume it is the same
logic as your code.

(I believe in Perl it is common to write "do_something or die()" as a structure.)

It's an untidy feature however; it's abusing || in similar ways to those
who separate things with commas to avoid needing a compounds statement.

It's the same thing as in /your/ language, with the example /you/ gave!
I didn't suggest it or recommend it, I merely showed that what you
seemed to think is a marvellous unique feature of your special language
is ordinary C code. I don't know that I would describe it as "untidy",
and it is certainly not an "abuse" of the operator, but I would not feel
it is a particularly clear way to write the code in question, and I have
no intention of using it myself. (I'd use the "if" statement.)

It is also error prone as it is untuitive: you probably meant one of:

      p != NULL || puts("error");
      p == NULL && puts("error");

I assumed that in your language, "|" means "or", but that you have
simply chosen other rules for distinguishing bit-wise and logical "or".
So I copied that.

From what you write below, however, it seems that is not the case - you
are using the common "|" operator in a very uncommon manner that would
be unexpected and confusing to people with experience in any other
programming language.

There are also limitations: what follows || or || needs to be something
that returns a type that can be coerced to an 'int' type.

Sure. In C, that's fine for most commonly used types. (Again, I don't recommend the syntax.)

(Note that the '|' is my example is not 'or'; it means 'then':

   ( c |    a )          # these are exactly equivalent
   if c then a fi

   ( c |    a |    )     # so are these
   if c then a else b fi

There is no restriction on what a and b are, statements or expressions, unless the whole returns some value.)

Ah, so your language has a disastrous choice of syntax here so that
sometimes "a | b" means "or", and sometimes it means "then" or
"implies", and sometimes it means "else". Why have a second syntax with
a confusing choice of operators when you have a perfectly good "if /
then / else" syntax? Or if you feel an operator adds a lot to the
language here, why not choose one that would make sense to people, such
as "=>" - the common mathematical symbol for "implies".

I think all of these, including your construct in your language, are
smart-arse choices compared to a simple "if" statement, but personal
styles and preferences vary.

C's if statement is rather limited. As it is only if-else, then
if-else-if sequences must be emulated using nested if-else-(if else (if else....

if (a) {
...
} else if (b) {
...
} else if (c) {
...
}

Do you think C programming would be revolutionised if there was an
"elseif" or "elif" keyword? Feel free to put :

#define elif else if

at the top of your code.

(Or use your own spelling, "elsif".)

Misleading indentation needs to be used to stop nested if's disappearing
to the right. When coding style mandates braces around if branches, an exception needs to be made for if-else-if chains (otherwise you will end
up with }}}}}}}... at the end.

This is not rocket science. I've known ten year olds that complain less
about how difficult it is to learn programming.

And the whole thing cannot return a value; a separate ?: feature (whose branches must be expressions) is needed.

It is also liable to 'dangling' else, and error prone due to braces
being optional.

It's a mess. By contrast, my if statements look like this:

   if then elsif then ... [else] fi

'elsif' is a part of the syntax. The whole thing can return a value.
There is a compact form (not for elsif, that would be too much) as shown above.

There are a hundred and one different ways to make the syntax for
conditionals in a programming language. And there are dozens of choices
to be made regarding the distinction, or not, between statements and expressions, and what can return or evaluate to values. There are pros
and cons of all of these, and supporters and detractors of them all.

Anyone who is convinced that their own personal preferences are more
"natural" or inherently superior to all other alternatives, and can't
justify their claims other than saying that everything else is "a mess",
is just navel-gazing.

--- Synchronet 3.20a-Linux NewsLink 1.114

From Kaz Kylheku@643-408-1753@kylheku.com to comp.lang.c on Sun Nov 3 19:41:57 2024

From Newsgroup: comp.lang.c

On 2024-11-03, David Brown <david.brown@hesbynett.no> wrote:

like. (It would be very strange for you to have constructs that you
don't like in your own personal one-man language.)

It would, until you document it and other people use it. Then it can
happen that you regret some designs (for instance, because of the way
they clash with something shiny and new you would like to do)
but you are stuck, because you have to keep things working.
--
TXR Programming Language: http://nongnu.org/txr
Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
Mastodon: @Kazinator@mstdn.ca
--- Synchronet 3.20a-Linux NewsLink 1.114

From Bart@bc@freeuk.com to comp.lang.c on Sun Nov 3 20:00:26 2024

From Newsgroup: comp.lang.c

On 03/11/2024 17:00, David Brown wrote:

On 02/11/2024 21:44, Bart wrote:

I would disagree on that definition, yes. A "multi-way selection" would mean, to me, a selection of one of N possible things - nothing more than that. It is far too general a phrase to say that it must involve
branching of some sort ("notional" or otherwise).

Not really. If the possible options involving actions written in-line,
and you only want one of those executed, then you need to branch around
the others!

And it is too general
to say if you are selecting one of many things to do, or doing many
things and selecting one.

Sorry, but this is the key part. You are not evaluating N things and
selecting one; you are evaluating ONLY one of N things.

The former is more like construction a list, the latter is multi-way
select. It's illustrated with this bit of scripting code:

x := ( printf("one"), printf("two"), printf("three"))[3]
println
println =x

y := (3 | printf("one"), printf("two"), printf("three") | printf("default"))
println
println =y

The output is:

onetwothree
X= 5
three
Y= 5

For X, it builds a list by evaluating all the elements, and returns the
value of the last. For Y, it evaluates only ONE element (using internal switch, so branching), which again is the last.

You don't seem keen on keeping these concepts distinct?

Well, C has EXACTLY those same concepts, and I can come up with similar examples in C (I may even have posted some), for example using compound literal for the first, and a regular if-else or nested ?: chain for the second. (But only ?: can return that '5'.)

The whole construct may or may not return a value. If it does, then
one of the N paths must be a default path.

No, that is simply incorrect. For one thing, you can say that it is perfectly fine for the selection construct to return a value sometimes
and not at other times.

How on earth is that going to satisfy the type system? You're saying
it's OK to have this:

int x = if (randomfloat()<0.5) 42;

Or even this, which was discussed recently, and which is apparently valid C:

int F(void) {
if (randomfloat()<0.5) return 42;

In the first example, you could claim that no assignment takes place
with a false condition (so x contains garbage). In the second example,
what value does F return when the condition is false?

You can't hide behind your vast hyper-optimising compiler; the language
needs to say something about it.

My language will not allow it. Most people would say that that's a good
thing. You seem to want to take the perverse view that such code should
be allowed to return garbage values or have undefined behaviour.

After all, this is C! But please tell me, what would be the downside of
not allowing it?

It's fine if it never returns at all for some

cases. It's fine to give selection choices for all possible inputs.
It's fine to say that the input must be a value for which there is a
choice.

What I see here is that you don't like C's constructs (that may be for
good reasons, it may be from your many misunderstandings about C, or it
may be from your knee-jerk dislike of everything C related).

With justification. 0010 means 8 in C? Jesus.

It's hardly knee-jerk either since I first looked at it in 1982, when my
own language barely existed. My opinion has not improved.

(I remarked elsewhere that C is like the Trump of programming languages;
there is no amount of crassness that will turn people off it!)

You have
some different selection constructs in your own language, which you /do/ like. (It would be very strange for you to have constructs that you
don't like in your own personal one-man language.)

It's a one-man language but most of its constructs and features are
universal. And therefore can be used for comparison.

One feature of my concept of 'multi-way select' is that there is one
or more controlling expressions which determine which path is followed.

Okay, that's fine for /your/ language's multi-way select construct. But other people and other languages may do things differently.

FGS, /how/ different? To select WHICH path or which element requires
some input. That's the controlling expression.

Or maybe with your ideal language, you can select an element of an array without bothering to provide an index!

There are plenty of C programmers - including me - who would have
preferred to have "switch" be a more structured construct which could
not be intertwined with other constructs in this way. That does not
mean "switch" is not clearly defined - nor does it hinder almost every real-world use of "switch" from being reasonably clear and structured.
It does, however, /allow/ people to use "switch" in more complex and
less clear ways.

Try and write a program which takes any arbitrary switch construct (that usually means written by someone else, because obviously all yours will
be sensible), and cleanly isolates all the branches including the
default branch.

Hint: the lack of 'break' in a non-empty span between two case labels
will blur the line. So will a conditional break (example below unless
it's been culled).

You are confusing "this makes it possible to write messy code" with a
belief that messy code is inevitable or required. And you are
forgetting that it is always possible to write messy or incomprehensible code in any language, with any construct.

I can't write that randomfloat example in my language. I can't leave out
a 'break' in a switch statement (it's not meaningful). It is impossible
to do the crazy things you can do with switch in C.

Yes, with most languages you can write nonsense programs, but that
doesn't give the language a licence to forget basic rules and common
sense, and just allow any old rubbish even if clearly wrong:

int F() {
F(1, 2.3, "four", F,F,F,F(),F(F()));
F(42);
}

This is apparently valid C. It is impossible to write this in my language.

You can't use such a statement as a solid basis for a multi-way
construct that returns a value, since it is, in general, impossible to
sensibly enumerate the N branches.

It is simple and obvious to enumerate the branches in almost all
real-world cases of switch statements. (And /please/ don't faff around with cherry-picked examples you have found somewhere as if they were representative of anything.)

Oh, right. I'm not allowed to use counter-examples to lend weight to my comments. In that case, perhaps you shouldn't be allowed to use your
sensible examples either. After all we don't know what someone will feed
to a compiler.

But, suppose C was upgraded so that switch could return a value. For
that, you'd need the value at the end of each branch. OK, here's a
simple one:

y = switch (x) {
case 12:
if (c) case 14: break;
100;
case 13:
200;
break;
}

Any ideas? I will guess that x=12/c=false or c=13 will yield 200. What
avout x=12/c=true, or x=14, or x = anything else?

So if I understand correctly, you are saying that chains of if/else, an imaginary version of "switch", and the C tertiary operator all evaluate
the same things in the same way, while with C's switch you have no idea
what happens?

Yes. With C's switch, you can't /in-general/ isolate things into
distinct blocks. You might have a stab if you stick to a subset of C and follow various guidelines, in an effort to make 'switch' look normal.

See the example above.

That is true, if you cherry-pick what you choose to
ignore in each case until it fits your pre-conceived ideas.

You're the one who's cherry-picking examples of C! Here is my attempt at converting the above switch into my syntax (using a tool derived from my
C compiler):

switch x
when 12 then
if c then

fi
100
fallthrough
when 13 then
200
end switch

It doesn't attempt to deal with fallthrough, and misses out that
14-case, and that conditional break. It's not easy; I might have better
luck with assembly!

No, what you call "natural" is entirely subjective. You have looked at
a microscopic fraction of code written in a tiny proportion of
programming languages within a very narrow set of programming fields.

I've worked with systems programming and have done A LOT in the 15 years
until the mid 90s. That included pretty much everything involved in
writing graphical applications given only a text-based disk OS that
provided file-handling.

Plus of course devising and implementing everthing needed to run my own systems language. (After mid 90s, Windows took over half the work.)

That's not criticism - few people have looked at anything more.

Very few people use their own languages, especially over such a long
period, also use them to write commercial applications, or create
languages for others to use.

What I /do/ criticise is that your assumption that this almost
negligible experience gives you the right to decide what is "natural" or "true", or how programming languages or tools "should" work.

So, in your opinion, 'switch' should work how it works in C? That is the
most intuitive and natural way implementing it?

You need
to learn that other people have different ideas, needs, opinions or preferences.

Most people haven't got a clue about devising PLs.

I'd question the whole idea of having a construct that can evaluate
to something of different types in the first place, whether or not it
returns a value, but that's your choice.

If the result of a multi-way execution doesn't yield a value to be
used, then the types don't matter.

Of course they do.

Of course they don't! Here, F, G and H return int, float and void* respectively:

if (c1) F();
else if (c2) G();
else H();

C will not complain that those branches yield different types. But you
say it should do? Why?

You're just being contradictory for the sake of it aren't you?!

This is just common sense; I don't know why you're questioning it.
(I'd quite like to see a language of your design!)

def foo(n) :
    if n == 1 : return 10
    if n == 2 : return 20
    if n == 3 : return

That's Python, quite happily having a multiple choice selection that sometimes does not return a value.

Python /always/ returns some value. If one isn't provided, it returns
None. Which means checking that a function returns an explicit value
goes out the window. Delete the 10 and 20 (or the entire body), and it
still 'works'.

Yes, that is a dynamically typed
language, not a statically type language.

std::optional<int> foo(int n) {
    if (n == 1) return 10;
    if (n == 2) return 20;
    if (n == 3) return {};
}

That's C++, a statically typed language, with a multiple choice
selection that sometimes does not return a value - the return type
supports values of type "int" and non-values.

So what happens when n is 4? Does it return garbage (so that's bad).
Does it arrange to return some special value of 'optional' that means no value? In that case, the type still does matter, but the language is
providing that default path for you.

X Y A B are arbitrary expressions. The need for 'else' is determined
during type analysis. Whether it will ever execute the default path
would be up to extra analysis, that I don't do, and would anyway be
done later.

But if it is not possible for neither of X or Y to be true, then how
would you test the "else" clause? Surely you are not proposing that programmers be required to write lines of code that will never be
executed and cannot be tested?

Why not? They still have to write 'end', or do you propose that can be
left out if control never reaches the end of the function?!

(In earlier versions of my dynamic language, the compiler would insert
an 'else' branch if one was needed, returning 'void'.

I decided that requiring an explicit 'else' branch was better and more failsafe.)

You can't design a language like this where valid syntax depends on
compiler and what it might or might not discover when analysing the code.

Why not? It is entirely reasonable to say that a compiler for a
language has to be able to do certain types of analysis.

This was the first part of your example:

const char * flag_to_text_A(bool b) {
if (b == true) {
return "It's true!";
} else if (b == false) {
return "It's false!";

/I/ would question why you'd want to make the second branch conditional
in the first place. Write an 'else' there, and the issue doesn't arise.

Because I can't see the point of deliberately writing code that usually
takes two paths, when either:

(1) you know that one will never be taken, or
(2) you're not sure, but don't make any provision in case it is

Fix that first rather relying on compiler writers to take care of your
badly written code.

And also, you keep belittling my abilities and my language, when C allows:

int F(void) {}

How about getting your house in order first.

Anyone who is convinced that their own personal preferences are more "natural" or inherently superior to all other alternatives, and can't justify their claims other than saying that everything else is "a mess",
is just navel-gazing.

I wrote more here but the post is already too long. Let's just that
'messy' is a fair assessment of C's conditional features, since you can
write this:

if (c1) {
if (c2) s1;
else if (c3)
s2;
}

What does it even mean?

The decisions *I* made resulted in a better syntax with fewer ways to inadvertently get things wrong or writing ambiguous code. You for some
reason call that navel-gazing.

--- Synchronet 3.20a-Linux NewsLink 1.114

From Tim Rentsch@tr.17687@z991.linuxsc.com to comp.lang.c on Sun Nov 3 20:00:55 2024

From Newsgroup: comp.lang.c

fir <fir@grunge.pl> writes:

Tim Rentsch wrote:

fir <fir@grunge.pl> writes:

Bart wrote:

ral clear patterns here: you're testing the same variable 'n'
against several mutually exclusive alternatives, which also happen
to be consecutive values.

C is short of ways to express this, if you want to keep those
'somethings' as inline code (otherwise arrays of function pointers
or even label pointers could be use

so in short this groupo seem to have no conclusion but is tolerant
foir various approaches as it seems

imo the else latder is like most proper but i dont lkie it
optically, swich case i also dont like (use as far i i remember
never in my code, for years dont use even one)

so i persnally would use bare ifs and maybe elses ocasionally
(and switch should be mended but its fully not clear how,

I think you should have confidence in your own opinion. All
you're getting from other people is their opinion about what is
easier to understand, or "clear", or "readable", etc. As long as
the code is logically correct you are free to choose either
style, and it's perfectly okay to choose the one that you find
more appealing.

There is a case where using 'else' is necessary, when there is a
catchall action for circumstances matching "none of the above".
Alternatively a 'break' or 'continue' or 'goto' or 'return' may
be used to bypass subsequent cases, but you get the idea.

With the understanding that I am offering more than my own opinion,
I can say that I might use any of the patterns mentioned, depending
on circumstances. I don't think any one approach is either always
right or always wrong.

maybe, but some may heve some strong arguments (for use this and not
that) i may overlook

I acknowledge the point, but you haven't gotten any arguments,
only opinions.
--- Synchronet 3.20a-Linux NewsLink 1.114

From David Brown@david.brown@hesbynett.no to comp.lang.c on Mon Nov 4 08:18:34 2024

From Newsgroup: comp.lang.c

On 04/11/2024 05:00, Tim Rentsch wrote:

fir <fir@grunge.pl> writes:

Tim Rentsch wrote:

fir <fir@grunge.pl> writes:

Bart wrote:

ral clear patterns here: you're testing the same variable 'n'
against several mutually exclusive alternatives, which also happen
to be consecutive values.

C is short of ways to express this, if you want to keep those
'somethings' as inline code (otherwise arrays of function pointers
or even label pointers could be use

so in short this groupo seem to have no conclusion but is tolerant
foir various approaches as it seems

imo the else latder is like most proper but i dont lkie it
optically, swich case i also dont like (use as far i i remember
never in my code, for years dont use even one)

so i persnally would use bare ifs and maybe elses ocasionally
(and switch should be mended but its fully not clear how,

I think you should have confidence in your own opinion. All
you're getting from other people is their opinion about what is
easier to understand, or "clear", or "readable", etc. As long as
the code is logically correct you are free to choose either
style, and it's perfectly okay to choose the one that you find
more appealing.

There is a case where using 'else' is necessary, when there is a
catchall action for circumstances matching "none of the above".
Alternatively a 'break' or 'continue' or 'goto' or 'return' may
be used to bypass subsequent cases, but you get the idea.

With the understanding that I am offering more than my own opinion,
I can say that I might use any of the patterns mentioned, depending
on circumstances. I don't think any one approach is either always
right or always wrong.

maybe, but some may heve some strong arguments (for use this and not
that) i may overlook

I acknowledge the point, but you haven't gotten any arguments,
only opinions.

There have been /some/ justifications for some of the opinions - but
much of it has been merely opinions. And other people's opinions and
thoughts can be inspirational in forming your own opinions.

Once the OP (or anyone else) has looked at these, and garnered the ideas floated around, he might then modify his own opinions and preferences as
a result. In the end, however, you are right that it is the OP's own
opinions and preferences that should guide the style of the code - only
he knows what the real code is, and what might suit best for the task in
hand.

--- Synchronet 3.20a-Linux NewsLink 1.114

From Bart@bc@freeuk.com to comp.lang.c on Mon Nov 4 11:56:03 2024

From Newsgroup: comp.lang.c

On 04/11/2024 04:00, Tim Rentsch wrote:

fir <fir@grunge.pl> writes:

Tim Rentsch wrote:

With the understanding that I am offering more than my own opinion,
I can say that I might use any of the patterns mentioned, depending
on circumstances. I don't think any one approach is either always
right or always wrong.

maybe, but some may heve some strong arguments (for use this and not
that) i may overlook

I acknowledge the point, but you haven't gotten any arguments,
only opinions.

Pretty much everything about PL design is somebody's opinion.

Somebody may try to argue about a particular approach or feature being
more readable, easier to understand, to implement, more ergonomic, more intuitive, more efficient, more maintainable etc, but they are never
going to convince anyone who has a different view or who is too used to another approach.

In this case, it was about how to express a coding pattern in a
particular language, as apparently the OP didn't like writing the 'else'
in 'else if', and they didn't like using 'switch'.

You are trying to argue against somebody's personal preference; that's
never going to go well. Even when you use actual facts, such as having
the wrong behaviour when those 'somethings' do certain things.

Here, the question was, can:

if (c1) s1;
else if (c2) s2;

always be rewritten as:

if (c1) s1;
if (c2) s2;

In general, the answer has to be No. But when the OP doens't like that
answer, what can you do?

Even when the behaviour is the same for a particular set of c1/c2/s1/s2,
the question then was: is it always going to be as efficient (since c2
may be sometimes be evaluated unnessarily). Then it depends on quality
of implementation, another ill-defined factor.

--- Synchronet 3.20a-Linux NewsLink 1.114

From Janis Papanagnou@janis_papanagnou+ng@hotmail.com to comp.lang.c on Mon Nov 4 13:29:09 2024

From Newsgroup: comp.lang.c

On 04.11.2024 12:56, Bart wrote:

[...]

Here, the question was, can:

if (c1) s1;
else if (c2) s2;

always be rewritten as:

if (c1) s1;
if (c2) s2;

Erm, no. The question was even more specific. It had (per example)
not only all ci disjunct but also defined as a linear sequence of
natural numbers! - In other languages [than "C"] this may be more
important since [historically] there were specific constructs for
that case; see e.g. 'switch' definitions in Simula, or the 'case'
statement of Algol 68, both mapping elements onto an array[1..N];
labels in the first case, and expressions in the latter case. So
in "C" we could at least consider using something similar, like,
say, arrays of function pointers indexed by those 'n'. (Not that
I'd suggest that by just pointing it out.)

I'm a bit astonished, BTW, about this huge emphasis on the topic
"opinions" in later posts of this thread. The OP asked (even in
the subject) about "practice" which actually invites if not asks
for providing opinions (besides practical experiences).

(He also asked about two specific aspects; performance and terse
code. Answers to that can already be derived from various posts'
answers.)

Janis

[...]

--- Synchronet 3.20a-Linux NewsLink 1.114

From Bart@bc@freeuk.com to comp.lang.c on Mon Nov 4 12:38:06 2024

From Newsgroup: comp.lang.c

On 04/11/2024 12:29, Janis Papanagnou wrote:

On 04.11.2024 12:56, Bart wrote:

[...]

Here, the question was, can:

if (c1) s1;
else if (c2) s2;

always be rewritten as:

if (c1) s1;
if (c2) s2;

Erm, no. The question was even more specific.

I mean that the question came down to this. After all he had already
decided on that second form rather than the first, and had acknowledged
that the 'else's were missing.

That the OP's example contained some clear patterns has already been
covered (I did so anyway).

It had (per example)
not only all ci disjunct but also defined as a linear sequence of
natural numbers! - In other languages [than "C"] this may be more
important since [historically] there were specific constructs for
that case; see e.g. 'switch' definitions in Simula, or the 'case'
statement of Algol 68, both mapping elements onto an array[1..N];
labels in the first case, and expressions in the latter case. So
in "C" we could at least consider using something similar, like,
say, arrays of function pointers indexed by those 'n'.

That too!

! (Not that

I'd suggest that by just pointing it out.)

I'm a bit astonished, BTW, about this huge emphasis on the topic
"opinions" in later posts of this thread. The OP asked (even in
the subject) about "practice" which actually invites if not asks
for providing opinions (besides practical experiences).

--- Synchronet 3.20a-Linux NewsLink 1.114

From Janis Papanagnou@janis_papanagnou+ng@hotmail.com to comp.lang.c on Mon Nov 4 13:40:48 2024

From Newsgroup: comp.lang.c

On 02.11.2024 19:09, Tim Rentsch wrote:

[...] As long as
the code is logically correct you are free to choose either
style, and it's perfectly okay to choose the one that you find
more appealing.

This is certainly true for one-man-shows. Hardly suited for most
professional contexts I worked in. (Just my experience, of course.
And people are free to learn things the Hard Way, if they prefer.)

Janis

--- Synchronet 3.20a-Linux NewsLink 1.114

From Janis Papanagnou@janis_papanagnou+ng@hotmail.com to comp.lang.c on Mon Nov 4 13:46:34 2024

From Newsgroup: comp.lang.c

On 04.11.2024 13:38, Bart wrote:

That the OP's example contained some clear patterns has already been
covered (I did so anyway).

I haven't read every post, even if I occasionally take some time
to catch up.[*]

Janis

[*] Threads in this group, even for trivial things, tend to get
band-worms and individual posts often very longish.

--- Synchronet 3.20a-Linux NewsLink 1.114

From fir@fir@grunge.pl to Bart on Mon Nov 4 16:02:16 2024

From Newsgroup: comp.lang.c

Bart wrote:

On 04/11/2024 04:00, Tim Rentsch wrote:

fir <fir@grunge.pl> writes:

Tim Rentsch wrote:

With the understanding that I am offering more than my own opinion,
I can say that I might use any of the patterns mentioned, depending
on circumstances. I don't think any one approach is either always
right or always wrong.

maybe, but some may heve some strong arguments (for use this and not
that) i may overlook

I acknowledge the point, but you haven't gotten any arguments,
only opinions.

Pretty much everything about PL design is somebody's opinion.

overally when you think and discuss such thing some conclusions may do
appear - and often soem do for me, though they are not always very clear
or 'hard'

overally from this thread i noted that switch (which i already dont
liked) is bad.. note those two elements of switch it is "switch"
and "Case" are in weird not obvious relation in c (and what will it
work when you mix it etc)

what i concluded was than if you do thing such way

a { } //this is analogon to case - named block
b { } //this is analogon to case - named block
n() // here by "()" i noted call of some wariable that mey yeild
'call' to a ,b, c, d, e, f //(in that case na would be soem enum or
pointer)
c( ) //this is analogon to case - named block
d( ) //this is analogon to case - named block

then everything is clear - this call just selects and calls block , and
block itself are just definitions and are skipped in execution until
"called"

this is example of some conclusion for me from thsi thread - and i think
such codes as this my own initial example should be probably done such
way (though it is not c, i know

--- Synchronet 3.20a-Linux NewsLink 1.114

From fir@fir@grunge.pl to Bart on Mon Nov 4 16:06:56 2024

From Newsgroup: comp.lang.c

fir wrote:

Bart wrote:

On 04/11/2024 04:00, Tim Rentsch wrote:

fir <fir@grunge.pl> writes:

Tim Rentsch wrote:

With the understanding that I am offering more than my own opinion,
I can say that I might use any of the patterns mentioned, depending
on circumstances. I don't think any one approach is either always
right or always wrong.

maybe, but some may heve some strong arguments (for use this and not
that) i may overlook

I acknowledge the point, but you haven't gotten any arguments,
only opinions.

Pretty much everything about PL design is somebody's opinion.

overally when you think and discuss such thing some conclusions may do
appear - and often soem do for me, though they are not always very clear
or 'hard'

overally from this thread i noted that switch (which i already dont
liked) is bad.. note those two elements of switch it is "switch"
and "Case" are in weird not obvious relation in c (and what will it
work when you mix it etc)

what i concluded was than if you do thing such way

a { } //this is analogon to case - named block
b { } //this is analogon to case - named block
n() // here by "()" i noted call of some wariable that mey yeild
'call' to a ,b, c, d, e, f //(in that case na would be soem enum or
pointer)
c( ) //this is analogon to case - named block
d( ) //this is analogon to case - named block

then everything is clear - this call just selects and calls block , and
block itself are just definitions and are skipped in execution until
"called"

this is example of some conclusion for me from thsi thread - and i think
such codes as this my own initial example should be probably done such
way (though it is not c, i know

note in fact both array usage like tab[5] and fuunction call like foo()
are analogues to swich case - as when you call fuctions the call is like switch and function definition sets are 'cases'

--- Synchronet 3.20a-Linux NewsLink 1.114

From Bart@bc@freeuk.com to comp.lang.c on Mon Nov 4 15:21:37 2024

From Newsgroup: comp.lang.c

On 04/11/2024 15:06, fir wrote:

fir wrote:

Bart wrote:

On 04/11/2024 04:00, Tim Rentsch wrote:

fir <fir@grunge.pl> writes:

Tim Rentsch wrote:

With the understanding that I am offering more than my own opinion, >>>>>> I can say that I might use any of the patterns mentioned, depending >>>>>> on circumstances. I don't think any one approach is either always >>>>>> right or always wrong.

maybe, but some may heve some strong arguments (for use this and not >>>>> that) i may overlook

I acknowledge the point, but you haven't gotten any arguments,
only opinions.

Pretty much everything about PL design is somebody's opinion.

overally when you think and discuss such thing some conclusions may do
appear - and often soem do for me, though they are not always very clear
or 'hard'

overally from this thread i noted that switch (which i already dont
liked) is bad.. note those two elements of switch it is "switch"
and "Case" are in weird not obvious relation in c (and what will it
work when you mix it etc)

what i concluded was than if you do thing such way

a { } //this is analogon to case - named block
b { } //this is analogon to case - named block
n() // here by "()" i noted call of some wariable that mey yeild
'call' to a ,b, c, d, e, f //(in that case na would be soem enum or
pointer)
c( ) //this is analogon to case - named block
d( ) //this is analogon to case - named block

then everything is clear - this call just selects and calls block , and
block itself are just definitions and are skipped in execution until
"called"

this is example of some conclusion for me from thsi thread - and i think
such codes as this my own initial example should be probably done such
way (though it is not c, i know

note in fact both array usage like tab[5] and fuunction call like foo()
are analogues to swich case - as when you call fuctions the call is like switch and function definition sets are 'cases'

Yes, switch could be implemented via a table of label pointers, but it
needs a GNU extension.

For example this switch:

#include <stdio.h>

int main(void) {
for (int i=0; i<10; ++i) {
switch(i) {
case 7: case 2: puts("two or seven"); break;
case 5: puts("five"); break;
default: puts("other");
}
}
}

Could also be written like this:

#include <stdio.h>

int main(void) {
void* table[] = {
&&Lother, &&Lother, &&L27, &&Lother, &&Lother, &&L5,
&&Lother, &&L27, &&Lother, &&Lother};

for (int i=0; i<10; ++i) {
goto *table[i];

L27: puts("two or seven"); goto Lend;
L5: puts("five"); goto Lend;
Lother: puts("other");
Lend:;
}
}

(A compiler may generate something like this, although it will be range-checked if need. In practice, small numbers of cases, or where the
case values are too spread out, might be implemented as if-else chains.)

--- Synchronet 3.20a-Linux NewsLink 1.114

From fir@fir@grunge.pl to Bart on Mon Nov 4 16:34:46 2024

From Newsgroup: comp.lang.c

fir wrote:

Bart wrote:

On 04/11/2024 04:00, Tim Rentsch wrote:

fir <fir@grunge.pl> writes:

Tim Rentsch wrote:

With the understanding that I am offering more than my own opinion,
I can say that I might use any of the patterns mentioned, depending
on circumstances. I don't think any one approach is either always
right or always wrong.

maybe, but some may heve some strong arguments (for use this and not
that) i may overlook

I acknowledge the point, but you haven't gotten any arguments,
only opinions.

Pretty much everything about PL design is somebody's opinion.

overally when you think and discuss such thing some conclusions may do
appear - and often soem do for me, though they are not always very clear
or 'hard'

overally from this thread i noted that switch (which i already dont
liked) is bad.. note those two elements of switch it is "switch"
and "Case" are in weird not obvious relation in c (and what will it
work when you mix it etc)

what i concluded was than if you do thing such way

a { } //this is analogon to case - named block
b { } //this is analogon to case - named block
n() // here by "()" i noted call of some wariable that mey yeild
'call' to a ,b, c, d, e, f //(in that case na would be soem enum or
pointer)
c( ) //this is analogon to case - named block
d( ) //this is analogon to case - named block

second wersion would be the one based on labels and goto

a:
b:
n!
c:
d:

gere n! wuld symbolize goto n and the different operator means dfference
among "call" ang "jmp" on assembly level and lack of block
would denote lack on ret on assembly level

im not sure byut maybe that those two versions span all needed
(not sure as to this, but as said one sxpresses jumps and one calls
on assembly level

then everything is clear - this call just selects and calls block , and
block itself are just definitions and are skipped in execution until
"called"

this is example of some conclusion for me from thsi thread - and i think
such codes as this my own initial example should be probably done such
way (though it is not c, i know

--- Synchronet 3.20a-Linux NewsLink 1.114

From fir@fir@grunge.pl to Bart on Mon Nov 4 16:52:17 2024

From Newsgroup: comp.lang.c

Bart wrote:

On 04/11/2024 15:06, fir wrote:

fir wrote:

Bart wrote:

On 04/11/2024 04:00, Tim Rentsch wrote:

fir <fir@grunge.pl> writes:

Tim Rentsch wrote:

With the understanding that I am offering more than my own opinion, >>>>>>> I can say that I might use any of the patterns mentioned, depending >>>>>>> on circumstances. I don't think any one approach is either always >>>>>>> right or always wrong.

maybe, but some may heve some strong arguments (for use this and not >>>>>> that) i may overlook

I acknowledge the point, but you haven't gotten any arguments,
only opinions.

Pretty much everything about PL design is somebody's opinion.

overally when you think and discuss such thing some conclusions may do
appear - and often soem do for me, though they are not always very clear >>> or 'hard'

overally from this thread i noted that switch (which i already dont
liked) is bad.. note those two elements of switch it is "switch"
and "Case" are in weird not obvious relation in c (and what will it
work when you mix it etc)

what i concluded was than if you do thing such way

a { } //this is analogon to case - named block
b { } //this is analogon to case - named block
n() // here by "()" i noted call of some wariable that mey yeild
'call' to a ,b, c, d, e, f //(in that case na would be soem enum or
pointer)
c( ) //this is analogon to case - named block
d( ) //this is analogon to case - named block

then everything is clear - this call just selects and calls block , and
block itself are just definitions and are skipped in execution until
"called"

this is example of some conclusion for me from thsi thread - and i think >>> such codes as this my own initial example should be probably done such
way (though it is not c, i know

note in fact both array usage like tab[5] and fuunction call like foo()
are analogues to swich case - as when you call fuctions the call is
like switch and function definition sets are 'cases'

Yes, switch could be implemented via a table of label pointers, but it
needs a GNU extension.

For example this switch:

#include <stdio.h>

int main(void) {
for (int i=0; i<10; ++i) {
switch(i) {
case 7: case 2: puts("two or seven"); break;
case 5: puts("five"); break;
default: puts("other");
}
}
}

Could also be written like this:

#include <stdio.h>

int main(void) {
void* table[] = {
&&Lother, &&Lother, &&L27, &&Lother, &&Lother, &&L5,
&&Lother, &&L27, &&Lother, &&Lother};

for (int i=0; i<10; ++i) {
goto *table[i];

L27: puts("two or seven"); goto Lend;
L5: puts("five"); goto Lend;
Lother: puts("other");
Lend:;
}
}

(A compiler may generate something like this, although it will be range-checked if need. In practice, small numbers of cases, or where the
case values are too spread out, might be implemented as if-else chains.)

probably swich is implemented like

push __out__ //to simulate return under out_ adress
cmp eax, "A"
je __A__
cmp eax, "B"
je __B__
cmp eax, "C"
je __C__
__out___:
...
...
...

if elkse ladder would do the same i guess
and sequence f ifs would not push __out__ if
not detected it can ans those cases for sure may not appear

its waste to check a long sequance of compares it someones unlucky
though if the argument of switch is like 8bit wide it is probably no
problem to put the labels in the teble and callvia the table

--- Synchronet 3.20a-Linux NewsLink 1.114

From David Brown@david.brown@hesbynett.no to comp.lang.c on Mon Nov 4 17:35:44 2024

From Newsgroup: comp.lang.c

On 03/11/2024 21:00, Bart wrote:

On 03/11/2024 17:00, David Brown wrote:

On 02/11/2024 21:44, Bart wrote:

I would disagree on that definition, yes. A "multi-way selection"
would mean, to me, a selection of one of N possible things - nothing
more than that. It is far too general a phrase to say that it must
involve branching of some sort ("notional" or otherwise).

Not really. If the possible options involving actions written in-line,
and you only want one of those executed, then you need to branch around
the others!

And if it does /not/ involve actions "in-line", or if the semantics of
the selection say that all parts are evaluated before the selection,
then it would /not/ involve branching. I did not say that multi-way selections cannot involve branching - I said that the phrase "multi-way selection" is too vague to say that branches are necessary.

And it is too general to say if you are selecting one of many things
to do, or doing many things and selecting one.

Sorry, but this is the key part. You are not evaluating N things and selecting one; you are evaluating ONLY one of N things.

I understand that this is key to what /you/ mean by "multi-way
selection". And if I thought that was what that phrase meant, then I'd
agree with you on many of your other points.

If you have some objective justification for insisting that the phrase
has a particular common meaning that rules out the possibility of first creating N "things" and then selecting from them, then I would like to
hear about it. Until then, I will continue to treat it as a vague
phrase without a specific meaning, and repeating your assertions won't
change my mind.

To my mind, this is a type of "multi-way selection" :

(const int []){ a, b, c }[n];

I can't see any good reason to exclude it as fitting the descriptive
phrase. And if "a", "b" and "c" are not constant, but require
evaluation of some sort, it does not change things. Of course if these required significant effort to evaluate, or had side-effects, then you
would most likely want a "multi-way selection" construction that did the selection first, then the evaluation - but that's a matter of programmer choice, and does not change the terms. (For some situations, such as
vector processing or SIMD work, doing the calculations before the
selection may be more time-efficient even if most of the results are
then discarded.)

For X, it builds a list by evaluating all the elements, and returns the value of the last. For Y, it evaluates only ONE element (using internal switch, so branching), which again is the last.

You don't seem keen on keeping these concepts distinct?

I am very keen on keeping the concepts distinct in cases where it
matters. So they should be given distinct names or terms - or at least,
clear descriptive phrases should be used to distinguish them.

At the moment, you are saying that an "pet" is a four-legged creature
that purrs, and getting worked up when I some pets are dogs. It doesn't matter how much of a cat person you are, there /are/ other kinds of pets.

It doesn't matter how keen you are on making the selection before the evaluation, or how often it is the better choice, you can't impose
arbitrary restrictions on a general phrase.

The whole construct may or may not return a value. If it does, then
one of the N paths must be a default path.

No, that is simply incorrect. For one thing, you can say that it is
perfectly fine for the selection construct to return a value sometimes
and not at other times.

How on earth is that going to satisfy the type system? You're saying
it's OK to have this:

   int x = if (randomfloat()<0.5) 42;

In C, no. But when we have spread to other languages, including
hypothetical languages, there's nothing to stop that. Not only could it
be supported by the run-time type system, but it would be possible to
have compile-time types that are more flexible and only need to be "solidified" during code generation. That might allow the language to
track things like "uninitialised" or "no value" during compilation
without having them part of a real type (such as std::optional<> or a C
struct with a "valid" field). All sorts of things are possible in a programming language when you don't always think in terms of direct translation from source to assembly.

Or even this, which was discussed recently, and which is apparently
valid C:

   int F(void) {
       if (randomfloat()<0.5) return 42;

Presumably you meant to add the closing } here ? Yes, that is valid C,
but it is undefined behaviour to use the value of F() if a value was not returned.

In the first example, you could claim that no assignment takes place
with a false condition (so x contains garbage). In the second example,
what value does F return when the condition is false?

It doesn't return a value. That is why it is UB to try to use that non-existent value.

You can't hide behind your vast hyper-optimising compiler; the language needs to say something about it.

I am not suggesting any kind of "hyper-optimising" compiler. I am
suggesting that it is perfectly possible for a language to be defined in
a way that is different from your limited ideas (noting that your style
of language is not hugely different from C, at least in this aspect).

My language will not allow it. Most people would say that that's a good thing. You seem to want to take the perverse view that such code should
be allowed to return garbage values or have undefined behaviour.

Is your idea of "most people" based on a survey of more than one person?

Note that I have not suggested returning garbage values - I have
suggested that a language might support handling "no value" in a
convenient and safe manner. Many languages already do, though of course
it is debatable how safe, convenient or efficient the chosen solution
is. I've already given examples of std::optional<> in C++, Maybe types
in Haskell, null pointers in C, and you can add exceptions to that list
as a very different way of allowing functions to exit without returning
a value.

Totally independent of and orthogonal to that, I strongly believe that
there is no point in trying to define behaviour for something that
cannot happen, or for situations where there is no correct behaviour.
The principle of "garbage in, garbage out" was established by Babbage's
time, and the concept of functions that do not have defined values for
all inputs is as at least as old as the concept of mathematical function
- it goes back to the first person who realised you can't divide by
zero. The concept of UB is no more and no less than this.

After all, this is C! But please tell me, what would be the downside of
not allowing it?

Are you asking what are the downsides of always requiring a returned
value of a specific type? Do you mean in addition to the things I have already written?

It's fine if it never returns at all for some

cases. It's fine to give selection choices for all possible inputs.
It's fine to say that the input must be a value for which there is a
choice.

What I see here is that you don't like C's constructs (that may be for
good reasons, it may be from your many misunderstandings about C, or
it may be from your knee-jerk dislike of everything C related).

With justification. 0010 means 8 in C? Jesus.

I think the word "neighbour" is counter-intuitive to spell. Therefore
we should throw out the English language, because it is all terrible,
and it only still exists because some people insist on using it rather
than my own personal language of gobbledegook.

That's the knee-jerk "logic" you use in discussions about C. (Actually,
it's worse than that - you'd reject English because you think the word "neighbour" is spelt with three "u's", or because you once saw it misspelt.)

It's hardly knee-jerk either since I first looked at it in 1982, when my
own language barely existed. My opinion has not improved.

It's been knee-jerk all the time I have known you in this group.

Of course some of your criticisms of the language will be shared by
others - that's true of any language that has ever been used. And
different people will dislike different aspects of the language. But
you are unique in hating everything about C simply because it is in C.

You have some different selection constructs in your own language,
which you /do/ like. (It would be very strange for you to have
constructs that you don't like in your own personal one-man language.)

It's a one-man language but most of its constructs and features are universal. And therefore can be used for comparison.

Once a thread here has wandered this far off-topic, it is perhaps not unreasonable to draw comparisons with your one-man language. But it is
not as useful as comparisons to real languages that other people might
be familiar with, or can at least read a little about.

The real problem with your language is that you think it is perfect, and
that everyone else should agree that it is perfect, and that any
language that does something differently is clearly wrong and inferior.
This hinders you from thinking outside the box you have build for yourself.

One feature of my concept of 'multi-way select' is that there is one
or more controlling expressions which determine which path is followed.

Okay, that's fine for /your/ language's multi-way select construct.
But other people and other languages may do things differently.

FGS, /how/ different? To select WHICH path or which element requires
some input. That's the controlling expression.

Have you been following this thread at all? Clearly a "multi-way
select" must have an input to choose the selection. But it does /not/
have to be a choice of a path for execution or evaluation.

When someone disagrees with a statement you made, please try to think a
little about which part of it they disagree with.

Or maybe with your ideal language, you can select an element of an array without bothering to provide an index!

There are plenty of C programmers - including me - who would have
preferred to have "switch" be a more structured construct which could
not be intertwined with other constructs in this way. That does not
mean "switch" is not clearly defined - nor does it hinder almost every
real-world use of "switch" from being reasonably clear and structured.
It does, however, /allow/ people to use "switch" in more complex and
less clear ways.

Try and write a program which takes any arbitrary switch construct (that usually means written by someone else, because obviously all yours will
be sensible), and cleanly isolates all the branches including the
default branch.

No. I am well aware that the flexibility of C's switch, and the
fall-through mechanism, make it more effort to parse and handle algorithmically than if it were more structured. That has no bearing on whether or not the meaning is clearly defined, or whether the majority
of real-world uses of "switch" are fairly easy to follow.

Hint: the lack of 'break' in a non-empty span between two case labels
will blur the line. So will a conditional break (example below unless
it's been culled).

You are confusing "this makes it possible to write messy code" with a
belief that messy code is inevitable or required. And you are
forgetting that it is always possible to write messy or
incomprehensible code in any language, with any construct.

I can't write that randomfloat example in my language.

Okay.

I can't leave out
a 'break' in a switch statement (it's not meaningful). It is impossible
to do the crazy things you can do with switch in C.

Okay - although I can't see why you'd have a "break" statement here in
the first place.

As I've said many times, I'd prefer it if C's switches were more structured.

None of that has any bearing on other types of multi-way selection
constructs.

Yes, with most languages you can write nonsense programs, but that
doesn't give the language a licence to forget basic rules and common
sense, and just allow any old rubbish even if clearly wrong:

   int F() {
       F(1, 2.3, "four", F,F,F,F(),F(F()));
       F(42);
   }

This is apparently valid C. It is impossible to write this in my language.

It is undefined behaviour in C. Programmers are expected to write
sensible code.

I am confident that if I knew your language, I could write something meaningless. But just as with C, doing so would be pointless.

You can't use such a statement as a solid basis for a multi-way
construct that returns a value, since it is, in general, impossible
to sensibly enumerate the N branches.

It is simple and obvious to enumerate the branches in almost all
real-world cases of switch statements. (And /please/ don't faff
around with cherry-picked examples you have found somewhere as if they
were representative of anything.)

Oh, right. I'm not allowed to use counter-examples to lend weight to my comments. In that case, perhaps you shouldn't be allowed to use your sensible examples either. After all we don't know what someone will feed
to a compiler.

We /do/ know that most people would feed sensible code to compilers.

But, suppose C was upgraded so that switch could return a value. For
that, you'd need the value at the end of each branch. OK, here's a
simple one:

    y = switch (x) {
        case 12:
            if (c) case 14: break;
            100;
        case 13:
            200;
            break;
        }

Any ideas? I will guess that x=12/c=false or c=13 will yield 200. What
avout x=12/c=true, or x=14, or x = anything else?

What exactly is your point here? Am I supposed to be impressed that you
can add something to C and then write meaningless code with that extension?

So if I understand correctly, you are saying that chains of if/else,
an imaginary version of "switch", and the C tertiary operator all
evaluate the same things in the same way, while with C's switch you
have no idea what happens?

Yes. With C's switch, you can't /in-general/ isolate things into
distinct blocks. You might have a stab if you stick to a subset of C and follow various guidelines, in an effort to make 'switch' look normal.

See the example above.

You /can/ isolate things into distinct blocks, with occasional
fall-throughs, when you look at code people actually write. No one
writes code like your example above, so no one needs to be able to
interpret it.

Occasionally, people use "switch" statements in C for fancy things, like coroutines. Then the logic flow can be harder to follow, but it is for
niche cases. People don't randomly mix switches with other structures.

That is true, if you cherry-pick what you choose to ignore in each
case until it fits your pre-conceived ideas.

You're the one who's cherry-picking examples of C!

I haven't even given any examples.

Here is my attempt at
converting the above switch into my syntax (using a tool derived from my
C compiler):

    switch x
    when 12 then
        if c then

        fi
        100
        fallthrough
    when 13 then
        200
    end switch

It doesn't attempt to deal with fallthrough, and misses out that
14-case, and that conditional break. It's not easy; I might have better
luck with assembly!

No, what you call "natural" is entirely subjective. You have looked
at a microscopic fraction of code written in a tiny proportion of
programming languages within a very narrow set of programming fields.

I've worked with systems programming and have done A LOT in the 15 years until the mid 90s. That included pretty much everything involved in
writing graphical applications given only a text-based disk OS that
provided file-handling.

I know you have done a fair bit of programming. That does not change
what I said. (And I am not claiming that I have programmed in a wider
range of fields than you.)

Plus of course devising and implementing everthing needed to run my own systems language. (After mid 90s, Windows took over half the work.)

That's not criticism - few people have looked at anything more.

Very few people use their own languages, especially over such a long
period, also use them to write commercial applications, or create
languages for others to use.

When you use your own language, that inevitably /restricts/ your
experience with other programmers and other code. It is not a positive
thing in this context.

What I /do/ criticise is that your assumption that this almost
negligible experience gives you the right to decide what is "natural"
or "true", or how programming languages or tools "should" work.

So, in your opinion, 'switch' should work how it works in C? That is the most intuitive and natural way implementing it?

No, I think there is quite a bit wrong with the way C's "switch"
statement works.

I don't think there is a single "most intuitive" or "most natural" way
to achieve a multi-way execution path selection statement in a language
- because "intuitive" and "natural" are highly subjective. There are syntaxes, features and limitations that I think would be a reasonable
fit in C, but those could well be very different in other languages.

You need to learn that other people have different ideas, needs,
opinions or preferences.

Most people haven't got a clue about devising PLs.

I think you'd be surprised. Designing a general-purpose programming
language is not a small or easy task, and making a compiler is certainly
a big job. But you'd search far and wide to find an experienced
programmer who doesn't have opinions or ideas about languages and how
they might like to change them.

I'd question the whole idea of having a construct that can
evaluate to something of different types in the first place, whether
or not it returns a value, but that's your choice.

If the result of a multi-way execution doesn't yield a value to be
used, then the types don't matter.

Of course they do.

Of course they don't! Here, F, G and H return int, float and void* respectively:

        if (c1) F();
   else if (c2) G();
   else         H();

C will not complain that those branches yield different types. But you
say it should do? Why?

Those branches don't yield different types in C. In C, branches don't
"yield" anything. Any results from calling these functions are, in
effect, cast to void.

You're just being contradictory for the sake of it aren't you?!

No, but I think you are having great difficulty understanding what I
write. Maybe that's my fault as much as yours.

This is just common sense; I don't know why you're questioning it.
(I'd quite like to see a language of your design!)

def foo(n) :
     if n == 1 : return 10
     if n == 2 : return 20
     if n == 3 : return

That's Python, quite happily having a multiple choice selection that
sometimes does not return a value.

Python /always/ returns some value. If one isn't provided, it returns
None. Which means checking that a function returns an explicit value
goes out the window. Delete the 10 and 20 (or the entire body), and it
still 'works'.

"None" is the Python equivalent of "no value".

Maybe you are thinking about returning an unspecified value of a type
such as "int", rather than returning no value?

Yes, that is a dynamically typed language, not a statically type
language.

std::optional<int> foo(int n) {
     if (n == 1) return 10;
     if (n == 2) return 20;
     if (n == 3) return {};
}

That's C++, a statically typed language, with a multiple choice
selection that sometimes does not return a value - the return type
supports values of type "int" and non-values.

So what happens when n is 4? Does it return garbage (so that's bad).

It is undefined behaviour, as you would expect. (In my hypothetical
language that had better handling for "no value", falling off the end of
the function would return "no value" - in C++, that's std::nullopt,
which is what you get with "return {};" here.)

Does it arrange to return some special value of 'optional' that means no value?

No. C++ rules for function returns are similar to C's, but a little
stricter - you are not allowed to fall off the end of a non-void
function (excluding main(), constructors, destructors and coroutines).
If you break the rules, there is no defined behaviour.

The "return {};" returns the special "std::nullopt;" value (converted to
the actual std::optional<T> type) that means "no value".

Roughly speaking, a C++ std::optional<T> is like a C struct:

struct {
bool valid;
T value;
}

In that case, the type still does matter, but the language is
providing that default path for you.

X Y A B are arbitrary expressions. The need for 'else' is determined
during type analysis. Whether it will ever execute the default path
would be up to extra analysis, that I don't do, and would anyway be
done later.

But if it is not possible for neither of X or Y to be true, then how
would you test the "else" clause? Surely you are not proposing that
programmers be required to write lines of code that will never be
executed and cannot be tested?

Why not? They still have to write 'end', or do you propose that can be
left out if control never reaches the end of the function?!

I'm guessing that "end" here is part of the syntax of your function definitions in your language. That's not executable code, but part of
the syntax.

(In earlier versions of my dynamic language, the compiler would insert
an 'else' branch if one was needed, returning 'void'.

I decided that requiring an explicit 'else' branch was better and more failsafe.)

You can't design a language like this where valid syntax depends on
compiler and what it might or might not discover when analysing the
code.

Why not? It is entirely reasonable to say that a compiler for a
language has to be able to do certain types of analysis.

This was the first part of your example:

const char * flag_to_text_A(bool b) {
    if (b == true) {
        return "It's true!";
    } else if (b == false) {
        return "It's false!";

/I/ would question why you'd want to make the second branch conditional
in the first place. Write an 'else' there, and the issue doesn't arise.

Perhaps I want to put it there for symmetry.

Because I can't see the point of deliberately writing code that usually takes two paths, when either:

(1) you know that one will never be taken, or
(2) you're not sure, but don't make any provision in case it is

Fix that first rather relying on compiler writers to take care of your
badly written code.

I am not expecting anything from compiler writers here. I am asking
/you/ why you want to force /programmers/ to write extra code that they
know is useless.

And also, you keep belittling my abilities and my language, when C allows:

int F(void) {}

How about getting your house in order first.

If I were the designer of the C language and the maintainer of the C standards, you might have a point. C is not /my/ language.

Anyone who is convinced that their own personal preferences are more
"natural" or inherently superior to all other alternatives, and can't
justify their claims other than saying that everything else is "a
mess", is just navel-gazing.

I wrote more here but the post is already too long.

Ah, a point that we can agree on 100% :-)

Let's just that
'messy' is a fair assessment of C's conditional features, since you can write this:

No, let's not just say that.

We can agree that C /lets/ people write messy code. It does not
/require/ it. And I have never found a programming language that stops
people writing messy code.

--- Synchronet 3.20a-Linux NewsLink 1.114

From Bart@bc@freeuk.com to comp.lang.c on Mon Nov 4 19:50:40 2024

From Newsgroup: comp.lang.c

On 04/11/2024 16:35, David Brown wrote:

On 03/11/2024 21:00, Bart wrote:

To my mind, this is a type of "multi-way selection" :

    (const int []){ a, b, c }[n];

I can't see any good reason to exclude it as fitting the descriptive
phrase.

And if "a", "b" and "c" are not constant, but require
evaluation of some sort, it does not change things. Of course if these required significant effort to evaluate,

Or you had a hundred of them.

or had side-effects, then you
would most likely want a "multi-way selection" construction that did the selection first, then the evaluation - but that's a matter of programmer choice, and does not change the terms.

You still don't get how different the concepts are. Everybody is
familiar with N-way selection when it involves actions, eg. statements. Because they will be in form of a switch statement, or an if-else chain.

They will expect one branch only to evaluated. Otherwise, there's no
point in a selection or condition, if all will be evaluated anyway!

But I think they are less familiar with the concept when it mainly
involves expressions, and the whole thing returns a value.

The only such things in C are the ?: operator, and those compound
literals. And even then, those do not allow arbitrary statements.

Here is a summary of C vs my language.

In C, 0 or 1 branches will be evaluated (except for ?: where it is
always 1.)

In M, 0 or 1 branches are evaluated, unless it yields a value or lvalue,
then it must be 1 (and those need an 'else' branch):

C M

if-else branches can be exprs/stmts Y Y
if-else can yield a value N Y
if-else can be an lvalue N Y

?: branches can be exprs/stmts Y Y (M's is a form of if)
?: can yield a value Y Y
?: can be an lvalue N Y (Only possible in C++)

switch branches can have expr/stmts Y Y
switch can yield a value N Y
switch can be an lvalue N Y

select can have exprs/stmts - Y (Does not exist in C)
select can yield a value - Y
select can be an lvalue - Y

case-select has exprs/stmts - Y
case-select can yield a value - Y
case-select can be an lvalue - Y

15 Ys in the M column, vs 4 Ys in the C column, with only 1 for value-returning. You can see why C users might be less familiar with the concepts.

I am very keen on keeping the concepts distinct in cases where it
matters.

I know, you like to mix things up. I like clear lines:

func F:int ... Always returns a value
proc P ... Never returns a value

    int x = if (randomfloat()<0.5) 42;

In C, no. But when we have spread to other languages, including hypothetical languages, there's nothing to stop that. Not only could it
be supported by the run-time type system, but it would be possible to
have compile-time types that are more flexible

This is a program from my 1990s scripting language which was part of my
CAD application:

n := 999
x := (n | 10, 20, 30)
println x

This uses N-way select (and evaluating only one option!). But there is
no default path (it is added by the bytecode compiler).

The output, since n is out of range, is this:

<Void>

In most arithmetic, using a void value is an error, so it's likely to
soon go wrong. I now require a default branch, as that is safer.

and only need to be
"solidified" during code generation. That might allow the language to track things like "uninitialised" or "no value" during compilation
without having them part of a real type (such as std::optional<> or a C

But you are always returning an actual type in agreement with the
language. That is my point. You're not choosing to just fall off that
cliff and return garbage or just crash.

However, your example with std::optional did just that, despite having
that type available.

It doesn't return a value. That is why it is UB to try to use that non-existent value.

And why it is so easy to avoid that UB.

My language will not allow it. Most people would say that that's a
good thing. You seem to want to take the perverse view that such code
should be allowed to return garbage values or have undefined behaviour.

Is your idea of "most people" based on a survey of more than one person?

So, you're suggesting that "most people" would prefer a language that
lets you do crazy, unsafe things for no good reason? That is, unless you prefer to fall off that cliff I keep talking about.

The fact is, I spend a lot of time implementing this stuff, but I
wouldn't even know how to express some of the odd things in C. My
current C compiler uses a stack-based IL. Given this:

#include <stdio.h>

int F(void){}

int main(void) {
int a;
a=F();
printf("%d\n", a);
}

It just about works when generating native code (I'm not quite sure
how); but it just returns whatever garbage is in the register:

c:\cxp>cc -run t # here runs t.c as native code in memory
1900545

But the new compiler can also directly interpret that stack IL:

c:\cxp>cc -runp t
PC Exec error: RETF/SP mismatch: old=3 curr=2 seqno: 7

The problem is that the call-function handling expects a return value to
have been pushed. But nothing has been pushed in this case. And the
language doesn't allow me to detect that.

(My compiler could detect some cases, but not all, and even it it could,
it would report false positives of a missing return, for functions that
did always return early.)

So this is a discontinuity in the language, a schism, an exception that shouldn't be there. It's unnatural. It looked off to me, and it is off
in practice, so it's not just an opinion.

To fix this would require my always pushing some dummy value at the
closing } of the function, if the operand stack is empty at that point.

Which is sort of what you are claiming you don't want to impose on the programmer. But it looks like it's needed anyway, otherwise the function
is out of kilter.

Note that I have not suggested returning garbage values - I have
suggested that a language might support handling "no value" in a
convenient and safe manner.

But in C it is garbage. And I've show an example of my language handling
'no value' in a scheme from the 1990s; I decided to require an explicit
'else' branch, which you seem to think is some kind of imposition.

Well, it won't kill you, and it can make programs more failsafe. It is
also friendly to compilers that aren't 100MB monsters.

Totally independent of and orthogonal to that, I strongly believe that
there is no point in trying to define behaviour for something that
cannot happen,

But it could for n==4.

With justification. 0010 means 8 in C? Jesus.

I think the word "neighbour" is counter-intuitive to spell.

EVERYBODY agrees that leading zero octals in C were a terrible idea. You
can't say it's just me thinks that!

Once a thread here has wandered this far off-topic, it is perhaps not unreasonable to draw comparisons with your one-man language.

Suppose I'd made my own hammer. The things I'd use it for are not going
to that different: hammering in nails, pulling them out, or generally
bashing things about.

As I said, the things my language does are universal. The way it does
them are better thought out and tidier.

The real problem with your language is that you think it is perfect

Compared with C, it a huge improvement. Compared with most other modern languages, 95% of what people expect now is missing.

    int F() {
        F(1, 2.3, "four", F,F,F,F(),F(F()));
        F(42);

It is undefined behaviour in C. Programmers are expected to write
sensible code.

But it would be nice if the language stopped people writing such things,
yes?

Can you tell me which other current languages, other than C++ and
assembly, allow such nonsense?

None? So it's not just me and my language then! Mine is lower level and
still plenty unsafe, but it has somewhat higher standards.

If I were the designer of the C language and the maintainer of the C standards, you might have a point. C is not /my/ language.

You do like to defend it though.

We can agree that C /lets/ people write messy code. It does not
/require/ it. And I have never found a programming language that stops people writing messy code.

I had included half a dozen points that made C's 'if' error prone and confusing, that would not occur in my syntax because it is better designed.

You seem to be incapable of drawing a line beween what a language can
enforce, and what a programmer is free to express.

Or rather, because a programmer has so much freedom anyway, let's not
bother with any lines at all! Just have a language that simply doesn't care.

--- Synchronet 3.20a-Linux NewsLink 1.114

From David Brown@david.brown@hesbynett.no to comp.lang.c on Mon Nov 4 22:48:12 2024

From Newsgroup: comp.lang.c

On 04/11/2024 20:50, Bart wrote:

On 04/11/2024 16:35, David Brown wrote:

On 03/11/2024 21:00, Bart wrote:

To my mind, this is a type of "multi-way selection" :

(const int []){ a, b, c }[n];

I can't see any good reason to exclude it as fitting the descriptive
phrase.

And if "a", "b" and "c" are not constant, but require evaluation of
some sort, it does not change things. Of course if these required
significant effort to evaluate,

Or you had a hundred of them.

or had side-effects, then you would most likely want a "multi-way
selection" construction that did the selection first, then the
evaluation - but that's a matter of programmer choice, and does not
change the terms.

You still don't get how different the concepts are.

Yes, I do. I also understand how they are sometimes exactly the same
thing, depending on the language, and how they can often have the same
end result, depending on the details, and how they can often be
different, especially in the face of side-effects or efficiency concerns.

Look, it's really /very/ simple.

A) You can have a construct that says "choose one of these N things to
execute and evaluate, and return that value (if any)".

B) You can have a construct that says "here are N things, select one of
them to return as a value".

Both of these can reasonably be called "multi-way selection" constructs.
Some languages can have one as a common construct, other languages may
have the other, and many support both in some way. Pretty much any
language that allows the programmer to have control over execution order
will let you do both in some way, even if there is not a clear language construct for it and you have to write it manually in code.

Mostly type A will be most efficient if there is a lot of effort
involved in putting together the things to select. Type B is likely to
be most efficient if you already have the collection of things to choose
from (it can be as simple as an array lookup), if the creation of the collection can be done in parallel (such as in some SIMD uses), or if
the cpu can generate them all before it has established the selection index.

Sometimes type A will be the simplest and clearest in the code,
sometimes type B will be the simplest and clearest in the code.

Both of these constructs are "multi-way selections".

Your mistake is in thinking that type A is all there is and all that
matters, possibly because you feel you have a better implementation for
it than C has. (I think that you /do/ have a nicer switch than C, but
that does not justify limiting your thinking to it.)

--- Synchronet 3.20a-Linux NewsLink 1.114

From David Brown@david.brown@hesbynett.no to comp.lang.c on Mon Nov 4 23:25:32 2024

From Newsgroup: comp.lang.c

On 04/11/2024 20:50, Bart wrote:

On 04/11/2024 16:35, David Brown wrote:

On 03/11/2024 21:00, Bart wrote:

Here is a summary of C vs my language.

<snip the irrelevant stuff>

I am very keen on keeping the concepts distinct in cases where it
matters.

I know, you like to mix things up. I like clear lines:

func F:int ...              Always returns a value
proc P ...                 Never returns a value

Oh, you /know/ that, do you? And how do you "know" that? Is that
because you still think I am personally responsible for the C language,
and that I think C is the be-all and end-all of perfect languages?

I agree that it can make sense to divide different types of "function".
I disagree that whether or not a value is returned has any significant relevance. I see no difference, other than minor syntactic issues,
between "int foo(...)" and "void foo(int * result, ...)".

A much more useful distinction would be between Malcolm-functions and Malcolm-procedures. "Malcolm-functions" are "__attribute__((const))" in
gcc terms or "[[unsequenced]]" in C23 terms (don't blame me for the
names here). In other words, they have no side-effects and their
result(s) are based entirely on their inputs. "Malcolm-procedures" can
have side-effects and interact with external data. I would possibly add
to that "meta-functions" that deal with compile-time information -
reflection, types, functions, etc.

and only need to be "solidified" during code generation. That might
allow the language to track things like "uninitialised" or "no value"
during compilation without having them part of a real type (such as
std::optional<> or a C

But you are always returning an actual type in agreement with the
language. That is my point. You're not choosing to just fall off that
cliff and return garbage or just crash.

However, your example with std::optional did just that, despite having
that type available.

It doesn't return a value. That is why it is UB to try to use that
non-existent value.

And why it is so easy to avoid that UB.

I agree. I think C gets this wrong. That's why I, and pretty much all
other C programmers, use a subset of C that disallows falling off the
end of a function with a non-void return type. Thus we avoid that UB.

(The only reason it is acceptable syntax in C, AFAIK, is because early versions of C had "default int" everywhere - there were no "void"
functions.)

Note that I have not suggested returning garbage values - I have
suggested that a language might support handling "no value" in a
convenient and safe manner.

But in C it is garbage.

Note that /I/ have not suggested returning garbage values.

I have not said that I think C is defined in a good way here. You are,
as so often, mixing up what people say they like with what C does (or
what you /think/ C does, as you are often wrong). And as usual you mix
up people telling you what C does with what people think is a good idea
in a language.

Totally independent of and orthogonal to that, I strongly believe that
there is no point in trying to define behaviour for something that
cannot happen,

But it could for n==4.

Again, you /completely/ miss the point.

If you have a function (or construct) that returns a correct value for
inputs 1, 2 and 3, and you never pass it the value 4 (or anything else),
then there is no undefined behaviour no matter what the code looks like
for values other than 1, 2 and 3. If someone calls that function with
input 4, then /their/ code has the error - not the code that doesn't
handle an input 4.

EVERYBODY agrees that leading zero octals in C were a terrible idea. You can't say it's just me thinks that!

I agree that this a terrible idea. <https://gcc.gnu.org/bugzilla/show_bug.cgi?id=60523>

But picking one terrible idea in C does not mean /everything/ in C is a terrible idea! /That/ is what you got wrong, as you do so often.

    int F() {
        F(1, 2.3, "four", F,F,F,F(),F(F()));
        F(42);

It is undefined behaviour in C. Programmers are expected to write
sensible code.

But it would be nice if the language stopped people writing such things, yes?

Sure. That's why sane programmers use decent tools - the language might
not stop them writing this, but the tools do.

Can you tell me which other current languages, other than C++ and
assembly, allow such nonsense?

Python.

Of course, it is equally meaningless in Python as it is in C.

None? So it's not just me and my language then! Mine is lower level and still plenty unsafe, but it has somewhat higher standards.

If I were the designer of the C language and the maintainer of the C
standards, you might have a point. C is not /my/ language.

You do like to defend it though.

I defend it if that is appropriate. Mostly, I /explain/ it to you. It
is bizarre that people need to do that for someone who claims to have
written a C compiler, but there it is.

We can agree that C /lets/ people write messy code. It does not
/require/ it. And I have never found a programming language that
stops people writing messy code.

I had included half a dozen points that made C's 'if' error prone and confusing, that would not occur in my syntax because it is better designed.

I'm glad you didn't - it would be a waste of effort.

You seem to be incapable of drawing a line beween what a language can enforce, and what a programmer is free to express.

I can't see how you could reach that conclusion.

Or rather, because a programmer has so much freedom anyway, let's not
bother with any lines at all! Just have a language that simply doesn't
care.

You /do/ understand that I use top-quality tools with carefully chosen warnings, set to throw fatal errors, precisely because I want a language
that has a lot more "lines" and restrictions that your little tools?
/Every/ C programmer uses a restricted subset of C - some more
restricted than others. I choose to use a very strict subset of C for
my work, because it is the best language for the tasks I need to do. (I
also use a very strict subset of C++ when it is a better choice.)

--- Synchronet 3.20a-Linux NewsLink 1.114

From Bart@bc@freeuk.com to comp.lang.c on Mon Nov 4 23:44:34 2024

From Newsgroup: comp.lang.c

On 04/11/2024 22:25, David Brown wrote:

On 04/11/2024 20:50, Bart wrote:

But it could for n==4.

Again, you /completely/ miss the point.

If you have a function (or construct) that returns a correct value for inputs 1, 2 and 3, and you never pass it the value 4 (or anything else), then there is no undefined behaviour no matter what the code looks like
for values other than 1, 2 and 3. If someone calls that function with input 4, then /their/ code has the error - not the code that doesn't
handle an input 4.

This is the wrong kind of thinking.

If this was a library function then, sure, you can stipulate a set of
input values, but that's at a different level, where you are writing
code on top of a working, well-specified language.

You don't make use of holes in the language, one that can cause a crash.
That is, by allowing a function to run into an internal RET op with no provision for a result. That's if there even is a RET; perhaps your
compilers are so confident that that path is not taken, or you hint it
won't be, that they won't bother!

It will start executing whatever random bytes follow the function.

As I said in my last post, a missing return value caused an internal
error in one of my C implementations because a pushed return value was missing.

How should that be fixed, via a hack in the implementation which pushes
some random value to avoid an immediate crash? And then what?

Let the user - the author of the function - explicitly provide that
value then at least that can be documented: if N isn't in 1..3, then F
returns so and so.

You know that makes perfect sense, but because you've got used to that dangerous feature in C you think it's acceptable.

--- Synchronet 3.20a-Linux NewsLink 1.114

From Bart@bc@freeuk.com to comp.lang.c on Tue Nov 5 02:11:46 2024

From Newsgroup: comp.lang.c

On 04/11/2024 21:48, David Brown wrote:

On 04/11/2024 20:50, Bart wrote:

On 04/11/2024 16:35, David Brown wrote:

On 03/11/2024 21:00, Bart wrote:

To my mind, this is a type of "multi-way selection" :

(const int []){ a, b, c }[n];

I can't see any good reason to exclude it as fitting the descriptive
phrase.

And if "a", "b" and "c" are not constant, but require evaluation of
some sort, it does not change things. Of course if these required
significant effort to evaluate,

Or you had a hundred of them.

or had side-effects, then you would most likely want a "multi-way
selection" construction that did the selection first, then the
evaluation - but that's a matter of programmer choice, and does not
change the terms.

You still don't get how different the concepts are.

Yes, I do. I also understand how they are sometimes exactly the same thing, depending on the language, and how they can often have the same
end result, depending on the details, and how they can often be
different, especially in the face of side-effects or efficiency concerns.

Look, it's really /very/ simple.

A) You can have a construct that says "choose one of these N things to execute and evaluate, and return that value (if any)".

B) You can have a construct that says "here are N things, select one of
them to return as a value".

Both of these can reasonably be called "multi-way selection" constructs.
Some languages can have one as a common construct, other languages may have the other, and many support both in some way. Pretty much any language that allows the programmer to have control over execution order will let you do both in some way, even if there is not a clear language construct for it and you have to write it manually in code.

Mostly type A will be most efficient if there is a lot of effort
involved in putting together the things to select. Type B is likely to
be most efficient if you already have the collection of things to choose from (it can be as simple as an array lookup), if the creation of the collection can be done in parallel (such as in some SIMD uses), or if
the cpu can generate them all before it has established the selection
index.

Sometimes type A will be the simplest and clearest in the code,
sometimes type B will be the simplest and clearest in the code.

Both of these constructs are "multi-way selections".

Your mistake is in thinking that type A is all there is and all that matters, possibly because you feel you have a better implementation for
it than C has. (I think that you /do/ have a nicer switch than C, but
that does not justify limiting your thinking to it.)

You STILL don't get it. Suppose this wasn't about returning a value, but executing one piece of code from a conditional set of statements.

In C that might be using an if/else chain, or switch. Other languages
might use a match statement.

Universally only one of those pieces of code will be evaluated. Unless
you can point me to a language where, in IF C THEN A ELSE B, *both* A
and B statements are executed.

Do you agree so far? If so call that Class I.

Do you also agree that languages have data stuctures, and those often
have constructors that will build a data structure element by element?
So all elements necessarily have to be evaluated. (Put aside selecting
one for now; that is a separate matter).

Call that Class II.

What my languages do, is that ALL the constructs in Class I that are
commonly used to execute one of N branches, can also return values.
(Which can require each branch to yield a type compatible with all the
others; another separate matter.)

Do you now see why it is senseless for my 'multi-way' selections to work
any other way. It would mean that:

x := if C then A else B fi

really could both evaluate A and B whatever the value of C! Whatever
that IF construct does here, has to do the same even without that 'x :='
a the start.

Of course, I support the sorts of indexing, of an existing or
just-created data structure, that belong in Class II.

Although it would not be particularly efficient to do this:

(f1(), f2(), .... f100())[100] # (1-based)

Since you will execute 100 functions rather than just one. But perhaps
there is a good reason for it. In that is needed, then the construct exists.

Another diference between Class I (when used to yield values) and Class
II, is that an out-of-bounds selector in Part II either yields a runtime
error (or raises an exception), or may just go wrong in my lower-level language.

But in Class I, the selector is either range-checked or falls off the
end of a test sequence, and a default value is provided.

--- Synchronet 3.20a-Linux NewsLink 1.114

From David Brown@david.brown@hesbynett.no to comp.lang.c on Tue Nov 5 09:26:24 2024

From Newsgroup: comp.lang.c

On 05/11/2024 00:44, Bart wrote:

On 04/11/2024 22:25, David Brown wrote:

On 04/11/2024 20:50, Bart wrote:

But it could for n==4.

Again, you /completely/ miss the point.

If you have a function (or construct) that returns a correct value for
inputs 1, 2 and 3, and you never pass it the value 4 (or anything
else), then there is no undefined behaviour no matter what the code
looks like for values other than 1, 2 and 3. If someone calls that
function with input 4, then /their/ code has the error - not the code
that doesn't handle an input 4.

This is the wrong kind of thinking.

If this was a library function then, sure, you can stipulate a set of
input values, but that's at a different level, where you are writing
code on top of a working, well-specified language.

You don't make use of holes in the language, one that can cause a crash. That is, by allowing a function to run into an internal RET op with no provision for a result. That's if there even is a RET; perhaps your compilers are so confident that that path is not taken, or you hint it
won't be, that they won't bother!

It will start executing whatever random bytes follow the function.

As I said in my last post, a missing return value caused an internal
error in one of my C implementations because a pushed return value was missing.

How should that be fixed, via a hack in the implementation which pushes
some random value to avoid an immediate crash? And then what?

Let the user - the author of the function - explicitly provide that
value then at least that can be documented: if N isn't in 1..3, then F returns so and so.

You know that makes perfect sense, but because you've got used to that dangerous feature in C you think it's acceptable.

I am a serious programmer. I write code for use by serious programmers.
I don't write code that is bigger and slower for the benefit of some half-wit coder that won't read the relevant documentation or rub a
couple of brain cells together. I have no time for hand-holding and spoon-feeding potential users of my functions - if someone wants to use play-dough plastic knives, they should not have become a programmer.

My programming stems from mathematics, not from C, and from an education
in developing provably correct code. I don't try to calculate the log
of 0, and I don't expect the mathematical log function to give me some "default" value if I try. The same applies to my code.

--- Synchronet 3.20a-Linux NewsLink 1.114

From antispam@antispam@fricas.org (Waldek Hebisch) to comp.lang.c on Tue Nov 5 12:42:34 2024

From Newsgroup: comp.lang.c

Bart <bc@freeuk.com> wrote:

Then we disagree on what 'multi-way' select might mean. I think it means branching, even if notionally, on one-of-N possible code paths.

OK.

The whole construct may or may not return a value. If it does, then one
of the N paths must be a default path.

You need to cover all input values. This is possible when there
is reasonably small number of possibilities. For example, switch on
char variable which covers all possible values does not need default
path. Default is needed only when number of possibilities is too
large to explicitely give all of them. And some languages allow
ranges, so that you may be able to cover all values with small
number of ranges.
--
Waldek Hebisch
--- Synchronet 3.20a-Linux NewsLink 1.114

From fir@fir@grunge.pl to Waldek Hebisch on Tue Nov 5 14:23:04 2024

From Newsgroup: comp.lang.c

Waldek Hebisch wrote:

Bart <bc@freeuk.com> wrote:

Then we disagree on what 'multi-way' select might mean. I think it means
branching, even if notionally, on one-of-N possible code paths.

OK.

The whole construct may or may not return a value. If it does, then one
of the N paths must be a default path.

You need to cover all input values. This is possible when there
is reasonably small number of possibilities. For example, switch on
char variable which covers all possible values does not need default
path. Default is needed only when number of possibilities is too
large to explicitely give all of them. And some languages allow
ranges, so that you may be able to cover all values with small
number of ranges.

in fact when consider in mind or see on assembly level the
implementation of switch not necessary need "default"
patch (which shopuld be named "other" btw)

it has two natural ways
1) ignore them
2) signal an runtime error

(both are kinda natural)

--- Synchronet 3.20a-Linux NewsLink 1.114

From David Brown@david.brown@hesbynett.no to comp.lang.c on Tue Nov 5 14:29:21 2024

From Newsgroup: comp.lang.c

On 05/11/2024 13:42, Waldek Hebisch wrote:

Bart <bc@freeuk.com> wrote:

Then we disagree on what 'multi-way' select might mean. I think it means
branching, even if notionally, on one-of-N possible code paths.

OK.

I appreciate this is what Bart means by that phrase, but I don't agree
with it. I'm not sure if that is covered by "OK" or not!

The whole construct may or may not return a value. If it does, then one
of the N paths must be a default path.

You need to cover all input values. This is possible when there
is reasonably small number of possibilities. For example, switch on
char variable which covers all possible values does not need default
path. Default is needed only when number of possibilities is too
large to explicitely give all of them. And some languages allow
ranges, so that you may be able to cover all values with small
number of ranges.

I think this is all very dependent on what you mean by "all input values".

Supposing I declare this function:

// Return the integer square root of numbers between 0 and 10
int small_int_sqrt(int x);

To me, the range of "all input values" is integers from 0 to 10. I
could implement it as :

int small_int_sqrt(int x) {
if (x == 0) return 0;
if (x < 4) return 1;
if (x < 9) return 2;
if (x < 16) return 3;
unreachable();
}

If the user asks for small_int_sqrt(-10) or small_int_sqrt(20), that's
/their/ fault and /their/ problem. I said nothing about what would
happen in those cases.

But some people seem to feel that "all input values" means every
possible value of the input types, and thus that a function like this
should return a value even when there is no correct value in and no
correct value out.

This is, IMHO, just nonsense and misunderstands the contract between
function writers and function users.

Further, I am confident that these people are quite happen to write code
like :

// Take a pointer to an array of two ints, add them, and return the sum
int sum_two_ints(const int * p) {
return p[0] + p[1];
}

Perhaps, in a mistaken belief that it makes the code "safe", they will add :

if (!p) return 0;

at the start of the function. But they will not check that "p" actually points to an array of two ints (how could they?), nor will they check
for integer overflow (and what would they do if it happened?).

A function should accept all input values - once you have made clear
what the acceptable input values can be. A "default" case is just a
short-cut for conveniently handling a wide range of valid input values -
it is never a tool for handling /invalid/ input values.

--- Synchronet 3.20a-Linux NewsLink 1.114

From Tim Rentsch@tr.17687@z991.linuxsc.com to comp.lang.c on Tue Nov 5 05:50:34 2024

From Newsgroup: comp.lang.c

Janis Papanagnou <janis_papanagnou+ng@hotmail.com> writes:

On 02.11.2024 19:09, Tim Rentsch wrote:

[...] As long as
the code is logically correct you are free to choose either
style, and it's perfectly okay to choose the one that you find
more appealing.

This is certainly true for one-man-shows.

The question asked concerned code in an individual programming
effort. I was addressing the question that was asked.

Hardly suited for most professional contexts I worked in.

Note that the pronoun "you" is plural as well as singular. The
conclusion applies to groups just as it does to individuals.
--- Synchronet 3.20a-Linux NewsLink 1.114

From Tim Rentsch@tr.17687@z991.linuxsc.com to comp.lang.c on Tue Nov 5 06:11:18 2024

From Newsgroup: comp.lang.c

Bart <bc@freeuk.com> writes:

On 04/11/2024 04:00, Tim Rentsch wrote:

fir <fir@grunge.pl> writes:

Tim Rentsch wrote:

With the understanding that I am offering [nothing] more than my
own opinion, I can say that I might use any of the patterns
mentioned, depending on circumstances. I don't think any one
approach is either always right or always wrong.

maybe, but some may heve some strong arguments (for use this and
not that) i may overlook

I acknowledge the point, but you haven't gotten any arguments,
only opinions.

Pretty much everything about PL design is somebody's opinion.

First, the discussion is not about language design but language
usage.

Second, the idea that "pretty much everything" about language usage
is just opinion is simply wrong (that holds for language design
also). Most of what is offered in newsgroups is just opinion, but
there are plenty of objective statements that could be made also.
Posters in the newsgroup here rarely make such statements, mostly I
think because they don't want to be bothered to make the effort to
research the issues. But that doesn't mean there isn't much to say
about such things; there is plenty to say, but for some strange
reason the people posting in comp.lang.c think their opinions offer
more value than statements of objective fact.
--- Synchronet 3.20a-Linux NewsLink 1.114

From Bart@bc@freeuk.com to comp.lang.c on Tue Nov 5 15:03:54 2024

From Newsgroup: comp.lang.c

On 05/11/2024 12:42, Waldek Hebisch wrote:

Bart <bc@freeuk.com> wrote:

Then we disagree on what 'multi-way' select might mean. I think it means
branching, even if notionally, on one-of-N possible code paths.

OK.

The whole construct may or may not return a value. If it does, then one
of the N paths must be a default path.

You need to cover all input values. This is possible when there
is reasonably small number of possibilities. For example, switch on
char variable which covers all possible values does not need default
path. Default is needed only when number of possibilities is too
large to explicitely give all of them. And some languages allow
ranges, so that you may be able to cover all values with small
number of ranges.

What's easier to implement in a language: to have a conditional need for
an 'else' branch, which is dependent on the compiler performing some arbitrarily complex levels of analysis on some arbitrarily complex set
of expressions...

...or to just always require 'else', with a dummy value if necessary?

Even if you went with the first, what happens if the compiler can't
guarantee that all values of a selector are covered; should it report
that, or say nothing?

What happens if you do need 'else', but later change things so all bases
are covered; will the compiler report it as being unnecesary, so that
you remove it?

Now, C doesn't have such a feature to test out (ie. that is a construct
with an optional 'else' branch, the whole of which returns a value). The nearest is function return values:

int F(int n) {
if (n==1) return 10;
if (n==2) return 20;
}

Here, neither tcc not gcc report that you might run into the end of the function. It will return garbage if called with anything other than 1 or 2.

gcc will say something with enough warning levels (reaches end of
non-void function). But it will say the same here:

int F(unsigned char c) {
if (c<128) return 10;
if (c>=128) return 20;
}

--- Synchronet 3.20a-Linux NewsLink 1.114

From David Brown@david.brown@hesbynett.no to comp.lang.c on Tue Nov 5 17:02:04 2024

From Newsgroup: comp.lang.c

On 05/11/2024 16:03, Bart wrote:

On 05/11/2024 12:42, Waldek Hebisch wrote:

Bart <bc@freeuk.com> wrote:

Then we disagree on what 'multi-way' select might mean. I think it means >>> branching, even if notionally, on one-of-N possible code paths.

OK.

The whole construct may or may not return a value. If it does, then one
of the N paths must be a default path.

You need to cover all input values. This is possible when there
is reasonably small number of possibilities. For example, switch on
char variable which covers all possible values does not need default
path. Default is needed only when number of possibilities is too
large to explicitely give all of them. And some languages allow
ranges, so that you may be able to cover all values with small
number of ranges.

What's easier to implement in a language: to have a conditional need for
an 'else' branch, which is dependent on the compiler performing some arbitrarily complex levels of analysis on some arbitrarily complex set
of expressions...

...or to just always require 'else', with a dummy value if necessary?

If this was a discussion on learning about compiler design for newbies,
that might be a relevant point. Otherwise, what is easier to implement
in a language tool is completely irrelevant to what is good in a language.

A language should try to support things that are good for the
/programmer/, not the compiler. But it does have to limited by what is practically possible for a compiler. A fair bit of the weaknesses of C
as a language can be attributed to the limitations of compilers from its
early days, and thereafter existing practice was hard to change.

--- Synchronet 3.20a-Linux NewsLink 1.114

From antispam@antispam@fricas.org (Waldek Hebisch) to comp.lang.c on Tue Nov 5 19:39:21 2024

From Newsgroup: comp.lang.c

David Brown <david.brown@hesbynett.no> wrote:

On 05/11/2024 13:42, Waldek Hebisch wrote:

Bart <bc@freeuk.com> wrote:

Then we disagree on what 'multi-way' select might mean. I think it means >>> branching, even if notionally, on one-of-N possible code paths.

OK.

I appreciate this is what Bart means by that phrase, but I don't agree
with it. I'm not sure if that is covered by "OK" or not!

You may prefer your own definition, but Bart's is resonable one.

The whole construct may or may not return a value. If it does, then one
of the N paths must be a default path.

You need to cover all input values. This is possible when there
is reasonably small number of possibilities. For example, switch on
char variable which covers all possible values does not need default
path. Default is needed only when number of possibilities is too
large to explicitely give all of them. And some languages allow
ranges, so that you may be able to cover all values with small
number of ranges.

I think this is all very dependent on what you mean by "all input values".

Supposing I declare this function:

// Return the integer square root of numbers between 0 and 10
int small_int_sqrt(int x);

To me, the range of "all input values" is integers from 0 to 10. I
could implement it as :

int small_int_sqrt(int x) {
if (x == 0) return 0;
if (x < 4) return 1;
if (x < 9) return 2;
if (x < 16) return 3;
unreachable();
}

If the user asks for small_int_sqrt(-10) or small_int_sqrt(20), that's /their/ fault and /their/ problem. I said nothing about what would
happen in those cases.

But some people seem to feel that "all input values" means every
possible value of the input types, and thus that a function like this
should return a value even when there is no correct value in and no
correct value out.

Well, some languages treat types more seriously than C. In Pascal
type of your input would be 0..10 and all input values would be
handled. Sure, when domain is too complicated to express in type
than it could be documented restriction. Still, it makes sense to
signal error if value goes outside handled rage, so in a sense all
values of input type are handled: either you get valid answer or
clear error.

This is, IMHO, just nonsense and misunderstands the contract between function writers and function users.

Further, I am confident that these people are quite happen to write code like :

// Take a pointer to an array of two ints, add them, and return the sum
int sum_two_ints(const int * p) {
return p[0] + p[1];
}

I do not think that people wanting strong type checking are happy
with C. Simply, either they use different language or use C
without bitching, but aware of its limitations. I certainly would
be quite unhappy with code above. It is possible that I would still
use it as a compromise (say if it was desirable to have single
prototype but handle points in spaces of various dimensions),
but my first attempt would be something like:

typedef struct {int p[2];} two_int;
....

Perhaps, in a mistaken belief that it makes the code "safe", they will add :

if (!p) return 0;

at the start of the function. But they will not check that "p" actually points to an array of two ints (how could they?), nor will they check
for integer overflow (and what would they do if it happened?).

I am certainly unhappy with overflow handling in current hardware
an by extention with overflow handling in C.

A function should accept all input values - once you have made clear
what the acceptable input values can be. A "default" case is just a short-cut for conveniently handling a wide range of valid input values -
it is never a tool for handling /invalid/ input values.

Well, default can signal error which frequently is right handling
of invalid input values.
--
Waldek Hebisch
--- Synchronet 3.20a-Linux NewsLink 1.114

From antispam@antispam@fricas.org (Waldek Hebisch) to comp.lang.c on Tue Nov 5 19:53:12 2024

From Newsgroup: comp.lang.c

Bart <bc@freeuk.com> wrote:

On 05/11/2024 12:42, Waldek Hebisch wrote:

Bart <bc@freeuk.com> wrote:

Then we disagree on what 'multi-way' select might mean. I think it means >>> branching, even if notionally, on one-of-N possible code paths.

OK.

The whole construct may or may not return a value. If it does, then one
of the N paths must be a default path.

You need to cover all input values. This is possible when there
is reasonably small number of possibilities. For example, switch on
char variable which covers all possible values does not need default
path. Default is needed only when number of possibilities is too
large to explicitely give all of them. And some languages allow
ranges, so that you may be able to cover all values with small
number of ranges.

What's easier to implement in a language: to have a conditional need for
an 'else' branch, which is dependent on the compiler performing some arbitrarily complex levels of analysis on some arbitrarily complex set
of expressions...

...or to just always require 'else', with a dummy value if necessary?

Well, frequently it is easier to do bad job, than a good one. However, normally you do not need very complex analysis: if simple analysis
is not enough, then first thing to do is to simpliy the program.
And in cases where problem to solve is really hard and program can
not be simplified ("irreducible complexity"), then it is time for
cludges for example in form of default case. But it should not
be the norm.

Even if you went with the first, what happens if the compiler can't guarantee that all values of a selector are covered; should it report
that, or say nothing?

Compile time error.

What happens if you do need 'else', but later change things so all bases
are covered; will the compiler report it as being unnecesary, so that
you remove it?

When practical, yes.

Now, C doesn't have such a feature to test out (ie. that is a construct
with an optional 'else' branch, the whole of which returns a value). The nearest is function return values:

int F(int n) {
if (n==1) return 10;
if (n==2) return 20;
}

Here, neither tcc not gcc report that you might run into the end of the function. It will return garbage if called with anything other than 1 or 2.

Hmm, using gcc-12 with your code in "foo.c":

gcc -Wall -O3 -c foo.c
foo.c: In function ‘F’:
foo.c:4:1: warning: control reaches end of non-void function [-Wreturn-type]
4 | }
| ^

gcc will say something with enough warning levels (reaches end of
non-void function). But it will say the same here:

int F(unsigned char c) {
if (c<128) return 10;
if (c>=128) return 20;
}

Indeed, it says the same. Somebody should report this as a bug.
IIUC gcc has all machinery needed to detect that all cases are
covered.
--
Waldek Hebisch
--- Synchronet 3.20a-Linux NewsLink 1.114

From David Brown@david.brown@hesbynett.no to comp.lang.c on Tue Nov 5 21:33:55 2024

From Newsgroup: comp.lang.c

On 05/11/2024 20:39, Waldek Hebisch wrote:

David Brown <david.brown@hesbynett.no> wrote:

On 05/11/2024 13:42, Waldek Hebisch wrote:

Bart <bc@freeuk.com> wrote:

Then we disagree on what 'multi-way' select might mean. I think it means >>>> branching, even if notionally, on one-of-N possible code paths.

OK.

I appreciate this is what Bart means by that phrase, but I don't agree
with it. I'm not sure if that is covered by "OK" or not!

You may prefer your own definition, but Bart's is resonable one.

The only argument I can make here is that I have not seen "multi-way
select" as a defined phrase with a particular established meaning. So
it simply means what the constituent words mean - selecting something
from multiple choices. There are no words in that phrase that talk
about "branching", or imply a specific order to events. It is a very
general and vague phrase, and I cannot see a reason to assume it has
such a specific meaning as Bart wants to assign to it. And as I have
pointed out in other posts, there are constructs in many languages
(including C) that fit the idea of a selection from one of many things,
but which do not fit Bart's specific interpretation of the phrase.

Bart's interpretation is "reasonable" in the sense of being definable
and consistent, or at least close enough to that to be useable in a discussion. But neither he, I, or anyone else gets to simply pick a
meaning for such a phrase and claim it is /the/ definition. Write a
popular and influential book with this as a key phrase, and /then/ you
can start calling your personal definition "the correct" definition.

The whole construct may or may not return a value. If it does, then one >>>> of the N paths must be a default path.

You need to cover all input values. This is possible when there
is reasonably small number of possibilities. For example, switch on
char variable which covers all possible values does not need default
path. Default is needed only when number of possibilities is too
large to explicitely give all of them. And some languages allow
ranges, so that you may be able to cover all values with small
number of ranges.

I think this is all very dependent on what you mean by "all input values". >>
Supposing I declare this function:

// Return the integer square root of numbers between 0 and 10
int small_int_sqrt(int x);

To me, the range of "all input values" is integers from 0 to 10. I
could implement it as :

int small_int_sqrt(int x) {
if (x == 0) return 0;
if (x < 4) return 1;
if (x < 9) return 2;
if (x < 16) return 3;
unreachable();
}

If the user asks for small_int_sqrt(-10) or small_int_sqrt(20), that's
/their/ fault and /their/ problem. I said nothing about what would
happen in those cases.

But some people seem to feel that "all input values" means every
possible value of the input types, and thus that a function like this
should return a value even when there is no correct value in and no
correct value out.

Well, some languages treat types more seriously than C. In Pascal
type of your input would be 0..10 and all input values would be
handled. Sure, when domain is too complicated to express in type
than it could be documented restriction. Still, it makes sense to
signal error if value goes outside handled rage, so in a sense all
values of input type are handled: either you get valid answer or
clear error.

No, it does not make sense to do that. Just because the C language does
not currently (maybe once C++ gets contracts, C will copy them) have a
way to specify input sets other than by types, does not mean that
functions in C always have a domain matching all possible combinations
of bits in the underlying representation of the parameter's types.

It might be a useful fault-finding aid temporarily to add error messages
for inputs that are invalid but can physically be squeezed into the parameters. That won't stop people making incorrect declarations of the function and passing completely different parameter types to it, or
finding other ways to break the requirements of the function.

And in general there is no way to check the validity of the inputs - you usually have no choice but to trust the caller. It's only in simple
cases, like the example above, that it would be feasible at all.

There are, of course, situations where the person calling the function
is likely to be incompetent, malicious, or both, and where there can be serious consequences for what you might prefer to consider as invalid
input values. You have that for things like OS system calls - it's no different than dealing with user inputs or data from external sources.
But you handle that by extending the function - increase the range of
valid inputs and appropriate outputs. You no longer have a function
that takes a number between 0 and 10 and returns the integer square root
- you now have a function that takes a number between -(2 ^ 31 + 1) and
(2 ^ 31) and returns the integer square root if the input is in the
range 0 to 10 or halts the program with an error message for other
inputs in the wider range. It's a different function, with a wider set
of inputs - and again, it is specified to give particular results for particular inputs.

This is, IMHO, just nonsense and misunderstands the contract between
function writers and function users.

Further, I am confident that these people are quite happen to write code
like :

// Take a pointer to an array of two ints, add them, and return the sum
int sum_two_ints(const int * p) {
return p[0] + p[1];
}

I do not think that people wanting strong type checking are happy
with C. Simply, either they use different language or use C
without bitching, but aware of its limitations.

Sure. C doesn't give as much help to writing correct programs as some
other languages. That doesn't mean the programmer can't do the right thing.

I certainly would
be quite unhappy with code above. It is possible that I would still
use it as a compromise (say if it was desirable to have single
prototype but handle points in spaces of various dimensions),
but my first attempt would be something like:

typedef struct {int p[2];} two_int;
....

I think you'd quickly find that limiting and awkward in C (but it might
be appropriate in other languages). But don't misunderstand me - I am
all in favour of finding ways in code that make input requirements
clearer or enforceable within the language - never put anything in
comments if you can do it in code. You could reasonably do this in C
for the first example :

// Do not use this directly
extern int small_int_sqrt_implementation(int x);

// Return the integer square root of numbers between 0 and 10
static inline int small_int_sqrt(int x) {
assert(x >= 0 && x <= 10);
return small_int_sqrt_implementation(x);
}

There is no way to check the validity of pointers in C, but you might at
least be able to use implementation-specific extensions to declare the function with the requirement that the pointer not be null.

Perhaps, in a mistaken belief that it makes the code "safe", they will add : >>
if (!p) return 0;

at the start of the function. But they will not check that "p" actually
points to an array of two ints (how could they?), nor will they check
for integer overflow (and what would they do if it happened?).

I am certainly unhappy with overflow handling in current hardware
an by extention with overflow handling in C.

A function should accept all input values - once you have made clear
what the acceptable input values can be. A "default" case is just a
short-cut for conveniently handling a wide range of valid input values -
it is never a tool for handling /invalid/ input values.

Well, default can signal error which frequently is right handling
of invalid input values.

Will that somehow fix the bug in the code that calls the function?

It can be a useful debugging and testing aid, certainly, but it does not
make the code "correct" or "safe" in any sense.

--- Synchronet 3.20a-Linux NewsLink 1.114

From Bart@bc@freeuk.com to comp.lang.c on Tue Nov 5 22:48:28 2024

From Newsgroup: comp.lang.c

On 05/11/2024 13:29, David Brown wrote:

On 05/11/2024 13:42, Waldek Hebisch wrote:

Supposing I declare this function:

// Return the integer square root of numbers between 0 and 10
int small_int_sqrt(int x);

To me, the range of "all input values" is integers from 0 to 10. I
could implement it as :

int small_int_sqrt(int x) {
    if (x == 0) return 0;
    if (x < 4) return 1;
    if (x < 9) return 2;
    if (x < 16) return 3;
    unreachable();
}

If the user asks for small_int_sqrt(-10) or small_int_sqrt(20), that's /their/ fault and /their/ problem. I said nothing about what would
happen in those cases.

But some people seem to feel that "all input values" means every
possible value of the input types, and thus that a function like this
should return a value even when there is no correct value in and no
correct value out.

Your example is an improvement on your previous ones. At least it
attempts to deal with out-of-range conditions!

However there is still the question of providing that return type. If 'unreachable' is not a special language feature, then this can fail
either if the language requires the 'return' keyword, or 'unreachable'
doesn't yield a compatible type (even if it never returns because it's
an error handler).

Getting that right will satisfy both the language (if it cared more
about such matters than C apparently does), and the casual reader
curious about how the function contract is met (that is, supplying that promised return int type if or when it returns).

// Take a pointer to an array of two ints, add them, and return the sum
int sum_two_ints(const int * p) {
    return p[0] + p[1];
}

Perhaps, in a mistaken belief that it makes the code "safe", they will
add :

    if (!p) return 0;

at the start of the function. But they will not check that "p" actually points to an array of two ints (how could they?), nor will they check
for integer overflow (and what would they do if it happened?).

This is a different category of error.

Here's a related example of what I'd class as a language error:

int a;
a = (exit(0), &a);

A type mismatch error is usually reported. However, the assignment is
never done because it never returns from that exit() call.

I expect you wouldn't think much of a compiler that didn't report such
an error because that code is never executed.

But to me that is little different from running into the end of function without the proper provisions for a valid return value.

--- Synchronet 3.20a-Linux NewsLink 1.114

From Bart@bc@freeuk.com to comp.lang.c on Tue Nov 5 23:01:44 2024

From Newsgroup: comp.lang.c

On 05/11/2024 20:33, David Brown wrote:

On 05/11/2024 20:39, Waldek Hebisch wrote:

David Brown <david.brown@hesbynett.no> wrote:

On 05/11/2024 13:42, Waldek Hebisch wrote:

Bart <bc@freeuk.com> wrote:

Then we disagree on what 'multi-way' select might mean. I think it
means
branching, even if notionally, on one-of-N possible code paths.

OK.

I appreciate this is what Bart means by that phrase, but I don't agree
with it. I'm not sure if that is covered by "OK" or not!

You may prefer your own definition, but Bart's is resonable one.

The only argument I can make here is that I have not seen "multi-way
select" as a defined phrase with a particular established meaning.

Well, it started off as 2-way select, meaning constructs like this:

x = c ? a : b;
x := (c | a | b)

Where one of two branches is evaluated. I extended the latter to N-way
select:

x := (n | a, b, c, ... | z)

Where again one of these elements is evaluated, selected by n (here
having the values of 1, 2, 3, ... compared with true, false above, but
there need to be at least 2 elements inside |...| to distinguish them).

I applied it also to other statements that can be provide values, such
as if-elsif chains and switch, but there the selection might be
different (eg. a series of tests are done sequentially until a true one).

I don't know how it got turned into 'multi-way'.

Notice that each starts with an assignment (or the value is used in
other ways like passing to a function), so provision has to be made for
some value always to be returned.

Such N-way selections can be emulated, for example:

if (c)
x = a;
else
x = b;

But because the assignment has been brought inside (a dedicated one for
each branch), the issue of a default path doesn't arise. You can leave
out the 'else' for example; x is just left unchanged.

This doesn't work however when the result is passed to a function:

f(if (c) a);

what is passed when c is false?

--- Synchronet 3.20a-Linux NewsLink 1.114

From Bart@bc@freeuk.com to comp.lang.c on Tue Nov 5 23:15:35 2024

From Newsgroup: comp.lang.c

On 05/11/2024 19:53, Waldek Hebisch wrote:

Bart <bc@freeuk.com> wrote:

On 05/11/2024 12:42, Waldek Hebisch wrote:

Bart <bc@freeuk.com> wrote:

Then we disagree on what 'multi-way' select might mean. I think it means >>>> branching, even if notionally, on one-of-N possible code paths.

OK.

The whole construct may or may not return a value. If it does, then one >>>> of the N paths must be a default path.

You need to cover all input values. This is possible when there
is reasonably small number of possibilities. For example, switch on
char variable which covers all possible values does not need default
path. Default is needed only when number of possibilities is too
large to explicitely give all of them. And some languages allow
ranges, so that you may be able to cover all values with small
number of ranges.

What's easier to implement in a language: to have a conditional need for
an 'else' branch, which is dependent on the compiler performing some
arbitrarily complex levels of analysis on some arbitrarily complex set
of expressions...

...or to just always require 'else', with a dummy value if necessary?

Well, frequently it is easier to do bad job, than a good one.

I assume that you consider the simple solution the 'bad' one?

I'd would consider a much elaborate one putting the onus on external
tools, and still having an unpredictable result to be the poor of the two.

You want to create a language that is easily compilable, no matter how
complex the input.

With the simple solution, the worst that can happen is that you have to
write a dummy 'else' branch, perhaps with a dummy zero value.

If control never reaches that point, it will never be executed (at
worse, it may need to skip an instruction).

But if the compiler is clever enough (optionally clever, it is not a requirement!), then it could eliminate that code.

A bonus is that when debugging, you can comment out all or part of the previous lines, but the 'else' now catches those untested cases.

normally you do not need very complex analysis:

I don't want to do any analysis at all! I just want a mechanical
translation as effortlessly as possible.

I don't like unbalanced code within a function because it's wrong and
can cause problems.
--- Synchronet 3.20a-Linux NewsLink 1.114

From Kaz Kylheku@643-408-1753@kylheku.com to comp.lang.c on Wed Nov 6 07:26:25 2024

From Newsgroup: comp.lang.c

On 2024-11-05, Bart <bc@freeuk.com> wrote:

On 05/11/2024 20:33, David Brown wrote:

On 05/11/2024 20:39, Waldek Hebisch wrote:

David Brown <david.brown@hesbynett.no> wrote:

On 05/11/2024 13:42, Waldek Hebisch wrote:

Bart <bc@freeuk.com> wrote:

Then we disagree on what 'multi-way' select might mean. I think it >>>>>> means
branching, even if notionally, on one-of-N possible code paths.

OK.

I appreciate this is what Bart means by that phrase, but I don't agree >>>> with it. I'm not sure if that is covered by "OK" or not!

You may prefer your own definition, but Bart's is resonable one.

The only argument I can make here is that I have not seen "multi-way
select" as a defined phrase with a particular established meaning.

Well, it started off as 2-way select, meaning constructs like this:

x = c ? a : b;
x := (c | a | b)

Where one of two branches is evaluated. I extended the latter to N-way select:

x := (n | a, b, c, ... | z)

This looks quite error-prone. You have to count carefully that
the cases match the intended values. If an entry is
inserted, all the remaining ones shift to a higher value.

You've basically taken a case construct and auto-generated
the labels starting from 1.

If that was someone's Lisp macro, I would prefer they confine
it to their own program. :)

(defmacro nsel (expr . clauses)

^(caseql ,expr ,*[mapcar list 1 clauses]))
nsel

(nsel 1 (prinl "one") (prinl "two") (prinl "three"))

"one"
"one"

(nsel (+ 1 1) (prinl "one") (prinl "two") (prinl "three"))

"two"
"two"

(nsel (+ 1 3) (prinl "one") (prinl "two") (prinl "three"))

nil

(nsel (+ 1 2) (prinl "one") (prinl "two") (prinl "three"))

"three"
"three"
nil

(macroexpand-1 '(nsel x a b c d))

(caseql x (1 a)
(2 b) (3 c)
(4 d))

Yawn ...
--
TXR Programming Language: http://nongnu.org/txr
Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
Mastodon: @Kazinator@mstdn.ca
--- Synchronet 3.20a-Linux NewsLink 1.114

From David Brown@david.brown@hesbynett.no to comp.lang.c on Wed Nov 6 08:38:47 2024

From Newsgroup: comp.lang.c

On 06/11/2024 00:01, Bart wrote:

On 05/11/2024 20:33, David Brown wrote:

On 05/11/2024 20:39, Waldek Hebisch wrote:

David Brown <david.brown@hesbynett.no> wrote:

On 05/11/2024 13:42, Waldek Hebisch wrote:

Bart <bc@freeuk.com> wrote:

Then we disagree on what 'multi-way' select might mean. I think it >>>>>> means
branching, even if notionally, on one-of-N possible code paths.

OK.

I appreciate this is what Bart means by that phrase, but I don't agree >>>> with it. I'm not sure if that is covered by "OK" or not!

You may prefer your own definition, but Bart's is resonable one.

The only argument I can make here is that I have not seen "multi-way
select" as a defined phrase with a particular established meaning.

Well, it started off as 2-way select, meaning constructs like this:

   x = c ? a : b;
   x := (c | a | b)

Where one of two branches is evaluated. I extended the latter to N-way select:

   x := (n | a, b, c, ... | z)

I appreciate that this is what you have in your language as a "multi-way select". I can see it being a potentially useful construct (though
personally I don't like the syntax at all).

The only thing I have disagreed with is your assertions that what you
have there is somehow the only "true" or "correct" concept of a
"multi-way selection".

--- Synchronet 3.20a-Linux NewsLink 1.114

From Bart@bc@freeuk.com to comp.lang.c on Wed Nov 6 10:01:16 2024

From Newsgroup: comp.lang.c

On 06/11/2024 07:26, Kaz Kylheku wrote:

On 2024-11-05, Bart <bc@freeuk.com> wrote:

Well, it started off as 2-way select, meaning constructs like this:

x = c ? a : b;
x := (c | a | b)

Where one of two branches is evaluated. I extended the latter to N-way
select:

x := (n | a, b, c, ... | z)

This looks quite error-prone. You have to count carefully that
the cases match the intended values. If an entry is
inserted, all the remaining ones shift to a higher value.

You've basically taken a case construct and auto-generated
the labels starting from 1.

It's a version of Algol68's case construct:

x := CASE n IN a, b, c OUT z ESAC

which also has the same compact form I use. I only use the compact
version because n is usually small, and it is intended to be used within
an expression: print (n | "One", "Two", "Three" | "Other").

This an actual example (from my first scripting language; not written by
me):

Crd[i].z := (BendAssen |P.x, P.y, P.z)

An out-of-bounds index yields 'void' (via a '| void' part inserted by
the compiler). This is one of my examples from that era:

xt := (messa | 1,1,1, 2,2,2, 3,3,3)
yt := (messa | 3,2,1, 3,2,1, 3,2,1)

Algol68 didn't have 'switch', but I do, as well as a separate
case...esac statement that is more general. Those are better for
multi-line constructs.

As for being error prone because values can get out of step, so is a
function call like this:

f(a, b, c, d, e)

But I also have keyword arguments.

--- Synchronet 3.20a-Linux NewsLink 1.114

From Bart@bc@freeuk.com to comp.lang.c on Wed Nov 6 14:40:52 2024

From Newsgroup: comp.lang.c

On 04/11/2024 22:25, David Brown wrote:

On 04/11/2024 20:50, Bart wrote:

On 04/11/2024 16:35, David Brown wrote:

On 03/11/2024 21:00, Bart wrote:

Here is a summary of C vs my language.

<snip the irrelevant stuff>

I am very keen on keeping the concepts distinct in cases where it
matters.

I know, you like to mix things up. I like clear lines:

func F:int ... Always returns a value
proc P ... Never returns a value

Oh, you /know/ that, do you? And how do you "know" that? Is that
because you still think I am personally responsible for the C language,
and that I think C is the be-all and end-all of perfect languages?

I agree that it can make sense to divide different types of "function".
I disagree that whether or not a value is returned has any significant relevance. I see no difference, other than minor syntactic issues,
between "int foo(...)" and "void foo(int * result, ...)".

I don't use functional concepts; my functions may or may not be pure.

But the difference between value-returning and non-value returning
functions to me is significant:

Func Proc
return x; Y N
return; N Y
hit final } N Y
Pure ? Unlikely
Side-effects ? Likely
Call within expr Y N
Call standalone ? Y

Having a clear distinction helps me focus more precisely on how a
routine has to work.

In C, the syntax is dreadful: not only can you barely distinguish a
function from a procedure (even without attributes, user types and
macros add in), but you can hardly tell them apart from variable
declarations.

In fact, function declarations can even be declared in the middle of a
set of variable declarations.

You can learn a lot about the underlying structure of of a language by implementing it. So when I generate IL from C for example, I found the
need to have separate instructions to call functions and procedures, and separate return instructions too.

If you have a function (or construct) that returns a correct value for inputs 1, 2 and 3, and you never pass it the value 4 (or anything else), then there is no undefined behaviour no matter what the code looks like
for values other than 1, 2 and 3. If someone calls that function with input 4, then /their/ code has the error - not the code that doesn't
handle an input 4.

No. The function they are calling is badly formed. There should never be
any circumstance where a value-returning function terminates (hopefully
by running into RET) without an explicit set return value.

I agree that this a terrible idea. <https://gcc.gnu.org/bugzilla/show_bug.cgi?id=60523>

But picking one terrible idea in C does not mean /everything/ in C is a terrible idea! /That/ is what you got wrong, as you do so often.

What the language does is generally fine. /How/ it does is generally
terrible. (Type syntax; no 'fun' keyword; = vs ==; operator precedence;
format codes; 'break' in switch; export by default; struct T vs typedef
T; dangling 'else'; optional braces; ... there's reams of this stuff!)

So actually, I'm not wrong. There have been discussions about all of
these and a lot more.

Can you tell me which other current languages, other than C++ and
assembly, allow such nonsense?

Python.

Of course, it is equally meaningless in Python as it is in C.

Python at least can trap the errors. Once you fix the unlimited
recursion, it will detect the wrong number of arguments. In C, before
C23 anyway, any number and types of arguments is legal in that example.

I defend it if that is appropriate. Mostly, I /explain/ it to you. It
is bizarre that people need to do that for someone who claims to have written a C compiler, but there it is.

It is bizarre that the ins and outs of C, a supposedly simple language,
are so hard to understand. Like the rules for how many {} you can leave
out for a initialising a nested data structure. Or how many extra ones
you can have; this is OK:

int a = {0};

but not {{0}} (tcc accepts it though, so which set of rules is it using?).

Or whether it is a static followed by a non-static declaration that is
OK, or whether it's the other way around.

I'm glad you didn't - it would be a waste of effort.

I guessed that. You seemingly don't care that C is a messy language with
many quirks; you just work around it by using a subset, with some help
from your compiler in enforcing that subset.

So you're using a strict dialect. The trouble is that everyone else
using C will either be using their own dialect incompatible with yours,
or are stuck using the messy language and laid-back compilers operating
in lax mode by default.

I'm interested in fixing things at source - within a language.

You /do/ understand that I use top-quality tools with carefully chosen warnings, set to throw fatal errors, precisely because I want a language that has a lot more "lines" and restrictions that your little tools?
/Every/ C programmer uses a restricted subset of C - some more
restricted than others. I choose to use a very strict subset of C for
my work, because it is the best language for the tasks I need to do. (I also use a very strict subset of C++ when it is a better choice.)

I'd guess only 1% of your work with C involves the actual language, and
99% using additional tooling.

With me it's mostly about the language.

--- Synchronet 3.20a-Linux NewsLink 1.114

From David Brown@david.brown@hesbynett.no to comp.lang.c on Wed Nov 6 15:50:21 2024

From Newsgroup: comp.lang.c

On 05/11/2024 23:48, Bart wrote:

On 05/11/2024 13:29, David Brown wrote:

On 05/11/2024 13:42, Waldek Hebisch wrote:

Supposing I declare this function:

// Return the integer square root of numbers between 0 and 10
int small_int_sqrt(int x);

To me, the range of "all input values" is integers from 0 to 10. I
could implement it as :

int small_int_sqrt(int x) {
     if (x == 0) return 0;
     if (x < 4) return 1;
     if (x < 9) return 2;
     if (x < 16) return 3;
     unreachable();
}

If the user asks for small_int_sqrt(-10) or small_int_sqrt(20), that's
/their/ fault and /their/ problem. I said nothing about what would
happen in those cases.

But some people seem to feel that "all input values" means every
possible value of the input types, and thus that a function like this
should return a value even when there is no correct value in and no
correct value out.

Your example is an improvement on your previous ones. At least it
attempts to deal with out-of-range conditions!

No, it does not. The fact that some invalid inputs also give
deterministic results is a coincidence of the implementation, not an indication that the function is specified for those additional inputs or
that it does any checking. I intentionally structured the example this
way to show this - sometimes undefined behaviour gives you results you
might like, but it is still undefined behaviour. This function has no
defined behaviour for inputs outside the range 0 to 10, because I gave
no definition of its behaviour - the effect of particular
implementations of the function is irrelevant to that.

As I suspected it might, this apparently confused you.

However there is still the question of providing that return type. If 'unreachable' is not a special language feature, then this can fail
either if the language requires the 'return' keyword, or 'unreachable' doesn't yield a compatible type (even if it never returns because it's
an error handler).

"unreachable()" is a C23 standardisation of a feature found in most
high-end compilers. For gcc and clang, there is
__builtin_unreachable(), and MSVC has its version. The functions are
handled by the compilers as "undefined behaviour". (I mean that quite literally - gcc and clang turn it into an "UB" instruction in their
internal representations.)

Clearly, "unreachable()" has no return type - it does not in any sense "return". And since the compiler knows it will never be "executed", it
knows control will never fall off the end of that function. You don't
need a type for something that can never happen (it's like if I say
"this is a length of 0" and you ask "was that 0 metres, or 0 inches?" -
the question is meaningless).

Getting that right will satisfy both the language (if it cared more
about such matters than C apparently does), and the casual reader
curious about how the function contract is met (that is, supplying that promised return int type if or when it returns).

C gets it right here. There is no need for a return type when there is
no return - indeed, trying to force some sort of type or "default" value
would be counterproductive. It would be confusing to the reader, add untestable and unexecutable source code, make code flow more
complicated, break invariants, cripple correctness analysis of the rest
of the code, and make the generated object code inefficient.

Remember how the function is specified. All you have to do is use it correctly - go outside the specifications, and I make no promises or guarantees about what will happen. If you step outside /your/ side of
the bargain by giving it an input outside 0 to 10, then I give you no
more guarantees that it will return an int of any sort than I give you a guarantee that it would be a great sales gimmick if printed on a t-shirt.

But what I /can/ give you is something that can be very useful in being
sure the rest of your code is correct, and which is impossible for a
function with "default" values or other such irrelevant parts. I can guarantee you that:

int y = small_int_sqrt(x);

assert(y * y <= x);
assert ((y + 1) * (y + 1) > x);

That is to say - I can guarantee that the function works and gives you
the correct results.

But supposing I had replaced the "unreachable();" with a return of a
default value - let's say 42, since that's the right answer even if you
don't know the question. What does the user of small_int_sqrt() know now?

Now you know that "y" is an int. You have no idea if it is a correct or useful result, unless you have first checked that x is in the specified
range of 0 to 10.

If you /have/ checked (in some way) that x is valid, then why would you
bother calling the function when x is invalid? And thus why would you
care what the function does or does not do when x is invalid?

And if you haven't checked that x is valid, why would you bother calling
the function if you have no idea whether or not it results in something
useful and correct?

So we have now established that returning a default int value is worse
than useless - there are no circumstances in which it can be helpful,
and it ruins the guarantees you want in order to be sure that the
calling code is correct.

Let's now look at another alternative - have the function check for
validity, and return some kind of error signal if the input is invalid.
There are two ways to do this - we can have a value of the main return
type acting as an error signal, or we can have an additional return value.

If we pick the first one - say, return -1 on error - then we have a
compact solution that is easy to check for the calling function. But
now we have a check for validity of the input whether we need it or not
(since the callee function does the checking, even if the caller
function knows the values are valid), and the caller function has to add
a check a check for error return values. The return may still be an
"int", but it is no longer representative of an integer value - it
multiplexes two different concepts. We have lost the critical
correctness equations that we previously had. And it won't work at all
if there is no choice of an error indicator.

If we pick the second one, we need to return two values. The checking
is all the same, but at least the concepts of validity and value are separated. Now we have either a struct return with its added efficiency costs, or a monstrosity from the dark ages where the function must take
a pointer parameter for where to store the results. (And how is the
function going to check the validity of that pointer? Or is it somehow
okay to skip that check while insisting that a check of the other input
is vital?) This has most of the disadvantages of the first choice, plus
extra efficiency costs.

All in all, we have a significant costs in various aspects, with no real benefit, all in the name of a mistaken belief that we are avoiding
undefined behaviour.

// Take a pointer to an array of two ints, add them, and return the sum
int sum_two_ints(const int * p) {
     return p[0] + p[1];
}

Perhaps, in a mistaken belief that it makes the code "safe", they will
add :

     if (!p) return 0;

at the start of the function. But they will not check that "p"
actually points to an array of two ints (how could they?), nor will
they check for integer overflow (and what would they do if it happened?).

This is a different category of error.

No, it is not. It is just another case of a function having
preconditions on the input, and whether or not the called function
should check those preconditions. You can say you think it is vital for functions to do these checks itself, or you can accept that it is the responsibility of the calling code to provide valid inputs. But you
don't get to say it is vital to check /some/ types of inputs, but other
types are fine to take on trust.

Here's a related example of what I'd class as a language error:

   int a;
   a = (exit(0), &a);

A type mismatch error is usually reported. However, the assignment is
never done because it never returns from that exit() call.

I expect you wouldn't think much of a compiler that didn't report such
an error because that code is never executed.

I would expect the compiler to know that "exit()" can't return, so the
value of "a" is never used and it can be optimised away. But I do also
expect that the compiler will enforce the rules of the language - syntax
and grammar rules, along with constraints and anything else it is able
to check. And even if I said it was reasonable for a language to say
this "assignment" is not an error since it can't be executed, I think
trying to put that level of detail into a language definition (and corresponding compilers) would quickly be a major complexity for no
real-world gain.

But to me that is little different from running into the end of function without the proper provisions for a valid return value.

Yes, I think so too.

--- Synchronet 3.20a-Linux NewsLink 1.114

From David Brown@david.brown@hesbynett.no to comp.lang.c on Wed Nov 6 16:47:50 2024

From Newsgroup: comp.lang.c

On 06/11/2024 15:40, Bart wrote:

On 04/11/2024 22:25, David Brown wrote:

On 04/11/2024 20:50, Bart wrote:

On 04/11/2024 16:35, David Brown wrote:

On 03/11/2024 21:00, Bart wrote:

Here is a summary of C vs my language.

<snip the irrelevant stuff>

I am very keen on keeping the concepts distinct in cases where it
matters.

I know, you like to mix things up. I like clear lines:

   func F:int ...              Always returns a value
   proc P ...                 Never returns a value

Oh, you /know/ that, do you? And how do you "know" that? Is that
because you still think I am personally responsible for the C
language, and that I think C is the be-all and end-all of perfect
languages?

I agree that it can make sense to divide different types of
"function". I disagree that whether or not a value is returned has any
significant relevance. I see no difference, other than minor
syntactic issues, between "int foo(...)" and "void foo(int * result,
...)".

I don't use functional concepts; my functions may or may not be pure.

OK. You are not alone in that. (Standard C didn't support a difference
there until C23.)

But the difference between value-returning and non-value returning
functions to me is significant:

                  Func Proc
return x;         Y     N
return;           N     Y
hit final }       N     Y
Pure              ?     Unlikely
Side-effects      ?     Likely
Call within expr Y     N
Call standalone   ?     Y

There are irrelevant differences in syntax, which could easily disappear entirely if a language supported a default initialisation value when a
return gives no explicit value. (i.e., "T foo() { return; }; T x =
foo();" could be treated in the same way as "T x;" in a static
initialisation context.) /Your/ language does not support that, but
other languages could.

Then you list some things that may or may not happen, which are of
course totally irrelevant. If you list the differences between bikes
and cars, you don't include "some cars are red" and "bikes are unlikely
to be blue".

Having a clear distinction helps me focus more precisely on how a
routine has to work.

It's a pointless distinction. Any function or procedure can be morphed
into the other form without any difference in the semantic meaning of
the code, requiring just a bit of re-arrangement at the caller site:

int foo(int x) { int y = ...; return y; }

void foo(int * res, int x) { int y = ...; *res = y; }

void foo(int x) { ... ; return; }

int foo(int x) { ... ; return 0; }

There is no relevance in the division here, which is why most languages
don't make a distinction unless they do so simply for syntactic reasons.

In C, the syntax is dreadful: not only can you barely distinguish a
function from a procedure (even without attributes, user types and
macros add in), but you can hardly tell them apart from variable declarations.

As always, you are trying to make your limited ideas of programming
languages appear to be correct, universal, obvious or "natural" by
saying things that you think are flaws in C. That's not how a
discussion works, and it is not a way to convince anyone of anything.
The fact that C does not have a keyword used in the declaration or
definition of a function does not in any way mean that there is the
slightest point in your artificial split between "func" and "proc"
functions.

(It doesn't matter that I too prefer a clear keyword for defining
functions in a language.)

In fact, function declarations can even be declared in the middle of a
set of variable declarations.

You can learn a lot about the underlying structure of of a language by implementing it. So when I generate IL from C for example, I found the
need to have separate instructions to call functions and procedures, and separate return instructions too.

That is solely from your choice of an IL.

If you have a function (or construct) that returns a correct value for
inputs 1, 2 and 3, and you never pass it the value 4 (or anything
else), then there is no undefined behaviour no matter what the code
looks like for values other than 1, 2 and 3. If someone calls that
function with input 4, then /their/ code has the error - not the code
that doesn't handle an input 4.

No. The function they are calling is badly formed. There should never be
any circumstance where a value-returning function terminates (hopefully
by running into RET) without an explicit set return value.

There are no circumstances where you can use the function correctly and
it does not return the correct answer. If you want to consider when
people to use a function /incorrectly/, then there are no limits to how
wrong they can be.

I agree that this a terrible idea.
<https://gcc.gnu.org/bugzilla/show_bug.cgi?id=60523>

But picking one terrible idea in C does not mean /everything/ in C is
a terrible idea! /That/ is what you got wrong, as you do so often.

What the language does is generally fine. /How/ it does is generally terrible. (Type syntax; no 'fun' keyword; = vs ==; operator precedence; format codes; 'break' in switch; export by default; struct T vs typedef
T; dangling 'else'; optional braces; ... there's reams of this stuff!)

Making the same mistake again does not help your argument.

So actually, I'm not wrong. There have been discussions about all of
these and a lot more.

Of course you are wrong!

You have failed to grasp the key concept of programming - it is based on contracts and agreements. Tasks are broken down into subtasks, and for
each subtask there is a requirement for what gets put into the subtask
and a requirement for what comes out of it. The calling task is
responsible for fulfilling the input requirements, the callee subtask is responsible for fulfilling the output requirements. The caller does not
need to check that the outputs are correct, and the callee does not need
to check that the input tasks are correct. That is the division of responsibilities - and doing anything else is, at best, wasted duplicate effort.

You are right that C has its flaws - every language does. I agree with
you in many cases where you think C has poor design choices.

But can you not understand that repeating things that you dislike about
C - things we have all heard countless times - does not excuse your
tunnel vision about programming concepts or change your misunderstandings?

Can you tell me which other current languages, other than C++ and
assembly, allow such nonsense?

Python.

Of course, it is equally meaningless in Python as it is in C.

Python at least can trap the errors. Once you fix the unlimited
recursion, it will detect the wrong number of arguments. In C, before
C23 anyway, any number and types of arguments is legal in that example.

It is syntactically legal, but semantically undefined behaviour (look it
up in the C standards). That means it is wrong, but the language
standards don't insist that compilers diagnose it as an error.

I defend it if that is appropriate. Mostly, I /explain/ it to you.
It is bizarre that people need to do that for someone who claims to
have written a C compiler, but there it is.

It is bizarre that the ins and outs of C, a supposedly simple language,
are so hard to understand.

Have you ever played Go ? It is a game with very simple rules, and extraordinarily complicated gameplay.

Compared to most general purpose languages, C /is/ small and simple.
But that is a relative rating, not an absolute rating.

I'm glad you didn't - it would be a waste of effort.

I guessed that. You seemingly don't care that C is a messy language with many quirks; you just work around it by using a subset, with some help
from your compiler in enforcing that subset.

Yes.

If there was an alternative language that I thought would be better for
the tasks I have, I'd use that. (Actually, a subset of C++ is often
better, so I use that when I can.)

What do you think I should do instead? Whine in newsgroups to people
that don't write language standards (for C or anything else) and don't
make compilers? Make my own personal language that is useless to
everyone else and holds my customers to ransom by being the only person
that can work with their code? Perhaps that is fine for the type of
customers you have, but not for my customers.

I /do/ understand that C has its flaws (from /my/ viewpoint, for /my/
needs). So I work around those.

So you're using a strict dialect. The trouble is that everyone else
using C will either be using their own dialect incompatible with yours,
or are stuck using the messy language and laid-back compilers operating
in lax mode by default.

I'm interested in fixing things at source - within a language.

You haven't fixed a thing.

(I'm not claiming /I/ have fixed anything either.)

You /do/ understand that I use top-quality tools with carefully chosen
warnings, set to throw fatal errors, precisely because I want a
language that has a lot more "lines" and restrictions that your little
tools? /Every/ C programmer uses a restricted subset of C - some more
restricted than others. I choose to use a very strict subset of C for
my work, because it is the best language for the tasks I need to do.
(I also use a very strict subset of C++ when it is a better choice.)

I'd guess only 1% of your work with C involves the actual language, and
99% using additional tooling.

What a weird thing to guess.

With me it's mostly about the language.

An even weirder thing to say from someone who made his own tools.

--- Synchronet 3.20a-Linux NewsLink 1.114

From Bart@bc@freeuk.com to comp.lang.c on Wed Nov 6 19:38:09 2024

From Newsgroup: comp.lang.c

On 06/11/2024 15:47, David Brown wrote:

On 06/11/2024 15:40, Bart wrote:

There are irrelevant differences in syntax, which could easily disappear entirely if a language supported a default initialisation value when a return gives no explicit value. (i.e., "T foo() { return; }; T x =
foo();" could be treated in the same way as "T x;" in a static initialisation context.)

You wrote:

T foo () {return;} # definition?

T x = foo(); # call?

I'm not quite sure what you're saying here. That a missing return value
in non-void function would default to all-zeros?

Maybe. A rather pointless feature just to avoid writing '0', and which
now introduces a new opportunity for a silent error (accidentally
forgetting a return value).

It's not quite the same as a static initialisiation, which is zeroed
when a program starts.

Then you list some things that may or may not happen, which are of
course totally irrelevant. If you list the differences between bikes
and cars, you don't include "some cars are red" and "bikes are unlikely
to be blue".

Yes; if you're using a vehicle, or planning a journey or any related
thing, it helps to remember if it's a bike or a car! At least here you acknowledge the difference.

But I guess you find those likely/unlikely macros of gcc pointless too.
If I know something is a procedure, then I also know it is likely to
change global state, that I might need to deal with a return value, and
a bunch of other stuff.

Boldly separating the two with either FUNC or PROC denotations I find
helps tremendously. YM-obviously-V, but you can't have a go at me for my
view.

If I really found it a waste of time, the distinction would have been
dropped decades ago.

It's a pointless distinction. Any function or procedure can be morphed into the other form without any difference in the semantic meaning of
the code, requiring just a bit of re-arrangement at the caller site:

    int foo(int x) { int y = ...; return y; }

    void foo(int * res, int x) { int y = ...; *res = y; }

    void foo(int x) { ... ; return; }

    int foo(int x) { ... ; return 0; }

There is no relevance in the division here, which is why most languages don't make a distinction unless they do so simply for syntactic reasons.

As I said, you like to mix things up. You disagreed. I'm not surprised.

Here you've demonstrated how a function that returns results by value
can be turned into a procedure that returns a result by reference.

So now, by-value and by-reference are the same thing?

I listed seven practical points of difference between functions and procedures, and above is an eighth point, but you just dismissing them.
Is there any point in this?

I do like taking what some think as a single feature and having
dedicated versions, because I find it helpful.

That includes functions, loops, control flow and selections.

In C, the syntax is dreadful: not only can you barely distinguish a
function from a procedure (even without attributes, user types and
macros add in), but you can hardly tell them apart from variable
declarations.

As always, you are trying to make your limited ideas of programming languages appear to be correct, universal, obvious or "natural" by
saying things that you think are flaws in C. That's not how a
discussion works, and it is not a way to convince anyone of anything.
The fact that C does not have a keyword used in the declaration or definition of a function does not in any way mean that there is the slightest point in your artificial split between "func" and "proc" functions.

void F();
void (*G);
void *H();
void (*I)();

OK, 4 things declared here. Are they procedures, functions, variables,
or pointers to functions? (I avoided using a typedef in place of 'void'
to make things easier.)

I /think/ they are as follows: procedure, pointer variable, function (returning void*), and pointer to a procedure. But I had to work at it,
even though the examples are very simple.

I don't know about you, but I prefer syntax like this:

proc F
ref void G
ref proc H
func I -> ref void

Now come on, scream at again for prefering a nice syntax for
programming, one which just tells me at a glance what it means without
having to work it out.

(It doesn't matter that I too prefer a clear keyword for defining
functions in a language.)

Why? Don't your smart tools tell you all that anyway?

That is solely from your choice of an IL.

The IL design also falls into place from the natural way these things
have to work.

Of course you are wrong!

You keep saying that. But then you also keep saying, from time to time,
that you agree that something in C was a bad idea. So I'm still wrong
when calling out the same thing?

If there was an alternative language that I thought would be better for
the tasks I have, I'd use that. (Actually, a subset of C++ is often better, so I use that when I can.)

What do you think I should do instead? Whine in newsgroups to people
that don't write language standards (for C or anything else) and don't
make compilers?

What makes you think I'm whining? The thread opened up a discussion
about multi-way selections, and it got into how it could be done with
features from other languages.

I gave some examples from mine, as I'm very familiar with that, and it
uses simple features that are easy to grasp and appreciate. You could
have done the same from ones you know.

But you just hate the idea that I have my own language to draw on, whose syntax is very sweet ('serious' languages hate such syntax for some
reason, and is usually relegated to scripting languages.)

I guess then you just have to belittle and insult me, my languages and
my views at every opporunity.

Make my own personal language that is useless to
everyone else and holds my customers to ransom by being the only person
that can work with their code?

Plenty of companies use DSLs. But isn't that sort of what you do? That
is, using 'C' with a particular interpretation or enforcement of the
rules, which needs to go in hand with a particular compiler, version,
sets of options and assorted makefiles.

I for one would never be able to build one of your programs. It might as
well be written in your inhouse language with proprietory tools.

--- Synchronet 3.20a-Linux NewsLink 1.114

From Bart@bc@freeuk.com to comp.lang.c on Thu Nov 7 12:23:04 2024

From Newsgroup: comp.lang.c

On 06/11/2024 14:50, David Brown wrote:

On 05/11/2024 23:48, Bart wrote:

On 05/11/2024 13:29, David Brown wrote:

int small_int_sqrt(int x) {
     if (x == 0) return 0;
     if (x < 4) return 1;
     if (x < 9) return 2;
     if (x < 16) return 3;
     unreachable();
}

"unreachable()" is a C23 standardisation of a feature found in most
high-end compilers. For gcc and clang, there is
__builtin_unreachable(), and MSVC has its version.

So it's a kludge. Cool, I can create one of those too:

func smallsqrt(int x)int =
if
elsif x=0 then 0
elsif x<4 then 1
elsif x<9 then 2
elsif x<16 then 3
dummyelse int.min
fi
end

'dummyelse' is a special version of 'else' that tells the compiler that control will (should) never arrive there. ATM it does nothing but inform
the reader of that and to remind the author. But later stages of the
compiler can choose not to generate code for it, or to generate error-reporting code.

(A couple of things about this: the first 'if' condition and branch can
be omitted; it starts at elsif. This removes the special-casing for the
first of an if-elsif-chain, so to allow easier maintenance, and better alignment.

Second is that, unlike your C, the whole if-fi construct is a single expression term that yields the function return value. Hence the need
for all branches to be present and balanced regarding their common type.

This could have been handled internally (compiler adds 'dummyelse <empty
value for type>'), but I think it's better that it is explicit (user
might forget to add that branch).

That int.main is something I sometimes use for in-band signalling. Here
that is the value -9223372036854775808 so it's quite a wide band!
Actually it is out-of-band it the user expects only result with an i32
range.

BTW your example lets through negative values; I haven't fixed that.)

Getting that right will satisfy both the language (if it cared more
about such matters than C apparently does), and the casual reader
curious about how the function contract is met (that is, supplying
that promised return int type if or when it returns).

C gets it right here. There is no need for a return type when there is
no return

There is no return for only half the function! A function with a return
type is a function that CAN return. If it can't ever return, then make
it a procedure.

Take this function where N can never be zero; is this the right way to
write it in C:

int F(int N) {
if (N==0) unreachable();
return abc/N; // abc is a global with value 100
}

If doesn't look right. If I compile it with gcc (using
__builtin_unreachable), and call F(0), then it crashes. So it doesn't do
much does it?!

indeed, trying to force some sort of type or "default" value
would be counterproductive. It would be confusing to the reader, > add untestable and unexecutable source code,

But it IS confusing, since it quite clearly IS reachable. There's a
difference between covering all possible values of N, so that is
genuinely is unreachable, and having code that COULD be reachable.

Let's now look at another alternative - have the function check for validity, and return some kind of error signal if the input is invalid. There are two ways to do this - we can have a value of the main return
type acting as an error signal, or we can have an additional return value.

...

All in all, we have a significant costs in various aspects, with no real benefit, all in the name of a mistaken belief that we are avoiding
undefined behaviour.

This is all a large and complex subject. But it's not really the point
of the discussion.

I'm not talking about what happens when running a program, but what
happens at compilation, and satisfying the needs of the language.

C here is less strict in being happy to have parts of a function body as
no-go areas where various requirements can be ignored, like a function
with a designed return type T, being allowed to return without
satisfying that need.

Here, you demostrated bolted-on hacks that are not part of the language,
like the snappy __builtin_unreachable (the () are apparently optional).
I can't see however that it does much.

It is a fact C as a language allows this:

T F() {} // T is not void

(I've had to qualify T - point number 9 in procedures vs. function.)

All that C says is that control flow running into that closing },
without encountering a 'return x', is UB.

IMV, sloppy. My language simply doesn't allow it.

--- Synchronet 3.20a-Linux NewsLink 1.114

From Bart@bc@freeuk.com to comp.lang.c on Thu Nov 7 15:08:34 2024

From Newsgroup: comp.lang.c

On 07/11/2024 12:23, Bart wrote:

On 06/11/2024 14:50, David Brown wrote:

C gets it right here. There is no need for a return type when there
is no return

There is no return for only half the function! A function with a return
type is a function that CAN return. If it can't ever return, then make
it a procedure.

Take this function where N can never be zero; is this the right way to
write it in C:

   int F(int N) {
       if (N==0) unreachable();
       return abc/N;              // abc is a global with value 100
   }

If doesn't look right. If I compile it with gcc (using __builtin_unreachable), and call F(0), then it crashes. So it doesn't do much does it?!

It looks like it needs 'else' here. If I put that in, then F(0) returns
either 0 or 1, so it returns garbage, whether or not 'unreachable' is
used in the branch.

So I'm struggling to see the point of it. Is it just to quieten the
'reaches end of non-void function' warning when used before the final '}'?

In any case, 'unreachable' is a misnomer. 'shouldnt_be_reachable' is
more accurate.

--- Synchronet 3.20a-Linux NewsLink 1.114

From David Brown@david.brown@hesbynett.no to comp.lang.c on Thu Nov 7 17:23:54 2024

From Newsgroup: comp.lang.c

On 06/11/2024 20:38, Bart wrote:

On 06/11/2024 15:47, David Brown wrote:

On 06/11/2024 15:40, Bart wrote:

There are irrelevant differences in syntax, which could easily
disappear entirely if a language supported a default initialisation
value when a return gives no explicit value. (i.e., "T foo() {
return; }; T x = foo();" could be treated in the same way as "T x;" in
a static initialisation context.)

You wrote:

T foo () {return;}        # definition?

T x = foo();              # call?

I'm not quite sure what you're saying here. That a missing return value
in non-void function would default to all-zeros?

It would not necessarily mean all zeros, but yes, that's the idea. You
could easily say that returning from a non-void function without an
explicit value, or falling off the end of it, returned the default value
for the type in the same sense as you can have a default initialisation
of non-stack objects in a language. (In C, this pretty much always
means all zeros - in a more advanced language with object support, it
would typically mean default construction.)

Equally, you could say that in a void function, "return x;" simply casts
"x" to void - just like writing "x;" as a statement does.

I'm not suggesting that either of these things are a particularly good
idea - I am merely saying that with a minor syntactic change to the
language (your language, C, or anything similar) most of the rest of the differences between your "proc" and your "func" disappear.

All you are left with is that "func" can be used in an expression, and
"proc" cannot. For me, that is not sufficient reason to distinguish
them as concepts.

Maybe. A rather pointless feature just to avoid writing '0', and which
now introduces a new opportunity for a silent error (accidentally
forgetting a return value).

Sure. As I say, I don't think it is a particularly good idea - at
least, not as an addition to C (or, presumably, your language).

It's not quite the same as a static initialisiation, which is zeroed
when a program starts.

Of course. (Theoretically in C, pointers are initialised to null
pointers which don't have to be all zeros. But I don't know of any implementation which has something different.) I was just using that to
show how some languages - like C - have a default value available.

Then you list some things that may or may not happen, which are of
course totally irrelevant. If you list the differences between bikes
and cars, you don't include "some cars are red" and "bikes are
unlikely to be blue".

Yes; if you're using a vehicle, or planning a journey or any related
thing, it helps to remember if it's a bike or a car! At least here you acknowledge the difference.

There's a difference between cars and bikes - not between procs and funcs.

Remember, if you are going to make such a distinction between two
concepts, it has to be absolute - "likely" or "unlikely" does not help.
You can't distinguish between your procs and funcs by looking at the
existence of side-effects, since a code block that has side-effects
might return a value or might not. It's like looking at a vehicle and
seeing that it is red - it won't tell you if it is a bike or a car.

This is why I say distinguishing between "func" and "proc" by your
criteria - the existence or absence of a return type - gives no useful information to the programmer or the compiler that can't be equally well
given by writing a return type of "void".

But I guess you find those likely/unlikely macros of gcc pointless too.

How is that even remotely relevant to the discussion? (Not that gcc has macros by those names.)

If I know something is a procedure, then I also know it is likely to
change global state, that I might need to deal with a return value, and
a bunch of other stuff.

That's useless information - both to the programmer, and to the
compiler. (I am never sure which viewpoint you are taking - it would be helpful if you were clear there.) If the compiler /knows/ global state
cannot be changed, and the function only uses data from its input
values, then it can do a lot with that information - shuffling around
calls, removing duplicates, pre-calculating constant data at compile
time, or whatever. Similarly, if the programmer /knows/ global state
cannot be changed in a function, then that can make it easier to
understand what is going on in the code, or what is going wrong in it.

But if you only know that it is /likely/ to be one thing or the other,
you know nothing of use.

Boldly separating the two with either FUNC or PROC denotations I find
helps tremendously. YM-obviously-V, but you can't have a go at me for my view.

I can have a go at you for not thinking! I believe that if you think
more carefully about this, you will understand how little your
distinction helps anyone. You might find the distinction I made -
between being allowed to interact with global state (a "procedure") and
having no possibility of interacting with global state (a "function") -
to be useful. In my distinction, there is no grey area of "likely" or "unlikely" - it is absolute, and therefore gives potentially useful information. Of course it is then up to you to decide if it is worth
the effort or not.

Let me tempt you with this - whatever syntax or terms you use here,
you'll be able to brag that it is nicer than C23's "[[unsequenced]]"
attribute for pure functions!

If I really found it a waste of time, the distinction would have been dropped decades ago.

Why? Once you've put it in the language, there is no motivation to drop
it. Pascal has the same procedure / function distinction you do. Just because it adds little of use to language, does not mean that you'd want
to drop it and make your tools incompatible between language versions.

It's a pointless distinction. Any function or procedure can be
morphed into the other form without any difference in the semantic
meaning of the code, requiring just a bit of re-arrangement at the
caller site:

     int foo(int x) { int y = ...; return y; }

     void foo(int * res, int x) { int y = ...; *res = y; }

     void foo(int x) { ... ; return; }

     int foo(int x) { ... ; return 0; }

There is no relevance in the division here, which is why most
languages don't make a distinction unless they do so simply for
syntactic reasons.

As I said, you like to mix things up. You disagreed. I'm not surprised.

Here you've demonstrated how a function that returns results by value
can be turned into a procedure that returns a result by reference.

So now, by-value and by-reference are the same thing?

Returning something from a function by returning a value, or by having
the caller pass a pointer (or mutable reference, if you prefer that
term) and having the function pass its results via that pointer are not
really very different. Sure, there are details of the syntax and the
ABI that will differ, but not the meaning of the code.

Remember that this is precisely what C compilers do when returning a
struct that is too big to fit neatly in a register or two - the caller
makes space for the return struct on the stack and passes a pointer to
it as a hidden parameter to the function. The function has no normal
return value. And yet the struct return is syntactically and
semantically identical whether it is returned in registers or via a
hidden pointer.

I listed seven practical points of difference between functions and procedures, and above is an eighth point, but you just dismissing them.
Is there any point in this?

Maybe not, if you can't understand /why/ I am dismissing them. The only difference you listed that is real and has potential consequences for
people using the language is that functions returning a value can be
used in expressions - all the rest is minor detail or wishy-washy "maybes".

I do like taking what some think as a single feature and having
dedicated versions, because I find it helpful.

That includes functions, loops, control flow and selections.

If it ultimately comes down to just the word you want to use, then I
guess that's fair enough. It is the /reasoning/ you gave that I am
arguing with.

If your language has "do ... until" and "do ... while" loops, and you
justify it by saying you simply find it nicer to write some tests as
positives and some tests as negatives, then I think that is reasonable.
If you claim it is because they are fundamentally distinct and do
different things because one is likely to loop more than three times and
the other is unlikely to do so, then I'd argue against that claim.

In C, the syntax is dreadful: not only can you barely distinguish a
function from a procedure (even without attributes, user types and
macros add in), but you can hardly tell them apart from variable
declarations.

As always, you are trying to make your limited ideas of programming
languages appear to be correct, universal, obvious or "natural" by
saying things that you think are flaws in C. That's not how a
discussion works, and it is not a way to convince anyone of anything.
The fact that C does not have a keyword used in the declaration or
definition of a function does not in any way mean that there is the
slightest point in your artificial split between "func" and "proc"
functions.

void F();
void (*G);
void *H();
void (*I)();

OK, 4 things declared here. Are they procedures, functions, variables,
or pointers to functions? (I avoided using a typedef in place of 'void'
to make things easier.)

I /think/ they are as follows: procedure, pointer variable, function (returning void*), and pointer to a procedure. But I had to work at it,
even though the examples are very simple.

I don't know about you, but I prefer syntax like this:

   proc F
   ref void G
   ref proc H
   func I -> ref void

Now come on, scream at again for prefering a nice syntax for
programming, one which just tells me at a glance what it means without having to work it out.

I quite agree that your syntax is clearer that the example in C for this
kind of thing. I rarely see the C syntax as complicated - for my own
code - because I use typedefs and spacing that makes it clear. But I
fully agree that it is clearer in a language if it distinguishes better between declarations of variables and declarations of functions.

However, I don't think it would make a huge difference to the clarity of
your syntax if you had written :

func F -> void
ref void G
ref func H -> void
func I -> ref void

or

func F
ref void G
ref func H
func I -> ref void

It is not the use of a keyword for functions that I disagree with, nor
am I arguing for C's syntax or against your use of "ref" or ordering. I simply don't think there is much to be gained by using "proc F" instead
of "func F -> void" (assuming that's the right syntax) - or just "func F".

But I think there is quite a bit to be gained if the func/proc
distinction told us something useful and new, rather than just the
existence or lack of a return type.

(It doesn't matter that I too prefer a clear keyword for defining
functions in a language.)

Why? Don't your smart tools tell you all that anyway?

Yes, they can. But it would be nicer with a keyword. Where possible, I prefer clear language constructs /and/ nice syntax highlighting and
indexing from tools. Call me greedy if you like!

That is solely from your choice of an IL.

The IL design also falls into place from the natural way these things
have to work.

Of course you are wrong!

You keep saying that. But then you also keep saying, from time to time,
that you agree that something in C was a bad idea. So I'm still wrong
when calling out the same thing?

I can agree with you about some of the things you say about C, while
still disagreeing with other things (about C or programming in general).

If there was an alternative language that I thought would be better
for the tasks I have, I'd use that. (Actually, a subset of C++ is
often better, so I use that when I can.)

What do you think I should do instead? Whine in newsgroups to people
that don't write language standards (for C or anything else) and don't
make compilers?

What makes you think I'm whining? The thread opened up a discussion
about multi-way selections, and it got into how it could be done with features from other languages.

You /do/ whine a lot. But here I was asking, rhetorically, if you
thought that was a good alternative to finding ways to make C work well
for me.

I gave some examples from mine, as I'm very familiar with that, and it
uses simple features that are easy to grasp and appreciate. You could
have done the same from ones you know.

But you just hate the idea that I have my own language to draw on, whose syntax is very sweet ('serious' languages hate such syntax for some
reason, and is usually relegated to scripting languages.)

I guess then you just have to belittle and insult me, my languages and
my views at every opporunity.

I haven't been berating or belittling your language here - I have been
arguing against some of the justification you have for some design
decisions, and suggesting something that I think would be better.

Make my own personal language that is useless to everyone else and
holds my customers to ransom by being the only person that can work
with their code?

Plenty of companies use DSLs. But isn't that sort of what you do? That
is, using 'C' with a particular interpretation or enforcement of the
rules, which needs to go in hand with a particular compiler, version,
sets of options and assorted makefiles.

No.

I for one would never be able to build one of your programs. It might as well be written in your inhouse language with proprietory tools.

Pretty much every professional in my field could manage it. But
software development is a wide discipline, with many niche areas.

--- Synchronet 3.20a-Linux NewsLink 1.114

From Bart@bc@freeuk.com to comp.lang.c on Thu Nov 7 17:51:09 2024

From Newsgroup: comp.lang.c

On 07/11/2024 16:23, David Brown wrote:

On 06/11/2024 20:38, Bart wrote:

[Functions vs. procedures]

   void F();
   void (*G);
   void *H();
   void (*I)();

OK, 4 things declared here. Are they procedures, functions, variables,
or pointers to functions? (I avoided using a typedef in place of
'void' to make things easier.)

I /think/ they are as follows: procedure, pointer variable, function
(returning void*), and pointer to a procedure. But I had to work at
it, even though the examples are very simple.

I don't know about you, but I prefer syntax like this:

    proc F
    ref void G
    ref proc H
    func I -> ref void

(The last two might be wrong interpretations of the C. I've stared at
the C code for a minute and I'm still not sure.

If I put it through my C compiler and examine the ST listing, it seems I
I'd just swapped the last two:

func H -> ref void
ref proc I

But you shouldn't need to employ a tool to figure out if a declaration
is even a function, let alone whether it is also a procedure. That
syntax is not fit for purpose. This is a HLL, so let's have some some HL syntax, not gobbledygook.)

It is not the use of a keyword for functions that I disagree with, nor
am I arguing for C's syntax or against your use of "ref" or ordering. I simply don't think there is much to be gained by using "proc F" instead
of "func F -> void" (assuming that's the right syntax) - or just "func F".

But I think there is quite a bit to be gained if the func/proc
distinction told us something useful and new, rather than just the
existence or lack of a return type.

I use the same syntax for my dynamic language where type annotations are
not used, including indicating a return type for a function. That means
that without distinct keywords here:

func F =
end

proc G =
end

I can't tell whether each a returns value or not. So 'func'/'proc' is
useful to me, to readers, and makes it possible to detect errors and omissions:

- 'return' without a value in functions
- 'return x' used in procedures
- A missing return or missing return value in functions (since this
is also expression-based and the "return" keyword is optional in the
last statement/expression)
- A missing 'else' clause of multi-way constructs within functions
- Trying to use the value of a function call when that is not a
function.

--- Synchronet 3.20a-Linux NewsLink 1.114

From David Brown@david.brown@hesbynett.no to comp.lang.c on Fri Nov 8 09:45:28 2024

From Newsgroup: comp.lang.c

On 07/11/2024 13:23, Bart wrote:

On 06/11/2024 14:50, David Brown wrote:

On 05/11/2024 23:48, Bart wrote:

On 05/11/2024 13:29, David Brown wrote:

int small_int_sqrt(int x) {
     if (x == 0) return 0;
     if (x < 4) return 1;
     if (x < 9) return 2;
     if (x < 16) return 3;
     unreachable();
}

"unreachable()" is a C23 standardisation of a feature found in most
high-end compilers. For gcc and clang, there is
__builtin_unreachable(), and MSVC has its version.

So it's a kludge.

You mean it is something you don't understand? Think of this as an opportunity to learn something new.

Cool, I can create one of those too:

func smallsqrt(int x)int =
     if
     elsif x=0 then 0
     elsif x<4 then 1
     elsif x<9 then 2
     elsif x<16 then 3
     dummyelse       int.min
     fi
end

'dummyelse' is a special version of 'else' that tells the compiler that control will (should) never arrive there. ATM it does nothing but inform
the reader of that and to remind the author. But later stages of the compiler can choose not to generate code for it, or to generate error-reporting code.

You are missing the point - that is shown clearly by the "int.min".

Do you /really/ not understand when and why it can be useful to tell the compiler that something cannot happen?

BTW your example lets through negative values; I haven't fixed that.)

Again, you are missing the point.

This is all a large and complex subject. But it's not really the point
of the discussion.

You haven't followed the discussion or considered it to have a point.
To you, the "point" of /all/ discussions here is that you hate
everything about C, think that everyone else loves everything about C,
and see it as your job to prove them "wrong".

You have your way of doing things, and have no interest in learning
anything else or even bothering to listen or think. Your bizarre hatred
of C is overpowering for you - it doesn't matter what anyone writes.
All that matters to you is how you can use it as an excuse to fit it
into your world-view that everything about C, and everything written in
C, is terrible. You don't even appear to care about your own languages
beyond the fact that they are not C.

It is time to give up for now.

--- Synchronet 3.20a-Linux NewsLink 1.114

From Bart@bc@freeuk.com to comp.lang.c on Fri Nov 8 17:37:20 2024

From Newsgroup: comp.lang.c

On 08/11/2024 08:45, David Brown wrote:

On 07/11/2024 13:23, Bart wrote:

On 06/11/2024 14:50, David Brown wrote:

On 05/11/2024 23:48, Bart wrote:

On 05/11/2024 13:29, David Brown wrote:

int small_int_sqrt(int x) {
     if (x == 0) return 0;
     if (x < 4) return 1;
     if (x < 9) return 2;
     if (x < 16) return 3;
     unreachable();
}

"unreachable()" is a C23 standardisation of a feature found in most
high-end compilers. For gcc and clang, there is
__builtin_unreachable(), and MSVC has its version.

So it's a kludge.

You mean it is something you don't understand? Think of this as an opportunity to learn something new.

You don't seem to understand a 'kludge' is. Think of it as a 'hack',
something bolted-on to a language.

This is from Hacker News about 'unreachable':

"Note that gcc and clang's __builtin_unreachable() are optimization
pragmas, not assertions. If control actually reaches a __builtin_unreachable(), your program doesn't necessarily abort.

Terrible things can happen such as switch statements jumping into random addresses or functions running off the end without returning:"

"Sure, these aren't for defensive programming—they're for places where
you know a location is unreachable, but your compiler can't prove it for
you."

'dummyelse' is a special version of 'else' that tells the compiler
that control will (should) never arrive there. ATM it does nothing but
inform the reader of that and to remind the author. But later stages
of the compiler can choose not to generate code for it, or to generate
error-reporting code.

You are missing the point - that is shown clearly by the "int.min".

At least my code will never 'run off the end of a function'.

But, it looks like you're happy with ensuring C programs don't do that,
by the proven expedient of keeping your fingers crossed.

You have your way of doing things, and have no interest in learning
anything else or even bothering to listen or think.

Ditto for you.

Your bizarre hatred
of C is overpowering for you

Ditto for your hatred of my stuff.

--- Synchronet 3.20a-Linux NewsLink 1.114

From Janis Papanagnou@janis_papanagnou+ng@hotmail.com to comp.lang.c on Fri Nov 8 18:37:26 2024

From Newsgroup: comp.lang.c

On 03.11.2024 18:00, David Brown wrote:

On 02/11/2024 21:44, Bart wrote:

(Note that the '|' is my example is not 'or'; it means 'then':

( c | a ) # these are exactly equivalent
if c then a fi

( c | a | ) # so are these
if c then a else b fi

There is no restriction on what a and b are, statements or
expressions, unless the whole returns some value.)

Ah, so your language has a disastrous choice of syntax here so that
sometimes "a | b" means "or", and sometimes it means "then" or
"implies", and sometimes it means "else".

(I can't comment on the "other use" of the same syntax in the
"poster's language" since it's not quoted here.)

But it's not uncommon in programming languages that operators
are context specific, and may mean different things depending
on context.

You are saying "disastrous choice of syntax". - Wow! Hard stuff.
I suggest to cool down before continuing reading further. :-)

Incidentally above syntax is what Algol 68 supports; you have
the choice to write conditionals with 'if' or with parenthesis.
As opposed to "C", where you have also *two* conditionals, one
for statements (if-then-else) and one for expressions ( ? : ),
in Algol 68 you can use both forms (sort of) "anywhere", e.g.
IF a THEN b ELSE c FI
x := IF a THEN b ELSE c FI
IF a THEN b ELSE c FI := x
or using the respective alternative forms with ( a | b | c) ,
or ( a | b ) where no 'ELSE' is required. (And there's also
the 'ELIF' and the '|:' as alternative form available.)

BTW, the same symbols can also be used as an alternative form
of the 'case' statement; the semantic distinction is made by
context, e.g. the types involved in the construct.
I can understand if this sounds strange and feels unfamiliar.

Why have a second syntax with
a confusing choice of operators when you have a perfectly good "if /
then / else" syntax?

Because, depending on the program context, that may not be as
legible as the other, simpler construct.

Personally I use both forms depending on application context.
In some cases one syntax is better legible, in other cases the
other one.[*]

In complex expressions it may even be worthwhile to mix(!) both
forms; use 'if' on outer levels and parenthesis on inner levels.
(Try an example and see before too quickly dismiss the thought.)

Or if you feel an operator adds a lot to the
language here, why not choose one that would make sense to people, such
as "=>" - the common mathematical symbol for "implies".

This is as opinion of course arguable. It's certainly also
influenced where one is coming from (i.e. personal expertise
from other languages). The detail of what symbols are used is
not that important to me, if it fits to the overall language
design.

From the high-level languages I used in my life I was almost
always "missing" something with conditional expressions. I
don't want separate and restricted syntaxes (plural!) in "C"
(for statements and expressions respectively), for example.
Some are lacking conditional expressions completely. Others
support the syntax with a 'fi' end-terminator and simplify
structures (and add to maintainability) supporting 'else-if'.
And few allow 'if' expressions on the left-hand side of an
assignment. (Algol 68 happens to support everything I need.
Unfortunately it's a language I never used professionally.)

I'm positive that folks who use languages that support those
syntactic forms wouldn't like to miss them. (Me for sure.)

("disastrous syntax" - I'm still laughing... :-)

Bart, out of interest; have you invented that syntax for your
language yourself of borrowed it from another language (like
Algol 68)?

Janis

[*] BTW, in Unix shell I also use the '||' and '&&' syntax
shortcuts occasionally, in addition to the if/then/else/fi
constructs, depending on the application context.

--- Synchronet 3.20a-Linux NewsLink 1.114

From Janis Papanagnou@janis_papanagnou+ng@hotmail.com to comp.lang.c on Fri Nov 8 18:52:01 2024

From Newsgroup: comp.lang.c

On 06.11.2024 00:01, Bart wrote:

Well, it started off as 2-way select, meaning constructs like this:

x = c ? a : b;
x := (c | a | b)

Where one of two branches is evaluated. I extended the latter to N-way select:

x := (n | a, b, c, ... | z)

Where again one of these elements is evaluated, selected by n (here
having the values of 1, 2, 3, ... compared with true, false above, but
there need to be at least 2 elements inside |...| to distinguish them).

I suppose you borrowed that syntax from Algol 68, or is that just
coincidence?

Algol 68's 'CASE' statement has the abbreviated form you depicted
above. (There's also some nesting supported with the '|:' operator,
similar to the 'IF' syntax [in Algol 68].) - Personally, though,
I use that only very rarely because of the restriction to support
only integral numbers as branch selector.

I applied it also to other statements that can be provide values, such
as if-elsif chains and switch, but there the selection might be
different (eg. a series of tests are done sequentially until a true one).

I don't know how it got turned into 'multi-way'.

[...]

Janis

--- Synchronet 3.20a-Linux NewsLink 1.114

From Janis Papanagnou@janis_papanagnou+ng@hotmail.com to comp.lang.c on Fri Nov 8 18:53:52 2024

From Newsgroup: comp.lang.c

On 06.11.2024 11:01, Bart wrote:

x := (n | a, b, c, ... | z)

It's a version of Algol68's case construct:

x := CASE n IN a, b, c OUT z ESAC

which also has the same compact form I use. I only use the compact
version because n is usually small, and it is intended to be used within
an expression: print (n | "One", "Two", "Three" | "Other").

Which answers my upthread raised questions. :-)

Thanks.

Janis

--- Synchronet 3.20a-Linux NewsLink 1.114

From Bart@bc@freeuk.com to comp.lang.c on Fri Nov 8 18:03:54 2024

From Newsgroup: comp.lang.c

On 03/11/2024 17:00, David Brown wrote:

On 02/11/2024 21:44, Bart wrote:

(Note that the '|' is my example is not 'or'; it means 'then':

    ( c |    a )          # these are exactly equivalent
    if c then a fi

    ( c |    a |    b )     # so are these [fixed]
    if c then a else b fi

There is no restriction on what a and b are, statements or
expressions, unless the whole returns some value.)

Ah, so your language has a disastrous choice of syntax here so that sometimes "a | b" means "or", and sometimes it means "then" or
"implies", and sometimes it means "else".

I missed this part of a very long post until JP commented on it.

As I mentioned above, "|" here doesn't mean 'or' at all. In "( ... | ...
| ... )", the first means "then" and the second "else". (It also wasn't
my idea, it was taken from Algol 68.)

Why have a second syntax with
a confusing choice of operators when you have a perfectly good "if /
then / else" syntax?

if/then/else suits multi-line statements. (||) suits terms that are part
of a larger one-line expression.

I might as well ask why C uses ?: when it has if-else, or why it needs

m when it has (*P).m.

Or if you feel an operator adds a lot to the
language here, why not choose one that would make sense to people, such
as "=>" - the common mathematical symbol for "implies".

It is not an operator, it is part of '(x | x,x,x | x)' syntax.

--- Synchronet 3.20a-Linux NewsLink 1.114

From David Brown@david.brown@hesbynett.no to comp.lang.c on Fri Nov 8 19:18:26 2024

From Newsgroup: comp.lang.c

On 08/11/2024 18:37, Janis Papanagnou wrote:

On 03.11.2024 18:00, David Brown wrote:

On 02/11/2024 21:44, Bart wrote:

(Note that the '|' is my example is not 'or'; it means 'then':

( c | a ) # these are exactly equivalent
if c then a fi

( c | a | ) # so are these
if c then a else b fi

There is no restriction on what a and b are, statements or
expressions, unless the whole returns some value.)

Ah, so your language has a disastrous choice of syntax here so that
sometimes "a | b" means "or", and sometimes it means "then" or
"implies", and sometimes it means "else".

(I can't comment on the "other use" of the same syntax in the
"poster's language" since it's not quoted here.)

But it's not uncommon in programming languages that operators
are context specific, and may mean different things depending
on context.

Sure. Just look at the comma for an overloaded syntax in many languages.

You are saying "disastrous choice of syntax". - Wow! Hard stuff.
I suggest to cool down before continuing reading further. :-)

The | operator means "or" in the OP's language (AFAIK - only he actually
knows the language). So "(a | b | c)" in that language will sometimes
mean the same as "(a | b | c)" in C, and sometimes it will mean the same
as "(a ? b : c)" in C.

There may be some clear distinguishing feature that disambiguates these
uses. But this is a one-man language - there is no need for a clear
syntax or grammar, documentation, consistency in the language, or a consideration for corner cases or unusual uses.

Incidentally above syntax is what Algol 68 supports;

Yes, he said later that Algol 68 was the inspiration for it. Algol 68
was very successful in its day - but there are good reasons why many of
its design choices were been left behind long ago in newer languages.

Or if you feel an operator adds a lot to the
language here, why not choose one that would make sense to people, such
as "=>" - the common mathematical symbol for "implies".

This is as opinion of course arguable. It's certainly also
influenced where one is coming from (i.e. personal expertise
from other languages).

The language here is "mathematics". I would not expect anyone who even considers designing a programming language to be unfamiliar with that
symbol.

The detail of what symbols are used is
not that important to me, if it fits to the overall language
design.

I am quite happy with the same symbol being used for very different
meanings in different contexts. C's use of "*" for indirection and for multiplication is rarely confusing. Using | for "bitwise or" and also
using it for a "pipe" operator would probably be fine - only one
operation makes sense for the types involved. But here the two
operations - "bitwise or" (or logical or) and "choice" can apply to to
the same types of operands. That's what makes it a very poor choice of syntax.

(For comparison, Algol 68 uses "OR", "∨" or "\/" for the "or" operator,
thus it does not have this confusion.)

From the high-level languages I used in my life I was almost
always "missing" something with conditional expressions. I
don't want separate and restricted syntaxes (plural!) in "C"
(for statements and expressions respectively), for example.
Some are lacking conditional expressions completely. Others
support the syntax with a 'fi' end-terminator and simplify
structures (and add to maintainability) supporting 'else-if'.
And few allow 'if' expressions on the left-hand side of an
assignment. (Algol 68 happens to support everything I need.
Unfortunately it's a language I never used professionally.)

I'm positive that folks who use languages that support those
syntactic forms wouldn't like to miss them. (Me for sure.)

I've nothing (much) against the operation - it's the choice of operator
that is wrong.

("disastrous syntax" - I'm still laughing... :-)

--- Synchronet 3.20a-Linux NewsLink 1.114

From Bart@bc@freeuk.com to comp.lang.c on Fri Nov 8 22:24:44 2024

From Newsgroup: comp.lang.c

On 08/11/2024 17:37, Janis Papanagnou wrote:

On 03.11.2024 18:00, David Brown wrote:

or using the respective alternative forms with ( a | b | c) ,
or ( a | b ) where no 'ELSE' is required. (And there's also
the 'ELIF' and the '|:' as alternative form available.)

BTW, the same symbols can also be used as an alternative form
of the 'case' statement; the semantic distinction is made by
context, e.g. the types involved in the construct.

You mean whether the 'a' in '(a | b... | c)' has type Bool rather than Int?

I've always discriminated on the number of terms between the two |s:
either 1, or more than 1.

It would be uncommon to select one-of-N when N is only 1! It does make
for an untidy exception in the language, but which has never bothered me
(I don't think I've even thought about it until now.)

Bart, out of interest; have you invented that syntax for your
language yourself of borrowed it from another language (like
Algol 68)?

It was heavily inspired by the syntax (not the semantics) of Algol68,
even though I'd never used it at that point.

I like that it solved the annoying begin-end aspect of Algol60/Pascal
syntax where you have to write the clunky:

if cond then begin s1; s2 end else begin s3; s4 end;

You see it also with braces:

if (cond) {s1; s2; } else { s3; s4; }

With Algol68 it became:

IF cond THEN s1; s2 ELSE s3; s4 FI;

I enhanced it by not needing stropping (and so not allowing embedded
spaces within names); allowing redundant semicolons while at the same
time, turning newlines into semicolons when a line obviously didn't
continue; plus allowing ordinary 'end' or 'end if' to be used as well as
'fi'.

My version then can look like this, a bit less forbidding than Algol68:

if cond then
s1
s2
else
s3
s4
end

--- Synchronet 3.20a-Linux NewsLink 1.114

From Janis Papanagnou@janis_papanagnou+ng@hotmail.com to comp.lang.c on Sat Nov 9 04:57:05 2024

From Newsgroup: comp.lang.c

On 08.11.2024 23:24, Bart wrote:

On 08/11/2024 17:37, Janis Papanagnou wrote:

BTW, the same symbols can also be used as an alternative form
of the 'case' statement; the semantic distinction is made by
context, e.g. the types involved in the construct.

You mean whether the 'a' in '(a | b... | c)' has type Bool rather than Int?

I've always discriminated on the number of terms between the two |s:
either 1, or more than 1.

I suppose in a [historic] "C" like language it's impossible to
distinguish on type here (given that there was no 'bool' type
[in former times] in "C"). - But I'm not quite sure whether
you're speaking here about your "C"-like language or some other
language you implemented.

Bart, out of interest; have you invented that syntax for your
language yourself of borrowed it from another language (like
Algol 68)?

It was heavily inspired by the syntax (not the semantics) of Algol68,

(Sure.)

even though I'd never used it at that point.

I like that it solved the annoying begin-end aspect of Algol60/Pascal
syntax where you have to write the clunky:
[snip examples]

Well, annoying would be a strong word [for me] here, but yes,
that's what I also find suboptimal. Quite some languages have
adopted the if/fi or if/end forms (and for good reasons, IMO).

I enhanced it by not needing stropping (and so not allowing embedded
spaces within names); allowing redundant semicolons while at the same
time, turning newlines into semicolons when a line obviously didn't
continue; plus allowing ordinary 'end' or 'end if' to be used as well as 'fi'.

My version then can look like this, a bit less forbidding than Algol68:

if cond then
s1
s2
else
s3
s4
end

(Looks a lot more like a scripting language without semicolons.)

Not sure what you mean by "less forbidding", though. - Algol 68
never appeared to me to restrict me. And it allows more flexible
and coherent application of its concepts (and in a safe way) than
in a lot other common languages.

Janis

--- Synchronet 3.20a-Linux NewsLink 1.114

From Janis Papanagnou@janis_papanagnou+ng@hotmail.com to comp.lang.c on Sat Nov 9 05:51:54 2024

From Newsgroup: comp.lang.c

On 08.11.2024 19:18, David Brown wrote:

On 08/11/2024 18:37, Janis Papanagnou wrote:

The | operator means "or" in the OP's language (AFAIK - only he actually knows the language). So "(a | b | c)" in that language will sometimes
mean the same as "(a | b | c)" in C, and sometimes it will mean the same
as "(a ? b : c)" in C.

As said ("I can't comment on the "other use" of the same syntax"),
I don't know Bart's language, so cannot comment on that.

And, frankly, some personal language projects are not of interest
to me, apart from experiences the implementer (Bart) has gotten
from his projects that might be worthwhile to consider for other
languages' evolution or design. This is why I got interested in
the thread and his posts.

There may be some clear distinguishing feature that disambiguates these
uses. But this is a one-man language - there is no need for a clear
syntax or grammar, documentation, consistency in the language, or a consideration for corner cases or unusual uses.

Incidentally above syntax is what Algol 68 supports;

Yes, he said later that Algol 68 was the inspiration for it. Algol 68
was very successful in its day - but there are good reasons why many of
its design choices were been left behind long ago in newer languages.

Myself I've never seen Algol 68 code outside of education and
specification. (But that's normal due my naturally restricted
view on what happens all over the world. So if you have some
examples for practical successes of Algol 68 I'd be interested
to hear about.)

Some design decisions of Algol 68 are arguable, indeed, and we
can observe that from the reports those days. (But that's not
surprising given that there have been a lot of different (and
strong) characters, university professors and scientists from
all over the world, in the committees and working group.) It's
obvious that quite some members left and introduced their own
languages; those languages were of course also not unopposed.

I don't think, though, that this natural segregation process or
any design decisions of some later developed languages would
give evidence for a clear negative valuation of any specific
language details (or for the language as a whole). Contrary,
a lot of later languages even ignored outstanding and important
concepts of languages these days. (The market and politics have
their own logic and dynamics.)

This is as opinion of course arguable. It's certainly also
influenced where one is coming from (i.e. personal expertise
from other languages).

The language here is "mathematics". I would not expect anyone who even considers designing a programming language to be unfamiliar with that
symbol.

Mathematics, unfortunately, [too] often has several symbols for
the same thing. (It's in that respect not very different from
programming languages, where you can [somewhat] rely on + - * /
but beyond that it's getting more tight.)

Programming languages have the additional problem that you don't
have all necessary symbols available, so language designers have
to map them onto existing symbols. (Also Unicode in modern times
do not solve that fact, since languages typically rely on ASCII,
or some 8-bit extension, at most; full Unicode support, I think,
is rare, especially on the lexical language level. Some allow
them in strings, some in identifiers; but in language keywords?)

BTW, in Algol 68 you can define operators, so you can define
"OP V" or "OP ^" (for 'or' and 'and', respectively, but we cannot
define (e.g.) "OP ·" (a middle dot, e.g. for multiplication).[*]

The detail of what symbols are used is
not that important to me, if it fits to the overall language
design.

I am quite happy with the same symbol being used for very different
meanings in different contexts. C's use of "*" for indirection and for multiplication is rarely confusing. Using | for "bitwise or" and also
using it for a "pipe" operator would probably be fine - only one
operation makes sense for the types involved. But here the two
operations - "bitwise or" (or logical or) and "choice" can apply to to
the same types of operands. That's what makes it a very poor choice of syntax.

Well, I'm more used (from mathematics) to 'v' and '^' than to '|'
and '&', respectively. But that doesn't prevent me from accepting
other symbols like '|' to have some [mathematical] meaning, or
even different meanings depending on context. In mathematics it's
not different; same symbols are used in different contexts with
different semantics. (And there's also the mentioned problem of
non-coherent literature WRT used mathematics' symbols.)

(For comparison, Algol 68 uses "OR", "∨" or "\/" for the "or" operator, thus it does not have this confusion.)

Actually, while I like Algol 68's flexibility, there's in some
cases (to my liking) too many variants. This had partly been
necessary, of course, due to the (even more) restricted character
sets (e.g. 6-bit characters) available in the 1960's.

The two options for conditionals I consider very useful, though,
and it also produces very legible and easily understandable code.

[...]

I've nothing (much) against the operation - it's the choice of operator
that is wrong.

Well, on opinions there's nothing more to discuss, I suppose.

Janis

[*] Note: I'm using the "Genie" compiler for tests.

--- Synchronet 3.20a-Linux NewsLink 1.114

From Janis Papanagnou@janis_papanagnou+ng@hotmail.com to comp.lang.c on Sat Nov 9 07:00:07 2024

From Newsgroup: comp.lang.c

On 03.11.2024 21:00, Bart wrote:

This was the first part of your example:

const char * flag_to_text_A(bool b) {
if (b == true) {
return "It's true!";
} else if (b == false) {
return "It's false!";

/I/ would question why you'd want to make the second branch conditional
in the first place.

You might want to read about Dijkstra's Guards; it might provide
some answers, rationales, and insights for this question. (Don't
get repelled or confused by the "calculate all conditions" aspect
or the non-determinism; think more about, e.g., the safety of full specification, automated optimization runs, and other [positive]
implications.)

(Though if you're only focused on programmer-optimized structures
Dijkstra's concept and ideas probably won't help you.)

Incidentally, Dijkstra's Guards cover also an aspect of the OP's
original question.

Janis

Write an 'else' there, and the issue doesn't arise.

Because I can't see the point of deliberately writing code that usually
takes two paths, when either:

(1) you know that one will never be taken, or
(2) you're not sure, but don't make any provision in case it is

Fix that first rather relying on compiler writers to take care of your
badly written code.
[...]

--- Synchronet 3.20a-Linux NewsLink 1.114

From Janis Papanagnou@janis_papanagnou+ng@hotmail.com to comp.lang.c on Sat Nov 9 07:54:44 2024

From Newsgroup: comp.lang.c

On 04.11.2024 23:25, David Brown wrote:

If you have a function (or construct) that returns a correct value for
inputs 1, 2 and 3, and you never pass it the value 4 (or anything else),
then there is no undefined behaviour no matter what the code looks like
for values other than 1, 2 and 3. If someone calls that function with
input 4, then /their/ code has the error - not the code that doesn't
handle an input 4.

Well, it's a software system design decision whether you want to
make the caller test the preconditions for every function call,
or let the callee take care of unexpected input, or both.

We had always followed the convention to avoid all undefined
situations and always define every 'else' case by some sensible
behavior, at least writing a notice into a log-file, but also
to "fix" the runtime situation to be able to continue operating.
(Note, I was mainly writing server-side software where this was
especially important.)

That's one reason why (as elsethread mentioned) I dislike 'else'
to handle a defined value; I prefer an explicit 'if' and use the
else for reporting unexpected situations (that practically never
appear, or, with the diagnostics QA-evaluated, asymptotically
disappearing).

(For pure binary predicates there's no errors branch, of course.)

Janis

PS: One of my favorite IT-gotchas is the plane crash where the
code specified landing procedure functions for height < 50.0 ft
and for height > 50.0 ft conditions, which mostly worked since
the height got polled only every couple seconds, and the case
height = 50.0 ft happened only very rarely due to the typical
descent characteristics during landing.

--- Synchronet 3.20a-Linux NewsLink 1.114

From fir@fir@grunge.pl to Bart on Sat Nov 9 11:08:05 2024

From Newsgroup: comp.lang.c

Bart wrote:

On 06/11/2024 07:26, Kaz Kylheku wrote:

On 2024-11-05, Bart <bc@freeuk.com> wrote:

Well, it started off as 2-way select, meaning constructs like this:

x = c ? a : b;
x := (c | a | b)

Where one of two branches is evaluated. I extended the latter to N-way
select:

x := (n | a, b, c, ... | z)

This looks quite error-prone. You have to count carefully that
the cases match the intended values. If an entry is
inserted, all the remaining ones shift to a higher value.

You've basically taken a case construct and auto-generated
the labels starting from 1.

It's a version of Algol68's case construct:

x := CASE n IN a, b, c OUT z ESAC

which also has the same compact form I use. I only use the compact
version because n is usually small, and it is intended to be used within
an expression: print (n | "One", "Two", "Three" | "Other").

This an actual example (from my first scripting language; not written by
me):

Crd[i].z := (BendAssen |P.x, P.y, P.z)

An out-of-bounds index yields 'void' (via a '| void' part inserted by
the compiler). This is one of my examples from that era:

xt := (messa | 1,1,1, 2,2,2, 3,3,3)
yt := (messa | 3,2,1, 3,2,1, 3,2,1)

still the more c compatimle version would look better imo

xt = {1,1,1, 2,2,2, 3,3,3}[messa];
yt = {3,2,1, 3,2,1, 3,2,1}[messa];

esp if maybe there would be allowed to also use [] leftside

and

t = {1,3, 1,2, 1,1 2,3, 2,2, 2,1, 3,3, 3,2, 3,1} [messa]

where t is struct {x,y}

could be maybe faster

Algol68 didn't have 'switch', but I do, as well as a separate
case...esac statement that is more general. Those are better for
multi-line constructs.

As for being error prone because values can get out of step, so is a
function call like this:

f(a, b, c, d, e)

But I also have keyword arguments.

--- Synchronet 3.20a-Linux NewsLink 1.114

From David Brown@david.brown@hesbynett.no to comp.lang.c on Sat Nov 9 12:06:21 2024

From Newsgroup: comp.lang.c

On 09/11/2024 07:54, Janis Papanagnou wrote:

On 04.11.2024 23:25, David Brown wrote:

If you have a function (or construct) that returns a correct value for
inputs 1, 2 and 3, and you never pass it the value 4 (or anything else),
then there is no undefined behaviour no matter what the code looks like
for values other than 1, 2 and 3. If someone calls that function with
input 4, then /their/ code has the error - not the code that doesn't
handle an input 4.

Well, it's a software system design decision whether you want to
make the caller test the preconditions for every function call,
or let the callee take care of unexpected input, or both.

Well, I suppose it is their decision - they can do the right thing, or
the wrong thing, or both.

I believe I explained in previous posts why it is the /caller's/ responsibility to ensure pre-conditions are fulfilled, and why anything
else is simply guaranteeing extra overheads while giving you less
information for checking code correctness. But I realise that could
have been lost in the mass of posts, so I can go through it again if you
want.

(On security boundaries, system call interfaces, etc., where the caller
could be malicious or incompetent in a way that damages something other
than their own program, you have to treat all inputs as dangerous and
sanitize them, just like data from external sources. That's a different matter, and not the real focus here.)

We had always followed the convention to avoid all undefined
situations and always define every 'else' case by some sensible
behavior, at least writing a notice into a log-file, but also
to "fix" the runtime situation to be able to continue operating.
(Note, I was mainly writing server-side software where this was
especially important.)

You can't "fix" bugs in the caller code by writing to a log file.
Sometimes you can limit the damage, however.

If you can't trust the people writing the calling code, then that should
be the focus of your development process - find a way to be sure that
the caller code is right. That's where you want your conventions, or to
focus code reviews, training, automatic test systems - whatever is
appropriate for your team and project. Make sure callers pass correct
data to the function, and the function can do its job properly.

Sometimes it makes sense to specify functions differently, and accept a
wider input. Maybe instead of saying "this function will return the
integer square root of numbers between 0 and 10", you say "this function
will return the integer square root if given a number between 0 and 10,
and will log a message and return -1 for other int values". Fair enough
- now you've got a new function where it is very easy for the caller to
ensure the preconditions are satisfied. But be very aware of the costs
- you have now destroyed the "purity" of the function, and lost the key mathematical relation between the input and output. (You have also made everything much less efficient.)

In terms of development practices, for large code bases you should
divide things up into modules with clear boundaries. And then you might
say that the teams working on other modules that call yours are muppets
that can't read a function specification and can't get their code right.
So these boundary functions have to accept as wide a range of inputs
as possible, and check them as well as possible. But you only do that
for these externally accessible interfaces, not your internal code.

That's one reason why (as elsethread mentioned) I dislike 'else'
to handle a defined value; I prefer an explicit 'if' and use the
else for reporting unexpected situations (that practically never
appear, or, with the diagnostics QA-evaluated, asymptotically
disappearing).

(For pure binary predicates there's no errors branch, of course.)

Janis

PS: One of my favorite IT-gotchas is the plane crash where the
code specified landing procedure functions for height < 50.0 ft
and for height > 50.0 ft conditions, which mostly worked since
the height got polled only every couple seconds, and the case
height = 50.0 ft happened only very rarely due to the typical
descent characteristics during landing.

--- Synchronet 3.20a-Linux NewsLink 1.114

From Bart@bc@freeuk.com to comp.lang.c on Sat Nov 9 12:21:47 2024

From Newsgroup: comp.lang.c

On 09/11/2024 03:57, Janis Papanagnou wrote:

On 08.11.2024 23:24, Bart wrote:

On 08/11/2024 17:37, Janis Papanagnou wrote:

BTW, the same symbols can also be used as an alternative form
of the 'case' statement; the semantic distinction is made by
context, e.g. the types involved in the construct.

You mean whether the 'a' in '(a | b... | c)' has type Bool rather than Int? >>
I've always discriminated on the number of terms between the two |s:
either 1, or more than 1.

I suppose in a [historic] "C" like language it's impossible to
distinguish on type here (given that there was no 'bool' type
[in former times] in "C"). - But I'm not quite sure whether
you're speaking here about your "C"-like language or some other
language you implemented.

I currently have three HLL implementations:

* For my C subset language (originally I had some enhancements, now
dropped)

* For my 'M' systems language inspired by A68 syntax

* For my 'Q' scripting language, with the same syntax, more or less

The remark was about those last two.

if cond then
s1
s2
else
s3
s4
end

(Looks a lot more like a scripting language without semicolons.)

This is what I've long suspected: that people associate clear, pseudo-code-like syntax with scripting languages.

'Serious' ones apparently need to look the business with a lot of extra punctuation. The more clutter the better!

By that criteria, C++ is obviously more advanced than C:

C: #include <stdio.h>
printf("A=%d B=%d\n", a, b);

C++ #include <iostream>
std::cout << "A=" << a << " " << "B=" << b << std::endl;

Maybe Zig even more so (normally you'd create a shorter alias to that
print):

Zig: @import("std").debug.print("A={d} B={d}\n", .{a, b});

By that measure, mine probably looks like a toy:

M: println =a, =b

--- Synchronet 3.20a-Linux NewsLink 1.114

From David Brown@david.brown@hesbynett.no to comp.lang.c on Sat Nov 9 17:27:13 2024

From Newsgroup: comp.lang.c

On 09/11/2024 05:51, Janis Papanagnou wrote:

On 08.11.2024 19:18, David Brown wrote:

On 08/11/2024 18:37, Janis Papanagnou wrote:

The language here is "mathematics". I would not expect anyone who even
considers designing a programming language to be unfamiliar with that
symbol.

Mathematics, unfortunately, [too] often has several symbols for
the same thing. (It's in that respect not very different from
programming languages, where you can [somewhat] rely on + - * /
but beyond that it's getting more tight.)

Programming languages have the additional problem that you don't
have all necessary symbols available, so language designers have
to map them onto existing symbols. (Also Unicode in modern times
do not solve that fact, since languages typically rely on ASCII,
or some 8-bit extension, at most; full Unicode support, I think,
is rare, especially on the lexical language level. Some allow
them in strings, some in identifiers; but in language keywords?)

Sure, I appreciate all this. We must do the best we can - I am simply
saying that using | for this operation is far from the best choice.

BTW, in Algol 68 you can define operators, so you can define
"OP V" or "OP ^" (for 'or' and 'and', respectively, but we cannot
define (e.g.) "OP ·" (a middle dot, e.g. for multiplication).[*]

The detail of what symbols are used is
not that important to me, if it fits to the overall language
design.

I am quite happy with the same symbol being used for very different
meanings in different contexts. C's use of "*" for indirection and for
multiplication is rarely confusing. Using | for "bitwise or" and also
using it for a "pipe" operator would probably be fine - only one
operation makes sense for the types involved. But here the two
operations - "bitwise or" (or logical or) and "choice" can apply to to
the same types of operands. That's what makes it a very poor choice of
syntax.

Well, I'm more used (from mathematics) to 'v' and '^' than to '|'
and '&', respectively. But that doesn't prevent me from accepting
other symbols like '|' to have some [mathematical] meaning, or
even different meanings depending on context. In mathematics it's
not different; same symbols are used in different contexts with
different semantics. (And there's also the mentioned problem of
non-coherent literature WRT used mathematics' symbols.)

We are - unfortunately, perhaps - constrained by common keyboards and
ASCII (for the most part). "v" and "^" are poor choices for "or" and
"and" - "∨" and "∧" would be much nicer, but are hard to type. For
better or worse, the programming world has settled on "|" and "&" as
practical alternatives. ("+" and "." are often used in boolean logic,
and can be typed on normal keyboards, but would quickly be confused with
other uses of those symbols.)

(For comparison, Algol 68 uses "OR", "∨" or "\/" for the "or" operator,
thus it does not have this confusion.)

Actually, while I like Algol 68's flexibility, there's in some
cases (to my liking) too many variants. This had partly been
necessary, of course, due to the (even more) restricted character
sets (e.g. 6-bit characters) available in the 1960's.

The two options for conditionals I consider very useful, though,
and it also produces very legible and easily understandable code.

[...]

I've nothing (much) against the operation - it's the choice of operator
that is wrong.

Well, on opinions there's nothing more to discuss, I suppose.

Opinions can be justified, and that discussion can be interesting.
Purely subjective opinion is less interesting.

--- Synchronet 3.20a-Linux NewsLink 1.114

From Janis Papanagnou@janis_papanagnou+ng@hotmail.com to comp.lang.c on Sun Nov 10 05:01:44 2024

From Newsgroup: comp.lang.c

On 09.11.2024 11:08, fir wrote:

Bart wrote:

On 06/11/2024 07:26, Kaz Kylheku wrote:

On 2024-11-05, Bart <bc@freeuk.com> wrote:

[...] I extended the latter to N-way select:

x := (n | a, b, c, ... | z)

This looks quite error-prone. You have to count carefully that
the cases match the intended values. If an entry is
inserted, all the remaining ones shift to a higher value.

You've basically taken a case construct and auto-generated
the labels starting from 1.

It's a version of Algol68's case construct:

x := CASE n IN a, b, c OUT z ESAC

which also has the same compact form I use. I only use the compact
version because n is usually small, and it is intended to be used within
an expression: print (n | "One", "Two", "Three" | "Other").

[...]

An out-of-bounds index yields 'void' (via a '| void' part inserted by
the compiler). This is one of my examples from that era:

xt := (messa | 1,1,1, 2,2,2, 3,3,3)
yt := (messa | 3,2,1, 3,2,1, 3,2,1)

still the more c compatimle version would look better imo

xt = {1,1,1, 2,2,2, 3,3,3}[messa];
yt = {3,2,1, 3,2,1, 3,2,1}[messa];

[...]

It might look better - which of course lies in the eyes of the
beholder - but this would actually need more guaranteed context
or explicit tests (whether "messa" is within defined bounds) to
become a safe construct; which then again makes it more clumsy.

Above you also write about the syntax (which included the 'else'
case) that "This looks quite error-prone." and that you have to
"count carefully". Why do you think the "C-like" syntax is less
error prone and that you wouldn't have to count?

The biggest problem with such old switch semantics is, IMO, that
you have to map them on sequence numbers [1..N], or use them just
in contexts where you naturally have such selectors given. (Not
that the "C-like" suggestion would address that inherent issue.)

In "C" I occasionally used a {...}[...] or "..."[...] syntax,
but rather in this form: {...}[... % n] , where 'n' is the
determined (constant) number of elements.

Janis

--- Synchronet 3.20a-Linux NewsLink 1.114

From Janis Papanagnou@janis_papanagnou+ng@hotmail.com to comp.lang.c on Sun Nov 10 05:22:21 2024

From Newsgroup: comp.lang.c

On 09.11.2024 12:06, David Brown wrote:

On 09/11/2024 07:54, Janis Papanagnou wrote:

Well, it's a software system design decision whether you want to
make the caller test the preconditions for every function call,
or let the callee take care of unexpected input, or both.

Well, I suppose it is their decision - they can do the right thing, or
the wrong thing, or both.

I believe I explained in previous posts why it is the /caller's/ responsibility to ensure pre-conditions are fulfilled, and why anything
else is simply guaranteeing extra overheads while giving you less
information for checking code correctness. But I realise that could
have been lost in the mass of posts, so I can go through it again if you want.

I haven't read all the posts, or rather, I just skipped most posts;
it's too time consuming.

Since you explicitly elaborated - thanks! - I will read this one...

[...]

(On security boundaries, system call interfaces, etc., where the caller
could be malicious or incompetent in a way that damages something other
than their own program, you have to treat all inputs as dangerous and sanitize them, just like data from external sources. That's a different matter, and not the real focus here.)

We had always followed the convention to avoid all undefined
situations and always define every 'else' case by some sensible
behavior, at least writing a notice into a log-file, but also
to "fix" the runtime situation to be able to continue operating.
(Note, I was mainly writing server-side software where this was
especially important.)

You can't "fix" bugs in the caller code by writing to a log file.
Sometimes you can limit the damage, however.

I spoke more generally of fixing situations (not only bugs).

If you can't trust the people writing the calling code, then that should
be the focus of your development process - find a way to be sure that
the caller code is right. That's where you want your conventions, or to focus code reviews, training, automatic test systems - whatever is appropriate for your team and project. Make sure callers pass correct
data to the function, and the function can do its job properly.

Yes.

Sometimes it makes sense to specify functions differently, and accept a
wider input. Maybe instead of saying "this function will return the
integer square root of numbers between 0 and 10", you say "this function
will return the integer square root if given a number between 0 and 10,
and will log a message and return -1 for other int values". Fair enough
- now you've got a new function where it is very easy for the caller to ensure the preconditions are satisfied. But be very aware of the costs
- you have now destroyed the "purity" of the function, and lost the key mathematical relation between the input and output. (You have also made everything much less efficient.)

I disagree in the "much less" generalization. I also think that when
weighing performance versus safety my preferences might be different;
I'm only speaking about a "rule of thumb", not about the actual (IMO) necessity(!) to make this decisions depending on the project context.

[...]

Janis

--- Synchronet 3.20a-Linux NewsLink 1.114

From Janis Papanagnou@janis_papanagnou+ng@hotmail.com to comp.lang.c on Sun Nov 10 06:05:02 2024

From Newsgroup: comp.lang.c

On 09.11.2024 17:27, David Brown wrote:

On 09/11/2024 05:51, Janis Papanagnou wrote:

[...]

Sure, I appreciate all this. We must do the best we can - I am simply
saying that using | for this operation is far from the best choice.

That's also what I understood. - My point is that preferences (and
opinions) differ. (And I haven't seen any convincing rationale.)

Frankly, we're confronted with so much rubbish syntax (in various
languages, even in the ones we have to or even like to use) that
I'm at least astonished about your [strong appearing] opinion here.

Well, I'm more used (from mathematics) to 'v' and '^' than to '|'
and '&', respectively. But that doesn't prevent me from accepting
other symbols like '|' to have some [mathematical] meaning, or
even different meanings depending on context. In mathematics it's
not different; same symbols are used in different contexts with
different semantics. (And there's also the mentioned problem of
non-coherent literature WRT used mathematics' symbols.)

We are - unfortunately, perhaps - constrained by common keyboards and
ASCII (for the most part). "v" and "^" are poor choices for "or" and
"and" - "∨" and "∧" would be much nicer, but are hard to type.

That was the key what I wanted to express. (I used the approximated
symbols only for convenience.) - But, as a fact, the symbols I used
(an alpha-letter and a punctuation character) can [in Algol 68] be
effectively used as valid operators but the more appropriate Unicode
characters can't. (In the Genie compiler the 'v' must be used as 'V',
though.)

(Yes, it's a pity that we are constrained by keyboards, but not only
by that. And international use and cooperation makes sensible general applicable solutions not easier.)

For
better or worse, the programming world has settled on "|" and "&" as practical alternatives.

Only a subset of the languages; nowadays vastly those that took "C" -
to my very astonishment! - as a design paragon.

Personally I prefer 'and' and 'or' to '&&' and '||', or '&' and '|'.
(And the others, "∧" and "∨", are out for said reasons.)

The symbol '|' I associate more with alternatives (BNF, shell syntax,
etc.). But in Unix shell also with pipes (in former Unixes '^', BTW).
And I have no problem with it if used as a separator in a conditional,
where "separator" is of course not the formally appropriate term.

("+" and "." are often used in boolean logic,
and can be typed on normal keyboards, but would quickly be confused with other uses of those symbols.)

[...]

Well, on opinions there's nothing more to discuss, I suppose.

Opinions can be justified, and that discussion can be interesting.
Purely subjective opinion is less interesting.

Sure. Your's appreciated as well.

Janis

--- Synchronet 3.20a-Linux NewsLink 1.114

From antispam@antispam@fricas.org (Waldek Hebisch) to comp.lang.c on Sun Nov 10 06:00:19 2024

From Newsgroup: comp.lang.c

Bart <bc@freeuk.com> wrote:

On 05/11/2024 19:53, Waldek Hebisch wrote:

Bart <bc@freeuk.com> wrote:

On 05/11/2024 12:42, Waldek Hebisch wrote:

Bart <bc@freeuk.com> wrote:

Then we disagree on what 'multi-way' select might mean. I think it means >>>>> branching, even if notionally, on one-of-N possible code paths.

OK.

The whole construct may or may not return a value. If it does, then one >>>>> of the N paths must be a default path.

You need to cover all input values. This is possible when there
is reasonably small number of possibilities. For example, switch on
char variable which covers all possible values does not need default
path. Default is needed only when number of possibilities is too
large to explicitely give all of them. And some languages allow
ranges, so that you may be able to cover all values with small
number of ranges.

What's easier to implement in a language: to have a conditional need for >>> an 'else' branch, which is dependent on the compiler performing some
arbitrarily complex levels of analysis on some arbitrarily complex set
of expressions...

...or to just always require 'else', with a dummy value if necessary?

Well, frequently it is easier to do bad job, than a good one.

I assume that you consider the simple solution the 'bad' one?

You wrote about _always_ requiring 'else' regardless if it is
needed or not. Yes, I consider this bad.

I'd would consider a much elaborate one putting the onus on external
tools, and still having an unpredictable result to be the poor of the two.

You want to create a language that is easily compilable, no matter how complex the input.

Normally time spent _using_ compiler should be bigger than time
spending writing compiler. If compiler gets enough use, it
justifies some complexity.

With the simple solution, the worst that can happen is that you have to write a dummy 'else' branch, perhaps with a dummy zero value.

If control never reaches that point, it will never be executed (at
worse, it may need to skip an instruction).

But if the compiler is clever enough (optionally clever, it is not a requirement!), then it could eliminate that code.

A bonus is that when debugging, you can comment out all or part of the previous lines, but the 'else' now catches those untested cases.

I am mainly concerned with clarity and correctness of source code.
Dummy 'else' doing something may hide errors. Dummy 'else' signaling
error means that something which could be compile time error is
only detected at runtime.

Compiler that detects most errors of this sort is IMO better than
compiler which makes no effort to detect them. And clearly, once
problem is formulated in sufficiently general way, it becomes
unsolvable. So I do not expect general solution, but expect
resonable effort.

normally you do not need very complex analysis:

I don't want to do any analysis at all! I just want a mechanical
translation as effortlessly as possible.

I don't like unbalanced code within a function because it's wrong and
can cause problems.

Well, I demand more from compiler than you do...
--
Waldek Hebisch
--- Synchronet 3.20a-Linux NewsLink 1.114

From antispam@antispam@fricas.org (Waldek Hebisch) to comp.lang.c on Sun Nov 10 06:57:26 2024

From Newsgroup: comp.lang.c

David Brown <david.brown@hesbynett.no> wrote:

On 05/11/2024 20:39, Waldek Hebisch wrote:

David Brown <david.brown@hesbynett.no> wrote:

On 05/11/2024 13:42, Waldek Hebisch wrote:

Bart <bc@freeuk.com> wrote:

Then we disagree on what 'multi-way' select might mean. I think it means >>>>> branching, even if notionally, on one-of-N possible code paths.

OK.

I appreciate this is what Bart means by that phrase, but I don't agree
with it. I'm not sure if that is covered by "OK" or not!

You may prefer your own definition, but Bart's is resonable one.

The only argument I can make here is that I have not seen "multi-way
select" as a defined phrase with a particular established meaning.

There is well-defined concept appearing when studing control structures.
I am not sure if "multi-way select" is usual name for it, but with
Bart explanation it is very clear that he meant this concept. And
even without his explanation I would assume that he meant this concept.

The whole construct may or may not return a value. If it does, then one >>>>> of the N paths must be a default path.

You need to cover all input values. This is possible when there
is reasonably small number of possibilities. For example, switch on
char variable which covers all possible values does not need default
path. Default is needed only when number of possibilities is too
large to explicitely give all of them. And some languages allow
ranges, so that you may be able to cover all values with small
number of ranges.

I think this is all very dependent on what you mean by "all input values". >>>
Supposing I declare this function:

// Return the integer square root of numbers between 0 and 10
int small_int_sqrt(int x);

To me, the range of "all input values" is integers from 0 to 10. I
could implement it as :

int small_int_sqrt(int x) {
if (x == 0) return 0;
if (x < 4) return 1;
if (x < 9) return 2;
if (x < 16) return 3;
unreachable();
}

If the user asks for small_int_sqrt(-10) or small_int_sqrt(20), that's
/their/ fault and /their/ problem. I said nothing about what would
happen in those cases.

But some people seem to feel that "all input values" means every
possible value of the input types, and thus that a function like this
should return a value even when there is no correct value in and no
correct value out.

Well, some languages treat types more seriously than C. In Pascal
type of your input would be 0..10 and all input values would be
handled. Sure, when domain is too complicated to express in type
than it could be documented restriction. Still, it makes sense to
signal error if value goes outside handled rage, so in a sense all
values of input type are handled: either you get valid answer or
clear error.

No, it does not make sense to do that. Just because the C language does
not currently (maybe once C++ gets contracts, C will copy them) have a
way to specify input sets other than by types, does not mean that
functions in C always have a domain matching all possible combinations
of bits in the underlying representation of the parameter's types.

It might be a useful fault-finding aid temporarily to add error messages
for inputs that are invalid but can physically be squeezed into the parameters. That won't stop people making incorrect declarations of the function and passing completely different parameter types to it, or
finding other ways to break the requirements of the function.

And in general there is no way to check the validity of the inputs - you usually have no choice but to trust the caller. It's only in simple
cases, like the example above, that it would be feasible at all.

There are, of course, situations where the person calling the function
is likely to be incompetent, malicious, or both, and where there can be serious consequences for what you might prefer to consider as invalid
input values.

You apparently exclude possibility of competent persons making a
mistake. AFAIK industry statistic shows that code develeped by
good developers using rigorous process still contains substantial
number of bugs. So, it makes sense to have as much as possible
verified mechanically. Which in common practice means depending on
type checks. In less common practice you may have some theorem
proving framework checking assertions about input arguments,
then the assertions take role of types.

You have that for things like OS system calls - it's no
different than dealing with user inputs or data from external sources.
But you handle that by extending the function - increase the range of
valid inputs and appropriate outputs. You no longer have a function
that takes a number between 0 and 10 and returns the integer square root
- you now have a function that takes a number between -(2 ^ 31 + 1) and
(2 ^ 31) and returns the integer square root if the input is in the
range 0 to 10 or halts the program with an error message for other
inputs in the wider range. It's a different function, with a wider set
of inputs - and again, it is specified to give particular results for particular inputs.

It make sense to extend definition when such extention converts
function which use can be verified only by informal process into
one with formally verified use.

I certainly would
be quite unhappy with code above. It is possible that I would still
use it as a compromise (say if it was desirable to have single
prototype but handle points in spaces of various dimensions),
but my first attempt would be something like:

typedef struct {int p[2];} two_int;
....

I think you'd quickly find that limiting and awkward in C (but it might
be appropriate in other languages).

Your snippet handled only two element arrays. If that is right assumption
for the problem, then typedef above expresses it in IMO resonable
way. Yes, it is more characters to write than usual C idioms.
My main "trouble" is that usually I want to handle variable sized
arrays. In such case beside pointer there would be size argument.
I would probably use variably modified type in such case.

But don't misunderstand me - I am
all in favour of finding ways in code that make input requirements
clearer or enforceable within the language - never put anything in
comments if you can do it in code. You could reasonably do this in C
for the first example :

// Do not use this directly
extern int small_int_sqrt_implementation(int x);

// Return the integer square root of numbers between 0 and 10
static inline int small_int_sqrt(int x) {
assert(x >= 0 && x <= 10);
return small_int_sqrt_implementation(x);
}

Hmm, why extern implementation and static wrapper? I would do
the opposite.

A function should accept all input values - once you have made clear
what the acceptable input values can be. A "default" case is just a
short-cut for conveniently handling a wide range of valid input values - >>> it is never a tool for handling /invalid/ input values.

Well, default can signal error which frequently is right handling
of invalid input values.

Will that somehow fix the bug in the code that calls the function?

It can be a useful debugging and testing aid, certainly, but it does not make the code "correct" or "safe" in any sense.

There is concept of "partial correctness": code if it finishes returns
correct value. A variation of this is: code if it finishes without
signaling error returns correct values. Such condition may be
much easier to verify than "full correctness" and in many case
is almost as useful. In particular, mathematicians are _very_
unhappy when program return incorrect results. But they are used
to programs which can not deliver results, either because of
lack or resources or because needed case was not implemented.

When dealing with math formulas there are frequently various
restrictions on parameters, like we can only divide by nonzero
quantity. By signaling error when restrictions are not
satisfied we ensure that sucessful completition means that
restrictions were satisfied. Of course that alone does not
mean that result is correct, but correctness of "general"
case is usually _much_ easier to ensure. In other words,
failing restrictions are major source of errors, and signaling
errors effectively eliminates it.

In world of prefect programmers, they would check restrictions
before calling any function depending on them, or prove that
restrictions on arguments to a function imply correctness of
calls made by the function. But world is imperfect and in
real world extra runtime checks are quite useful.
--
Waldek Hebisch
--- Synchronet 3.20a-Linux NewsLink 1.114

From Janis Papanagnou@janis_papanagnou+ng@hotmail.com to comp.lang.c on Sun Nov 10 08:16:22 2024

From Newsgroup: comp.lang.c

On 09.11.2024 13:21, Bart wrote:

On 09/11/2024 03:57, Janis Papanagnou wrote:

[...] - But I'm not quite sure whether
you're speaking here about your "C"-like language or some other
language you implemented.

I currently have three HLL implementations:

* For my C subset language (originally I had some enhancements, now
dropped)

* For my 'M' systems language inspired by A68 syntax

* For my 'Q' scripting language, with the same syntax, more or less

The remark was about those last two.

if cond then
s1
s2
else
s3
s4
end

(Looks a lot more like a scripting language without semicolons.)

This is what I've long suspected: that people associate clear, pseudo-code-like syntax with scripting languages.

Most posts from you that I saw were addressing your "C"-like
language, so I was confused about the actual focus of your post.

It's helpful to give some hint if posted code is intended as
pseudo-code. That wasn't clear to me. So thanks for clarifying.

BTW, I don't consider scripting languages as "bad" - I'm actually
doing quite a lot scripting. - My comment doesn't contain any
valuation and also didn't intend to insinuate one.

Janis

[...]

--- Synchronet 3.20a-Linux NewsLink 1.114

From David Brown@david.brown@hesbynett.no to comp.lang.c on Sun Nov 10 16:13:21 2024

From Newsgroup: comp.lang.c

On 10/11/2024 05:22, Janis Papanagnou wrote:

On 09.11.2024 12:06, David Brown wrote:

On 09/11/2024 07:54, Janis Papanagnou wrote:

Well, it's a software system design decision whether you want to
make the caller test the preconditions for every function call,
or let the callee take care of unexpected input, or both.

Well, I suppose it is their decision - they can do the right thing, or
the wrong thing, or both.

I believe I explained in previous posts why it is the /caller's/
responsibility to ensure pre-conditions are fulfilled, and why anything
else is simply guaranteeing extra overheads while giving you less
information for checking code correctness. But I realise that could
have been lost in the mass of posts, so I can go through it again if you
want.

I haven't read all the posts, or rather, I just skipped most posts;
it's too time consuming.

I should probably have skipped /writing/ the posts - it was too time
consuming :-)

Since you explicitly elaborated - thanks! - I will read this one...

[...]

(On security boundaries, system call interfaces, etc., where the caller
could be malicious or incompetent in a way that damages something other
than their own program, you have to treat all inputs as dangerous and
sanitize them, just like data from external sources. That's a different
matter, and not the real focus here.)

We had always followed the convention to avoid all undefined
situations and always define every 'else' case by some sensible
behavior, at least writing a notice into a log-file, but also
to "fix" the runtime situation to be able to continue operating.
(Note, I was mainly writing server-side software where this was
especially important.)

You can't "fix" bugs in the caller code by writing to a log file.
Sometimes you can limit the damage, however.

I spoke more generally of fixing situations (not only bugs).

OK. It can certainly help with /finding/ bugs, that can then be fixed
later.

If you can't trust the people writing the calling code, then that should
be the focus of your development process - find a way to be sure that
the caller code is right. That's where you want your conventions, or to
focus code reviews, training, automatic test systems - whatever is
appropriate for your team and project. Make sure callers pass correct
data to the function, and the function can do its job properly.

Yes.

Sometimes it makes sense to specify functions differently, and accept a
wider input. Maybe instead of saying "this function will return the
integer square root of numbers between 0 and 10", you say "this function
will return the integer square root if given a number between 0 and 10,
and will log a message and return -1 for other int values". Fair enough
- now you've got a new function where it is very easy for the caller to
ensure the preconditions are satisfied. But be very aware of the costs
- you have now destroyed the "purity" of the function, and lost the key
mathematical relation between the input and output. (You have also made
everything much less efficient.)

I disagree in the "much less" generalization. I also think that when
weighing performance versus safety my preferences might be different;
I'm only speaking about a "rule of thumb", not about the actual (IMO) necessity(!) to make this decisions depending on the project context.

My preferences are very much weighted towards correctness, not
efficiency. That includes /knowing/ that things are correct, not just
passing some tests. And key to that is knowing facts about the code
that can be used to reason about it. If you have a function that has
clear and specific pre-conditions, you know what you have to do in order
to use it correctly. It can then give clear and specific
post-conditions, and you can use these to reason further about your
code. On the other hand, if the function can, in practice, take any
input then you have learned little. And if it can do all sorts of
different things - log a message, return an arbitrary "default" value,
etc., - then you have nothing to work with for proving or verifying the
rest of your code.

--- Synchronet 3.20a-Linux NewsLink 1.114

From David Brown@david.brown@hesbynett.no to comp.lang.c on Sun Nov 10 16:38:25 2024

From Newsgroup: comp.lang.c

On 10/11/2024 07:57, Waldek Hebisch wrote:

David Brown <david.brown@hesbynett.no> wrote:

On 05/11/2024 20:39, Waldek Hebisch wrote:

David Brown <david.brown@hesbynett.no> wrote:

On 05/11/2024 13:42, Waldek Hebisch wrote:

Bart <bc@freeuk.com> wrote:

It might be a useful fault-finding aid temporarily to add error messages
for inputs that are invalid but can physically be squeezed into the
parameters. That won't stop people making incorrect declarations of the
function and passing completely different parameter types to it, or
finding other ways to break the requirements of the function.

And in general there is no way to check the validity of the inputs - you
usually have no choice but to trust the caller. It's only in simple
cases, like the example above, that it would be feasible at all.

There are, of course, situations where the person calling the function
is likely to be incompetent, malicious, or both, and where there can be
serious consequences for what you might prefer to consider as invalid
input values.

You apparently exclude possibility of competent persons making a
mistake.

I didn't do so intentionally. I wasn't trying to be exhaustive here. I
have several times mentioned that extra checks can be very helpful in fault-finding and debugging - good programmers also make mistakes and
need to debug their code.

AFAIK industry statistic shows that code develeped by
good developers using rigorous process still contains substantial
number of bugs. So, it makes sense to have as much as possible
verified mechanically. Which in common practice means depending on
type checks. In less common practice you may have some theorem
proving framework checking assertions about input arguments,
then the assertions take role of types.

Type checks can be extremely helpful, and strong typing greatly reduces
the errors in released code by catching them early (at compile time).
And temporary run-time checks are also helpful during development or debugging.

But extra run-time checks are costly (and I don't mean just in run-time performance, which is only an issue in a minority of situations). They
mean more code - which means more scope for errors, and more code that
must be checked and maintained. Usually this code can't be tested well
in final products - precisely because it is there to handle a situation
that never occurs.

But don't misunderstand me - I am
all in favour of finding ways in code that make input requirements
clearer or enforceable within the language - never put anything in
comments if you can do it in code. You could reasonably do this in C
for the first example :

// Do not use this directly
extern int small_int_sqrt_implementation(int x);

// Return the integer square root of numbers between 0 and 10
static inline int small_int_sqrt(int x) {
assert(x >= 0 && x <= 10);
return small_int_sqrt_implementation(x);
}

Hmm, why extern implementation and static wrapper? I would do
the opposite.

I wrote it the way you might have it in a header - the run-time check disappears when it is disabled (or if the compiler can see that the
check always passes). The real function implementation is hidden away
in an implementation module.

A function should accept all input values - once you have made clear
what the acceptable input values can be. A "default" case is just a
short-cut for conveniently handling a wide range of valid input values - >>>> it is never a tool for handling /invalid/ input values.

Well, default can signal error which frequently is right handling
of invalid input values.

Will that somehow fix the bug in the code that calls the function?

It can be a useful debugging and testing aid, certainly, but it does not
make the code "correct" or "safe" in any sense.

There is concept of "partial correctness": code if it finishes returns correct value. A variation of this is: code if it finishes without
signaling error returns correct values. Such condition may be
much easier to verify than "full correctness" and in many case
is almost as useful. In particular, mathematicians are _very_
unhappy when program return incorrect results. But they are used
to programs which can not deliver results, either because of
lack or resources or because needed case was not implemented.

When dealing with math formulas there are frequently various
restrictions on parameters, like we can only divide by nonzero
quantity. By signaling error when restrictions are not
satisfied we ensure that sucessful completition means that
restrictions were satisfied. Of course that alone does not
mean that result is correct, but correctness of "general"
case is usually _much_ easier to ensure. In other words,
failing restrictions are major source of errors, and signaling
errors effectively eliminates it.

Yes, out-of-band signalling in some way is a useful way to indicate a
problem, and can allow parameter checking without losing the useful
results of a function. This is the principle behind exceptions in many languages - then functions either return normally with correct results,
or you have a clearly abnormal situation.

In world of prefect programmers, they would check restrictions
before calling any function depending on them, or prove that
restrictions on arguments to a function imply correctness of
calls made by the function. But world is imperfect and in
real world extra runtime checks are quite useful.

Runtime checks in a function can be useful if you know the calling code
might not be perfect and the function is going to take responsibility
for identifying that situation. Programmers will often be writing both
the caller and callee code, and put temporary debugging and test checks wherever it is most convenient.

But I think being too enthusiastic about putting checks in the wrong
place - the callee function - can hide the real problems, or make the
callee code writer less careful about getting their part of the code
correct.

Real-world programmers are imperfect - that does not mean their code has
to be.

--- Synchronet 3.20a-Linux NewsLink 1.114

From antispam@antispam@fricas.org (Waldek Hebisch) to comp.lang.c on Mon Nov 11 19:09:08 2024

From Newsgroup: comp.lang.c

David Brown <david.brown@hesbynett.no> wrote:

On 10/11/2024 07:57, Waldek Hebisch wrote:

David Brown <david.brown@hesbynett.no> wrote:

On 05/11/2024 20:39, Waldek Hebisch wrote:

David Brown <david.brown@hesbynett.no> wrote:

On 05/11/2024 13:42, Waldek Hebisch wrote:

Bart <bc@freeuk.com> wrote:

Type checks can be extremely helpful, and strong typing greatly reduces
the errors in released code by catching them early (at compile time).
And temporary run-time checks are also helpful during development or debugging.

But extra run-time checks are costly (and I don't mean just in run-time performance, which is only an issue in a minority of situations). They
mean more code - which means more scope for errors, and more code that
must be checked and maintained. Usually this code can't be tested well
in final products - precisely because it is there to handle a situation
that never occurs.

It depends. gcc used to have several accessors macros which could
perform checks. They were turned of during "production use" (mainly
because checks increased runtime), but were "always" present in
source code. "Source cost" was moderate, checking code took hundreds,
moje be low thousends of lines in headres definitng the macros.
Actual use of macros was the same as if macros did no checking,
so there was minimal increase in source complexity.

Concerning testing, things exposed in exported interface frequently
can be tested with reasonable effort. The main issue is generating
apropriate arguments and possibly replicating global state (but
I normally have global state only when strictly necessary).

A function should accept all input values - once you have made clear >>>>> what the acceptable input values can be. A "default" case is just a >>>>> short-cut for conveniently handling a wide range of valid input values - >>>>> it is never a tool for handling /invalid/ input values.

Well, default can signal error which frequently is right handling
of invalid input values.

Will that somehow fix the bug in the code that calls the function?

It can be a useful debugging and testing aid, certainly, but it does not >>> make the code "correct" or "safe" in any sense.

There is concept of "partial correctness": code if it finishes returns
correct value. A variation of this is: code if it finishes without
signaling error returns correct values. Such condition may be
much easier to verify than "full correctness" and in many case
is almost as useful. In particular, mathematicians are _very_
unhappy when program return incorrect results. But they are used
to programs which can not deliver results, either because of
lack or resources or because needed case was not implemented.

When dealing with math formulas there are frequently various
restrictions on parameters, like we can only divide by nonzero
quantity. By signaling error when restrictions are not
satisfied we ensure that sucessful completition means that
restrictions were satisfied. Of course that alone does not
mean that result is correct, but correctness of "general"
case is usually _much_ easier to ensure. In other words,
failing restrictions are major source of errors, and signaling
errors effectively eliminates it.

Yes, out-of-band signalling in some way is a useful way to indicate a problem, and can allow parameter checking without losing the useful
results of a function. This is the principle behind exceptions in many languages - then functions either return normally with correct results,
or you have a clearly abnormal situation.

In world of prefect programmers, they would check restrictions
before calling any function depending on them, or prove that
restrictions on arguments to a function imply correctness of
calls made by the function. But world is imperfect and in
real world extra runtime checks are quite useful.

Runtime checks in a function can be useful if you know the calling code might not be perfect and the function is going to take responsibility
for identifying that situation. Programmers will often be writing both
the caller and callee code, and put temporary debugging and test checks wherever it is most convenient.

But I think being too enthusiastic about putting checks in the wrong
place - the callee function - can hide the real problems, or make the
callee code writer less careful about getting their part of the code correct.

IME the opposite: not having checks in called function simply delays
moment when error is detected. Getting errors early helps focus on
tricky problems or misconceptions. And motivates programmers to
be more careful

Concerning correct place for checks: one could argue that check
should be close to place where the result of check matters, which
frequently is in called function. And frequently check requires
computation that is done by called function as part of normal
processing, but would be extra code in the caller.
--
Waldek Hebisch
--- Synchronet 3.20a-Linux NewsLink 1.114

From Bart@bc@freeuk.com to comp.lang.c on Mon Nov 11 21:24:02 2024

From Newsgroup: comp.lang.c

On 10/11/2024 06:00, Waldek Hebisch wrote:

Bart <bc@freeuk.com> wrote:

I assume that you consider the simple solution the 'bad' one?

You wrote about _always_ requiring 'else' regardless if it is
needed or not. Yes, I consider this bad.

It is 'needed' by the language because of its rules. It might not be
needed by a particular function because the author knows that all
expected values of the 2**64 range of most scalar parameters have been covered.

The language doesn't know.

But the rule only applies to value-returning statements; you can choose
not to use such statements, but more conventional ones like those in C.

However, the language will still consider the last statement of a value-returning function to be such a statement. So either that one
needs 'else' (perhaps in multiple branches), or you instead need a dummy 'return x' at the end of the function, one which is never executed.

I don't think that's too onerous, and it is safer than somehow asking
the language to disable the requirement. (How would that be done, by
some special keyword? Then you'd just be writing that keyword instead of 'return'!)

I'd would consider a much elaborate one putting the onus on external
tools, and still having an unpredictable result to be the poor of the two. >>
You want to create a language that is easily compilable, no matter how
complex the input.

Normally time spent _using_ compiler should be bigger than time
spending writing compiler. If compiler gets enough use, it
justifies some complexity.

That doesn't add up: the more the compiler gets used, the slower it
should get?!

The sort of analysis you're implying I don't think belongs in the kind
of compiler I prefer. Even if it did, it would be later on in the
process than the point where the above restriction is checked, so
wouldn't exist in one of my compilers anyway.

I don't like open-ended tasks like this where compilation time could end
up being anything. If you need to keep recompiling the same module, then
you don't want to repeat that work each time.

I am mainly concerned with clarity and correctness of source code.

So am I. I try to keep my syntax clean and uncluttered.

Dummy 'else' doing something may hide errors.

So can 'unreachable'.

Dummy 'else' signaling
error means that something which could be compile time error is
only detected at runtime.

Compiler that detects most errors of this sort is IMO better than
compiler which makes no effort to detect them. And clearly, once
problem is formulated in sufficiently general way, it becomes
unsolvable. So I do not expect general solution, but expect
resonable effort.

So how would David Brown's example work:

int F(int n) {
if (n==1) return 10;
if (n==2) return 20;
}

/You/ know that values -2**31 to 0 and 3 to 2**31-1 are impossible; the compiler doesn't. It's likely to tell you that you may run into the end
of the function.

So what do you want the compiler to here? If I try it:

func F(int n)int =
if n=1 then return 10 fi
if n=2 then return 20 fi
end

It says 'else needed' (in that last statement). I can also shut it up
like this:

func F(int n)int = # int is i64 here
if n=1 then return 10 fi
if n=2 then return 20 fi
0
end

Since now that last statement is the '0' value (any int value wil do).
What should my compiler report instead? What analysis should it be
doing? What would that save me from typing?

normally you do not need very complex analysis:

I don't want to do any analysis at all! I just want a mechanical
translation as effortlessly as possible.

I don't like unbalanced code within a function because it's wrong and
can cause problems.

Well, I demand more from compiler than you do...

Perhaps you're happy for it to be bigger and slower too. Most of my
projects build more or less instantly. Here 'ms' is a version that runs programs directly from source (the first 'ms' is 'ms.exe' and subsequent
ones are 'ms.m' the lead module):

c:\bx>ms ms ms ms ms ms ms ms ms ms ms ms ms ms ms ms hello
Hello World! 21:00:45

This builds and runs 15 successive generations of itself in memory
before building and running hello.m; it took 1 second in all. (Now try
that with gcc!)

Here:

c:\cx>tm \bx\mm -runp cc sql
Compiling cc.m to <pcl>
Compiling sql.c to sql.exe

This compiles my C compiler from source but then it /interprets/ the IR produced. This interpreted compiler took 6 seconds to build the 250Kloc
test file, and it's a very slow interpreter (it's used for testing and debugging).

(gcc -O0 took a bit longer to build sql.c! About 7 seconds but it is
using a heftier windows.h.)

If I run the C compiler from source as native code (\bx\ms cc sql) then building the compiler *and* sql.c takes 1/3 of a second.

You can't do this stuff with the compilers David Brown uses; I'm
guessing you can't do it with your prefered ones either.

--- Synchronet 3.20a-Linux NewsLink 1.114

From David Brown@david.brown@hesbynett.no to comp.lang.c on Tue Nov 12 10:43:54 2024

From Newsgroup: comp.lang.c

On 11/11/2024 20:09, Waldek Hebisch wrote:

David Brown <david.brown@hesbynett.no> wrote:

Runtime checks in a function can be useful if you know the calling code
might not be perfect and the function is going to take responsibility
for identifying that situation. Programmers will often be writing both
the caller and callee code, and put temporary debugging and test checks
wherever it is most convenient.

But I think being too enthusiastic about putting checks in the wrong
place - the callee function - can hide the real problems, or make the
callee code writer less careful about getting their part of the code
correct.

IME the opposite: not having checks in called function simply delays
moment when error is detected. Getting errors early helps focus on
tricky problems or misconceptions. And motivates programmers to
be more careful

I am always in favour of finding errors at the earliest opportunity -
suitable compiler (and even editor/IDE) warnings, strong types, static assertions, etc., are vital tools. Having temporary extra checks at appropriate points in the code are often useful for debugging.

I don't share your feeling about what motivates programmers to be more
careful - however, I have no evidence to back that up.

Concerning correct place for checks: one could argue that check
should be close to place where the result of check matters, which
frequently is in called function.

No, there I disagree. The correct place for the checks should be close
to where the error is, and that is in the /calling/ code. If the called function is correctly written, reviewed, tested, documented and
considered "finished", why would it be appropriate to add extra code to
that in order to test and debug some completely different part of the code?

The place where the result of the check /really/ matters, is the calling
code. And that is also the place where you can most easily find the
error, since the error is in the calling code, not the called function.
And it is most likely to be the code that you are working on at the time
- the called function is already written and tested.

And frequently check requires
computation that is done by called function as part of normal
processing, but would be extra code in the caller.

It is more likely to be the opposite in practice.

And for much of the time, the called function has no real practical way
to check the parameters anyway. A function that takes a pointer
parameter - not an uncommon situation - generally has no way to check
the validity of the pointer. It can't check that the pointer actually
points to useful source data or an appropriate place to store data.

All it can do is check for a null pointer, which is usually a fairly
useless thing to do (unless the specifications for the function make the pointer optional). After all, on most (but not all) systems you already
have a "free" null pointer check - if the caller code has screwed up and passed a null pointer when it should not have done, the program will
quickly crash when the pointer is used for access. Many compilers
provide a way to annotate function declarations to say that a pointer
must not be null, and can then spot at least some such errors at compile
time. And of course the calling code will very often be passing the
address of an object in the call - since that can't be null, a check in
the function is pointless.

Once you get to more complex data structures, the possibility for the
caller to check the parameters gets steadily less realistic.

So now your practice of having functions "always" check their parameters leaves the people writing calling code with a false sense of security - usually you /don't/ check the parameters, you only ever do simple checks
that that called could (and should!) do if they were realistic. You've
got the maintenance and cognitive overload of extra source code for your various "asserts" and other check, regardless of any run-time costs
(which are often irrelevant, but occasionally very important).

You will note that much of this - for both sides of the argument - uses
words like "often", "generally" or "frequently". It is important to appreciate that programming spans a very wide range of situations, and I
don't want to be too categorical about things. I have already said
there are situations when parameter checking in called functions can
make sense. I've no doubt that for some people and some types of
coding, such cases are a lot more common than what I see in my coding.

Note also that when you can use tools to automate checks, such as
"sanitize" options in compilers or different languages that have more
in-built checks, the balance differs. You will generally pay a run-time
cost for those checks, but you don't have the same kind of source-level
costs - your code is still clean, clear, and amenable to correctness
checking, without hiding the functionality of the code in a mass of unnecessary explicit checks. This is particularly good for debugging,
and the run-time costs might not be important. (But if run-time costs
are not important, there's a good chance that C is not the best language
to be using in the first place.)

--- Synchronet 3.20a-Linux NewsLink 1.114

From antispam@antispam@fricas.org (Waldek Hebisch) to comp.lang.c on Fri Nov 15 18:50:43 2024

From Newsgroup: comp.lang.c

David Brown <david.brown@hesbynett.no> wrote:

On 11/11/2024 20:09, Waldek Hebisch wrote:

David Brown <david.brown@hesbynett.no> wrote:

Concerning correct place for checks: one could argue that check
should be close to place where the result of check matters, which
frequently is in called function.

No, there I disagree. The correct place for the checks should be close
to where the error is, and that is in the /calling/ code. If the called function is correctly written, reviewed, tested, documented and
considered "finished", why would it be appropriate to add extra code to
that in order to test and debug some completely different part of the code?

The place where the result of the check /really/ matters, is the calling code. And that is also the place where you can most easily find the
error, since the error is in the calling code, not the called function.
And it is most likely to be the code that you are working on at the time
- the called function is already written and tested.

And frequently check requires
computation that is done by called function as part of normal
processing, but would be extra code in the caller.

It is more likely to be the opposite in practice.

And for much of the time, the called function has no real practical way
to check the parameters anyway. A function that takes a pointer
parameter - not an uncommon situation - generally has no way to check
the validity of the pointer. It can't check that the pointer actually points to useful source data or an appropriate place to store data.

All it can do is check for a null pointer, which is usually a fairly
useless thing to do (unless the specifications for the function make the pointer optional). After all, on most (but not all) systems you already have a "free" null pointer check - if the caller code has screwed up and passed a null pointer when it should not have done, the program will
quickly crash when the pointer is used for access. Many compilers
provide a way to annotate function declarations to say that a pointer
must not be null, and can then spot at least some such errors at compile time. And of course the calling code will very often be passing the
address of an object in the call - since that can't be null, a check in
the function is pointless.

Well, in a sense pointers are easy: if you do not play nasty tricks
with casts then type checks do significant part of checking. Of
course, pointer may be uninitialized (but compiler warnings help a lot
here), memory may be overwritten, etc. But overwritten memory is
rather special, if you checked that content of memory is correct,
but it is overwritten after the check, then earlier check does not
help. Anyway, main point is ensuring that pointed to data satisfies
expected conditions.

Once you get to more complex data structures, the possibility for the
caller to check the parameters gets steadily less realistic.

So now your practice of having functions "always" check their parameters leaves the people writing calling code with a false sense of security - usually you /don't/ check the parameters, you only ever do simple checks that that called could (and should!) do if they were realistic. You've
got the maintenance and cognitive overload of extra source code for your various "asserts" and other check, regardless of any run-time costs
(which are often irrelevant, but occasionally very important).

You will note that much of this - for both sides of the argument - uses words like "often", "generally" or "frequently". It is important to appreciate that programming spans a very wide range of situations, and I don't want to be too categorical about things. I have already said
there are situations when parameter checking in called functions can
make sense. I've no doubt that for some people and some types of
coding, such cases are a lot more common than what I see in my coding.

Note also that when you can use tools to automate checks, such as
"sanitize" options in compilers or different languages that have more in-built checks, the balance differs. You will generally pay a run-time cost for those checks, but you don't have the same kind of source-level costs - your code is still clean, clear, and amenable to correctness checking, without hiding the functionality of the code in a mass of unnecessary explicit checks. This is particularly good for debugging,
and the run-time costs might not be important. (But if run-time costs
are not important, there's a good chance that C is not the best language
to be using in the first place.)

Our experience differs. As a silly example consider a parser
which produces parse tree. Caller is supposed to pass syntactically
correct string as an argument. However, checking syntactic corretnetness requires almost the same effort as producing parse tree, so it
ususal that parser both checks correctness and produces the result.
I have computations that are quite different than parsing but
in some cases share the same characteristic: checking correctness of
arguments requires complex computation similar to producing
actual result. More freqently, called routine can check various
invariants which with high probablity can detect errors. Doing
the same check in caller is inpractical.

Most of my coding is in different languages than C. One of languages
that I use essentially forces programmer to insert checks in
some places. For example unions are tagged and one can use
specific variant only after checking that this is the current
variant. Similarly, fall-trough control structures may lead
to type error at compile time. But signalling error is considered
type safe. So code which checks for unhandled case and signals
errors is accepted as type correct. Unhandled cases frequently
lead to type errors. There is some overhead, but IMO it is accepable.
The language in question is garbage collected, so many memory
related problems go away.

Frequently checks come as natural byproduct of computations. When
handling tree like structures in C IME usualy simplest code code
is reqursive with base case being the null pointer. When base
case should not occur we get check instead of computation.
Skipping such checks also put cognitive load on the reader:
normal pattern has corresponding case, so reader does not know
if the case was ommited by accident or it can not occur. Comment
may clarify this, but error check is equally clear.
--
Waldek Hebisch
--- Synchronet 3.20a-Linux NewsLink 1.114

From ram@ram@zedat.fu-berlin.de (Stefan Ram) to comp.lang.c on Sat Nov 16 09:42:49 2024

From Newsgroup: comp.lang.c

Dan Purgert <dan@djph.net> wrote or quoted:

if (n==0) { printf ("n: %u\n",n); n++;}
if (n==1) { printf ("n: %u\n",n); n++;}
if (n==2) { printf ("n: %u\n",n); n++;}
if (n==3) { printf ("n: %u\n",n); n++;}
if (n==4) { printf ("n: %u\n",n); n++;}
printf ("all if completed, n=%u\n",n);

My bad if the following instruction structure's already been hashed
out in this thread, but I haven't been following the whole convo!

In my C 101 classes, after we've covered "if" and "else",
I always throw this program up on the screen and hit the newbies
with this curveball: "What's this bad boy going to spit out?".

Well, it's a blue moon when someone nails it. Most of them fall
for my little gotcha hook, line, and sinker.

#include <stdio.h>

const char * english( int const n )
{ const char * result;
if( n == 0 )result = "zero";
if( n == 1 )result = "one";
if( n == 2 )result = "two";
if( n == 3 )result = "three";
else result = "four";
return result; }

void print_english( int const n )
{ printf( "%s\n", english( n )); }

int main( void )
{ print_english( 0 );
print_english( 1 );
print_english( 2 );
print_english( 3 );
print_english( 4 ); }

--- Synchronet 3.20a-Linux NewsLink 1.114

From Bart@bc@freeuk.com to comp.lang.c on Sat Nov 16 14:51:34 2024

From Newsgroup: comp.lang.c

On 16/11/2024 09:42, Stefan Ram wrote:

Dan Purgert <dan@djph.net> wrote or quoted:

if (n==0) { printf ("n: %u\n",n); n++;}
if (n==1) { printf ("n: %u\n",n); n++;}
if (n==2) { printf ("n: %u\n",n); n++;}
if (n==3) { printf ("n: %u\n",n); n++;}
if (n==4) { printf ("n: %u\n",n); n++;}
printf ("all if completed, n=%u\n",n);

My bad if the following instruction structure's already been hashed
out in this thread, but I haven't been following the whole convo!

In my C 101 classes, after we've covered "if" and "else",
I always throw this program up on the screen and hit the newbies
with this curveball: "What's this bad boy going to spit out?".

FGS please turn down the 'hip lingo' generator down a few notches!

--- Synchronet 3.20a-Linux NewsLink 1.114

From James Kuyper@jameskuyper@alumni.caltech.edu to comp.lang.c on Sat Nov 16 10:14:07 2024

From Newsgroup: comp.lang.c

On 11/16/24 04:42, Stefan Ram wrote:
...

My bad if the following instruction structure's already been hashed
out in this thread, but I haven't been following the whole convo!

In my C 101 classes, after we've covered "if" and "else",
I always throw this program up on the screen and hit the newbies
with this curveball: "What's this bad boy going to spit out?".

Well, it's a blue moon when someone nails it. Most of them fall
for my little gotcha hook, line, and sinker.

#include <stdio.h>

const char * english( int const n )
{ const char * result;
if( n == 0 )result = "zero";
if( n == 1 )result = "one";
if( n == 2 )result = "two";
if( n == 3 )result = "three";
else result = "four";
return result; }

void print_english( int const n )
{ printf( "%s\n", english( n )); }

int main( void )
{ print_english( 0 );
print_english( 1 );
print_english( 2 );
print_english( 3 );
print_english( 4 ); }

Nice. It did take a little while for me to figure out what was wrong,
but since I knew that something was wrong, I did eventually find it -
without first running the program.

--- Synchronet 3.20a-Linux NewsLink 1.114

From Lew Pitcher@lew.pitcher@digitalfreehold.ca to comp.lang.c on Sat Nov 16 15:37:24 2024

From Newsgroup: comp.lang.c

On Sat, 16 Nov 2024 09:42:49 +0000, Stefan Ram wrote:

Dan Purgert <dan@djph.net> wrote or quoted:

if (n==0) { printf ("n: %u\n",n); n++;}
if (n==1) { printf ("n: %u\n",n); n++;}
if (n==2) { printf ("n: %u\n",n); n++;}
if (n==3) { printf ("n: %u\n",n); n++;}
if (n==4) { printf ("n: %u\n",n); n++;}
printf ("all if completed, n=%u\n",n);

My bad if the following instruction structure's already been hashed
out in this thread, but I haven't been following the whole convo!

In my C 101 classes, after we've covered "if" and "else",
I always throw this program up on the screen and hit the newbies
with this curveball: "What's this bad boy going to spit out?".

Well, it's a blue moon when someone nails it. Most of them fall
for my little gotcha hook, line, and sinker.

#include <stdio.h>

const char * english( int const n )
{ const char * result;
if( n == 0 )result = "zero";
if( n == 1 )result = "one";
if( n == 2 )result = "two";
if( n == 3 )result = "three";
else result = "four";
return result; }

void print_english( int const n )
{ printf( "%s\n", english( n )); }

int main( void )
{ print_english( 0 );
print_english( 1 );
print_english( 2 );
print_english( 3 );
print_english( 4 ); }

If I read your code correctly, you have actually included not one,
but TWO curveballs. Well done!
--
Lew Pitcher
"In Skills We Trust"
--- Synchronet 3.20a-Linux NewsLink 1.114

From David Brown@david.brown@hesbynett.no to comp.lang.c on Sat Nov 16 17:29:17 2024

From Newsgroup: comp.lang.c

On 15/11/2024 19:50, Waldek Hebisch wrote:

David Brown <david.brown@hesbynett.no> wrote:

On 11/11/2024 20:09, Waldek Hebisch wrote:

David Brown <david.brown@hesbynett.no> wrote:

Concerning correct place for checks: one could argue that check
should be close to place where the result of check matters, which
frequently is in called function.

No, there I disagree. The correct place for the checks should be close
to where the error is, and that is in the /calling/ code. If the called
function is correctly written, reviewed, tested, documented and
considered "finished", why would it be appropriate to add extra code to
that in order to test and debug some completely different part of the code? >>
The place where the result of the check /really/ matters, is the calling
code. And that is also the place where you can most easily find the
error, since the error is in the calling code, not the called function.
And it is most likely to be the code that you are working on at the time
- the called function is already written and tested.

And frequently check requires
computation that is done by called function as part of normal
processing, but would be extra code in the caller.

It is more likely to be the opposite in practice.

And for much of the time, the called function has no real practical way
to check the parameters anyway. A function that takes a pointer
parameter - not an uncommon situation - generally has no way to check
the validity of the pointer. It can't check that the pointer actually
points to useful source data or an appropriate place to store data.

All it can do is check for a null pointer, which is usually a fairly
useless thing to do (unless the specifications for the function make the
pointer optional). After all, on most (but not all) systems you already
have a "free" null pointer check - if the caller code has screwed up and
passed a null pointer when it should not have done, the program will
quickly crash when the pointer is used for access. Many compilers
provide a way to annotate function declarations to say that a pointer
must not be null, and can then spot at least some such errors at compile
time. And of course the calling code will very often be passing the
address of an object in the call - since that can't be null, a check in
the function is pointless.

Well, in a sense pointers are easy: if you do not play nasty tricks
with casts then type checks do significant part of checking. Of
course, pointer may be uninitialized (but compiler warnings help a lot
here), memory may be overwritten, etc. But overwritten memory is
rather special, if you checked that content of memory is correct,
but it is overwritten after the check, then earlier check does not
help. Anyway, main point is ensuring that pointed to data satisfies
expected conditions.

That does not match reality. Pointers are far and away the biggest
source of errors in C code. Use after free, buffer overflows, mixups of
who "owns" the pointer - the scope for errors is boundless. You are
correct that type systems can catch many potential types of errors - unfortunately, people /do/ play nasty tricks with type checks.
Conversions of pointer types are found all over the place in C
programming, especially conversions back and forth with void* pointers.

All this means that invalid pointer parameters are very much a real
issue - but are typically impossible to check in the called function.

The way you avoid getting errors in your pointers is being careful about having the right data in the first place, so you only call functions
with valid parameters. You do this by having careful control about the ownership and lifetime of pointers, and what they point to, keeping conventions in the names of your pointers and functions to indicate who
owns what, and so on. And you use sanitizers and similar tools during
testing and debugging to distinguish between tests that worked by luck,
and ones that worked reliably. (And of course you may consider other languages than C that help you express your requirements in a clearer
manner or with better automatic checking.)

Put the same effort and due diligence into the rest of your code, and
suddenly you find your checks for other kinds of parameters in functions
are irrelevant as you are now making sure you call functions with
appropriate valid inputs.

Once you get to more complex data structures, the possibility for the
caller to check the parameters gets steadily less realistic.

So now your practice of having functions "always" check their parameters
leaves the people writing calling code with a false sense of security -
usually you /don't/ check the parameters, you only ever do simple checks
that that called could (and should!) do if they were realistic. You've
got the maintenance and cognitive overload of extra source code for your
various "asserts" and other check, regardless of any run-time costs
(which are often irrelevant, but occasionally very important).

You will note that much of this - for both sides of the argument - uses
words like "often", "generally" or "frequently". It is important to
appreciate that programming spans a very wide range of situations, and I
don't want to be too categorical about things. I have already said
there are situations when parameter checking in called functions can
make sense. I've no doubt that for some people and some types of
coding, such cases are a lot more common than what I see in my coding.

Note also that when you can use tools to automate checks, such as
"sanitize" options in compilers or different languages that have more
in-built checks, the balance differs. You will generally pay a run-time
cost for those checks, but you don't have the same kind of source-level
costs - your code is still clean, clear, and amenable to correctness
checking, without hiding the functionality of the code in a mass of
unnecessary explicit checks. This is particularly good for debugging,
and the run-time costs might not be important. (But if run-time costs
are not important, there's a good chance that C is not the best language
to be using in the first place.)

Our experience differs. As a silly example consider a parser
which produces parse tree. Caller is supposed to pass syntactically
correct string as an argument. However, checking syntactic corretnetness requires almost the same effort as producing parse tree, so it
ususal that parser both checks correctness and produces the result.

The trick here is to avoid producing a syntactically invalid string in
the first place. Solve the issue at the point where there is a mistake
in the code!

(If you are talking about a string that comes from outside the code in
some way, then of course you need to check it - and if that is most conveniently done during the rest of parsing, then that is fair enough.)

I have computations that are quite different than parsing but
in some cases share the same characteristic: checking correctness of arguments requires complex computation similar to producing
actual result. More freqently, called routine can check various
invariants which with high probablity can detect errors. Doing
the same check in caller is inpractical.

I think you are misunderstanding me - maybe I have been unclear. I am
saying that it is the /caller's/ responsibility to make sure that the parameters it passes are correct, not the /callee's/ responsibility.
That does not mean that the caller has to add checks to get the
parameters right - it means the caller has to use correct parameters.

Think of this like walking near a cliff-edge. Checking parameters
before the call is like having a barrier at the edge of the cliff. My recommendation is that you know where the cliff edge is, and don't walk
there. Checking parameters in the called function is like having a
crash mat at the bottom of the cliff for people who blindly walk off it.

Most of my coding is in different languages than C. One of languages
that I use essentially forces programmer to insert checks in
some places. For example unions are tagged and one can use
specific variant only after checking that this is the current
variant. Similarly, fall-trough control structures may lead
to type error at compile time. But signalling error is considered
type safe. So code which checks for unhandled case and signals
errors is accepted as type correct. Unhandled cases frequently
lead to type errors. There is some overhead, but IMO it is accepable.
The language in question is garbage collected, so many memory
related problems go away.

Frequently checks come as natural byproduct of computations. When
handling tree like structures in C IME usualy simplest code code
is reqursive with base case being the null pointer. When base
case should not occur we get check instead of computation.
Skipping such checks also put cognitive load on the reader:
normal pattern has corresponding case, so reader does not know
if the case was ommited by accident or it can not occur. Comment
may clarify this, but error check is equally clear.

--- Synchronet 3.20a-Linux NewsLink 1.114

From David Brown@david.brown@hesbynett.no to comp.lang.c on Sat Nov 16 17:38:37 2024

From Newsgroup: comp.lang.c

On 16/11/2024 15:51, Bart wrote:

On 16/11/2024 09:42, Stefan Ram wrote:

Dan Purgert <dan@djph.net> wrote or quoted:

if (n==0) { printf ("n: %u\n",n); n++;}
if (n==1) { printf ("n: %u\n",n); n++;}
if (n==2) { printf ("n: %u\n",n); n++;}
if (n==3) { printf ("n: %u\n",n); n++;}
if (n==4) { printf ("n: %u\n",n); n++;}
printf ("all if completed, n=%u\n",n);

   My bad if the following instruction structure's already been hashed
   out in this thread, but I haven't been following the whole convo!

   In my C 101 classes, after we've covered "if" and "else",
   I always throw this program up on the screen and hit the newbies
   with this curveball: "What's this bad boy going to spit out?".

FGS please turn down the 'hip lingo' generator down a few notches!

I wonder what happened to Stefan. He used to make perfectly good posts.
Then he disappeared for a bit, and came back with this new "style".

Given that this "new" Stefan can write posts with interesting C content,
such as this one, and has retained his ugly coding layout and
non-standard Usenet format, I have to assume it's still the same person
behind the posts.

Is he using some "translate to hip lingo" tool? Or has he had a stroke
or brain tumour that has rendered him incapable of writing text like an
adult while still being able to write C code?

--- Synchronet 3.20a-Linux NewsLink 1.114

From Tim Rentsch@tr.17687@z991.linuxsc.com to comp.lang.c on Sun Nov 17 05:51:26 2024

From Newsgroup: comp.lang.c

Lew Pitcher <lew.pitcher@digitalfreehold.ca> writes:

On Sat, 16 Nov 2024 09:42:49 +0000, Stefan Ram wrote:

Dan Purgert <dan@djph.net> wrote or quoted:

if (n==0) { printf ("n: %u\n",n); n++;}
if (n==1) { printf ("n: %u\n",n); n++;}
if (n==2) { printf ("n: %u\n",n); n++;}
if (n==3) { printf ("n: %u\n",n); n++;}
if (n==4) { printf ("n: %u\n",n); n++;}
printf ("all if completed, n=%u\n",n);

My bad if the following instruction structure's already been hashed
out in this thread, but I haven't been following the whole convo!

In my C 101 classes, after we've covered "if" and "else",
I always throw this program up on the screen and hit the newbies
with this curveball: "What's this bad boy going to spit out?".

Well, it's a blue moon when someone nails it. Most of them fall
for my little gotcha hook, line, and sinker.

#include <stdio.h>

const char * english( int const n )
{ const char * result;
if( n == 0 )result = "zero";
if( n == 1 )result = "one";
if( n == 2 )result = "two";
if( n == 3 )result = "three";
else result = "four";
return result; }

void print_english( int const n )
{ printf( "%s\n", english( n )); }

int main( void )
{ print_english( 0 );
print_english( 1 );
print_english( 2 );
print_english( 3 );
print_english( 4 ); }

If I read your code correctly, you have actually included not one,
but TWO curveballs. Well done!

What's the second curveball?
--- Synchronet 3.20a-Linux NewsLink 1.114

From antispam@antispam@fricas.org (Waldek Hebisch) to comp.lang.c on Tue Nov 19 01:53:05 2024

From Newsgroup: comp.lang.c

Bart <bc@freeuk.com> wrote:

On 10/11/2024 06:00, Waldek Hebisch wrote:

Bart <bc@freeuk.com> wrote:

I'd would consider a much elaborate one putting the onus on external
tools, and still having an unpredictable result to be the poor of the two. >>>
You want to create a language that is easily compilable, no matter how
complex the input.

Normally time spent _using_ compiler should be bigger than time
spending writing compiler. If compiler gets enough use, it
justifies some complexity.

That doesn't add up: the more the compiler gets used, the slower it
should get?!

More complicated does not mean slower. Binary search or hash tables
are more complicated than linear search, but for larger data may
be much faster. Similarly, compiler may be simplified by using
simpler but slower methods and more complicated compiler may use
faster methods. This is particularly relevant here: simple compiler
may keep list of cases or ranges and linarly scan those. More
advanced one may use a say a tree structure.

More generaly, I want to minimize time spent by the programmer,
that is _sum over all iterations leading to correct program_ of
compile time and "think time". Compiler that compiles slower,
but allows less iterations due to better diagnostics may win.
Also, humans perceive 0.1s delay almost like no delay at all.
So it does not matter if single compilation step is 0.1s or
0.1ms. Modern computers can do a lot of work in 0.1s.

The sort of analysis you're implying I don't think belongs in the kind
of compiler I prefer. Even if it did, it would be later on in the
process than the point where the above restriction is checked, so
wouldn't exist in one of my compilers anyway.

Sure, you design your compiler as you like.

I don't like open-ended tasks like this where compilation time could end
up being anything. If you need to keep recompiling the same module, then
you don't want to repeat that work each time.

Yes. This may lead to some complexity. Simple approach is to
avoid obviously useless recompilation ('make' is doing this).
More complicated approach may keep some intermediate data and
try to "validate" them first. If previous analysis is valid,
then it can be reused. If something significant changes, than
it needs to be re-done. But many changes only have very local
effect, so at least theoretically re-using analyses could
save substantial time.

Concerning open-ended, may attitude is that compiler should make
effort which is open-ended in the sense that when new method is
discovered then compiler may be extended and do more work.
OTOH in "single true compiler" world, compiler may say "this is
too difficult, giving up". Of course, when trying something
very hard compiler is likely to run out of memory or user will
stop it. But compiler may give up earlier. Of course, this
is unacceptable for a standarized language, when people move
programs between different compiler. If compiler can legally
reject a program because of its limitation and is doing this
with significant probabliity, than portablity between compilers
is severly limited. But if there is a way to disable extra
checks, then this may work. This is one of reasones why
'gcc' has so many options: users that want it can get stronger
checking, but if they want 'gcc' will accept lousy code
too.

I am mainly concerned with clarity and correctness of source code.

So am I. I try to keep my syntax clean and uncluttered.

Dummy 'else' doing something may hide errors.

So can 'unreachable'.

Dummy 'else' signaling
error means that something which could be compile time error is
only detected at runtime.

Compiler that detects most errors of this sort is IMO better than
compiler which makes no effort to detect them. And clearly, once
problem is formulated in sufficiently general way, it becomes
unsolvable. So I do not expect general solution, but expect
resonable effort.

So how would David Brown's example work:

int F(int n) {
if (n==1) return 10;
if (n==2) return 20;
}

/You/ know that values -2**31 to 0 and 3 to 2**31-1 are impossible; the compiler doesn't. It's likely to tell you that you may run into the end
of the function.

So what do you want the compiler to here? If I try it:

func F(int n)int =
if n=1 then return 10 fi
if n=2 then return 20 fi
end

It says 'else needed' (in that last statement). I can also shut it up
like this:

func F(int n)int = # int is i64 here
if n=1 then return 10 fi
if n=2 then return 20 fi
0
end

Since now that last statement is the '0' value (any int value wil do).
What should my compiler report instead? What analysis should it be
doing? What would that save me from typing?

Currently in typed language that I use literal translation of
the example hits a hole in checks, that is the code is accepted.

Concerning needed analyses: one thing needed is representation of
type, either Pascal range type or enumeration type (the example
is _very_ unatural because in modern programming magic numbers
are avoided and there would be some symbolic representation
adding meaning to the numbers). Second, compiler must recognize
that this is a "multiway switch" and collect conditions. Once
you have such representation (which may be desirable for other
reasons) it is easy to determine set of handled values. More
precisely, in this example we just have small number of discrete
values. More ambitious compiler may have list of ranges.
If type also specifies list of values or list of ranges, then
it is easy to check if all values of the type are handled.

normally you do not need very complex analysis:

I don't want to do any analysis at all! I just want a mechanical
translation as effortlessly as possible.

I don't like unbalanced code within a function because it's wrong and
can cause problems.

Well, I demand more from compiler than you do...

Perhaps you're happy for it to be bigger and slower too. Most of my
projects build more or less instantly. Here 'ms' is a version that runs programs directly from source (the first 'ms' is 'ms.exe' and subsequent ones are 'ms.m' the lead module):

c:\bx>ms ms ms ms ms ms ms ms ms ms ms ms ms ms ms ms hello
Hello World! 21:00:45

This builds and runs 15 successive generations of itself in memory
before building and running hello.m; it took 1 second in all. (Now try
that with gcc!)

Here:

c:\cx>tm \bx\mm -runp cc sql
Compiling cc.m to <pcl>
Compiling sql.c to sql.exe

This compiles my C compiler from source but then it /interprets/ the IR produced. This interpreted compiler took 6 seconds to build the 250Kloc
test file, and it's a very slow interpreter (it's used for testing and debugging).

(gcc -O0 took a bit longer to build sql.c! About 7 seconds but it is
using a heftier windows.h.)

If I run the C compiler from source as native code (\bx\ms cc sql) then building the compiler *and* sql.c takes 1/3 of a second.

You can't do this stuff with the compilers David Brown uses; I'm
guessing you can't do it with your prefered ones either.

To recompile the typed system I use (about 0.4M lines) on new fast
machine I need about 53s. But that is kind of cheating:
- this time is for parallel build using 20 logical cores
- the compiler is not in the language it compiles (but in untyped
vesion of it)
- actuall compilation of the compiler is small part of total
compile time
On slow machine compile time can be as large as 40 minutes.

An untyped system that I use has about 0.5M lines and recompiles
itself in 16s on the same machine. This one uses single core.
On slow machine compile time may be closer to 2 minutes.
Again, compiler compile time is only a part of build time.
Actualy, one time-intensive part is creating index for included
documentation. Another is C compilation for a library file
(system has image-processing functions and low-level part of
image processing is done in C). Recomplation starts from
minimal version of the system, rebuilding this minimal
version takes 3.3s.

Note that in both cases line counts are from 'wc'. Both systems
contain substantial amount of documentation, I tried to compensate
for this, but size measured in terms of LOC (that is excluding
comments, empty lines, non-code files) would be significantly
smaller.

Anyway, I do not need cascaded recompilation than you present.
Both system above have incermental compilation, the second one
at statement/function level: it offers interactive prompt
which takes a statement from the user, compiles it and immediately
executes. Such statement may define a function or perform compilation.
Even on _very_ slow machine there is no noticable delay due to
compilation, unless you feed the system with some oversized statement
or function (presumably from a file).
--
Waldek Hebisch
--- Synchronet 3.20a-Linux NewsLink 1.114

From Janis Papanagnou@janis_papanagnou+ng@hotmail.com to comp.lang.c on Tue Nov 19 06:37:31 2024

From Newsgroup: comp.lang.c

On 10.11.2024 16:13, David Brown wrote:

[...]

My preferences are very much weighted towards correctness, not
efficiency. That includes /knowing/ that things are correct, not just passing some tests. [...]

I agree with you. But given what you write I'm also sure you know
what's achievable in theory, what's an avid wish, and what's really
possible. Yet there's also projects that don't seem to care, where
speedy delivery is the primary goal. Guaranteeing formal correctness
had never been an issue in the industry contexts I worked in, and I
was always glad when I had a good test environment, with a good test
coverage, and continuous refinement of tests. Informal documentation,
factual checks of the arguments, and actual tests was what kept the
quality of our project deliveries at a high level.

Janis

--- Synchronet 3.20a-Linux NewsLink 1.114

From Janis Papanagnou@janis_papanagnou+ng@hotmail.com to comp.lang.c on Tue Nov 19 07:25:27 2024

From Newsgroup: comp.lang.c

On 16.11.2024 17:38, David Brown wrote:

I wonder what happened to Stefan. He used to make perfectly good posts.
Then he disappeared for a bit, and came back with this new "style".

Given that this "new" Stefan can write posts with interesting C content,
such as this one, and has retained his ugly coding layout and
non-standard Usenet format, I have to assume it's still the same person behind the posts.

Sorry that I cannot resist asking what you consider "non-standard
Usenet format", given that your posts don't consider line length.
(Did the "standards" change during the past three decades maybe?
Do we use only those parts of the "standards" that we like and
ignore others? Or does it boil down to Netiquette is no standard?)

Janis, just curious and no offense intended :-)

--- Synchronet 3.20a-Linux NewsLink 1.114

From David Brown@david.brown@hesbynett.no to comp.lang.c on Tue Nov 19 09:19:18 2024

From Newsgroup: comp.lang.c

On 19/11/2024 06:37, Janis Papanagnou wrote:

On 10.11.2024 16:13, David Brown wrote:

[...]

My preferences are very much weighted towards correctness, not
efficiency. That includes /knowing/ that things are correct, not just
passing some tests. [...]

I agree with you. But given what you write I'm also sure you know
what's achievable in theory, what's an avid wish, and what's really
possible.

Sure. I've done my fair share of "write-test-debug" cycling for writing
code - that's almost inevitable when interacting with something else
(hardware devices, other programs, users, etc.) that are poorly
specified. At the other end of the scale, you have things such as race conditions, where is no option but to make sure the code is written
correctly.

The original context of this discussion was about small self-contained functions, where correctness is very much achievable in practice - /if/
you understand that it is something worth aiming at.

Yet there's also projects that don't seem to care, where
speedy delivery is the primary goal. Guaranteeing formal correctness
had never been an issue in the industry contexts I worked in, and I
was always glad when I had a good test environment, with a good test coverage, and continuous refinement of tests. Informal documentation,
factual checks of the arguments, and actual tests was what kept the
quality of our project deliveries at a high level.

There are a great variety of projects, and the development style differs wildly. Ultimately, you want a cost-benefit balance that makes sense
for what you are doing, and true formal proof methods are only
cost-effective in very niche circumstances. In my work, I have rarely
used any kind of formal methods - but I constantly have the principles
in mind. When I call a function, I can see that the parameters I use
are valid - and /could/ be proven valid. I know what the outputs of the function are, and how they fit in with the calling code - and I use that
to know the validity of the next function called. If I can't see such
things, it's time to re-factor the code to improve clarity.

Of course testing is important, at many levels. But the time to test
your code is when you are confident that it is correct - testing is not
an alternative to writing code that is as clearly correct as you are
able to make it.

--- Synchronet 3.20a-Linux NewsLink 1.114

From David Brown@david.brown@hesbynett.no to comp.lang.c on Tue Nov 19 09:30:19 2024

From Newsgroup: comp.lang.c

On 19/11/2024 07:25, Janis Papanagnou wrote:

On 16.11.2024 17:38, David Brown wrote:

I wonder what happened to Stefan. He used to make perfectly good posts.
Then he disappeared for a bit, and came back with this new "style".

Given that this "new" Stefan can write posts with interesting C content,
such as this one, and has retained his ugly coding layout and
non-standard Usenet format, I have to assume it's still the same person
behind the posts.

Sorry that I cannot resist asking what you consider "non-standard
Usenet format", given that your posts don't consider line length.
(Did the "standards" change during the past three decades maybe?
Do we use only those parts of the "standards" that we like and
ignore others? Or does it boil down to Netiquette is no standard?)

Janis, just curious and no offense intended :-)

I hadn't even considered taking offence! And if you are right that my
line length is wrong, I am glad to be told.

AFAIK, my posts /do/ follow line length standards. You are using
Thunderbird like me, I believe - select one of my posts and use ctrl-U
to see the source, and the lines are split appropriately. But depending
on the details of posts and clients, and the way lines are split
(manually or automatically), lines are not always displayed with a 72 character width.

Stefan's posting format has extra indentation for his prose, but
additional quoted material (such as code) is outdented. Perhaps that
does not count as "non-standard Usenet format", but it is certainly a formatting style that is highly unusual and characteristic.

--- Synchronet 3.20a-Linux NewsLink 1.114

From Michael S@already5chosen@yahoo.com to comp.lang.c on Tue Nov 19 13:21:51 2024

From Newsgroup: comp.lang.c

On Tue, 19 Nov 2024 07:25:27 +0100
Janis Papanagnou <janis_papanagnou+ng@hotmail.com> wrote:

On 16.11.2024 17:38, David Brown wrote:

I wonder what happened to Stefan. He used to make perfectly good
posts. Then he disappeared for a bit, and came back with this new
"style".

Given that this "new" Stefan can write posts with interesting C
content, such as this one, and has retained his ugly coding layout
and non-standard Usenet format, I have to assume it's still the
same person behind the posts.

Sorry that I cannot resist asking what you consider "non-standard
Usenet format", given that your posts don't consider line length.
(Did the "standards" change during the past three decades maybe?
Do we use only those parts of the "standards" that we like and
ignore others? Or does it boil down to Netiquette is no standard?)

It's not that 'X-No-Archive: Yes' and 'Archive: no' headers used by
Stefan Ram are not standard. They are just very unusual. He also has 'X-No-Archive-Readme' header that indicates that he expects that Usenet
servers will interpret his headers in a way that no real world
automatic server software would do. It looks like he expects individual treatment by human being.

--- Synchronet 3.20a-Linux NewsLink 1.114

From Janis Papanagnou@janis_papanagnou+ng@hotmail.com to comp.lang.c on Tue Nov 19 14:29:06 2024

From Newsgroup: comp.lang.c

On 19.11.2024 09:19, David Brown wrote:
[...]

There are a great variety of projects, [...]

I don't want the theme to get out of hand, so just one amendment to...

Of course testing is important, at many levels. But the time to test
your code is when you are confident that it is correct - testing is not
an alternative to writing code that is as clearly correct as you are
able to make it.

Sound like early days practice, where code is written, "defined" at
some point as "correct", and then tests written (sometimes written
by the same folks who implemented the code) to prove that the code
is doing the expected, or the tests have been spared because it was
"clear" that the code is "correct" (sort of).

Since the 1990's we've had other principles, yes, "on many levels"
(as you started your paragraph). At all levels there's some sort of specification (or description) that defined the expected outcome
and behavior; tests [of levels higher than unit-tests] are written
if not in parallel then usually by separate groups. The decoupling
is important, the "first implement, then test" serializing certainly
not.

Of course every responsible programmer tries to create correct code,
supported by own experience and by projects' regulatory means. But
that doesn't guarantee correct code. Neither do test guarantee that.
But tests have been, IME, more effective in supporting correctness
than being "confident that it is correct" (as you say).

Janis

--- Synchronet 3.20a-Linux NewsLink 1.114

From Janis Papanagnou@janis_papanagnou+ng@hotmail.com to comp.lang.c on Tue Nov 19 14:41:51 2024

From Newsgroup: comp.lang.c

On 16.11.2024 16:14, James Kuyper wrote:

On 11/16/24 04:42, Stefan Ram wrote:
...

[...]

#include <stdio.h>

const char * english( int const n )
{ const char * result;
if( n == 0 )result = "zero";
if( n == 1 )result = "one";
if( n == 2 )result = "two";
if( n == 3 )result = "three";
else result = "four";
return result; }

That's indeed a nice example. Where you get fooled by treachery
"trustiness" of formatting[*]. - In syntax we trust! [**]

void print_english( int const n )
{ printf( "%s\n", english( n )); }

int main( void )
{ print_english( 0 );
print_english( 1 );
print_english( 2 );
print_english( 3 );
print_english( 4 ); }

Nice. It did take a little while for me to figure out what was wrong,
but since I knew that something was wrong, I did eventually find it -
without first running the program.

Same here. :-)

Janis

[*] Why do I have to think of Python now? - Never mind. Better
let sleeping dogs lie.

[**] As far as I am concerned.

--- Synchronet 3.20a-Linux NewsLink 1.114

From Bart@bc@freeuk.com to comp.lang.c on Tue Nov 19 15:51:33 2024

From Newsgroup: comp.lang.c

On 19/11/2024 01:53, Waldek Hebisch wrote:

Bart <bc@freeuk.com> wrote:

On 10/11/2024 06:00, Waldek Hebisch wrote:

Bart <bc@freeuk.com> wrote:

I'd would consider a much elaborate one putting the onus on external
tools, and still having an unpredictable result to be the poor of the two. >>>>
You want to create a language that is easily compilable, no matter how >>>> complex the input.

Normally time spent _using_ compiler should be bigger than time
spending writing compiler. If compiler gets enough use, it
justifies some complexity.

That doesn't add up: the more the compiler gets used, the slower it
should get?!

More complicated does not mean slower. Binary search or hash tables
are more complicated than linear search, but for larger data may
be much faster.

That's not the complexity I had in mind. The 100-200MB sizes of
LLVM-based compilers are not because they use hash-tables over linear
search.

More generaly, I want to minimize time spent by the programmer,
that is _sum over all iterations leading to correct program_ of
compile time and "think time". Compiler that compiles slower,
but allows less iterations due to better diagnostics may win.
Also, humans perceive 0.1s delay almost like no delay at all.
So it does not matter if single compilation step is 0.1s or
0.1ms. Modern computers can do a lot of work in 0.1s.

What's the context of this 0.1 seconds? Do you consider it long or short?

My tools can generally build my apps from scratch in 0.1 seconds; big compilers tend to take a lot longer. Only Tiny C is in that ballpark.

So I'm failing to see your point here. Maybe you picked up that 0.1
seconds from an earlier post of mine and are suggesting I ought to be
able to do a lot more analysis within that time?

Yes. This may lead to some complexity. Simple approach is to
avoid obviously useless recompilation ('make' is doing this).
More complicated approach may keep some intermediate data and
try to "validate" them first. If previous analysis is valid,
then it can be reused. If something significant changes, than
it needs to be re-done. But many changes only have very local
effect, so at least theoretically re-using analyses could
save substantial time.

I consider compilation: turning textual source code into a form that can
be run, typically binary native code, to be a completely routine task
that should be as simple and as quick as flicking a light switch.

While anything else that might be a deep analysis of that program I
consider to be a quite different task. I'm not saying there is no place
for it, but I don't agree it should be integrated into every compiler
and always invoked.

Since now that last statement is the '0' value (any int value wil do).
What should my compiler report instead? What analysis should it be
doing? What would that save me from typing?

Currently in typed language that I use literal translation of
the example hits a hole in checks, that is the code is accepted.

Concerning needed analyses: one thing needed is representation of
type, either Pascal range type or enumeration type (the example
is _very_ unatural because in modern programming magic numbers
are avoided and there would be some symbolic representation
adding meaning to the numbers). Second, compiler must recognize
that this is a "multiway switch" and collect conditions.

The example came from C. Even if written as a switch, C switches do not
return values (and also are hard to even analyse as to which branch is
which).

In my languages, switches can return values, and a switch written as the
last statement of a function is considered to do so, even if each branch
uses an explicit 'return'. Then, it will consider a missing ELSE a 'hole'.

It will not do any analysis of the range other than what is necessary to implement switch (duplicate values, span of values, range-checking when
using jump tables).

So the language may require you to supply a dummy 'else x' or 'return
x'; so what?

The alternative appears to be one of:

* Instead of 'else' or 'return', to write 'unreachable', which puts some
trust, not in the programmer, but some person calling your function
who does not have sight of the source code, to avoid calling it with
invalid arguments

* Or relying on the variable capabilities of a compiler 'A', which might
sometimes be able to determine that some point is not reached, but
sometimes it can't. But when you use compiler 'B', it might have a
different result.

I'll stick with my scheme, thanks!

Once
you have such representation (which may be desirable for other
reasons) it is easy to determine set of handled values. More
precisely, in this example we just have small number of discrete
values. More ambitious compiler may have list of ranges.
If type also specifies list of values or list of ranges, then
it is easy to check if all values of the type are handled.

The types are tyically plain integers, with ranges from 2**8 to 2**64.
The ranges associated with application needs will be more arbitrary.

If talking about a language with ranged integer types, then there might
be more point to it, but that is itself a can of worms. (It's hard to do without getting halfway to implementing Ada.)

You can't do this stuff with the compilers David Brown uses; I'm
guessing you can't do it with your prefered ones either.

To recompile the typed system I use (about 0.4M lines) on new fast
machine I need about 53s. But that is kind of cheating:
- this time is for parallel build using 20 logical cores
- the compiler is not in the language it compiles (but in untyped
vesion of it)
- actuall compilation of the compiler is small part of total
compile time
On slow machine compile time can be as large as 40 minutes.

40 minutes for 400K lines? That's 160 lines per second; how old is this machine? Is the compiler written in Python?

An untyped system that I use has about 0.5M lines and recompiles
itself in 16s on the same machine. This one uses single core.
On slow machine compile time may be closer to 2 minutes.

So 4K to 30Klps.

Again, compiler compile time is only a part of build time.
Actualy, one time-intensive part is creating index for included documentation.

Which is not going to be part of a routine build.

Another is C compilation for a library file
(system has image-processing functions and low-level part of
image processing is done in C). Recomplation starts from
minimal version of the system, rebuilding this minimal
version takes 3.3s.

My language tools work on a whole program, where a 'program' is a single
EXE or DLL file (or a single OBJ file in some cases).

A 'build' then turns N source files into 1 binary file. This is the task
I am talking about.

A complete application may have several such binaries and a bunch of
other stuff. Maybe some source code is generated by a script. This part
is open-ended.

However each of my current projects is a single, self-contained binary
by design.

Anyway, I do not need cascaded recompilation than you present.
Both system above have incermental compilation, the second one
at statement/function level: it offers interactive prompt
which takes a statement from the user, compiles it and immediately
executes. Such statement may define a function or perform compilation.
Even on _very_ slow machine there is no noticable delay due to
compilation, unless you feed the system with some oversized statement
or function (presumably from a file).

This sounds like a REPL system. There, each line is a new part of the
program which is processed, executed and discarded. In that regard, it
is not really what I am talking about, which is AOT compilation of a
program represented by a bunch of source files.

Or can a new line redefine something, perhaps a function definition, previously entered amongst the last 100,000 lines? Can a new line
require compilation of something typed 50,000 lines ago?

What happens if you change the type of a global; are you saying that
none of the program codes needs revising?

What I do relies purely on raw compilation speed. No tricks are needed.
No incrementatal compilation is needed (the 'granularity' is a
'program': a single EXE/DLL file, as mentioned above).

You can change any single part, either local or global, and the file
thing is recompiled in an instant.

However, a 0.5M line project may take a second (unoptimised compiler),
but it would also generate a 5MB executable, which is quite sizeable.

Optimising my compiler and choosing to run the interpreter might reduce
that to half a second (to get to where the app starts to executed). That
could be done now. Other optimisations could be done while to reduce it further, but ATM they are not needed.

The only real example I have is an SQLite3 test, a 250Kloc C program
(but which which has lots of comments and conditional code; preprocessed
it's 85Kloc).

My C compiler can run that from source. It takes 0.22 seconds to compile 250Kloc/8MB of source to in-memory native code. Or I can run from source
via an interpreter, then it takes 1/6th of a second to get from C source
to IL code:

c:\cx>cc -runp sql
Compiling sql.c to 'pcl' # PCL is the name of my IL
Compile to PCL takes: 157 ms
SQLite version 3.25.3 2018-11-05 20:37:38
Enter ".help" for usage hints.
Connected to a transient in-memory database.
Use ".open FILENAME" to reopen on a persistent database.
sqlite> .quit

Another example, building 40Kloc interpreter from source then running it
in memory:

c:\qx>tm \bx\mm -run qq hello
Compiling qq.m to memory
Hello, World! 19-Nov-2024 15:38:47
TM: 0.11

c:\qx>tm qq hello
Hello, World! 19-Nov-2024 15:38:49
TM: 0.05

The second version runs a precompiled EXE. So building from source added
only 90ms. Or I can use the interpreter like (so interpreting an
interpreter) to get an 0.08 second timing.

No tricks are needed. The only thing that might be a cheat here is using
OS file-caching. But nearly always, you will be building source files
that have either just been edited, or will have been compiled a few
seconds before.

An untyped system

What do you mean by an untyped system? To me it usually means
dynamically typed.
--- Synchronet 3.20a-Linux NewsLink 1.114

From scott@scott@slp53.sl.home (Scott Lurndal) to comp.lang.c on Tue Nov 19 16:11:51 2024

From Newsgroup: comp.lang.c

Bart <bc@freeuk.com> writes:

On 19/11/2024 01:53, Waldek Hebisch wrote:

More complicated does not mean slower. Binary search or hash tables
are more complicated than linear search, but for larger data may
be much faster.

That's not the complexity I had in mind. The 100-200MB sizes of
LLVM-based compilers are not because they use hash-tables over linear >search.

You still have this irrational obsession with the amount of disk
space consumed by a compiler suite - one that is useful to a massive
number of developers (esp. compared with the user-base of your
compiler).

The amount of disk space consumed by a compilation suite is
a meaningless statistic. 10MByte disks are a relic of the
distant past.

My tools can generally build my apps from scratch in 0.1 seconds; big >compilers tend to take a lot longer. Only Tiny C is in that ballpark.

And Tiny C is useless for the majority of real-world applications.

How many people are using your compiler to build production applications?
--- Synchronet 3.20a-Linux NewsLink 1.114

From Bart@bc@freeuk.com to comp.lang.c on Tue Nov 19 16:43:00 2024

From Newsgroup: comp.lang.c

On 19/11/2024 16:11, Scott Lurndal wrote:

Bart <bc@freeuk.com> writes:

On 19/11/2024 01:53, Waldek Hebisch wrote:

More complicated does not mean slower. Binary search or hash tables
are more complicated than linear search, but for larger data may
be much faster.

That's not the complexity I had in mind. The 100-200MB sizes of
LLVM-based compilers are not because they use hash-tables over linear
search.

You still have this irrational obsession with the amount of disk
space consumed by a compiler suite - one that is useful to a massive
number of developers (esp. compared with the user-base of your
compiler).

The amount of disk space consumed by a compilation suite is
a meaningless statistic. 10MByte disks are a relic of the
distant past.

Yes is. But what is NOT meaningless is everything else that goes with
it: vast complexity, and slow compile times, and that's just for the
apps you build with the tool. Building LLVM itself can be challenging.

My tools can generally build my apps from scratch in 0.1 seconds; big
compilers tend to take a lot longer. Only Tiny C is in that ballpark.

And Tiny C is useless for the majority of real-world applications.

How many people are using your compiler to build production applications?

It doesn't matter. It's enough to illustrate that routine compilation
CAN be done at up to 100 times faster than those big tools and with a
program that could fit on a floppy. Presumably at a significant power
saving as well, as that seems to be a big thing these days.

If a simple implementation has trouble with big applications, then that
would need to be looked at.

But I suspect the trouble doesn't lie within the small compiler.
Probably those big compilers have had to be endlessly tweaked over
decades to deal myriad small problems, perhaps bugs and corner cases
within the C language, or need to compile legacy code that is too
fragile to fix, all sorts of stuff.

Or, where the compilers were not specially modded, then codebases would
have headers with conditional blocks that special-case particular
compilers with tweaks to get around the idiosyncrasies of each.

Or, the apps depend on C extensions implemented only by a big compiler.

The end result is that when some upstart comes along with a new,
streamlined compiler, it will not be able build that codebase.

But, try creating a NEW real-world application that is primarily
developed and tested with Tiny C, then you will see two revelations:

* It *will* build with Tiny C with no problems, unsurprisingly

* It will also build with any of your big compilers because the code is necessarily conservative.

Congratulations, you now have a much healthier codebase that works cross-compiler without all those #ifdef blocks.

--- Synchronet 3.20a-Linux NewsLink 1.114

From David Brown@david.brown@hesbynett.no to comp.lang.c on Tue Nov 19 18:31:06 2024

From Newsgroup: comp.lang.c

On 19/11/2024 14:29, Janis Papanagnou wrote:

On 19.11.2024 09:19, David Brown wrote:
[...]

There are a great variety of projects, [...]

I don't want the theme to get out of hand, so just one amendment to...

Of course testing is important, at many levels. But the time to test
your code is when you are confident that it is correct - testing is not
an alternative to writing code that is as clearly correct as you are
able to make it.

Sound like early days practice, where code is written, "defined" at
some point as "correct", and then tests written (sometimes written
by the same folks who implemented the code) to prove that the code
is doing the expected, or the tests have been spared because it was
"clear" that the code is "correct" (sort of).

Since the 1990's we've had other principles, yes, "on many levels"
(as you started your paragraph). At all levels there's some sort of specification (or description) that defined the expected outcome
and behavior; tests [of levels higher than unit-tests] are written
if not in parallel then usually by separate groups. The decoupling
is important, the "first implement, then test" serializing certainly
not.

Of course every responsible programmer tries to create correct code, supported by own experience and by projects' regulatory means. But
that doesn't guarantee correct code. Neither do test guarantee that.
But tests have been, IME, more effective in supporting correctness
than being "confident that it is correct" (as you say).

Both activities are about reducing the risk of incorrect code getting
through. In some cases, one of them is more practical or more effective
than the other, while in other situations you want to combine them.

My argument has never been against testing, nor have I claimed that programmers can be trusted to write infallible code!

All I have been arguing against is the idea of blindly putting in
validity tests for parameters in functions, as though it were a habit
that by itself leads to fewer bugs in code.

--- Synchronet 3.20a-Linux NewsLink 1.114

From Bart@bc@freeuk.com to comp.lang.c on Tue Nov 19 19:11:03 2024

From Newsgroup: comp.lang.c

On 19/11/2024 15:51, Bart wrote:

On 19/11/2024 01:53, Waldek Hebisch wrote:

Another example, building 40Kloc interpreter from source then running it
in memory:

c:\qx>tm \bx\mm -run qq hello
Compiling qq.m to memory
Hello, World! 19-Nov-2024 15:38:47
TM: 0.11

c:\qx>tm qq hello
Hello, World! 19-Nov-2024 15:38:49
TM: 0.05

The second version runs a precompiled EXE. So building from source added only 90ms.

Sorry, that should be 60ms. Running that interpreter from source only
takes 1/16th of a second longer not 1/11th of a second.

BTW I didn't remark on the range of your (WH's) figures. They spanned 40 minutes for a build to instant, but it's not clear for which languages
they are, which tools are used and which machines. Or how much work they
have to do to get those faster times, or what work they don't do: I'm guessing it's not processing 0.5M lines for that fastest time.

So it was hard to formulate a response.

All my timings are either for C or my systems language, running on one
core on the same PC.

For something that you can compare on your own machines, this is a test
using a one-file version of Lua adapted from https://github.com/edubart/minilua.

Timings and EXE sizes are:

Seconds KB

gcc -O0 -s 3.4 372
gcc -Os -s 8.5 241
gcc -O2 -s 11.7 328
gcc -O3 -s 14.4 378
tcc 0.9.27 0.14 384
cc 0.16 315 (My new C compiler)
cc 0.09 - (Compile to intepretable IL)
cc 0.11 - (Compile to IL then runnable in-mem code)
mcc 0.28 355 (My old C compiler uses intermediate ASM)

Since this is one file (of some tens of 1000s of KB; -E output varies),
any mod involves recompiling the whole thing.

--- Synchronet 3.20a-Linux NewsLink 1.114

From Mark Bourne@nntp.mbourne@spamgourmet.com to comp.lang.c on Tue Nov 19 20:51:47 2024

From Newsgroup: comp.lang.c

Bart wrote:

On 10/11/2024 06:00, Waldek Hebisch wrote:

Bart <bc@freeuk.com> wrote:

I'd would consider a much elaborate one putting the onus on external
tools, and still having an unpredictable result to be the poor of the
two.

You want to create a language that is easily compilable, no matter how
complex the input.

Normally time spent _using_ compiler should be bigger than time
spending writing compiler. If compiler gets enough use, it
justifies some complexity.

That doesn't add up: the more the compiler gets used, the slower it
should get?!

I may have misunderstood, but I don't think Waldek's comment was a claim
about how long a single compilation should take / how slow the compiler
should be made to be. I think it was a statement about the total amount
of time all users of a compiler can be expected to spend using it in comparison to the time compiler developers spend writing it.

If a compiler is used by a significant number of people, the total
amount of time users spend using it is far larger than the total amount
of time developers spend writing it, regardless of how long a single compilation takes. So overall it's worth the compiler developers
putting in extra effort to make the compiler more useful, provide better diagnostics, etc. rather than just doing whatever's easiest for them.
That may only save each user a relatively small amount of time, but
aggregated over all users of the compiler it adds up to a lot of time saved.

When a compiler is used by only a small number of people (or even just
one), it's not worth the compiler developer putting a lot of effort into
it, when it's only going to save a small number of people a small amount
of time.
--
Mark.
--- Synchronet 3.20a-Linux NewsLink 1.114

From antispam@antispam@fricas.org (Waldek Hebisch) to comp.lang.c on Tue Nov 19 22:40:45 2024

From Newsgroup: comp.lang.c

Bart <bc@freeuk.com> wrote:

On 19/11/2024 01:53, Waldek Hebisch wrote:

Bart <bc@freeuk.com> wrote:

On 10/11/2024 06:00, Waldek Hebisch wrote:

Bart <bc@freeuk.com> wrote:

I'd would consider a much elaborate one putting the onus on external >>>>> tools, and still having an unpredictable result to be the poor of the two.

You want to create a language that is easily compilable, no matter how >>>>> complex the input.

Normally time spent _using_ compiler should be bigger than time
spending writing compiler. If compiler gets enough use, it
justifies some complexity.

That doesn't add up: the more the compiler gets used, the slower it
should get?!

More complicated does not mean slower. Binary search or hash tables
are more complicated than linear search, but for larger data may
be much faster.

That's not the complexity I had in mind. The 100-200MB sizes of
LLVM-based compilers are not because they use hash-tables over linear search.

It is related: both gcc anf LLVM are doing analyses that in the
past were deemed inpracticaly expensive (both in time and in space).
Those analyses work now thanks to smart algorithms that
significantly reduced resource usage. I know that you consider
this too expensive. But the point is that there are also things
which are easy to program and are slow, but acceptable for some
people. You can speed up such things adding complexity to the
compiler.

More generaly, I want to minimize time spent by the programmer,
that is _sum over all iterations leading to correct program_ of
compile time and "think time". Compiler that compiles slower,
but allows less iterations due to better diagnostics may win.
Also, humans perceive 0.1s delay almost like no delay at all.
So it does not matter if single compilation step is 0.1s or
0.1ms. Modern computers can do a lot of work in 0.1s.

What's the context of this 0.1 seconds? Do you consider it long or short?

Context is interactive response. It means "pretty fast for interactive
use".

My tools can generally build my apps from scratch in 0.1 seconds; big compilers tend to take a lot longer. Only Tiny C is in that ballpark.

So I'm failing to see your point here. Maybe you picked up that 0.1
seconds from an earlier post of mine and are suggesting I ought to be
able to do a lot more analysis within that time?

This 0.1s is old thing. My point is that if you are compiling simple
change, than you should be able to do more in this time. In normal developement source file bigger than 10000 lines are relatively
rare, so once you get in range of 50000-100000 lines per second
making compiler faster is of marginal utility.

Yes. This may lead to some complexity. Simple approach is to
avoid obviously useless recompilation ('make' is doing this).
More complicated approach may keep some intermediate data and
try to "validate" them first. If previous analysis is valid,
then it can be reused. If something significant changes, than
it needs to be re-done. But many changes only have very local
effect, so at least theoretically re-using analyses could
save substantial time.

I consider compilation: turning textual source code into a form that can
be run, typically binary native code, to be a completely routine task
that should be as simple and as quick as flicking a light switch.

While anything else that might be a deep analysis of that program I
consider to be a quite different task. I'm not saying there is no place
for it, but I don't agree it should be integrated into every compiler
and always invoked.

We clearly differ in question of what is routine. Creating usable
executable is rare task, once executable is created it can be used
for long time. OTOH developement is routine and for this one wants
to know if a change is correct. Extra analyses and diagonstics
help here. And since normal developement works in cycles there is
a lot of possiblity to re-use results between cycles.

Since now that last statement is the '0' value (any int value wil do).
What should my compiler report instead? What analysis should it be
doing? What would that save me from typing?

Currently in typed language that I use literal translation of
the example hits a hole in checks, that is the code is accepted.

Concerning needed analyses: one thing needed is representation of
type, either Pascal range type or enumeration type (the example
is _very_ unatural because in modern programming magic numbers
are avoided and there would be some symbolic representation
adding meaning to the numbers). Second, compiler must recognize
that this is a "multiway switch" and collect conditions.

The example came from C. Even if written as a switch, C switches do not return values (and also are hard to even analyse as to which branch is which).

In my languages, switches can return values, and a switch written as the last statement of a function is considered to do so, even if each branch uses an explicit 'return'. Then, it will consider a missing ELSE a 'hole'.

It will not do any analysis of the range other than what is necessary to implement switch (duplicate values, span of values, range-checking when using jump tables).

So the language may require you to supply a dummy 'else x' or 'return
x'; so what?

The alternative appears to be one of:

* Instead of 'else' or 'return', to write 'unreachable', which puts some
trust, not in the programmer, but some person calling your function
who does not have sight of the source code, to avoid calling it with
invalid arguments

Already simple thing would be an improvement: make compiler aware of
error routine (if you do not have it add one) so that when you
signal error compiler will know that there is no need for normal
return value.

Once
you have such representation (which may be desirable for other
reasons) it is easy to determine set of handled values. More
precisely, in this example we just have small number of discrete
values. More ambitious compiler may have list of ranges.
If type also specifies list of values or list of ranges, then
it is easy to check if all values of the type are handled.

The types are tyically plain integers, with ranges from 2**8 to 2**64.
The ranges associated with application needs will be more arbitrary.

If talking about a language with ranged integer types, then there might
be more point to it, but that is itself a can of worms. (It's hard to do without getting halfway to implementing Ada.)

C has 'enum'. And a lot of languages treat such types much more
seriously than C.

You can't do this stuff with the compilers David Brown uses; I'm
guessing you can't do it with your prefered ones either.

To recompile the typed system I use (about 0.4M lines) on new fast
machine I need about 53s. But that is kind of cheating:
- this time is for parallel build using 20 logical cores
- the compiler is not in the language it compiles (but in untyped
vesion of it)
- actuall compilation of the compiler is small part of total
compile time
On slow machine compile time can be as large as 40 minutes.

40 minutes for 400K lines? That's 160 lines per second; how old is this machine? Is the compiler written in Python?

This is simple compiler doing rather complex analyses and time used by
them may grow exponentialy. Compiler is written in untyped version
of language it compiles and generates Lisp (so actual machine code
is generated by Lisp).

Concerning slowness, few years old Atoms are quite slow.

An untyped system that I use has about 0.5M lines and recompiles
itself in 16s on the same machine. This one uses single core.
On slow machine compile time may be closer to 2 minutes.

So 4K to 30Klps.

Closer to 50Klps, as there are other things taking time.

Again, compiler compile time is only a part of build time.
Actualy, one time-intensive part is creating index for included
documentation.

Which is not going to be part of a routine build.

In a sense build is not routine. Build is done for two purposes:
- to install working system from sources, that includes
documentaion
- to check that build works properly after changes, this also
should check documentaion build.

Normal developement goes without rebuilding the system.

Another is C compilation for a library file
(system has image-processing functions and low-level part of
image processing is done in C). Recomplation starts from
minimal version of the system, rebuilding this minimal
version takes 3.3s.

My language tools work on a whole program, where a 'program' is a single
EXE or DLL file (or a single OBJ file in some cases).

A 'build' then turns N source files into 1 binary file. This is the task
I am talking about.

I know. But this is not what I do. Build produces mutiple
artifacts, some of them executable, some are loadable code (but _not_
in form recogized by operating system), some essentially non-executable
(like documentation).

A complete application may have several such binaries and a bunch of
other stuff. Maybe some source code is generated by a script. This part
is open-ended.

However each of my current projects is a single, self-contained binary
by design.

Anyway, I do not need cascaded recompilation than you present.
Both system above have incermental compilation, the second one
at statement/function level: it offers interactive prompt
which takes a statement from the user, compiles it and immediately
executes. Such statement may define a function or perform compilation.
Even on _very_ slow machine there is no noticable delay due to
compilation, unless you feed the system with some oversized statement
or function (presumably from a file).

This sounds like a REPL system. There, each line is a new part of the program which is processed, executed and discarded.

First, I am writing about two different systems. Both have REPL.
Lines typed at REPL are "discarded", but their effect may last
long time.

In that regard, it
is not really what I am talking about, which is AOT compilation of a
program represented by a bunch of source files.

Untyped system is intended for "image based developement", you
compile bunch of routines to memory and dump the result to an
"image" file. You can load the image file later and use previously
compiled routines. This system also has second compiler which
outputs assembler file, and after using assembler you get object
file. If you insist compilation, assembly and linking can be
done by a single invocation of the compiler (which calls assembler
and linker behind the scene). But this is not normal use,
it is mainly used during system build to build base executable
which is later extended with extra functionality (like compilers
for extra languages) in saved images.

Typed system distingush "library compilation" and "user compilation".
"Library compilation" is done with module granularity and produces
loadable module.

Compilation is really AOT, you need to compile befor use.
Compiled functions may be replaced by new definitions, but in
absence of new definition compiled code is used without change.

Or can a new line redefine something, perhaps a function definition, previously entered amongst the last 100,000 lines? Can a new line
require compilation of something typed 50,000 lines ago?

What happens if you change the type of a global; are you saying that
none of the program codes needs revising?

In typed system there are no global "library" variables, all data
is encapsulated in modules and normally accessed in abstract way,
by calling apropriate functions. So, in "clean" code you
can recompile a single module and the whole system works.
There is potential trouble with user variables, if data layout
(representation) changes, old values will lead to trouble.
There is potential trouble if you remove exported function.
All previously compiled modules will assume that such function
is present and you will get runtime error when other modules
attempt to call such a function. For efficiency functions
from "core" modules may be inlined, if you make change to
of core modules you may need to recompile the whole system.
Similarly, some modules depend on structure of data in other
modules, if you change data layout you need to recompile
everything which depends on it (which as I wrote normally is
a single module, but may be more). In other words, if you
change data layout or module interfaces, than you may
need to recompile several modules. But during normal
developement this is much less frequent than changes which
affect only single module.

As an example, I changed representation of multidimensional arrays,
that required rebuild the whole system. OTOH most changes
are either bug fixes or replacing existing routine by a faster
one or adding new functionality. In those 3 cases there is
no change in interface seen by non-changed part. There are
also changes to module interfaces, those affect multiple
modules, but are less frequent.

Untyped (or if you prefer dynamicaly typed) system just acts
on what is in variables, if you put nonsense there you will
get error or possibly crash.

An untyped system

What do you mean by an untyped system? To me it usually means
dynamically typed.

Well, "untyped" is shorter and in a sense more relevant for
compiler. '+' is treated as a function call to a function
named '+' which performs actual work starting from dispatch
on type tags. OTOH 'fi_+' assume that it is given (tagged)
integers and is compiled to inline code which in case when
one argument is a constant may reduce to one or zero instructions
(zero instructions means that addition may be done as part
of address mode of load or store). At even lower level
there is '_add' which adds two things treating them as
machine integers.
--
Waldek Hebisch
--- Synchronet 3.20a-Linux NewsLink 1.114

From antispam@antispam@fricas.org (Waldek Hebisch) to comp.lang.c on Tue Nov 19 23:41:34 2024

From Newsgroup: comp.lang.c

Bart <bc@freeuk.com> wrote:

On 19/11/2024 15:51, Bart wrote:

On 19/11/2024 01:53, Waldek Hebisch wrote:

Another example, building 40Kloc interpreter from source then running it
in memory:

c:\qx>tm \bx\mm -run qq hello
Compiling qq.m to memory
Hello, World! 19-Nov-2024 15:38:47
TM: 0.11

c:\qx>tm qq hello
Hello, World! 19-Nov-2024 15:38:49
TM: 0.05

The second version runs a precompiled EXE. So building from source added
only 90ms.

Sorry, that should be 60ms. Running that interpreter from source only
takes 1/16th of a second longer not 1/11th of a second.

BTW I didn't remark on the range of your (WH's) figures. They spanned 40 minutes for a build to instant, but it's not clear for which languages
they are, which tools are used and which machines. Or how much work they
have to do to get those faster times, or what work they don't do: I'm guessing it's not processing 0.5M lines for that fastest time.

As I wrote, there are 2 different system, if interesed you can fetch
them from github. Build time is just running make, one (typed
system) was

time make -j 20 > mlogg 2>&1

so build used up to 20 jobs, output went to a file (I am not sure
if it was important in this case, but there is 15MB of messages
and terminal emulator could take some time to print them).
Of course, this after all dependencies were installed and after
running 'configure'. Note that parallel build saves substantial
time, otherwise it probably would be somewhat more than 6 minutes.

For untyped system it was

time make > mlogg 2>&1

Shortest time was

time make stamp_new_corepop > mlogg3 2>&1

this rebuild only one crucial binary (that involves about 100K wc
lines). This is mixed language project, there is runtime support in
C (hard to say how much as a single file contains functions for
several OS-es but conditionals choose only one OS), assembler files
which are macro-processed and passed to assembler. There are
header files which are included during multiple compilations.

My point was that with machines available to me and with my
developement process "full build" time is not a problem.
With typed system normal thing is to rebuild a single module, and
for some modules it takes several seconds (most are of order of
a second). It would be nice to have faster compile time.
OTOH my "think time" frequently is much longer than this,
so compiler doing less checking could lead to longer time
overall.

So it was hard to formulate a response.

All my timings are either for C or my systems language, running on one
core on the same PC.

I do not think I will use your system language. And for C compiler
at least currently it does not make big difference to me if your
compiler can do 1Mloc or 5Mloc on my machine, both are "pretty fast".
What matters more is support of debugging output, supporting
targets that I need (like ARM or Risc-V), good diagnostics
and optimization. I recently installed TinyC on small Risc-V
machine, I think that available memory (64MB all, about 20MB available
to user programs) is too small to run gcc or clang.
--
Waldek Hebisch
--- Synchronet 3.20a-Linux NewsLink 1.114

From Bart@bc@freeuk.com to comp.lang.c on Wed Nov 20 00:16:50 2024

From Newsgroup: comp.lang.c

On 19/11/2024 22:40, Waldek Hebisch wrote:

Bart <bc@freeuk.com> wrote:

It is related: both gcc anf LLVM are doing analyses that in the
past were deemed inpracticaly expensive (both in time and in space).
Those analyses work now thanks to smart algorithms that
significantly reduced resource usage. I know that you consider
this too expensive.

How long would LLVM take to compile itself on one core? (Here I'm not
even sure what LLVM is; it you download the binary, it's about 2.5GB,
but a typical LLVM compiler might 100+ MB. But I guess it will be while
in either case.)

I have product now that is like a mini-LLVM backend. It can build into a standalone library of under 0.2MB, which can directy produce EXEs, or it
can interpret. Building that product from scratch takes 60ms.

That is my kind of product

What's the context of this 0.1 seconds? Do you consider it long or short?

Context is interactive response. It means "pretty fast for interactive
use".

It's less than the time to press and release the Enter key.

My tools can generally build my apps from scratch in 0.1 seconds; big
compilers tend to take a lot longer. Only Tiny C is in that ballpark.

So I'm failing to see your point here. Maybe you picked up that 0.1
seconds from an earlier post of mine and are suggesting I ought to be
able to do a lot more analysis within that time?

This 0.1s is old thing. My point is that if you are compiling simple
change, than you should be able to do more in this time. In normal developement source file bigger than 10000 lines are relatively
rare, so once you get in range of 50000-100000 lines per second
making compiler faster is of marginal utility.

I *AM* doing more in that time! It just happens to be stuff you appear
to have no interest in:

* I write whole-program compilers: you always process all source files
of an application. The faster the compiler, the bigger the scale of app
it becomes practical on.

* That means no headaches with dependencies (it goes in hand with a
decent module scheme)

* I can change one tiny corner of a the program, say add an /optional/ argument to a function, which requires compiling all call-sites across
the program, and the next compilation will take care of everything

* If I were to do more with optimisation (there is lots that can be done without getting into the heavy stuff), it automatically applies to the
whole program

* I can choose to run applications from source code, without generating discrete binary files, just like a script language

* I can choose (with my new backend) to interpret programs in this
static language. (Interpretation gives better debugging opportunities)

* I don't need to faff around with object files or linkers

Module-based independent compilation and having to link 'object files'
is stone-age stuff.

We clearly differ in question of what is routine. Creating usable
executable is rare task, once executable is created it can be used
for long time. OTOH developement is routine and for this one wants
to know if a change is correct.

I take it then that you have some other way of doing test runs of a
program without creating an executable?

It's difficult to tell from your comments.

Already simple thing would be an improvement: make compiler aware of
error routine (if you do not have it add one) so that when you
signal error compiler will know that there is no need for normal
return value.

OK, but what does that buy me? Saving a few bytes for a return
instruction in a function? My largest program, which is 0.4MB, already
only occupies 0.005% of the machines 8GB.

Which is not going to be part of a routine build.

In a sense build is not routine. Build is done for two purposes:
- to install working system from sources, that includes
documentaion
- to check that build works properly after changes, this also
should check documentaion build.

Normal developement goes without rebuilding the system.

We must be talking at cross-purposes then.

Either you're developing using interpreted code, or you must have some
means of converting source code to native code, but for some reason you
don't use 'compile' or 'build' to describe that process.

Or maybe your REPL/incremental process can run for days doing
incremental changes without doing a full compile. It seems quite mysterious.

I might run my compiler hundreds of times a day (at 0.1 seconds a time,
600 builds would occupy one whole minute in the day!). I often do it for frivolous purposes, such as trying to get some output lined up just
right. Or just to make sure something has been recompiled since it's so
quick it's hard to tell.

I know. But this is not what I do. Build produces mutiple
artifacts, some of them executable, some are loadable code (but _not_
in form recogized by operating system), some essentially non-executable
(like documentation).

So, 'build' means something different to you. I use 'build' just as a
change from writing 'compile'.

This sounds like a REPL system. There, each line is a new part of the
program which is processed, executed and discarded.

First, I am writing about two different systems. Both have REPL.
Lines typed at REPL are "discarded", but their effect may last
long time.

My last big app used a compiled core but most user-facing functionality
was done using an add-on script language. This meant I could develop
such modules from within a working application, which provided a rich, persistent environment.

Changes to the core program required a rebuild and a restart.

However the whole thing was an application, not a language.

What happens if you change the type of a global; are you saying that
none of the program codes needs revising?

In typed system there are no global "library" variables, all data
is encapsulated in modules and normally accessed in abstract way,
by calling apropriate functions. So, in "clean" code you
can recompile a single module and the whole system works.

I used module-at-time compilation until 10-12 years ago. The module
scheme had to be upgraded at the same time, but it took several goes to
get it right.

Now I wouldn't go back. Who cares about compiling a single module that
may or may not affect a bunch of others? Just compile the lot!

If a project's scale becomes too big, then it should be split into
independent program units, for example a core EXE file and a bunch of
DLLs; that's the new granularity. Or a lot of functionality can be
off-loaded to scripts, as I used to do.

(My scripting language code still needs bytecode compilation, and I also
use whole-program units there, but the bytecode compiler goes up to 2Mlps.)

--- Synchronet 3.20a-Linux NewsLink 1.114

From Bart@bc@freeuk.com to comp.lang.c on Wed Nov 20 01:33:09 2024

From Newsgroup: comp.lang.c

On 19/11/2024 23:41, Waldek Hebisch wrote:

Bart <bc@freeuk.com> wrote:

BTW I didn't remark on the range of your (WH's) figures. They spanned 40
minutes for a build to instant, but it's not clear for which languages
they are, which tools are used and which machines. Or how much work they
have to do to get those faster times, or what work they don't do: I'm
guessing it's not processing 0.5M lines for that fastest time.

As I wrote, there are 2 different system, if interesed you can fetch
them from github.

Do you have a link? Probably I won't attempt to build but I can see what
it looks like.

I do not think I will use your system language. And for C compiler
at least currently it does not make big difference to me if your
compiler can do 1Mloc or 5Mloc on my machine, both are "pretty fast".
What matters more is support of debugging output, supporting
targets that I need (like ARM or Risc-V), good diagnostics
and optimization.

It's funny how nobody seems to care about the speed of compilers (which
can vary by 100:1), but for the generated programs, the 2:1 speedup you
might get by optimising it is vital!

Here I might borrow one of your arguments and suggest such a speed-up is
only necessary on a rare production build.

I recently installed TinyC on small Risc-V
machine, I think that available memory (64MB all, about 20MB available
to user programs) is too small to run gcc or clang.

Only 20,000KB? My first compilers worked on 64KB systems, not all of
which was available either.

None of my recent products will do so now, but they will still fit on a
floppy disk.

BTW why don't you use a cross-compiler? That's what David Brown would say.

--- Synchronet 3.20a-Linux NewsLink 1.114

From Dan Purgert@dan@djph.net to comp.lang.c on Wed Nov 20 12:31:35 2024

From Newsgroup: comp.lang.c

On 2024-11-16, Stefan Ram wrote:

Dan Purgert <dan@djph.net> wrote or quoted:

if (n==0) { printf ("n: %u\n",n); n++;}
if (n==1) { printf ("n: %u\n",n); n++;}
if (n==2) { printf ("n: %u\n",n); n++;}
if (n==3) { printf ("n: %u\n",n); n++;}
if (n==4) { printf ("n: %u\n",n); n++;}
printf ("all if completed, n=%u\n",n);

My bad if the following instruction structure's already been hashed
out in this thread, but I haven't been following the whole convo!

I honestly lost the plot ages ago; not sure if it was either!

In my C 101 classes, after we've covered "if" and "else",
I always throw this program up on the screen and hit the newbies
with this curveball: "What's this bad boy going to spit out?".

Segfaults? :D

Well, it's a blue moon when someone nails it. Most of them fall
for my little gotcha hook, line, and sinker.

#include <stdio.h>

const char * english( int const n )
{ const char * result;
if( n == 0 )result = "zero";
if( n == 1 )result = "one";
if( n == 2 )result = "two";
if( n == 3 )result = "three";
else result = "four";
return result; }

void print_english( int const n )
{ printf( "%s\n", english( n )); }

int main( void )
{ print_english( 0 );
print_english( 1 );
print_english( 2 );
print_english( 3 );
print_english( 4 ); }

oooh, that's way better at making a point of the hazard than mine was.

... almost needed to engage my rubber duckie, before I realized I was
mentally auto-correcting the 'english()' function while reading it.
--
|_|O|_|
|_|_|O| Github: https://github.com/dpurgert
|O|O|O| PGP: DDAB 23FB 19FA 7D85 1CC1 E067 6D65 70E5 4CE7 2860
--- Synchronet 3.20a-Linux NewsLink 1.114

From scott@scott@slp53.sl.home (Scott Lurndal) to comp.lang.c on Wed Nov 20 13:42:14 2024

From Newsgroup: comp.lang.c

Bart <bc@freeuk.com> writes:

On 19/11/2024 23:41, Waldek Hebisch wrote:

Bart <bc@freeuk.com> wrote:

It's funny how nobody seems to care about the speed of compilers (which
can vary by 100:1), but for the generated programs, the 2:1 speedup you >might get by optimising it is vital!

I don't consider it funny at all, rather it is simply the way things
should be. One compiles once. One's customer runs the resulting
executable perhaps millions of times.

Here I might borrow one of your arguments and suggest such a speed-up is >only necessary on a rare production build.

And again, you've clearly never worked with any significantly
large project. Like for instance an operating system.

I recently installed TinyC on small Risc-V
machine, I think that available memory (64MB all, about 20MB available
to user programs) is too small to run gcc or clang.

Only 20,000KB? My first compilers worked on 64KB systems, not all of
which was available either.

My first compilers worked on 4KW PDP-8. Not that I have any
interest in _ever_ working in such a constrained environment
ever again.

None of my recent products will do so now, but they will still fit on a >floppy disk.

And, nobody cares.

--- Synchronet 3.20a-Linux NewsLink 1.114

From antispam@antispam@fricas.org (Waldek Hebisch) to comp.lang.c on Wed Nov 20 13:44:08 2024

From Newsgroup: comp.lang.c

Bart <bc@freeuk.com> wrote:

On 19/11/2024 23:41, Waldek Hebisch wrote:

Bart <bc@freeuk.com> wrote:

BTW I didn't remark on the range of your (WH's) figures. They spanned 40 >>> minutes for a build to instant, but it's not clear for which languages
they are, which tools are used and which machines. Or how much work they >>> have to do to get those faster times, or what work they don't do: I'm
guessing it's not processing 0.5M lines for that fastest time.

As I wrote, there are 2 different system, if interesed you can fetch
them from github.

Do you have a link? Probably I won't attempt to build but I can see what
it looks like.

I do not think I will use your system language. And for C compiler
at least currently it does not make big difference to me if your
compiler can do 1Mloc or 5Mloc on my machine, both are "pretty fast".
What matters more is support of debugging output, supporting
targets that I need (like ARM or Risc-V), good diagnostics
and optimization.

It's funny how nobody seems to care about the speed of compilers (which
can vary by 100:1), but for the generated programs, the 2:1 speedup you might get by optimising it is vital!

Here I might borrow one of your arguments and suggest such a speed-up is only necessary on a rare production build.

Well, there are some good arguments for using optimizing compulation
during developement:
- test what will be deliverd
- in gcc important diagnostics like info about uninitialized variables
are available only when you turn on optimization
- with separate compilation compile time usually is acceptable

I have some extra factors:
- C files on which I am doing developement are frequently quite
small and compile time is reasonable
- C code is usually in slowly changing base part and is recompiled
only rarely

I recently installed TinyC on small Risc-V
machine, I think that available memory (64MB all, about 20MB available
to user programs) is too small to run gcc or clang.

Only 20,000KB? My first compilers worked on 64KB systems, not all of
which was available either.

I used compilers on ZX Spectrum, so I know that compiler is possible
on such a machine. More to the point, gcc-1.42 worked quite well
in 4MB machine, at that time 20MB would be quite big and could support
several users doing compilation. But porting gcc-1.42 to Risc-V
is more work that I am willing to do (at least now, I could do this
if I get infinite amount of free time).

None of my recent products will do so now, but they will still fit on a floppy disk.

BTW why don't you use a cross-compiler? That's what David Brown would say.

I did use cross-compiler to compile TinyC. Sometimes native compiler
is more convenient, I have non-C code which is hard to cross-build
and I need to link this code with C code. In cases like this doing
everthing natively is simplest thing to do (some folks use emulators,
but when it works native build is simpler). Second, one reason
to build natively is to test that native build works. In early
days of Linux I tried few times to recompile C library, and my
trials failed. Later I learned that at that time Linux C library
for i386 was cross-compiled on a Sparc machine. Apparently native
build was not tested and tended to fail. To be clear: that was long
ago, AFAIK now C library is build natively and IIRC I recompiled
it few times (I rarely have reason to do this). Third reason
to have native compiler is that machines of this class used to
come with C compiler, it was a shame not to have any C compiler
there, so I got one...
--
Waldek Hebisch
--- Synchronet 3.20a-Linux NewsLink 1.114

From Bart@bc@freeuk.com to comp.lang.c on Wed Nov 20 14:21:35 2024

From Newsgroup: comp.lang.c

On 20/11/2024 13:42, Scott Lurndal wrote:

Bart <bc@freeuk.com> writes:

On 19/11/2024 23:41, Waldek Hebisch wrote:

Bart <bc@freeuk.com> wrote:

It's funny how nobody seems to care about the speed of compilers (which
can vary by 100:1), but for the generated programs, the 2:1 speedup you
might get by optimising it is vital!

I don't consider it funny at all, rather it is simply the way things
should be. One compiles once.

Hmm, someone else who develops software, either without needing to
compile code in order to test it, or they write a 1M-line app and it
compiles and runs perfectly first time!

Sounds like some more gaslighting going on: people develop huge
applications, using slow, cumbersome compilers where max optimisations
are permanently enabled, and yet they have instant edit-compile-run
cycles or they apparently don't need to bother with a compiler at all!

One's customer runs the resulting
executable perhaps millions of times.

Sure. That's when you run a production build. I can even do that myself
on some programs (the ones where my C transpiler still works) and pass
it through gcc-O3. Then it might run 30% faster.

However, each of the 1000s of compilations before that point are pretty
much instant.

Here I might borrow one of your arguments and suggest such a speed-up is
only necessary on a rare production build.

And again, you've clearly never worked with any significantly
large project. Like for instance an operating system.

No. And? That's like telling somebody who likes to devise their own
bicycles that they've never worked on a really large conveyance, like a
jumbo jet. Unfortunately a bike as big, heavy, expensive and cumbersome
as an airliner is not really practical.

Besides, in the 1980s the tools and apps I did write were probably
larger than the OS. All I can remember is that the OS provided a file
system and a text display to allow you to launch the application you
really wanted.

The funny is that it is with large projects that edit-compile-run
turnaround times become more significant. I've heard horror-stories of
such builds taking minutes or even hours. But everybody here seems to
have found some magic workaround where compilation times even on -O3
don't matter at all.

machine, I think that available memory (64MB all, about 20MB available
to user programs) is too small to run gcc or clang.

Only 20,000KB? My first compilers worked on 64KB systems, not all of
which was available either.

My first compilers worked on 4KW PDP-8. Not that I have any
interest in _ever_ working in such a constrained environment
ever again.

There could be some lessons to be learned however. Since the amount of
bloat now around is becoming ridiculous.

None of my recent products will do so now, but they will still fit on a
floppy disk.

And, nobody cares.

You obviously don't.
--- Synchronet 3.20a-Linux NewsLink 1.114

From antispam@antispam@fricas.org (Waldek Hebisch) to comp.lang.c on Wed Nov 20 14:38:57 2024

From Newsgroup: comp.lang.c

Bart <bc@freeuk.com> wrote:

On 19/11/2024 22:40, Waldek Hebisch wrote:

Bart <bc@freeuk.com> wrote:

It is related: both gcc anf LLVM are doing analyses that in the
past were deemed inpracticaly expensive (both in time and in space).
Those analyses work now thanks to smart algorithms that
significantly reduced resource usage. I know that you consider
this too expensive.

How long would LLVM take to compile itself on one core? (Here I'm not
even sure what LLVM is; it you download the binary, it's about 2.5GB,
but a typical LLVM compiler might 100+ MB. But I guess it will be while
in either case.)

I do not know, but I would expect some hours. I did compile not
so recent gcc version, it was 6.5 min clock time, about 70 min
CPU time. Recent gcc is bigger and LLVM is of comparable size.

I have product now that is like a mini-LLVM backend. It can build into a standalone library of under 0.2MB, which can directy produce EXEs, or it
can interpret. Building that product from scratch takes 60ms.

That is my kind of product

What's the context of this 0.1 seconds? Do you consider it long or short? >>

Context is interactive response. It means "pretty fast for interactive
use".

It's less than the time to press and release the Enter key.

My tools can generally build my apps from scratch in 0.1 seconds; big
compilers tend to take a lot longer. Only Tiny C is in that ballpark.

So I'm failing to see your point here. Maybe you picked up that 0.1
seconds from an earlier post of mine and are suggesting I ought to be
able to do a lot more analysis within that time?

This 0.1s is old thing. My point is that if you are compiling simple
change, than you should be able to do more in this time. In normal
developement source file bigger than 10000 lines are relatively
rare, so once you get in range of 50000-100000 lines per second
making compiler faster is of marginal utility.

I *AM* doing more in that time! It just happens to be stuff you appear
to have no interest in:

* I write whole-program compilers: you always process all source files
of an application. The faster the compiler, the bigger the scale of app
it becomes practical on.

* That means no headaches with dependencies (it goes in hand with a
decent module scheme)

* I can change one tiny corner of a the program, say add an /optional/ argument to a function, which requires compiling all call-sites across
the program, and the next compilation will take care of everything

* If I were to do more with optimisation (there is lots that can be done without getting into the heavy stuff), it automatically applies to the
whole program

* I can choose to run applications from source code, without generating discrete binary files, just like a script language

* I can choose (with my new backend) to interpret programs in this
static language. (Interpretation gives better debugging opportunities)

* I don't need to faff around with object files or linkers

Module-based independent compilation and having to link 'object files'
is stone-age stuff.

I am not aware of a computer made from stone (silcon is product of
quite advanced metalurgy). And while you have aversion to object
files you wrote that you do independent compilation. Only you
insist that result of independent compilation must be a DLL.
How this is different from folks that compile each module to
a separate DLL?

We clearly differ in question of what is routine. Creating usable
executable is rare task, once executable is created it can be used
for long time. OTOH developement is routine and for this one wants
to know if a change is correct.

I take it then that you have some other way of doing test runs of a
program without creating an executable?

It's difficult to tell from your comments.

Already simple thing would be an improvement: make compiler aware of
error routine (if you do not have it add one) so that when you
signal error compiler will know that there is no need for normal
return value.

OK, but what does that buy me? Saving a few bytes for a return
instruction in a function? My largest program, which is 0.4MB, already
only occupies 0.005% of the machines 8GB.

What it buys is clear expressin of intent, easily checkable by the compiler/runtime. That is when you do not signal error compiler
will complain. And if you hit such case at runtime due to a bug
you will have clear info.

Which is not going to be part of a routine build.

In a sense build is not routine. Build is done for two purposes:
- to install working system from sources, that includes
documentaion
- to check that build works properly after changes, this also
should check documentaion build.

Normal developement goes without rebuilding the system.

We must be talking at cross-purposes then.

Either you're developing using interpreted code, or you must have some
means of converting source code to native code, but for some reason you don't use 'compile' or 'build' to describe that process.

Or maybe your REPL/incremental process can run for days doing
incremental changes without doing a full compile.

Yes.

It seems quite mysterious.

There is nothing misterious here. In typed system each module has
a vector (one dimensional array) called domain vector containg amoung
other references to called function. All inter-module calls are
indirect ones, they take thing to call from the domain vector. When
module starts execution references point to a runtime routine doing
similar work to dynamic linker. The first call goes to runtime
support routine which finds needed code and replaces reference in
the domain vector.

When a module is recompiled references is domain vectors are
reinitialized to point to runtimne. So searches are run again
and if needed pick new routine.

Note that there is a global table keeping info (including types)
about all exported routines from all modules. This table is used
when compileing a module and also by the search process at runtime.

The effect is that after recompilation of a single module I have
runnuble executable in memory including code of the new module.
If you wonder about compiling the same module many times: system
has garbage collector and unused code is garbage collected.
So, when old version is replaced by new one the old becomes a
garbage and will be collected in due time.

The other system is similar in principle, but there is no need
for runtime search and domain vectors.

I might run my compiler hundreds of times a day (at 0.1 seconds a time,
600 builds would occupy one whole minute in the day!). I often do it for frivolous purposes, such as trying to get some output lined up just
right. Or just to make sure something has been recompiled since it's so quick it's hard to tell.

I know. But this is not what I do. Build produces mutiple
artifacts, some of them executable, some are loadable code (but _not_
in form recogized by operating system), some essentially non-executable
(like documentation).

So, 'build' means something different to you. I use 'build' just as a
change from writing 'compile'.

Build means creating new fully-functional system. That involves
possibly multiple compilations and whatever else is needed.

This sounds like a REPL system. There, each line is a new part of the
program which is processed, executed and discarded.

First, I am writing about two different systems. Both have REPL.
Lines typed at REPL are "discarded", but their effect may last
long time.

My last big app used a compiled core but most user-facing functionality
was done using an add-on script language. This meant I could develop
such modules from within a working application, which provided a rich, persistent environment.

Changes to the core program required a rebuild and a restart.

However the whole thing was an application, not a language.

Well, the typed system is an application, which however offers
extention language and majority of application code is written
in this language. And this language is compiled, first to Lisp
and then Lisp to machine code (some Lisp compilers compile to
bytecode, some compile via C but it is best to use Lisp compiler
compiling Lisp directly to machine code).

The second system is four languages + collection of "standard"
routines. There is significantly more than just compiler
(for example text editor with capability to send e-mail),
but languages are at the center.

What happens if you change the type of a global; are you saying that
none of the program codes needs revising?

In typed system there are no global "library" variables, all data
is encapsulated in modules and normally accessed in abstract way,
by calling apropriate functions. So, in "clean" code you
can recompile a single module and the whole system works.

I used module-at-time compilation until 10-12 years ago. The module
scheme had to be upgraded at the same time, but it took several goes to
get it right.

Now I wouldn't go back. Who cares about compiling a single module that
may or may not affect a bunch of others? Just compile the lot!

If a project's scale becomes too big, then it should be split into independent program units, for example a core EXE file and a bunch of
DLLs; that's the new granularity. Or a lot of functionality can be off-loaded to scripts, as I used to do.

(My scripting language code still needs bytecode compilation, and I also
use whole-program units there, but the bytecode compiler goes up to 2Mlps.)

In both cases spirit is similar to scripting languages. Just
languages are compiled to machine code and have features supporting
large scale programming.
--
Waldek Hebisch
--- Synchronet 3.20a-Linux NewsLink 1.114

From antispam@antispam@fricas.org (Waldek Hebisch) to comp.lang.c on Wed Nov 20 14:49:08 2024

From Newsgroup: comp.lang.c

Bart <bc@freeuk.com> wrote:

On 19/11/2024 23:41, Waldek Hebisch wrote:

Bart <bc@freeuk.com> wrote:

BTW I didn't remark on the range of your (WH's) figures. They spanned 40 >>> minutes for a build to instant, but it's not clear for which languages
they are, which tools are used and which machines. Or how much work they >>> have to do to get those faster times, or what work they don't do: I'm
guessing it's not processing 0.5M lines for that fastest time.

As I wrote, there are 2 different system, if interesed you can fetch
them from github.

Do you have a link? Probably I won't attempt to build but I can see what
it looks like.

Forgot to put links in another message:

https://github.com/fricas/fricas

and

https://github.com/hebisch/poplog
--
Waldek Hebisch
--- Synchronet 3.20a-Linux NewsLink 1.114

From Janis Papanagnou@janis_papanagnou+ng@hotmail.com to comp.lang.c on Wed Nov 20 17:00:52 2024

From Newsgroup: comp.lang.c

On 19.11.2024 18:31, David Brown wrote:

[...]

All I have been arguing against is the idea of blindly putting in
validity tests for parameters in functions, as though it were a habit
that by itself leads to fewer bugs in code.

Fair enough.

Janis

--- Synchronet 3.20a-Linux NewsLink 1.114

From David Brown@david.brown@hesbynett.no to comp.lang.c on Wed Nov 20 17:15:20 2024

From Newsgroup: comp.lang.c

On 20/11/2024 02:33, Bart wrote:

On 19/11/2024 23:41, Waldek Hebisch wrote:

Bart <bc@freeuk.com> wrote:

I do not think I will use your system language. And for C compiler
at least currently it does not make big difference to me if your
compiler can do 1Mloc or 5Mloc on my machine, both are "pretty fast".
What matters more is support of debugging output, supporting
targets that I need (like ARM or Risc-V), good diagnostics
and optimization.

It's funny how nobody seems to care about the speed of compilers (which
can vary by 100:1), but for the generated programs, the 2:1 speedup you might get by optimising it is vital!

To understand this, you need to understand the benefits of a program
running quickly. Let's look at the main ones:

1. If it is a run-to-finish program, it will finish faster, and you have
less time waiting for it. A compiler will fall into this category.

2. If it is a run-continuously (or run often) program, it will use a
smaller proportion of the computer's resources, less electricity, less
heat generated, less fan noise, etc. That covers things like your email client, or your OS - things running all the time.

3. If it is a dedicated embedded system, faster programs can mean
smaller, cheaper, and lower power processors or microcontrollers for the
given task. That applies to the countless embedded systems that
surround us (far outweighing the number of "normal" computers), and the devices I make.

4. For some programs, running faster means you can have higher quality
in a similar time-frame. That applies to things like simulators, static analysers, automatic test coverage setups, and of course games.

5. For interactive programs, running faster makes them nicer to use.

There is usually a point where a program is "fast enough" - going faster
makes no difference. No one is ever going to care if a compilation
takes 1 second or 0.1 seconds, for example.

It doesn't take much thought to realise that for most developers, the
speed of their compiler is not actually a major concern in comparison to
the speed of other programs. And for everyone other than developers, it
is of no concern at all.

While writing code, and testing and debugging it, a given build might
only be run a few times, and compile speed is a bit more relevant.
Generally, however, most programs are run far more often, and for far
longer, than their compilation time. (If not, then you should most
likely have used a higher level language instead of a compiled low-level language.) So compile time is relatively speaking of much lower
priority than the speed of the result.

I think it's clear that everyone prefers faster rather than slower. But generally, people want /better/ rather than just faster. One of the
factors of "better" for compilers is that the resulting executable runs faster, and that is certainly worth a very significant cost in compile time.

And as usual, you miss out the fact that toy compilers - like yours, or
TinyC - miss all the other features developers want from their tools. I
want debugging information, static error checking, good diagnostics,
support for modern language versions (that's primarily C++ rather than
C), useful extensions, compact code, correct code generation, and most importantly of all, support for the target devices I want. I wouldn't
care if your compiler can run at a billion lines per second and gcc took
an hour to compile - I still wouldn't be interested in your compiler
because it does not generate code for the devices I use. Even if it
did, it would be useless to me, because I can trust the code gcc
generates and I cannot trust the code your tool generates. And even if
your tool did everything else I need, and you could convince me that it
is something a professional could rely on, I'd still use gcc for the
better quality generated code, because that translates to money saved
for my customers.

BTW why don't you use a cross-compiler? That's what David Brown would say.

That is almost certainly what he normally does. It can still be fun to
play around with things like TinyC, even if it is of no practical use
for the real development.

--- Synchronet 3.20a-Linux NewsLink 1.114

From antispam@antispam@fricas.org (Waldek Hebisch) to comp.lang.c on Wed Nov 20 18:31:53 2024

From Newsgroup: comp.lang.c

David Brown <david.brown@hesbynett.no> wrote:

On 15/11/2024 19:50, Waldek Hebisch wrote:

David Brown <david.brown@hesbynett.no> wrote:

On 11/11/2024 20:09, Waldek Hebisch wrote:

David Brown <david.brown@hesbynett.no> wrote:

Concerning correct place for checks: one could argue that check
should be close to place where the result of check matters, which
frequently is in called function.

No, there I disagree. The correct place for the checks should be close
to where the error is, and that is in the /calling/ code. If the called >>> function is correctly written, reviewed, tested, documented and
considered "finished", why would it be appropriate to add extra code to
that in order to test and debug some completely different part of the code? >>>
The place where the result of the check /really/ matters, is the calling >>> code. And that is also the place where you can most easily find the
error, since the error is in the calling code, not the called function.
And it is most likely to be the code that you are working on at the time >>> - the called function is already written and tested.

And frequently check requires
computation that is done by called function as part of normal
processing, but would be extra code in the caller.

It is more likely to be the opposite in practice.

And for much of the time, the called function has no real practical way
to check the parameters anyway. A function that takes a pointer
parameter - not an uncommon situation - generally has no way to check
the validity of the pointer. It can't check that the pointer actually
points to useful source data or an appropriate place to store data.

All it can do is check for a null pointer, which is usually a fairly
useless thing to do (unless the specifications for the function make the >>> pointer optional). After all, on most (but not all) systems you already >>> have a "free" null pointer check - if the caller code has screwed up and >>> passed a null pointer when it should not have done, the program will
quickly crash when the pointer is used for access. Many compilers
provide a way to annotate function declarations to say that a pointer
must not be null, and can then spot at least some such errors at compile >>> time. And of course the calling code will very often be passing the
address of an object in the call - since that can't be null, a check in
the function is pointless.

Well, in a sense pointers are easy: if you do not play nasty tricks
with casts then type checks do significant part of checking. Of
course, pointer may be uninitialized (but compiler warnings help a lot
here), memory may be overwritten, etc. But overwritten memory is
rather special, if you checked that content of memory is correct,
but it is overwritten after the check, then earlier check does not
help. Anyway, main point is ensuring that pointed to data satisfies
expected conditions.

That does not match reality. Pointers are far and away the biggest
source of errors in C code. Use after free, buffer overflows, mixups of
who "owns" the pointer - the scope for errors is boundless. You are
correct that type systems can catch many potential types of errors - unfortunately, people /do/ play nasty tricks with type checks.
Conversions of pointer types are found all over the place in C
programming, especially conversions back and forth with void* pointers.

Well, I worked with gcc code. gcc has its own garbages collector,
so there were no ownership troubles or use after free. There were
some possibility of buffer overflows, but since most data structures
that I was using were trees or lists it was limited. gcc did use
casts, but those were mainly between between pointer to union and
pointers to variants. Unions had tag (at the same place in all
variants), there were accessor macros which checked that the tag
corresponds to expected variant. It certainly took some effort to
develp the gcc infrastructure, I just benefited from it. Earlier
versions of gcc did not have garbage collector (and probably also
did not have checking macros).

Also, you say that pointers are source of errors. In gcc source
usualy was bad semantics, that is some function did something
else than it should. This could manifest as a failed tag check
(IME most frequent case), segfault or wrong generated code.
And troblesome cases were the wrong code cases.

My personal codes were much smaller. In one production case
all allocated memory was put in a linked list and freed in
bulk at end of processing. In my embeded code I do not
use dynamic allocation. In other case C routines are called
from garbage collected language, so most or all pointers are
"owned" by garbage collected language and C routines should
not and can not free them. In still another cases pointer
usage follows relatively simple design pattern and is not
a problem.

You may have more tricky cases than the cases I handle using
manual memory management and can not (or do not want) use garbage
collector. I do not know how much checking infrastructure do
you have. Simply I reported my experience.and how I interpret
it: I may get a segfault, but segfault itself is a minor
trouble. In particular many segfaults can be corrected almost
immediately. Bigger trouble is when actual problem is logic
error. In non-C coding in garbage-collected language "pointer
errors" that you mention go away, but logic errors are still
there.

All this means that invalid pointer parameters are very much a real
issue - but are typically impossible to check in the called function.

In gcc you could get pointer to wrong variant of a union, but called
function could detect it looking at the tag. One could cast
a point to completely differnt type, but this would be gross
error which was rare.

The way you avoid getting errors in your pointers is being careful about having the right data in the first place, so you only call functions
with valid parameters. You do this by having careful control about the ownership and lifetime of pointers, and what they point to, keeping conventions in the names of your pointers and functions to indicate who
owns what, and so on. And you use sanitizers and similar tools during testing and debugging to distinguish between tests that worked by luck,
and ones that worked reliably. (And of course you may consider other languages than C that help you express your requirements in a clearer
manner or with better automatic checking.)

Yes, of course.

Put the same effort and due diligence into the rest of your code, and suddenly you find your checks for other kinds of parameters in functions
are irrelevant as you are now making sure you call functions with appropriate valid inputs.

It depends on the domain (also see below).

Once you get to more complex data structures, the possibility for the
caller to check the parameters gets steadily less realistic.

So now your practice of having functions "always" check their parameters >>> leaves the people writing calling code with a false sense of security -
usually you /don't/ check the parameters, you only ever do simple checks >>> that that called could (and should!) do if they were realistic. You've
got the maintenance and cognitive overload of extra source code for your >>> various "asserts" and other check, regardless of any run-time costs
(which are often irrelevant, but occasionally very important).

You will note that much of this - for both sides of the argument - uses
words like "often", "generally" or "frequently". It is important to
appreciate that programming spans a very wide range of situations, and I >>> don't want to be too categorical about things. I have already said
there are situations when parameter checking in called functions can
make sense. I've no doubt that for some people and some types of
coding, such cases are a lot more common than what I see in my coding.

Note also that when you can use tools to automate checks, such as
"sanitize" options in compilers or different languages that have more
in-built checks, the balance differs. You will generally pay a run-time >>> cost for those checks, but you don't have the same kind of source-level
costs - your code is still clean, clear, and amenable to correctness
checking, without hiding the functionality of the code in a mass of
unnecessary explicit checks. This is particularly good for debugging,
and the run-time costs might not be important. (But if run-time costs
are not important, there's a good chance that C is not the best language >>> to be using in the first place.)

Our experience differs. As a silly example consider a parser
which produces parse tree. Caller is supposed to pass syntactically
correct string as an argument. However, checking syntactic corretnetness
requires almost the same effort as producing parse tree, so it
ususal that parser both checks correctness and produces the result.

The trick here is to avoid producing a syntactically invalid string in
the first place. Solve the issue at the point where there is a mistake
in the code!

(If you are talking about a string that comes from outside the code in
some way, then of course you need to check it - and if that is most conveniently done during the rest of parsing, then that is fair enough.)

Imagne about 1000 modules containing about 15000 functions. The
modules for a library and any exportd function (about 7000) is
potentially user-accessible. Function transform data and do not
know where their argument came from: user or other library
function. Processing in principle is quite well defined, so
one could formulate validity conditions for inputs and outputs.
But the conditions do no compose in a simple way. More precisly,
in many cases when given function received correct data and is
doing right thing, then all functions it calls will receive
correct arguments. But trouble is, what if the function is
wrong? Natural answer: write correct code solves nothing.
Of course, one makes effort to write correct code, but bugs
still appear. So, there are internal checks. And failing
check frequently is in called function, because it can
detect error. Of course, if detecting error in caller
were easy, the caller would do the check. But frequently
it is not easy. Look at partially made up example.
We have a mathematical problem that could be transformed to
solving linear equations. In general, a system of linear
equations may have no solution. But one may be able to
prove that that equations coming from a specific problem
are always solvable. So we write a routine that transforms
input into a system of linear equations. Equation solver
returns information if system is solvable and in case of
solvable system also description of solutions. Taking
literaly your advice, we would just access solutions
(we proved that system is solvable so solutions must be
there!). But in system I use and develop as written
one can not "just access solutions" without first checking
(explicitely or implicitely) return value for possibility
of no solution. And what happens when there is no solution?
Implicit check will signal error and if check is explicit
the only sensible thing to do is also signaling error.
My point here that there is natural place to put extra
check. If the check fails you know that there is a bug
(or possibly the input data was wrong). And if there is
a bug that is the earliest practical place to discover it.

BTW: While I did not give complete example, this is frequent
approach to solving math problem.

I have computations that are quite different than parsing but
in some cases share the same characteristic: checking correctness of
arguments requires complex computation similar to producing
actual result. More freqently, called routine can check various
invariants which with high probablity can detect errors. Doing
the same check in caller is inpractical.

I think you are misunderstanding me - maybe I have been unclear. I am saying that it is the /caller's/ responsibility to make sure that the parameters it passes are correct, not the /callee's/ responsibility.
That does not mean that the caller has to add checks to get the
parameters right - it means the caller has to use correct parameters.

In this sense I agree. Simply life shows that checks are needed
and there are frequently natural places to put checks. And frequently
those natural places are far from origin of the data.

Think of this like walking near a cliff-edge. Checking parameters
before the call is like having a barrier at the edge of the cliff. My recommendation is that you know where the cliff edge is, and don't walk there.

That is easy case. The worst problems are ones where you do not
know that there is cliff edge. With real cliff edge once you fall
the trouble will be obvious (either to you or to people who find you).
In programming you may be getting wrong results and do not know
this possibly making problem worse. I simply advocate early
detection of troubles.
--
Waldek Hebisch
--- Synchronet 3.20a-Linux NewsLink 1.114

From Bart@bc@freeuk.com to comp.lang.c on Wed Nov 20 20:17:39 2024

From Newsgroup: comp.lang.c

On 20/11/2024 16:15, David Brown wrote:

On 20/11/2024 02:33, Bart wrote:

It's funny how nobody seems to care about the speed of compilers
(which can vary by 100:1), but for the generated programs, the 2:1
speedup you might get by optimising it is vital!

To understand this, you need to understand the benefits of a program
running quickly.

As I said, people are preoccupied with that for programs in general. But
when it comes to compilers, it doesn't apply! Clearly, you are implying
that those benefits don't matter when the program is a compiler.

Let's look at the main ones:

<snip>

OK. I guess you missed the bits here and in another post, where I
suggested that enabling optimisation is fine for production builds.

For the routines ones that I do 100s of times a day, where test runs are generally very short, then I don't want to hang about waiting for a
compiler that is taking 30 times longer than necessary for no good reason.

There is usually a point where a program is "fast enough" - going faster makes no difference. No one is ever going to care if a compilation
takes 1 second or 0.1 seconds, for example.

If you look at all the interactions people have with technology, with
GUI apps, even with mechanical things, a 1 second latency is generally disastrous.

A one-second delay between pressing a key and seeing a character appear
on a display or any other feedback, would drive most people up to wall.
But 0.1 is perfectly fine.

It doesn't take much thought to realise that for most developers, the
speed of their compiler is not actually a major concern in comparison to
the speed of other programs.

Most developers are stuck with what there is. Naturally they will make
the best of it. Usually by finding 100 ways or 100 reasons to avoid
running the compiler.

While writing code, and testing and debugging it, a given build might
only be run a few times, and compile speed is a bit more relevant. Generally, however, most programs are run far more often, and for far longer, than their compilation time.

Developing code is the critical bit.

Even when a test run takes a bit longer as you need to set things up,
when you do need to change something and run it again, you don't want
any pointless delay.

Neither do you want to waste /your/ time pandering to a compiler's
slowness by writing makefiles and defining dependencies. Or even
splitting things up into tiny modules. I don't want to care about that
at all. Here's my bunch of source files, just build the damn thing, and
do it now!

And as usual, you miss out the fact that toy compilers - like yours, or TinyC - miss all the other features developers want from their tools. I want debugging information, static error checking, good diagnostics,
support for modern language versions (that's primarily C++ rather than
C), useful extensions, compact code, correct code generation, and most importantly of all, support for the target devices I want.

Sure. But then I'm sure you're aware that most scripting languages
include a compilation stage where source code might be translated to
bytecode.

I guess you're OK with that being as fast as possible so that there is
no noticeable delay. But I also guess that all those features go out the window, yet people don't seem to care in that case.

My whole-program compilers (even my C one now) can run programs from
source code just a like a scripting language.

So a fast, mechanical compiler than does little checking is good in one
case, but not in another (specifically, anything created by Bart).

I wouldn't
care if your compiler can run at a billion lines per second and gcc took
an hour to compile - I still wouldn't be interested in your compiler
because it does not generate code for the devices I use. Even if it
did, it would be useless to me, because I can trust the code gcc
generates and I cannot trust the code your tool generates.

Suppose I had a large C source file, mechanically generated via a
compiler from another language so that it was fully verified.

It took a fraction of a second to generate it, all that's needed is a mechanical translation to native code. In that case you can keep your
compiler that takes one hour to do analyses I don't need; I'll take the million line per second one. (A billion lines is not viable, one million
is.)

And even if
your tool did everything else I need, and you could convince me that it
is something a professional could rely on, I'd still use gcc for the
better quality generated code, because that translates to money saved
for my customers.

Where have I said you should use my compiler? I'm simply making a case
for the existence of very fast, baseline tools that do the minimum
necessary with as little effort or footprint as necessary.

Here's an interesting test: I took sql.c (a 250Kloc sqlite3 test
program), and compiled it first to NASM-compatible assembly, and then to
my own assembly code.

I compiled the latter with my assembler and it took 1/6th for a second
(for some 0.3M lines).

How long do you think NASM took? It was nearly 8 minutes. Or a blazing
5 minutes if you used -O0 (do only one pass).

No doubt you will argue that NASM is superior to my product, although
I'm not sure how much deep analysis you can do of assembly code. And you
will castigate me for giving it over-large inputs. However that is the
task that needs to be done here.

It clearly has a bug, but if I hadn't mentioned it, I'd like to have
known how sycophantic you would have been towards that product just to
be able to belittle mine.

The NASM bug only starts to become obvious above 20Kloc or so. I wonder
how many more subtle bugs exist in big products that result in
significantly slower performance, but are not picked up because people
like you /don't care/. You will just buy a faster machine or chop your application up into even smaller bits.

BTW why don't you use a cross-compiler? That's what David Brown would
say.

That is almost certainly what he normally does. It can still be fun to play around with things like TinyC, even if it is of no practical use
for the real development.

--- Synchronet 3.20a-Linux NewsLink 1.114

From Bart@bc@freeuk.com to comp.lang.c on Wed Nov 20 23:29:44 2024

From Newsgroup: comp.lang.c

On 20/11/2024 14:38, Waldek Hebisch wrote:

Bart <bc@freeuk.com> wrote:

Either you're developing using interpreted code, or you must have some
means of converting source code to native code, but for some reason you
don't use 'compile' or 'build' to describe that process.

Or maybe your REPL/incremental process can run for days doing
incremental changes without doing a full compile.

Yes.

It seems quite mysterious.

There is nothing misterious here. In typed system each module has
a vector (one dimensional array) called domain vector containg amoung
other references to called function. All inter-module calls are
indirect ones, they take thing to call from the domain vector. When
module starts execution references point to a runtime routine doing
similar work to dynamic linker. The first call goes to runtime
support routine which finds needed code and replaces reference in
the domain vector.

When a module is recompiled references is domain vectors are
reinitialized to point to runtimne. So searches are run again
and if needed pick new routine.

Note that there is a global table keeping info (including types)
about all exported routines from all modules. This table is used
when compileing a module and also by the search process at runtime.

The effect is that after recompilation of a single module I have
runnuble executable in memory including code of the new module.
If you wonder about compiling the same module many times: system
has garbage collector and unused code is garbage collected.
So, when old version is replaced by new one the old becomes a
garbage and will be collected in due time.

This sounds an intriguing kind of system to implement.

That is, where program source, code and data structures are kept
resident, individual functions and variables can be changed, and any
other functions that might be affected are recompiled, but no others.

This has some similarities to what I was doing in the 1990s with
hot-loadable and -modifible scripts. So a lot more dynamic than the
stuff I do now.

The problem is that my current applications are simply too small for it
to be worth the complexity. Most of them build 100% from scratch in
under 0.1 seconds, especially if working within a resident application
(my timings include Windows process start/end overheads.)

If I was routinely working with programs that were 10 times the scale
(so needing to wait 0.5 to 1 seconds), then it might be something I'd consider. Or I might just buy a faster machine; my current PC was pretty
much the cheapest in the shop in 2021.

The other system is similar in principle, but there is no need
for runtime search and domain vectors.

I might run my compiler hundreds of times a day (at 0.1 seconds a time,
600 builds would occupy one whole minute in the day!). I often do it for
frivolous purposes, such as trying to get some output lined up just
right. Or just to make sure something has been recompiled since it's so
quick it's hard to tell.

I know. But this is not what I do. Build produces mutiple
artifacts, some of them executable, some are loadable code (but _not_
in form recogized by operating system), some essentially non-executable
(like documentation).

So, 'build' means something different to you. I use 'build' just as a
change from writing 'compile'.

Build means creating new fully-functional system. That involves
possibly multiple compilations and whatever else is needed.

I would call that something else, perhaps based around 'Make' (nothing
to do with Linux 'make' tools).

Here is the result of such a process for one of my 1999 apps:

G:\m7>dir
10/03/1999 00:57 45,056 M7.DAT
17/10/2002 19:22 370,288 M7.EXE
11/10/2021 21:05 7,432 M7.INI
17/10/2002 19:27 705,376 M7.PCA
10/03/1999 00:59 8,541 M7CMD.INI

The PCA files contains a few dozen scripts (at that time, they were
compiled to bytecode). This was a distribution layout, created a batch
file, and ending up a floppy, or ater FTP-ed to a web-site.

This is not routine building of either than M7.EXE program unit, or
those scripts which are compiled independently from inside M7.EXE.

--- Synchronet 3.20a-Linux NewsLink 1.114

From David Brown@david.brown@hesbynett.no to comp.lang.c on Thu Nov 21 14:00:04 2024

From Newsgroup: comp.lang.c

On 20/11/2024 21:17, Bart wrote:

On 20/11/2024 16:15, David Brown wrote:

On 20/11/2024 02:33, Bart wrote:

It's funny how nobody seems to care about the speed of compilers
(which can vary by 100:1), but for the generated programs, the 2:1
speedup you might get by optimising it is vital!

To understand this, you need to understand the benefits of a program
running quickly.

As I said, people are preoccupied with that for programs in general. But when it comes to compilers, it doesn't apply! Clearly, you are implying
that those benefits don't matter when the program is a compiler.

No - you are still stuck with your preconceived ideas, rather than ever bothering reading and thinking.

As I have said many times, people will always be happier if their
compiler runs faster - as long as that does not happen at the cost of
the functionality and features.

Thus I expect that whoever compiles the gcc binaries that I use
(occasionally that is myself, but like every other programmer I usually
use pre-built compilers), uses a good compiler with solid optimisation
enabled when building the compiler. And I expect that the gcc (and clang/llvm) developers put effort into making their tools fast - but
that they prioritise correctness first, then features, and only then
look at the speed of the tools and their memory usage. (And I don't
expect disk space to be of the remotest concern to them.)

Let's look at the main ones:

<snip>

OK. I guess you missed the bits here and in another post, where I
suggested that enabling optimisation is fine for production builds.

I saw it. But maybe you missed the bit when the discussion was about
serious software developers. Waldek explained, and I've covered it
countless times in the past, but since you didn't pay attention then,
there is little point in repeating it now.

For the routines ones that I do 100s of times a day, where test runs are generally very short, then I don't want to hang about waiting for a
compiler that is taking 30 times longer than necessary for no good reason.

Your development process sounds bad in so many ways it is hard to know
where to start. I think perhaps the foundation is that you taught
yourself a bit of programming in the 1970's, convinced yourself at the
time that you were better at software development than anyone else, and
have been stuck in that mode and the same methodology for the last 50
years without ever considering that you could learn something new from
other people.

There is usually a point where a program is "fast enough" - going
faster makes no difference. No one is ever going to care if a
compilation takes 1 second or 0.1 seconds, for example.

If you look at all the interactions people have with technology, with
GUI apps, even with mechanical things, a 1 second latency is generally disastrous.

A one-second delay between pressing a key and seeing a character appear
on a display or any other feedback, would drive most people up to wall.
But 0.1 is perfectly fine.

As I said, no one is ever going to care if a compilation takes 1 second
or 0.1 seconds.

It doesn't take much thought to realise that for most developers, the
speed of their compiler is not actually a major concern in comparison
to the speed of other programs.

Most developers are stuck with what there is. Naturally they will make
the best of it. Usually by finding 100 ways or 100 reasons to avoid
running the compiler.

So your advice is that developers should be stuck with what they have -
the imaginary compilers from your nightmares that take hours to run -
and that they should make a point of always running them as often as
possible? And presumably you also advise doing so on a bargain basement single-core computer from at least 15 years ago?

People who do software development seriously are like anyone else who
does something seriously - they want the best tools for the job, within budget. And if they are being paid for the task, their employer will
expect efficiency in return for the budget.

Which do you think an employer (or amateur programmer) would prefer?

a) A compiler that runs in 0.1 seconds with little static checking
b) A compiler that runs in 10 seconds but spots errors saving 6 hours debugging time

Developers don't want to waste time unnecessarily. Good build tools
means you get all the benefits of good compilers, without wasting time re-doing the same compilations when nothing has changed.

I can't understand why you think that's a bad thing - what is the point
of re-doing a build step when nothing has changed? And a build tool
file is also the place to hold the details of how to do the build -
compiler versions, flags, list of sources, varieties of output files, additional pre- or post-processing actions, and so on. I couldn't
imagine working with anything beyond a "hello, world" without a build tool.

While writing code, and testing and debugging it, a given build might
only be run a few times, and compile speed is a bit more relevant.
Generally, however, most programs are run far more often, and for far
longer, than their compilation time.

Developing code is the critical bit.

Yes.

I might spend an hour or two writing code (including planing,
organising, reading references, etc.) and then 5 seconds building it.
Then there might be anything from a few minutes to a few hours testing
or debugging. How could that process be improved by a faster compile?
Even for the most intense code-compile-debug cycles, building rarely
takes longer than stretching my fingers or taking a mouthful of coffee.

But using a good compiler saves a substantial amount of developer time
because I can write better code with a better structure, I can rely on
the optimisation it does (instead of "hand-optimising" code to get the efficiency I need), and good static checking and good diagnostic
messages help me fix mistakes before test and debug cycles.

Even when a test run takes a bit longer as you need to set things up,
when you do need to change something and run it again, you don't want
any pointless delay.

Neither do you want to waste /your/ time pandering to a compiler's
slowness by writing makefiles and defining dependencies.

That is not what "make" is for. Speed is a convenient by-product of
good project management and build tools.

Or even
splitting things up into tiny modules.

Speed is not the reason people write modular, structured code.

I don't want to care about that
at all. Here's my bunch of source files, just build the damn thing, and
do it now!

You apparently don't want to care about anything much.

<snip the rest to save time>

--- Synchronet 3.20a-Linux NewsLink 1.114

From Bart@bc@freeuk.com to comp.lang.c on Thu Nov 21 15:20:22 2024

From Newsgroup: comp.lang.c

On 21/11/2024 13:00, David Brown wrote:

On 20/11/2024 21:17, Bart wrote:

For the routines ones that I do 100s of times a day, where test runs
are generally very short, then I don't want to hang about waiting for
a compiler that is taking 30 times longer than necessary for no good
reason.

Your development process sounds bad in so many ways it is hard to know
where to start. I think perhaps the foundation is that you taught
yourself a bit of programming in the 1970's,

1970s builds, especially on mainframes, were dominated by link times.
You also had to keep on eye on resources (eg. allocated CPU time), as
they were limited on time-shared systems.

Above all, you could only do active work from a terminal that you first
had to book, for one-hour slots.

I'm surprised you think that my tools and working practices have any connection with the above.

I've also eliminated linkers; you apparently still use them.

As I said, no one is ever going to care if a compilation takes 1 second
or 0.1 seconds.

And yet, considerable effort IS placed on getting development tools to
run fast:

* Presumably, optimisation is applied to a compiler to get it faster
than otherwise. But why bother if the difference is only a second or so?

* Tools can now do builds in parallel across multiple cores. Again, why?
So that 1 second becomes 20 lots of 50ms? Or what that 1 second really
have been 20 seconds without that feature?

* People are developing new kinds of linkers (I think there was 'gold',
and now something else) which are touted as being several times faster
than traditional

* All sorts of make and other files are used to define dependency graphs between program modules. Why? Presumably to minimise time spent recompiling.

* There various JIT compilation schemes where a rough version of an application can get up and running quickly, with 'hot' functions
compiled and optimised on demand. Again, why?

If people really don't care about compilation speed, why this vast effort?

Getting development tools faster is an active field, and everyone
benefits including you, but when I do it, it's a pointless waste of time?

As I said, no one is ever going to care if a compilation takes 1 second
or 0.1 seconds.

Have you asked? You must use interactive tools like shells; I guess you wouldn't want a pointless one second delay after each command, which you
KNOW doesn't warrant such a delay.

That would surely slow you down if used to fluently firing off a rapid sequence of commands.

The problem is that you don't view use of a compiler as just another interactive command.

As I said, no one is ever going to care if a compilation takes 1 second
or 0.1 seconds.

Here's an actual use-case: I have a transpiler that produces a
single-file C output of 40K lines. Tiny C can build it in 0.2 seconds.
gcc -O0 takes 2.2 seconds. However there's no point in using gcc, as the generated code is as poor as Tiny C, so I might as well use that.

But if I want faster code, gcc -O2 takes 11 seconds.

For lots of routine builds used for testing, passing the intermediate C through gcc -O2 makes no sense at all. It is just a waste of time,
destroys my train of thought, and is very frustrating.

However, if you ran the world, then tools like gcc and its ilk would be
the only choice!

So your advice is that developers should be stuck

I'm saying that most developers don't write their own tools. They will
use off-the-shelf language implementations. If those happen to be slow,
then there's little they can do except work within those limitations.

Or just twiddle their thumbs.

Which do you think an employer (or amateur programmer) would prefer?

a) A compiler that runs in 0.1 seconds with little static checking
b) A compiler that runs in 10 seconds but spots errors saving 6 hours debugging time

You can have both. You can run a slow compiler that might pick up those errors.

But sometimes you make a trivial mod (eg. change a prompt); do you
REALLY need that deep analysis all over again? Do you still it fully optimised?

If your answer is YES to both then there's little point in further
discussion.

I might spend an hour or two writing code (including planing,
organising, reading references, etc.) and then 5 seconds building it.
Then there might be anything from a few minutes to a few hours testing
or debugging.

Up to a few hours testing and debugging without need to rebuild? That
last time I had to do that, it was a program written on punched cards
that was submitted as an overnight job. You could compile it only once a
day.

And you're accusing ME of being stuck in the 70s!

But using a good compiler saves a substantial amount of developer time

A better language too.

<snip the rest to save time>

So you snipped my comments about fast bytecode compilers which do zero analysis being perfectly acceptable for scripting languages.

And my remark about my language edging towards behaving as a scripting language.

I can see why you wouldn't want to respond to that.

BTW I'm doing the same with C; given this program:

int main(void) {
int a;
int* p = 0;
a = *p;
}

Here's what happens with my C compiler when told to interpret it:

c:\cx>cc -i c
Compiling c.c to c.(int)
Error: Null ptr access

Here's what happens with gcc:

c:\cx>gcc c.c
c:\cx>a
<crashes>

Is there some option to insert such a check with gcc? I've no idea; most people don't.

--- Synchronet 3.20a-Linux NewsLink 1.114

From scott@scott@slp53.sl.home (Scott Lurndal) to comp.lang.c on Thu Nov 21 15:50:54 2024

From Newsgroup: comp.lang.c

Bart <bc@freeuk.com> writes:

On 21/11/2024 13:00, David Brown wrote:

On 20/11/2024 21:17, Bart wrote:

For the routines ones that I do 100s of times a day, where test runs
are generally very short, then I don't want to hang about waiting for
a compiler that is taking 30 times longer than necessary for no good
reason.

Your development process sounds bad in so many ways it is hard to know
where to start. I think perhaps the foundation is that you taught
yourself a bit of programming in the 1970's,

1970s builds, especially on mainframes, were dominated by link times.

Which mainframe do you have experience on?

I spent a decade writing a mainframe operating system (the largest
application we had to compile regularly) and the link time was a
minor fraction of the overall build time.

It was so minor that our build system stored the object files
so that the OS engineers only needed to recompile the object
associated with the source file being modified rather than
the entire OS, they'd share the rest of the object files
with the entire OS team.

--- Synchronet 3.20a-Linux NewsLink 1.114

From Bart@bc@freeuk.com to comp.lang.c on Thu Nov 21 16:05:58 2024

From Newsgroup: comp.lang.c

On 21/11/2024 15:50, Scott Lurndal wrote:

Bart <bc@freeuk.com> writes:

On 21/11/2024 13:00, David Brown wrote:

On 20/11/2024 21:17, Bart wrote:

For the routines ones that I do 100s of times a day, where test runs
are generally very short, then I don't want to hang about waiting for
a compiler that is taking 30 times longer than necessary for no good
reason.

Your development process sounds bad in so many ways it is hard to know
where to start. I think perhaps the foundation is that you taught
yourself a bit of programming in the 1970's,

1970s builds, especially on mainframes, were dominated by link times.

Which mainframe do you have experience on?

I spent a decade writing a mainframe operating system (the largest application we had to compile regularly) and the link time was a
minor fraction of the overall build time.

It was so minor that our build system stored the object files
so that the OS engineers only needed to recompile the object
associated with the source file being modified rather than
the entire OS, they'd share the rest of the object files
with the entire OS team.

The one I remember most was 'TKB' I think it was, running on ICL 4/72
(360 clone). It took up most of the memory. It was used to link my small Fortran programs.

But linking always seems to have been big deal in that era, until I had
to write one for microcomputers, then it was a simple case of loading N
object files and combining them into one COM file. It was as fast as
they could be loaded off a floppy.

(Given that the largest COM might have been a few 10s of KB, and floppy transfer time was some 20KB/s once a sector was located, it wouldn't
have been long.)
--- Synchronet 3.20a-Linux NewsLink 1.114

From scott@scott@slp53.sl.home (Scott Lurndal) to comp.lang.c on Thu Nov 21 16:10:38 2024

From Newsgroup: comp.lang.c

Bart <bc@freeuk.com> writes:

On 21/11/2024 15:50, Scott Lurndal wrote:

Bart <bc@freeuk.com> writes:

On 21/11/2024 13:00, David Brown wrote:

On 20/11/2024 21:17, Bart wrote:

For the routines ones that I do 100s of times a day, where test runs >>>>> are generally very short, then I don't want to hang about waiting for >>>>> a compiler that is taking 30 times longer than necessary for no good >>>>> reason.

Your development process sounds bad in so many ways it is hard to know >>>> where to start. I think perhaps the foundation is that you taught
yourself a bit of programming in the 1970's,

1970s builds, especially on mainframes, were dominated by link times.

Which mainframe do you have experience on?

I spent a decade writing a mainframe operating system (the largest
application we had to compile regularly) and the link time was a
minor fraction of the overall build time.

It was so minor that our build system stored the object files
so that the OS engineers only needed to recompile the object
associated with the source file being modified rather than
the entire OS, they'd share the rest of the object files
with the entire OS team.

The one I remember most was 'TKB' I think it was, running on ICL 4/72
(360 clone). It took up most of the memory. It was used to link my small >Fortran programs.

So you generalize from your one non-standard experience to the entire ecosystem.

Typical Bart.
--- Synchronet 3.20a-Linux NewsLink 1.114

From Bart@bc@freeuk.com to comp.lang.c on Thu Nov 21 17:22:31 2024

From Newsgroup: comp.lang.c

On 21/11/2024 16:10, Scott Lurndal wrote:

Bart <bc@freeuk.com> writes:

On 21/11/2024 15:50, Scott Lurndal wrote:

Bart <bc@freeuk.com> writes:

On 21/11/2024 13:00, David Brown wrote:

On 20/11/2024 21:17, Bart wrote:

For the routines ones that I do 100s of times a day, where test runs >>>>>> are generally very short, then I don't want to hang about waiting for >>>>>> a compiler that is taking 30 times longer than necessary for no good >>>>>> reason.

Your development process sounds bad in so many ways it is hard to know >>>>> where to start. I think perhaps the foundation is that you taught
yourself a bit of programming in the 1970's,

1970s builds, especially on mainframes, were dominated by link times.

Which mainframe do you have experience on?

I spent a decade writing a mainframe operating system (the largest
application we had to compile regularly) and the link time was a
minor fraction of the overall build time.

It was so minor that our build system stored the object files
so that the OS engineers only needed to recompile the object
associated with the source file being modified rather than
the entire OS, they'd share the rest of the object files
with the entire OS team.

The one I remember most was 'TKB' I think it was, running on ICL 4/72
(360 clone). It took up most of the memory. It was used to link my small
Fortran programs.

So you generalize from your one non-standard experience to the entire ecosystem.

Typical Bart.

Typical Scott. Did you post just to do a bit of bart-bashing?

Have you also considered that your experience of building operating
systems might itself be non-standard?

People quite likely used those machines to develop other applications
than OSes. Then the dynamics could have been different.

--- Synchronet 3.20a-Linux NewsLink 1.114

From scott@scott@slp53.sl.home (Scott Lurndal) to comp.lang.c on Thu Nov 21 17:55:01 2024

From Newsgroup: comp.lang.c

Bart <bc@freeuk.com> writes:

On 21/11/2024 16:10, Scott Lurndal wrote:

Bart <bc@freeuk.com> writes:

On 21/11/2024 15:50, Scott Lurndal wrote:

Bart <bc@freeuk.com> writes:

On 21/11/2024 13:00, David Brown wrote:

On 20/11/2024 21:17, Bart wrote:

For the routines ones that I do 100s of times a day, where test runs >>>>>>> are generally very short, then I don't want to hang about waiting for >>>>>>> a compiler that is taking 30 times longer than necessary for no good >>>>>>> reason.

Your development process sounds bad in so many ways it is hard to know >>>>>> where to start. I think perhaps the foundation is that you taught >>>>>> yourself a bit of programming in the 1970's,

1970s builds, especially on mainframes, were dominated by link times. >>>>

Which mainframe do you have experience on?

I spent a decade writing a mainframe operating system (the largest
application we had to compile regularly) and the link time was a
minor fraction of the overall build time.

It was so minor that our build system stored the object files
so that the OS engineers only needed to recompile the object
associated with the source file being modified rather than
the entire OS, they'd share the rest of the object files
with the entire OS team.

The one I remember most was 'TKB' I think it was, running on ICL 4/72
(360 clone). It took up most of the memory. It was used to link my small >>> Fortran programs.

So you generalize from your one non-standard experience to the entire ecosystem.

Typical Bart.

Typical Scott. Did you post just to do a bit of bart-bashing?

Have you also considered that your experience of building operating
systems might itself be non-standard?

We had a few thousand customers building code using the same
compilers and, when needed, linkers.

The vast majority used COBOL, which seldom required an
explicit link step.

--- Synchronet 3.20a-Linux NewsLink 1.114

From Bart@bc@freeuk.com to comp.lang.c on Fri Nov 22 01:09:03 2024

From Newsgroup: comp.lang.c

On 10/11/2024 06:00, Waldek Hebisch wrote:

Bart <bc@freeuk.com> wrote:

...or to just always require 'else', with a dummy value if necessary?

Well, frequently it is easier to do bad job, than a good one.

I assume that you consider the simple solution the 'bad' one?

You wrote about _always_ requiring 'else' regardless if it is
needed or not. Yes, I consider this bad.

I tried the earlier C example in Rust:

fn fred(n:i32)->i32 {
if n==1 {return 10;}
if n==2 {return 20;}
}

I get this error:

Error(s):
error[E0317]: if may be missing an else clause
--> 1022687238/source.rs:5:5
|
3 | fn fred(n:i32)->i32 {
| --- expected `i32` because of this return type
4 | if n==1 {return 10;}
5 | if n==2 {return 20;}
| ^^^^^^^^^^^^^^^^^^^^ expected i32, found ()
|
= note: expected type `i32`
found type `()`
= note: `if` expressions without `else` evaluate to `()`
= help: consider adding an `else` block that evaluates to the
expected type

error: aborting due to previous error

So Rust here is behaving exactly the same as my language (mine just says
'else needed').

Rust is generally a well-regarded and well-designed language. It also
has clear and helpful error messages.

Presumably you would regard this as 'bad' too.

In this case the behaviour is not the easy solution, as Rust compilers
are even slower and more complex than big C compilers. It is just a
language choice.

--- Synchronet 3.20a-Linux NewsLink 1.114

From Bart@bc@freeuk.com to comp.lang.c on Fri Nov 22 11:05:02 2024

From Newsgroup: comp.lang.c

On 21/11/2024 13:00, David Brown wrote:

On 20/11/2024 21:17, Bart wrote:

Your development process sounds bad in so many ways it is hard to know
where to start. I think perhaps the foundation is that you taught
yourself a bit of programming in the 1970's,

I did a CS degree actually. I also spent a year programming, working for
the ARC and SRC (UK research councils).

But since you are being so condescending, I think /your/ problem is in
having to use C. I briefly mentioned that a 'better language' can help.

While I don't claim that my language is particularly safe, mine is
somewhat safer than C in its type system, and far less error prone in
its syntax and its overall design (for example, a function's details are always defined in exactly one place, so less maintenance and fewer
things to get wrong).

So, half the options in your C compilers are to help get around those shortcomings.

You also seem proud that in this example:

int F(int n) {
if (n==1) return 10;
if (n==2) return 20;
}

You can use 'unreachable()', a new C feature, to silence compiler
messages about running into the end of the function, something I
considered a complete hack.

My language requires a valid return value from the last statement. In
that it's similar to the Rust example I posted 9 hours ago.

Yet the gaslighting here suggested what I chose to do was completely wrong.

And presumably you also advise doing so on a bargain basement
single-core computer from at least 15 years ago?

Another example of you acknowledging that compilation speed can be a
problem. So a brute force approach to speed is what counts for you.

If you found that it took several hours to drive 20 miles from A to B,
your answer would be to buy a car that goes at 300mph, rather than doing endless detours along the way.

Or another option is to think about each journey extremely carefully,
and then only do the trip once a week!

--- Synchronet 3.20a-Linux NewsLink 1.114

From antispam@antispam@fricas.org (Waldek Hebisch) to comp.lang.c on Fri Nov 22 12:33:29 2024

From Newsgroup: comp.lang.c

Bart <bc@freeuk.com> wrote:

Sure. That's when you run a production build. I can even do that myself
on some programs (the ones where my C transpiler still works) and pass
it through gcc-O3. Then it might run 30% faster.

On fast machine running Dhrystone 2.2a I get:

tcc-0.9.28rc 20000000
gcc-12.2 -O 64184852
gcc-12.2 -O2 83194672
clang-14 -O 83194672
clang-14 -O2 85763288

so with 02 this is more than 4 times faster. Dhrystone correlated
resonably with runtime of tight compute-intensive programs.
Compiler started to cheat on original Dhrystone, so there are
bigger benchmarks like SPEC INT. But Dhrystone 2 has modifications
to make cheating harder, so I think it is still reasonable
benchmark. Actually, difference may be much bigger, for example
in image processing both clang and gcc can use vector intructions,
with may give additional speedup of order 8-16.

30% above means that you are much better than tcc or your program
is badly behaving (I have programs that make intensive use of
memory, here effect of optimization would be smaller, but still
of order 2).
--
Waldek Hebisch
--- Synchronet 3.20a-Linux NewsLink 1.114

From antispam@antispam@fricas.org (Waldek Hebisch) to comp.lang.c on Fri Nov 22 12:51:27 2024

From Newsgroup: comp.lang.c

Bart <bc@freeuk.com> wrote:

int main(void) {
int a;
int* p = 0;
a = *p;
}

Here's what happens with my C compiler when told to interpret it:

c:\cx>cc -i c
Compiling c.c to c.(int)
Error: Null ptr access

Here's what happens with gcc:

c:\cx>gcc c.c
c:\cx>a
<crashes>

Is there some option to insert such a check with gcc? I've no idea; most people don't.

I would do

gcc -g c.c
gdb a.out
run

and gdb would show me place with bad access. Things like bound
checking array access or overflow checking makes a big difference.
Null pointer access is reliably detected by hardware so no big
deal. Say what you 'cc' will do with the following function:

int
foo(int n) {
int a[10];
int i;
int res = 0;
for(i = 0; i <= 10; i++) {
a[i] = n + i;
}
for(i = 0; i <= 10; i++) {
res += a[i];
}
res;
}

Here gcc at compile time says:

foo.c: In function ‘foo’:
foo.c:15:17: warning: iteration 10 invokes undefined behavior [-Waggressive-loop-optimizations]
15 | res += a[i];
| ~^~~
foo.c:14:18: note: within this loop
14 | for(i = 0; i <= 10; i++) {
| ~~^~~~~

Of course, there are also cases like

void
bar(int n, int a[n]) {
int i;
for(i = 0; i <= n; i++) {
a[i] = i;
}
}

which are really wrong, but IIUC C standard considers them OK.
Still, good compiler should have an option to flag them either
at compile time or at runtime.
--
Waldek Hebisch
--- Synchronet 3.20a-Linux NewsLink 1.114

From Bart@bc@freeuk.com to comp.lang.c on Fri Nov 22 14:11:51 2024

From Newsgroup: comp.lang.c

On 22/11/2024 12:51, Waldek Hebisch wrote:

Bart <bc@freeuk.com> wrote:

int main(void) {
int a;
int* p = 0;
a = *p;
}

Here's what happens with my C compiler when told to interpret it:

c:\cx>cc -i c
Compiling c.c to c.(int)
Error: Null ptr access

Here's what happens with gcc:

c:\cx>gcc c.c
c:\cx>a
<crashes>

Is there some option to insert such a check with gcc? I've no idea; most
people don't.

I would do

gcc -g c.c
gdb a.out
run

and gdb would show me place with bad access. Things like bound
checking array access or overflow checking makes a big difference.
Null pointer access is reliably detected by hardware so no big
deal. Say what you 'cc' will do with the following function:

int
foo(int n) {
int a[10];
int i;
int res = 0;
for(i = 0; i <= 10; i++) {
a[i] = n + i;
}
for(i = 0; i <= 10; i++) {
res += a[i];
}
res;
}

Here gcc at compile time says:

foo.c: In function ‘foo’:
foo.c:15:17: warning: iteration 10 invokes undefined behavior [-Waggressive-loop-optimizations]
15 | res += a[i];
| ~^~~
foo.c:14:18: note: within this loop
14 | for(i = 0; i <= 10; i++) {
| ~~^~~~~

My 'cc -i' wouldn't detect it. The -i tells it to run an interpreter on
the intermediate code. Within the interpreter, some things are easily
checked, but bounds info on arrays doesn't exist. (The IL supports only pointer operations, not high level array ops.)

That would need intervention at an earlier stage, but even then, the
design of C makes that difficult. First, because array types like
int[10] decay to simple pointers, and ones represented by types like
int* don't have bounds info at all. (I don't support int[n] params and
few people use them anyway.)

In my static language, it would be a little easier because an int[10]
type doesn't decay; the info persists. C's int* would be ref[]int, still unbounded so has the same problem.

However the language also allows slices, array pointers that include a
length, so those can be used for bounds checking. But then, it's not
really needed in that case, since you tend to write loops like this:

func foo(slice[]int a)int =
for x in a do # iterate over values
....
for i in a.bounds do # iterate over bounds
....

Apart from that, I have a higher level, interpreted language does do
full bounds checking, so algorithms can be tested with that then ported
to the static language, a task made simpler by them using the same
syntax. I just need to add type annotations.

Getting back to 'cc -i', if I apply it to the program here, it gives an
error:

c:\cx>type c.c
#include <stdio.h>

int fred() {}

int main(void) {
printf("%d\n", fred());
}

c:\cx>cc -i c
Compiling c.c to c.(int)
PCL Exec error: RETF/SP mismatch: old=2 curr=1 seq: 7

If I try it with gcc, then nothing much happens:

c:\cx>gcc c.c
c:\cx>a
1

If optimised, it shows 0 instead of 1, both meaningless values. It's a
good thing the function wasn't called 'launchmissile()'.

Trying it with my language:

c:\mx>type t.m
func fred:int =
end

proc main =
println fred()
end

c:\mx>mm -i t
Compiling t.m to t.(int)
TX Type Error:
....
Void expression/return value missing

It won't compile it, and without needing to figure out which obscure set
of options is needed to give a hard error.

--- Synchronet 3.20a-Linux NewsLink 1.114

From Michael S@already5chosen@yahoo.com to comp.lang.c on Fri Nov 22 16:19:05 2024

From Newsgroup: comp.lang.c

On Fri, 22 Nov 2024 12:33:29 -0000 (UTC)
antispam@fricas.org (Waldek Hebisch) wrote:

Bart <bc@freeuk.com> wrote:

Sure. That's when you run a production build. I can even do that
myself on some programs (the ones where my C transpiler still
works) and pass it through gcc-O3. Then it might run 30% faster.

On fast machine running Dhrystone 2.2a I get:

tcc-0.9.28rc 20000000
gcc-12.2 -O 64184852
gcc-12.2 -O2 83194672
clang-14 -O 83194672
clang-14 -O2 85763288

so with 02 this is more than 4 times faster. Dhrystone correlated
resonably with runtime of tight compute-intensive programs.
Compiler started to cheat on original Dhrystone, so there are
bigger benchmarks like SPEC INT. But Dhrystone 2 has modifications
to make cheating harder, so I think it is still reasonable
benchmark. Actually, difference may be much bigger, for example
in image processing both clang and gcc can use vector intructions,
with may give additional speedup of order 8-16.

30% above means that you are much better than tcc or your program
is badly behaving (I have programs that make intensive use of
memory, here effect of optimization would be smaller, but still
of order 2).

gcc -O is not what Bart was talking about. It is quite similar to -O1.
Try gcc -O0.
With regard to speedup, I had run only one or two benchmarks with tcc
and my results were close to those of Bart. gcc -O0 very similar to tcc
in speed of the exe, but compiles several times slower. gcc -O2 exe
about 2.5 times faster.
I'd guess, I can construct a case, where gcc successfully vectorized
some floating-point loop calculation and showed 10x speed up vs tcc on
modern Zen5 hardware. But that's would not be typical.

--- Synchronet 3.20a-Linux NewsLink 1.114

From Bart@bc@freeuk.com to comp.lang.c on Fri Nov 22 15:00:51 2024

From Newsgroup: comp.lang.c

On 22/11/2024 12:33, Waldek Hebisch wrote:

Bart <bc@freeuk.com> wrote:

Sure. That's when you run a production build. I can even do that myself
on some programs (the ones where my C transpiler still works) and pass
it through gcc-O3. Then it might run 30% faster.

On fast machine running Dhrystone 2.2a I get:

tcc-0.9.28rc 20000000
gcc-12.2 -O 64184852
gcc-12.2 -O2 83194672
clang-14 -O 83194672
clang-14 -O2 85763288

so with 02 this is more than 4 times faster. Dhrystone correlated
resonably with runtime of tight compute-intensive programs.
Compiler started to cheat on original Dhrystone, so there are
bigger benchmarks like SPEC INT. But Dhrystone 2 has modifications
to make cheating harder, so I think it is still reasonable
benchmark. Actually, difference may be much bigger, for example
in image processing both clang and gcc can use vector intructions,
with may give additional speedup of order 8-16.

30% above means that you are much better than tcc or your program
is badly behaving (I have programs that make intensive use of
memory, here effect of optimization would be smaller, but still
of order 2).

The 30% applies to my typical programs, not benchmarks. Sure, gcc -O3
can do a lot of aggressive optimisations when everything is contained
within one short module and most runtime is spent in clear bottlenecks.

Real apps, like say my compilers, are different. They tend to use
globals more, program flow is more disseminated. The bottlenecks are
harder to pin down.

But, OK, here's the first sizeable benchmark that I thought of (I can't
find a reliable Dhrystone one; perhaps you can post a link).

It's called Deltablue.c, copied to db.c below for convenience. I've no
idea what it does, but the last figure shown is the runtime, so smaller
is better:

c:\cx>cc -r db
Compiling db.c to db.(run)
DeltaBlue C <S:> 1000x 0.517ms

c:\cx>tcc -run db.c
DeltaBlue C <S:> 1000x 0.546ms

c:\cx>gcc db.c && a
DeltaBlue C <S:> 1000x 0.502ms

c:\cx>gcc -O3 db.c && a
DeltaBlue C <S:> 1000x 0.314ms

So here gcc is 64% faster than my product. However my 'cc' doesn't yet
have the register allocator of the older 'mcc' compiler (which simply
keeps some locals in registers). That gives this result:

c:\cx>mcc -o3 db && db
Compiling db.c to db.exe
DeltaBlue C <S:> 1000x 0.439ms

So, 40% faster, for a benchmark.

Now, for a more practical test. First I will create an optimised version
of my compiler via transpiling to C:

c:\mx6>mc -opt mm -out:mmgcc
M6 Compiling mm.m---------- to mmgcc.exe
W:Invoking C compiler: gcc -m64 -O3 -ommgcc.exe mmgcc.c -s

Now I run my normal compiler, self-hosted, on a test program 'fann4.m':

c:\mx6>tm mm \mx\big\fann4 -ext
Compiling \mx\big\fann4.m to \mx\big\fann4.exe
TM: 0.99

Now the gcc-optimised version:

c:\mx6>tm mmgcc \mx\big\fann4 -ext
Compiling \mx\big\fann4.m to \mx\big\fann4.exe
TM: 0.78

So it's 27% faster. Note that fann4.m is 740Kloc, so this represents compilation speed of just under a million lines per second.

Some other stats:

c:\mx6>dir mm.exe mmgcc.exe
22/11/2024 14:43 393,216 mm.exe
22/11/2024 14:37 651,776 mmgcc.exe

So my product has a smaller EXE too. For more typical inputs, the
differences are narrower:

c:\mx6>copy mm.m bb.m

c:\mx6>tm mm bb
Compiling bb.m to bb.exe
TM: 0.09

c:\mx6>tm mmgcc bb -ext
Compiling bb.m to bb.exe
TM: 0.08

gcc-O3 is 12% faster, saving 10ms in compile-time. Curious about how tcc
would fare? Let's try it:

c:\mx6>mc -tcc mm -out:mmtcc
M6 Compiling mm.m---------- to mmtcc.exe
W:Invoking C compiler: tcc -ommtcc.exe mmtcc.c c:\windows\system32\user32.dll -luser32 c:\windows\system32\kernel32.dll -fdollars-in-identifiers

c:\mx6>tm mmtcc bb
Compiling bb.m to bb.exe
TM: 0.11

Yeah, a tcc-compiled M compiler would take 0.03 seconds longer to build
my 35Kloc compiler than a gcc-O3-compiled one; about 37% slower.

One more point: when gcc builds my compiler, it can use whole-program optimisation because the input is one source file. So that gives it an
extra edge compared with compiling individual modules.

--- Synchronet 3.20a-Linux NewsLink 1.114

From David Brown@david.brown@hesbynett.no to comp.lang.c on Fri Nov 22 16:21:55 2024

From Newsgroup: comp.lang.c

On 22/11/2024 15:19, Michael S wrote:

On Fri, 22 Nov 2024 12:33:29 -0000 (UTC)
antispam@fricas.org (Waldek Hebisch) wrote:

Bart <bc@freeuk.com> wrote:

Sure. That's when you run a production build. I can even do that
myself on some programs (the ones where my C transpiler still
works) and pass it through gcc-O3. Then it might run 30% faster.

On fast machine running Dhrystone 2.2a I get:

tcc-0.9.28rc 20000000
gcc-12.2 -O 64184852
gcc-12.2 -O2 83194672
clang-14 -O 83194672
clang-14 -O2 85763288

so with 02 this is more than 4 times faster. Dhrystone correlated
resonably with runtime of tight compute-intensive programs.
Compiler started to cheat on original Dhrystone, so there are
bigger benchmarks like SPEC INT. But Dhrystone 2 has modifications
to make cheating harder, so I think it is still reasonable
benchmark. Actually, difference may be much bigger, for example
in image processing both clang and gcc can use vector intructions,
with may give additional speedup of order 8-16.

30% above means that you are much better than tcc or your program
is badly behaving (I have programs that make intensive use of
memory, here effect of optimization would be smaller, but still
of order 2).

gcc -O is not what Bart was talking about. It is quite similar to -O1.

"Similar" in this particular case being a synonym for "identical" :-)

Try gcc -O0.
With regard to speedup, I had run only one or two benchmarks with tcc
and my results were close to those of Bart. gcc -O0 very similar to tcc
in speed of the exe, but compiles several times slower. gcc -O2 exe
about 2.5 times faster.

(Note that "gcc -O0" is still a vastly more powerful compiler than tcc
in many ways.)

I'd guess, I can construct a case, where gcc successfully vectorized
some floating-point loop calculation and showed 10x speed up vs tcc on
modern Zen5 hardware. But that's would not be typical.

The effect you get from optimisation depends very much on the code in question, the exact compiler flags, and also on the processor you are using.

Fairly obviously, if your code spends a lot of time in system calls,
waiting for external events (files, networks, etc.), or calling code in
other separately compiled libraries, then optimisation of your code will
make almost no difference. Something that does a lot of calculations
and data manipulation, on the other hand, can be much faster. Even
then, however, it depends on what you are doing.

Beyond simple "-O3" flags, things like "-march=native" and "-ffast-math"
(if you have floating point calculations, and you are sure this does not affect the correctness of the code!) can make a huge difference by
allowing more re-arrangements, vector/SIMD processing, using more
instructions on newer processors, and having a more accurate model of scheduling.

And the type of processor is also very important. x86 processors are
tuned to running crappy code, since a lot of the time they are used to
run old binaries made by old tools, or binaries made by people who don't
know how to use their tools well. So they have features like extremely
local data caches to hide the cost of using the stack for local
variables instead of registers. And often it doesn't matter if you do
one instruction or a dozen instructions, because you are waiting for
memory anyway. If you are looking at microcontrollers, on the other
hand, optimisation can make a huge difference for a lot of real-world code.

There is also another substantial difference in code efficiency that is
missed out in these sorts of pretend benchmarks. When efficiency really matters, top-shelf compilers give you features and extensions to help.
You can use intrinsics, or vector extensions, or pragmas, or attributes,
or "builtins", to give the compiler more information and work with it to
give more opportunities for optimisation. Many of these are not
portable (or of limited portability), and getting top speed from your
code is not an easy job, but you certainly have possibilities with a
tool like gcc or clang that you can never have with tcc or other tiny compilers.

--- Synchronet 3.20a-Linux NewsLink 1.114

From David Brown@david.brown@hesbynett.no to comp.lang.c on Fri Nov 22 17:06:19 2024

From Newsgroup: comp.lang.c

On 22/11/2024 12:05, Bart wrote:

On 21/11/2024 13:00, David Brown wrote:

On 20/11/2024 21:17, Bart wrote:

Your development process sounds bad in so many ways it is hard to know
where to start. I think perhaps the foundation is that you taught
yourself a bit of programming in the 1970's,

I did a CS degree actually. I also spent a year programming, working for
the ARC and SRC (UK research councils).

But since you are being so condescending, I think /your/ problem is in having to use C. I briefly mentioned that a 'better language' can help.

I use better languages than C, when there are better languages than C
for the task. And as you regularly point out, I don't program in
"normal" C, but in a subset of C limited by (amongst many other things)
a choice of gcc warnings, combined with compiler extensions.

My programming and thinking is not limited to C. But I believe I have a better general understanding of that language than you do (though there
are some aspects you no doubt know better than me). I can say that
because I have read the standards, and make a point of keeping up with
them. I think about the features of C - I don't simply reject half of
them because of some weird prejudice (and then complain that the
language doesn't have the features you want!). I learn what the
language actually says and how it is defined - I don't alternate between pretending it is all terrible, and pretending it works the way I'd like
it to work.

While I don't claim that my language is particularly safe, mine is
somewhat safer than C in its type system, and far less error prone in
its syntax and its overall design (for example, a function's details are always defined in exactly one place, so less maintenance and fewer
things to get wrong).

So, half the options in your C compilers are to help get around those shortcomings.

What is your point? Are you trying to say that your language is better
than C because your language doesn't let you make certain mistakes that
a few people sometimes make in C? So what? Your language doesn't let
people make mistakes because no one else uses it. If they did, I am
confident that it would provide plenty of scope for getting things wrong.

People can write good quality C with few mistakes. They have the tools available to help them. If they don't make use of the tools, it's their
fault - not the fault of the language. If they write bad code - as bad programmers do in any language, with any tools - it's their fault.

You also seem proud that in this example:

int F(int n) {
if (n==1) return 10;
if (n==2) return 20;
}

You can use 'unreachable()', a new C feature, to silence compiler
messages about running into the end of the function, something I
considered a complete hack.

I don't care what you consider a hack. I appreciate being able to write
code that is safe, correct, clear, maintainable and efficient. I don't
really understand why that bothers you. Do you find it difficult to
write such code in C?

My language requires a valid return value from the last statement. In
that it's similar to the Rust example I posted 9 hours ago.

If you are not able to use a feature such as "unreachable()" safely and correctly, then I suppose it makes sense not to have such a feature in
your language.

Personally, I have use of powerful tools. And I like that those
powerful tools also come with checks and safety features.

Of course there is a place for different balances between power and
safety here - there is a reason there are many programming languages,
and why many programmers use different languages for different tasks. I
would not expect many C programmers to have much use for "unreachable()".

Yet the gaslighting here suggested what I chose to do was completely wrong.

And presumably you also advise doing so on a bargain basement
single-core computer from at least 15 years ago?

Another example of you acknowledging that compilation speed can be a problem. So a brute force approach to speed is what counts for you.

No, trying to use a long-outdated and underpowered computer and then complaining about the speed is a problem.

But if I felt that compiler speed was a serious hinder to my work, and alternatives did not do as good a job, I'd get a faster computer (within reason). That's the way things work for professionals. (If I felt that expensive commercial compilers did a better job than gcc for my work,
then I'd buy them - I've tested them and concluded that gcc is the best
tool for my needs, regardless of price.)

If you found that it took several hours to drive 20 miles from A to B,
your answer would be to buy a car that goes at 300mph, rather than doing endless detours along the way.

Presumably, in your analogy, the detours are useful.

Or another option is to think about each journey extremely carefully,
and then only do the trip once a week!

That sounds a vastly better option, yes.

Certainly it is better than swapping out the car with an electric
scooter that can't do these important "detours".

--- Synchronet 3.20a-Linux NewsLink 1.114

From Kaz Kylheku@643-408-1753@kylheku.com to comp.lang.c on Fri Nov 22 18:10:50 2024

From Newsgroup: comp.lang.c

On 2024-11-22, Bart <bc@freeuk.com> wrote:

You also seem proud that in this example:

int F(int n) {
if (n==1) return 10;
if (n==2) return 20;
}

You can use 'unreachable()', a new C feature, to silence compiler
messages about running into the end of the function, something I
considered a complete hack.

Unreachable assertions are actually a bad trade if all you are looking
for is to suppress a diagnostic. Because the behavior is undefined
if the unreachable is actually reached.

That's literally the semantic definition! "unreachable()" means,
roughly, "remove all definition of behavior from this spot in the
program".

Whereas falling off the end of an int-returning function only
becomes undefined if the caller obtains the return value,
and of course in the case of a void function, it's well-defined.

You are better off with:

assert(0 && "should not be reached");
return 0;

if asserts are turned off with NDEBUG, the function does something that
is locally safe, and offers the possibility of avoiding a disaster.

The only valid reason for using unreachable is optimization: you're
introducing something unsafe in order to get better machine code. When
the compiler is informed that the behavior is always undefined when some
code is reached, it can just delete that code and everything dominated
by it (reachable only through it).

The above function does not need a function return sequence to be
emitted for the fall-through case that is not expected to occur,
if the situation truly does not occur. Then if it does occur, hell
will break loose since control will fall through to whatever bytes
follow the abrupt end of the function.
--
TXR Programming Language: http://nongnu.org/txr
Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
Mastodon: @Kazinator@mstdn.ca
--- Synchronet 3.20a-Linux NewsLink 1.114

From antispam@antispam@fricas.org (Waldek Hebisch) to comp.lang.c on Fri Nov 22 19:29:59 2024

From Newsgroup: comp.lang.c

Bart <bc@freeuk.com> wrote:

On 22/11/2024 12:33, Waldek Hebisch wrote:

Bart <bc@freeuk.com> wrote:

Sure. That's when you run a production build. I can even do that myself
on some programs (the ones where my C transpiler still works) and pass
it through gcc-O3. Then it might run 30% faster.

On fast machine running Dhrystone 2.2a I get:

tcc-0.9.28rc 20000000
gcc-12.2 -O 64184852
gcc-12.2 -O2 83194672
clang-14 -O 83194672
clang-14 -O2 85763288

so with 02 this is more than 4 times faster. Dhrystone correlated
resonably with runtime of tight compute-intensive programs.
Compiler started to cheat on original Dhrystone, so there are
bigger benchmarks like SPEC INT. But Dhrystone 2 has modifications
to make cheating harder, so I think it is still reasonable
benchmark. Actually, difference may be much bigger, for example
in image processing both clang and gcc can use vector intructions,
with may give additional speedup of order 8-16.

30% above means that you are much better than tcc or your program
is badly behaving (I have programs that make intensive use of
memory, here effect of optimization would be smaller, but still
of order 2).

The 30% applies to my typical programs, not benchmarks. Sure, gcc -O3
can do a lot of aggressive optimisations when everything is contained
within one short module and most runtime is spent in clear bottlenecks.

Real apps, like say my compilers, are different. They tend to use
globals more, program flow is more disseminated. The bottlenecks are
harder to pin down.

But, OK, here's the first sizeable benchmark that I thought of (I can't
find a reliable Dhrystone one; perhaps you can post a link).

First Google hit for Dhrystone 2.2a

https://homepages.cwi.nl/~steven/dry.chttps://homepages.cwi.nl/~steven/dry.c

(I used this one).

Compiled in two steps like:

gcc -c -O -o dry.o dry.c
gcc -o dry2 -DPASS2 -O dry.c dry.o

If you want something practical, I have the following C function:

#include <stdint.h>
void inner_mul(uint32_t * x, uint32_t * y, uint32_t * z,
uint32_t xdeg, uint32_t ydeg, uint32_t zdeg, uint32_t p) {
if (ydeg < xdeg) {
uint32_t * tmpp = x;
uint32_t tmp = xdeg;
x = y;
xdeg = ydeg;
y = tmpp;
ydeg = tmp;
}
if (zdeg < xdeg) {
xdeg = zdeg;
}
if (zdeg < ydeg) {
ydeg = zdeg;
}
uint64_t ss;
long i;
long j;
for(i=0; i<=xdeg; i++) {
ss = z[i];
for(j=0; j<=i; j++) {
ss += ((uint64_t)(x[i-j]))*((uint64_t)(y[j]));
}
z[i] = ss%p;
}
for(i=xdeg+1; i<=ydeg; i++) {
ss = z[i];
for(j=0; j<=xdeg; j++) {
ss += ((uint64_t)(x[j]))*((uint64_t)(y[i-j]));
}
z[i] = ss%p;
}
for(i=ydeg+1; i<=zdeg; i++) {
ss = z[i];
for(j=i-xdeg; j<=ydeg; j++) {
ss += ((uint64_t)(x[i-j]))*((uint64_t)(y[j]));
}
z[i] = ss%p;
}
}

and the following test driver:

#include <stdio.h>
#include <stdint.h>
#include <sys/time.h>

extern void inner_mul(uint32_t * x, uint32_t * y, uint32_t * z,
uint32_t xdeg, uint32_t ydeg, uint32_t zdeg, uint32_t p);

int main(void) {
uint32_t x[85], y[85], z[169];
int i;
for(i=0;i<85;i++) {
x[i] = 1;
y[i] = 1;
}

struct timeval tv1, tv2;
gettimeofday(&tv1, 0);
int j;
for(j=0; j < 100000; j++) {
for(i=0;i<169; i++) {
z[i] = 1;
}
inner_mul(x, y, z, 84, 84, 168, 1000003);
}
gettimeofday(&tv2, 0);
for(i=0;i<12; i++) {
printf(" %u,", z[i]);
}
putchar('\n');
long tt = tv2.tv_sec - tv1.tv_sec;
tt *= (1000*1000);
tt += (tv2.tv_usec - tv1.tv_usec);
printf("Time: %ld us\n", tt);
return 0;
}

At least for gcc and clang put them is separate files to avoid
simplifing the task too much ('inner_mul' is supposed to work
with variable data, here we feed it the same thing several times).
Of course, the test driver is silly, but 'inner_mul' is doing
important computation and, as long as 'inner_mul' is compiled
without knowledge of actual parameters, the test should be fair.
My results are:

clang -O3 -march=native 126112us
clang -O3 222136us
clang -O 225855us
gcc -O3 -march=native 82809us
gcc -O3 114365us
gcc -O 287786us
tcc 757347us

There is some irregularity in timings, but this shows that
factor of order 9 is possible.

Notes:
- this code is somewhat hard to vectorize, but clang
and gcc manage to do this,
- vectorized code is sensitive to alignment of the data, some
variation may be due to this,
- modern processors dynamically change clock frequency, the
times seem to be high enough to trigger switch to maximal
frequency (initally I used smaller number of iterations
but timing were less regular),
- most of code is portable, but for timing we need timer with
sufficient resolution, so I use Unix 'gettimeofday'.
--
Waldek Hebisch
--- Synchronet 3.20a-Linux NewsLink 1.114

Who's Online

System Info

Sysop:	DaiTengu
Location:	Appleton, WI
Users:	991
Nodes:	10 (0 / 10)
Uptime:	119:56:36
Calls:	12,958
Files:	186,574
Messages:	3,265,641

else ladders practice

Who's Online

System Info