somethins i got such pies of code like
if(n==1) {/*something/}
if(n==2) {/*something/}
if(n==3) {/*something/}
if(n==4) {/*something/}
if(n==5) {/*something/}
technically i would need to add elses
but the question is if to do that o teh code has really
chance to be slower without else (except some very
prmitive compilers) ?? ot adding else makes liek code
shorter.. so im not literally sure which is better
somethins i got such pices of code like
if(n==1) {/*something/}
if(n==2) {/*something/}
if(n==3) {/*something/}
if(n==4) {/*something/}
if(n==5) {/*something/}
technically i would need to add elses - but the question is if to do that
do teh code has really chance to be slower without else (except some
very prmitive compilers) ??
not adding else makes liek code shorter.. so im not literally sure which
is better
On 31/10/2024 12:11, fir wrote:
somethins i got such pices of code like
if(n==1) {/*something/}
if(n==2) {/*something/}
if(n==3) {/*something/}
if(n==4) {/*something/}
if(n==5) {/*something/}
technically i would need to add elses - but the question is if to do that
Why not ...
switch (n)
{
case 1:
/* something */
break;
case 2:
/* etc ... */
default:
/* something else */
}
... ?
do teh code has really chance to be slower without else (except some
very prmitive compilers) ??
not adding else makes liek code shorter.. so im not literally sure
which is better
The size of teh [sic] source code won't make any difference to the size
of the executable - so aim for readability first and foremost.
somethins i got such pices of code like
if(n==1) {/*something/}
if(n==2) {/*something/}
if(n==3) {/*something/}
if(n==4) {/*something/}
if(n==5) {/*something/}
technically i would need to add elses - but the question is if to do that
do teh code has really chance to be slower without else (except some
very prmitive compilers) ??
Richard Harnden wrote:
On 31/10/2024 12:11, fir wrote:
somethins i got such pices of code like
if(n==1) {/*something/}
if(n==2) {/*something/}
if(n==3) {/*something/}
if(n==4) {/*something/}
if(n==5) {/*something/}
technically i would need to add elses - but the question is if to do
that
Why not ...
switch (n)
{
case 1:
/* something */
break;
case 2:
/* etc ... */
default:
/* something else */
}
... ?
switch is literally flawed construction - (i was even writing or at
kleast thinking why, last time but sadli i literrally forgot my
arguments, would need to find that notes)
so i forgot why but i forgiot that it is flawed - i mean the form of
this c switch has sme error
fir:
somethins i got such pies of code like
if(n==1) {/*something/}
if(n==2) {/*something/}
if(n==3) {/*something/}
if(n==4) {/*something/}
if(n==5) {/*something/}
technically i would need to add elses
Why?
but the question is if to do that o teh code has really
chance to be slower without else (except some very
prmitive compilers) ?? ot adding else makes liek code
shorter.. so im not literally sure which is better
I am all for code literalism, and want it to emphasize the
most natural method of execution. Therefore, I usually
append `return' or `goto' to each then-block in an tabular
if-sequence as above. I believe the perormance will depend
on the compiler.
On 31/10/2024 12:11, fir wrote:std::unreachable();
somethins i got such pices of code like
if(n==1) {/*something/}
if(n==2) {/*something/}
if(n==3) {/*something/}
if(n==4) {/*something/}
if(n==5) {/*something/}
technically i would need to add elses - but the question is if to do that
Why not ...
switch (n)
{
case 1:
/* something */
break;
case 2:
/* etc ... */
default:
/* something else */
}
... ?
do teh code has really chance to be slower without else (except some
very prmitive compilers) ??
not adding else makes liek code shorter.. so im not literally sure
which is better
The size of teh [sic] source code won't make any difference to the size
of the executable - so aim for readability first and foremost.
On 2024-10-31, fir wrote:
somethins i got such pices of code like
if(n==1) {/*something/}
if(n==2) {/*something/}
if(n==3) {/*something/}
if(n==4) {/*something/}
if(n==5) {/*something/}
technically i would need to add elses - but the question is if to do that
do teh code has really chance to be slower without else (except some
very prmitive compilers) ??
In the above, all conditionals are always checked -- that is the truth
of a previous conditional statement has no bearing on subsequent tests.
This leads to the potential of tests going off in directions you hadn't necessarily anticipated.
However, 'if..elseif..else' will only check subsequent conditionals if
the prior statements were false. So for the case that "n=2", you're
only ever testing the two cases "if (n==1)" (which is false) and "elseif(n==2)". The computer just skips to the end of the set of
statements.
Given this MWE (my own terrible code aside ;) ):
int main(){
int n=0;
printf ("all if, n=%u\n",n);
if (n==0) { printf ("n: %u\n",n); n++;}
if (n==1) { printf ("n: %u\n",n); n++;}
if (n==2) { printf ("n: %u\n",n); n++;}
if (n==3) { printf ("n: %u\n",n); n++;}
if (n==4) { printf ("n: %u\n",n); n++;}
printf ("all if completed, n=%u\n",n);
n=3;
printf ("with else if, n=%u\n",n);
if (n==0) { printf ("n: %u\n",n); n++;}
else if (n==1) { printf ("n: %u\n",n); n++;}
else if (n==2) { printf ("n: %u\n",n); n++;}
else if (n==3) { printf ("n: %u\n",n); n++;}
else { printf ("n: %u\n",n); n++;}
printf ("with else if completed, n=%u\n",n);
}
You'll get the output:
all if, n=0
n: 0
n: 1
n: 2
n: 3
n: 4
all if completed, n=5
with else if, n=3
n: 3
with else if completed, n=4
HTH :)
Dan Purgert wrote:
On 2024-10-31, fir wrote:i not modify n in those {} blocks so this example is not much relevant
somethins i got such pices of code like
if(n==1) {/*something/}
if(n==2) {/*something/}
if(n==3) {/*something/}
if(n==4) {/*something/}
if(n==5) {/*something/}
technically i would need to add elses - but the question is if to do that >>>
do teh code has really chance to be slower without else (except some
very prmitive compilers) ??
In the above, all conditionals are always checked -- that is the truth
of a previous conditional statement has no bearing on subsequent tests.
This leads to the potential of tests going off in directions you hadn't
necessarily anticipated.
However, 'if..elseif..else' will only check subsequent conditionals if
the prior statements were false. So for the case that "n=2", you're
only ever testing the two cases "if (n==1)" (which is false) and
"elseif(n==2)". The computer just skips to the end of the set of
statements.
Given this MWE (my own terrible code aside ;) ):
int main(){
int n=0;
printf ("all if, n=%u\n",n);
if (n==0) { printf ("n: %u\n",n); n++;}
if (n==1) { printf ("n: %u\n",n); n++;}
if (n==2) { printf ("n: %u\n",n); n++;}
if (n==3) { printf ("n: %u\n",n); n++;}
if (n==4) { printf ("n: %u\n",n); n++;}
printf ("all if completed, n=%u\n",n);
n=3;
printf ("with else if, n=%u\n",n);
if (n==0) { printf ("n: %u\n",n); n++;}
else if (n==1) { printf ("n: %u\n",n); n++;}
else if (n==2) { printf ("n: %u\n",n); n++;}
else if (n==3) { printf ("n: %u\n",n); n++;}
else { printf ("n: %u\n",n); n++;}
printf ("with else if completed, n=%u\n",n);
}
You'll get the output:
all if, n=0
n: 0
n: 1
n: 2
n: 3
n: 4
all if completed, n=5
with else if, n=3
n: 3
with else if completed, n=4
HTH :)
my quiestion is more liek what is a metter of beter style
switch(a);
case(1) {}
case(2) {}
case(3) {}
case(4) {}
case(5) {}
On 2024-10-31, fir wrote:
Dan Purgert wrote:
On 2024-10-31, fir wrote:i not modify n in those {} blocks so this example is not much relevant
somethins i got such pices of code like
if(n==1) {/*something/}
if(n==2) {/*something/}
if(n==3) {/*something/}
if(n==4) {/*something/}
if(n==5) {/*something/}
technically i would need to add elses - but the question is if to do that >>>>
do teh code has really chance to be slower without else (except some
very prmitive compilers) ??
In the above, all conditionals are always checked -- that is the truth
of a previous conditional statement has no bearing on subsequent tests.
This leads to the potential of tests going off in directions you hadn't
necessarily anticipated.
However, 'if..elseif..else' will only check subsequent conditionals if
the prior statements were false. So for the case that "n=2", you're
only ever testing the two cases "if (n==1)" (which is false) and
"elseif(n==2)". The computer just skips to the end of the set of
statements.
Given this MWE (my own terrible code aside ;) ):
int main(){
int n=0;
printf ("all if, n=%u\n",n);
if (n==0) { printf ("n: %u\n",n); n++;}
if (n==1) { printf ("n: %u\n",n); n++;}
if (n==2) { printf ("n: %u\n",n); n++;}
if (n==3) { printf ("n: %u\n",n); n++;}
if (n==4) { printf ("n: %u\n",n); n++;}
printf ("all if completed, n=%u\n",n);
n=3;
printf ("with else if, n=%u\n",n);
if (n==0) { printf ("n: %u\n",n); n++;}
else if (n==1) { printf ("n: %u\n",n); n++;}
else if (n==2) { printf ("n: %u\n",n); n++;}
else if (n==3) { printf ("n: %u\n",n); n++;}
else { printf ("n: %u\n",n); n++;}
printf ("with else if completed, n=%u\n",n);
}
You'll get the output:
all if, n=0
n: 0
n: 1
n: 2
n: 3
n: 4
all if completed, n=5
with else if, n=3
n: 3
with else if completed, n=4
HTH :)
I'm using that as a simplified case to force the issue. "n" could be modified anywhere, just so long as it is "between" any two of the test
cases being checked.
my quiestion is more liek what is a metter of beter style
If it is a series of related conditions, then "if .. else if .. else".
fir:
somethins i got such pies of code like
if(n==1) {/*something/}
if(n==2) {/*something/}
if(n==3) {/*something/}
if(n==4) {/*something/}
if(n==5) {/*something/}
technically i would need to add elses
Why?
somethins i got such pices of code like
if(n==1) {/*something/}
if(n==2) {/*something/}
if(n==3) {/*something/}
if(n==4) {/*something/}
if(n==5) {/*something/}
technically i would need to add elses - but the question is if to do that
do teh code has really chance to be slower without else (except some
very prmitive compilers) ??
not adding else makes liek code shorter.. so im not literally sure which
is better
somethins i got such pices of code like
if(n==1) {/*something/}
if(n==2) {/*something/}
if(n==3) {/*something/}
if(n==4) {/*something/}
if(n==5) {/*something/}
technically i would need to add elses - but the question is if to do that
do teh code has really chance to be slower without else (except some
very prmitive compilers) ??
not adding else makes liek code shorter.. so im not literally sure which
is better
On 10/31/24 09:15, Anton Shepelev wrote:
fir:
somethins i got such pies of code like
if(n==1) {/*something/}
if(n==2) {/*something/}
if(n==3) {/*something/}
if(n==4) {/*something/}
if(n==5) {/*something/}
technically i would need to add elses
Why?
He has indicated that the value of n is not changed inside any of the if-clauses. A sufficiently sophisticated compiler could notice that
fact, and also that each of the conditions is on the same variable, and
as a result it could generate the same kind of code as if it had been
written with 'else', so it won't generate unnecessary condition tests.
It might, in fact, generate the same kind of code which would have been generated if it had been coded properly, as a switch statement, so it
might use a jump table, if appropriate.
But it's better to write it as a switch statement in the first place, so
you don't have to rely upon the compiler being sufficiently
sophisticated to get the best results.
ral clear patterns here: you're testing the same variable 'n' against
several mutually exclusive alternatives, which also happen to be
consecutive values.
C is short of ways to express this, if you want to keep those
'somethings' as inline code (otherwise arrays of function pointers or
even label pointers could be use
Bart wrote:
ral clear patterns here: you're testing the same variable 'n' against
several mutually exclusive alternatives, which also happen to be
consecutive values.
C is short of ways to express this, if you want to keep those
'somethings' as inline code (otherwise arrays of function pointers or
even label pointers could be use
so in short this groupo seem to have no conclusion but is tolerant foir various approaches as it seems
imo the else latder is like most proper but i dont lkie it optically,
swich case i also dont like (use as far i i remember never in my code,
for years dont use even one)
so i persnally would use bare ifs and maybe elses ocasionally
(and switch should be mended but its fully not clear how,
as to those pointer tables im not sure but im like measurad it onece and
it was (not sure as to thsi as i dont remember exactly) slow maybe
dependant on architecture so its noth wort of use (if i remember correctly)
On 01/11/2024 11:32, fir wrote:
Bart wrote:
ral clear patterns here: you're testing the same variable 'n' against
several mutually exclusive alternatives, which also happen to be
consecutive values.
C is short of ways to express this, if you want to keep those
'somethings' as inline code (otherwise arrays of function pointers or
even label pointers could be use
so in short this groupo seem to have no conclusion but is tolerant
foir various approaches as it seems
imo the else latder is like most proper but i dont lkie it optically,
swich case i also dont like (use as far i i remember never in my code,
for years dont use even one)
so i persnally would use bare ifs and maybe elses ocasionally
(and switch should be mended but its fully not clear how,
as to those pointer tables im not sure but im like measurad it onece
and it was (not sure as to thsi as i dont remember exactly) slow maybe
dependant on architecture so its noth wort of use (if i remember
correctly)
Well, personally I don't like that repetition, that's why I mentioned
the patterns. You're writing 'n' 5 times, '==' 5 times, and you're
writing out the numbers 1, 2, 3, 4, 5.
I also don't like the lack of exclusivity.
However I don't need to use C. If those 'somethings' were simple, or
were expressions, I could use syntax like this:
(n | s1, s2, s3, s4, s5)
If they were more elaborate statements, I would use a heavier syntax,
but still one where 'n' is only written once, and I don't need to repeat '=='.
In the C version, you could mistakenly write 'm' instead of 'n', or '=' instead of '=='; it's more error prone, and a compiler might not be able
to detect it.
In the C, you could probably do something like this:
#define or else if
if (x == a) {}
or (x == b) {}
or (x == c) {}
In the C, you could probably do something like this:
#define or else if
if (x == a) {}
or (x == b) {}
or (x == c) {}
Bart wrote:
On 01/11/2024 11:32, fir wrote:
Bart wrote:
ral clear patterns here: you're testing the same variable 'n' against
several mutually exclusive alternatives, which also happen to be
consecutive values.
C is short of ways to express this, if you want to keep those
'somethings' as inline code (otherwise arrays of function pointers or
even label pointers could be use
so in short this groupo seem to have no conclusion but is tolerant
foir various approaches as it seems
imo the else latder is like most proper but i dont lkie it optically,
swich case i also dont like (use as far i i remember never in my code,
for years dont use even one)
so i persnally would use bare ifs and maybe elses ocasionally
(and switch should be mended but its fully not clear how,
as to those pointer tables im not sure but im like measurad it onece
and it was (not sure as to thsi as i dont remember exactly) slow maybe
dependant on architecture so its noth wort of use (if i remember
correctly)
Well, personally I don't like that repetition, that's why I mentioned
the patterns. You're writing 'n' 5 times, '==' 5 times, and you're
writing out the numbers 1, 2, 3, 4, 5.
I also don't like the lack of exclusivity.
However I don't need to use C. If those 'somethings' were simple, or
were expressions, I could use syntax like this:
(n | s1, s2, s3, s4, s5)
on a C ground more suitable is
{s1,s2,s3,s4,s5)[n]
//which is just array indexing
On 01/11/2024 12:55, fir wrote:
Bart wrote:
On 01/11/2024 11:32, fir wrote:
Bart wrote:
ral clear patterns here: you're testing the same variable 'n' against >>>>> several mutually exclusive alternatives, which also happen to be
consecutive values.
C is short of ways to express this, if you want to keep those
'somethings' as inline code (otherwise arrays of function pointers or >>>>> even label pointers could be use
so in short this groupo seem to have no conclusion but is tolerant
foir various approaches as it seems
imo the else latder is like most proper but i dont lkie it optically,
swich case i also dont like (use as far i i remember never in my code, >>>> for years dont use even one)
so i persnally would use bare ifs and maybe elses ocasionally
(and switch should be mended but its fully not clear how,
as to those pointer tables im not sure but im like measurad it onece
and it was (not sure as to thsi as i dont remember exactly) slow maybe >>>> dependant on architecture so its noth wort of use (if i remember
correctly)
Well, personally I don't like that repetition, that's why I mentioned
the patterns. You're writing 'n' 5 times, '==' 5 times, and you're
writing out the numbers 1, 2, 3, 4, 5.
I also don't like the lack of exclusivity.
However I don't need to use C. If those 'somethings' were simple, or
were expressions, I could use syntax like this:
(n | s1, s2, s3, s4, s5)
on a C ground more suitable is
{s1,s2,s3,s4,s5)[n]
//which is just array indexing
No, it's specifically not array indexing, as only one of s1 - s5 is evaluated, or nothing is when n is not in range, eg. n is 100.
You could try something like that in C:
int x;
x = ((int[]){(puts("a"),10), (puts("b"),20), (puts("c"), 30), (puts("d"),40)})[3];
printf("X=%d\n", x);
The output is:
a
b
c
d
X=40
Showing that all elements are evaluated first. If index is 100, the
result is also undefined.
Bart wrote:
On 01/11/2024 12:55, fir wrote::-O
Bart wrote:
On 01/11/2024 11:32, fir wrote:
Bart wrote:
ral clear patterns here: you're testing the same variable 'n' against >>>>>> several mutually exclusive alternatives, which also happen to be
consecutive values.
C is short of ways to express this, if you want to keep those
'somethings' as inline code (otherwise arrays of function pointers or >>>>>> even label pointers could be use
so in short this groupo seem to have no conclusion but is tolerant
foir various approaches as it seems
imo the else latder is like most proper but i dont lkie it optically, >>>>> swich case i also dont like (use as far i i remember never in my code, >>>>> for years dont use even one)
so i persnally would use bare ifs and maybe elses ocasionally
(and switch should be mended but its fully not clear how,
as to those pointer tables im not sure but im like measurad it onece >>>>> and it was (not sure as to thsi as i dont remember exactly) slow maybe >>>>> dependant on architecture so its noth wort of use (if i remember
correctly)
Well, personally I don't like that repetition, that's why I mentioned
the patterns. You're writing 'n' 5 times, '==' 5 times, and you're
writing out the numbers 1, 2, 3, 4, 5.
I also don't like the lack of exclusivity.
However I don't need to use C. If those 'somethings' were simple, or
were expressions, I could use syntax like this:
(n | s1, s2, s3, s4, s5)
on a C ground more suitable is
{s1,s2,s3,s4,s5)[n]
//which is just array indexing
No, it's specifically not array indexing, as only one of s1 - s5 is
evaluated, or nothing is when n is not in range, eg. n is 100.
You could try something like that in C:
int x;
x = ((int[]){(puts("a"),10), (puts("b"),20), (puts("c"), 30),
(puts("d"),40)})[3];
printf("X=%d\n", x);
The output is:
a
b
c
d
X=40
Showing that all elements are evaluated first. If index is 100, the
result is also undefined.
what is this, first time i see such thing
fir wrote:
Bart wrote:im surprised that it work, but in fact i meant that this syntax is old c compatible but sych thing like
On 01/11/2024 12:55, fir wrote::-O
Bart wrote:
On 01/11/2024 11:32, fir wrote:
Bart wrote:
ral clear patterns here: you're testing the same variable 'n'
against
several mutually exclusive alternatives, which also happen to be >>>>>>> consecutive values.
C is short of ways to express this, if you want to keep those
'somethings' as inline code (otherwise arrays of function
pointers or
even label pointers could be use
so in short this groupo seem to have no conclusion but is tolerant >>>>>> foir various approaches as it seems
imo the else latder is like most proper but i dont lkie it optically, >>>>>> swich case i also dont like (use as far i i remember never in my
code,
for years dont use even one)
so i persnally would use bare ifs and maybe elses ocasionally
(and switch should be mended but its fully not clear how,
as to those pointer tables im not sure but im like measurad it onece >>>>>> and it was (not sure as to thsi as i dont remember exactly) slow
maybe
dependant on architecture so its noth wort of use (if i remember
correctly)
Well, personally I don't like that repetition, that's why I mentioned >>>>> the patterns. You're writing 'n' 5 times, '==' 5 times, and you're
writing out the numbers 1, 2, 3, 4, 5.
I also don't like the lack of exclusivity.
However I don't need to use C. If those 'somethings' were simple, or >>>>> were expressions, I could use syntax like this:
(n | s1, s2, s3, s4, s5)
on a C ground more suitable is
{s1,s2,s3,s4,s5)[n]
//which is just array indexing
No, it's specifically not array indexing, as only one of s1 - s5 is
evaluated, or nothing is when n is not in range, eg. n is 100.
You could try something like that in C:
int x;
x = ((int[]){(puts("a"),10), (puts("b"),20), (puts("c"), 30),
(puts("d"),40)})[3];
printf("X=%d\n", x);
The output is:
a
b
c
d
X=40
Showing that all elements are evaluated first. If index is 100, the
result is also undefined.
what is this, first time i see such thing
{printf("ONE"), printf("TWO"), printf("THREE")} [2]
shouldn evaluate al just the one is selected
like in array tab[23] not eveluates something other than tab[23]
On 01/11/2024 14:17, fir wrote:
fir wrote:
Bart wrote:im surprised that it work, but in fact i meant that this syntax is old
On 01/11/2024 12:55, fir wrote::-O
Bart wrote:
On 01/11/2024 11:32, fir wrote:
Bart wrote:
ral clear patterns here: you're testing the same variable 'n' >>>>>>>> against
several mutually exclusive alternatives, which also happen to be >>>>>>>> consecutive values.
C is short of ways to express this, if you want to keep those
'somethings' as inline code (otherwise arrays of function
pointers or
even label pointers could be use
so in short this groupo seem to have no conclusion but is tolerant >>>>>>> foir various approaches as it seems
imo the else latder is like most proper but i dont lkie it
optically,
swich case i also dont like (use as far i i remember never in my >>>>>>> code,
for years dont use even one)
so i persnally would use bare ifs and maybe elses ocasionally
(and switch should be mended but its fully not clear how,
as to those pointer tables im not sure but im like measurad it onece >>>>>>> and it was (not sure as to thsi as i dont remember exactly) slow >>>>>>> maybe
dependant on architecture so its noth wort of use (if i remember >>>>>>> correctly)
Well, personally I don't like that repetition, that's why I mentioned >>>>>> the patterns. You're writing 'n' 5 times, '==' 5 times, and you're >>>>>> writing out the numbers 1, 2, 3, 4, 5.
I also don't like the lack of exclusivity.
However I don't need to use C. If those 'somethings' were simple, or >>>>>> were expressions, I could use syntax like this:
(n | s1, s2, s3, s4, s5)
on a C ground more suitable is
{s1,s2,s3,s4,s5)[n]
//which is just array indexing
No, it's specifically not array indexing, as only one of s1 - s5 is
evaluated, or nothing is when n is not in range, eg. n is 100.
You could try something like that in C:
int x;
x = ((int[]){(puts("a"),10), (puts("b"),20), (puts("c"), 30), >>>> (puts("d"),40)})[3];
printf("X=%d\n", x);
The output is:
a
b
c
d
X=40
Showing that all elements are evaluated first. If index is 100, the
result is also undefined.
what is this, first time i see such thing
c compatible but sych thing like
{printf("ONE"), printf("TWO"), printf("THREE")} [2]
shouldn evaluate al just the one is selected
like in array tab[23] not eveluates something other than tab[23]
It's a 'compound literal'. It allows you to have the same {...} initialisation data format, but anywhere, not just for initialing.
However it always needs a cast:
(int[]){printf("ONE"), printf("TWO"), printf("THREE")}[2];
This prints ONETWOTHREE, it also then indexes the 3rd value of the
array, which is 5, as returned by printf, so this:
printf("%d\n", (int[]){printf("ONE"), printf("TWO"), printf("THREE")}[2]);
prints ONETWOTHREE5
On 01/11/2024 16:59, Bart wrote:
On 01/11/2024 14:17, fir wrote:
fir wrote:
Bart wrote:im surprised that it work, but in fact i meant that this syntax is
On 01/11/2024 12:55, fir wrote::-O
Bart wrote:
On 01/11/2024 11:32, fir wrote:
Bart wrote:
ral clear patterns here: you're testing the same variable 'n' >>>>>>>>> against
several mutually exclusive alternatives, which also happen to be >>>>>>>>> consecutive values.
C is short of ways to express this, if you want to keep those >>>>>>>>> 'somethings' as inline code (otherwise arrays of function
pointers or
even label pointers could be use
so in short this groupo seem to have no conclusion but is tolerant >>>>>>>> foir various approaches as it seems
imo the else latder is like most proper but i dont lkie it
optically,
swich case i also dont like (use as far i i remember never in my >>>>>>>> code,
for years dont use even one)
so i persnally would use bare ifs and maybe elses ocasionally
(and switch should be mended but its fully not clear how,
as to those pointer tables im not sure but im like measurad it >>>>>>>> onece
and it was (not sure as to thsi as i dont remember exactly) slow >>>>>>>> maybe
dependant on architecture so its noth wort of use (if i remember >>>>>>>> correctly)
Well, personally I don't like that repetition, that's why I
mentioned
the patterns. You're writing 'n' 5 times, '==' 5 times, and you're >>>>>>> writing out the numbers 1, 2, 3, 4, 5.
I also don't like the lack of exclusivity.
However I don't need to use C. If those 'somethings' were simple, or >>>>>>> were expressions, I could use syntax like this:
(n | s1, s2, s3, s4, s5)
on a C ground more suitable is
{s1,s2,s3,s4,s5)[n]
//which is just array indexing
No, it's specifically not array indexing, as only one of s1 - s5 is
evaluated, or nothing is when n is not in range, eg. n is 100.
You could try something like that in C:
int x;
x = ((int[]){(puts("a"),10), (puts("b"),20), (puts("c"), 30), >>>>> (puts("d"),40)})[3];
printf("X=%d\n", x);
The output is:
a
b
c
d
X=40
Showing that all elements are evaluated first. If index is 100, the
result is also undefined.
what is this, first time i see such thing
old c compatible but sych thing like
{printf("ONE"), printf("TWO"), printf("THREE")} [2]
shouldn evaluate al just the one is selected
like in array tab[23] not eveluates something other than tab[23]
It's a 'compound literal'. It allows you to have the same {...}
initialisation data format, but anywhere, not just for initialing.
However it always needs a cast:
(int[]){printf("ONE"), printf("TWO"), printf("THREE")}[2];
This prints ONETWOTHREE, it also then indexes the 3rd value of the
array, which is 5, as returned by printf, so this:
printf("%d\n", (int[]){printf("ONE"), printf("TWO"),
printf("THREE")}[2]);
prints ONETWOTHREE5
What you have written here is all correct, but a more common method
would be to avoid having three printf's :
void shout_a_number(int n) {
printf( (const char* []) { "ONE", "TWO", "THREE" } [n] );
}
That's more likely to match what people would want.
On 01/11/2024 17:35, David Brown wrote:
On 01/11/2024 16:59, Bart wrote:
On 01/11/2024 14:17, fir wrote:
fir wrote:
Bart wrote:im surprised that it work, but in fact i meant that this syntax is
On 01/11/2024 12:55, fir wrote::-O
Bart wrote:
On 01/11/2024 11:32, fir wrote:
Bart wrote:
ral clear patterns here: you're testing the same variable 'n' >>>>>>>>>> against
several mutually exclusive alternatives, which also happen to be >>>>>>>>>> consecutive values.
C is short of ways to express this, if you want to keep those >>>>>>>>>> 'somethings' as inline code (otherwise arrays of function >>>>>>>>>> pointers or
even label pointers could be use
so in short this groupo seem to have no conclusion but is tolerant >>>>>>>>> foir various approaches as it seems
imo the else latder is like most proper but i dont lkie it
optically,
swich case i also dont like (use as far i i remember never in >>>>>>>>> my code,
for years dont use even one)
so i persnally would use bare ifs and maybe elses ocasionally >>>>>>>>> (and switch should be mended but its fully not clear how,
as to those pointer tables im not sure but im like measurad it >>>>>>>>> onece
and it was (not sure as to thsi as i dont remember exactly) >>>>>>>>> slow maybe
dependant on architecture so its noth wort of use (if i remember >>>>>>>>> correctly)
Well, personally I don't like that repetition, that's why I
mentioned
the patterns. You're writing 'n' 5 times, '==' 5 times, and you're >>>>>>>> writing out the numbers 1, 2, 3, 4, 5.
I also don't like the lack of exclusivity.
However I don't need to use C. If those 'somethings' were
simple, or
were expressions, I could use syntax like this:
(n | s1, s2, s3, s4, s5)
on a C ground more suitable is
{s1,s2,s3,s4,s5)[n]
//which is just array indexing
No, it's specifically not array indexing, as only one of s1 - s5 is >>>>>> evaluated, or nothing is when n is not in range, eg. n is 100.
You could try something like that in C:
int x;
x = ((int[]){(puts("a"),10), (puts("b"),20), (puts("c"), 30), >>>>>> (puts("d"),40)})[3];
printf("X=%d\n", x);
The output is:
a
b
c
d
X=40
Showing that all elements are evaluated first. If index is 100, the >>>>>> result is also undefined.
what is this, first time i see such thing
old c compatible but sych thing like
{printf("ONE"), printf("TWO"), printf("THREE")} [2]
shouldn evaluate al just the one is selected
like in array tab[23] not eveluates something other than tab[23]
It's a 'compound literal'. It allows you to have the same {...}
initialisation data format, but anywhere, not just for initialing.
However it always needs a cast:
(int[]){printf("ONE"), printf("TWO"), printf("THREE")}[2];
This prints ONETWOTHREE, it also then indexes the 3rd value of the
array, which is 5, as returned by printf, so this:
printf("%d\n", (int[]){printf("ONE"), printf("TWO"),
printf("THREE")}[2]);
prints ONETWOTHREE5
What you have written here is all correct, but a more common method
would be to avoid having three printf's :
void shout_a_number(int n) {
printf( (const char* []) { "ONE", "TWO", "THREE" } [n] );
}
That's more likely to match what people would want.
I was also trying to show that all elements are evaluated, so each has
to have some side-effect to illustrate that.
A true N-way-select construct (C only really has ?:) would evaluate only one, and would deal with an out-of-range condition.
(In my implementations, a default/else branch value must be provided if
the whole thing is expected to return a value.)
On 01/11/2024 19:05, Bart wrote:
On 01/11/2024 17:35, David Brown wrote:
What you have written here is all correct, but a more common method
would be to avoid having three printf's :
void shout_a_number(int n) {
printf( (const char* []) { "ONE", "TWO", "THREE" } [n] );
}
That's more likely to match what people would want.
I was also trying to show that all elements are evaluated, so each has
to have some side-effect to illustrate that.
Fair enough.
A true N-way-select construct (C only really has ?:) would evaluate
only one, and would deal with an out-of-range condition.
That's a matter of opinion and design choice, rather than being
requirements for a "true" select construct.
You are free to choose the
rules you want for your own language, but you are not free to dictate
what you think the rules should be for others. (You are welcome to /opinions/, of course.)
(In my implementations, a default/else branch value must be provided
if the whole thing is expected to return a value.)
OK, if that's what you want. My preference, if I were putting together what /I/ thought was an idea language for /my/ use, would be heavy use
of explicit specifications and contracts for code, so that a
default/else branch is either disallowed (if there the selection covers
all legal values) or required (if the selection is abbreviated). A
default value "just in case" is, IMHO, worse than useless.
On 31/10/2024 19:16, James Kuyper wrote:
On 10/31/24 09:15, Anton Shepelev wrote:
fir:
somethins i got such pies of code like
if(n==1) {/*something/}
if(n==2) {/*something/}
if(n==3) {/*something/}
if(n==4) {/*something/}
if(n==5) {/*something/}
technically i would need to add elses
Why?
He has indicated that the value of n is not changed inside any of the
if-clauses. A sufficiently sophisticated compiler could notice that
fact, and also that each of the conditions is on the same variable, and
as a result it could generate the same kind of code as if it had been
written with 'else', so it won't generate unnecessary condition tests.
It might, in fact, generate the same kind of code which would have been
generated if it had been coded properly, as a switch statement, so it
might use a jump table, if appropriate.
But it's better to write it as a switch statement in the first place, so
you don't have to rely upon the compiler being sufficiently
sophisticated to get the best results.
I disagree entirely.
It is best to write the code in the way that makes most sense -
whatever gives the best clarity and makes the programmer's intentions
obvious to readers, and with the least risk of errors. Consider the maintainability of the code - is it likely to be changed in the
future, or adapted and re-used in other contexts? If so, that should
be a big influence on how you structure the source code. Can a
different structure make it less likely for errors to occur unnoticed?
For example, if the controlling value can be an enumeration then with
a switch, a good compiler can check if there are accidentally
unhandled cases (and even a poor compiler can check for duplicates).
It is best to write the code in the way that makes most sense - whatever gives the best clarity and makes the programmer's intentions obvious to readers, and with the least risk of errors.
On 11/1/24 04:56, David Brown wrote:
On 31/10/2024 19:16, James Kuyper wrote:
On 10/31/24 09:15, Anton Shepelev wrote:
fir:
somethins i got such pies of code like
if(n==1) {/*something/}
if(n==2) {/*something/}
if(n==3) {/*something/}
if(n==4) {/*something/}
if(n==5) {/*something/}
technically i would need to add elses
Why?
He has indicated that the value of n is not changed inside any of the
if-clauses. A sufficiently sophisticated compiler could notice that
fact, and also that each of the conditions is on the same variable, and
as a result it could generate the same kind of code as if it had been
written with 'else', so it won't generate unnecessary condition tests.
It might, in fact, generate the same kind of code which would have been
generated if it had been coded properly, as a switch statement, so it
might use a jump table, if appropriate.
But it's better to write it as a switch statement in the first place, so >>> you don't have to rely upon the compiler being sufficiently
sophisticated to get the best results.
I disagree entirely.
It is best to write the code in the way that makes most sense -
whatever gives the best clarity and makes the programmer's intentions
obvious to readers, and with the least risk of errors. Consider the
maintainability of the code - is it likely to be changed in the
future, or adapted and re-used in other contexts? If so, that should
be a big influence on how you structure the source code. Can a
different structure make it less likely for errors to occur unnoticed?
For example, if the controlling value can be an enumeration then with
a switch, a good compiler can check if there are accidentally
unhandled cases (and even a poor compiler can check for duplicates).
I don't see those criteria as conflicting with my advice. A switch seems
to me to unambiguously the clearest way of writing this logic, for
precisely the same reason it also makes it easier for unsophisticated compilers to optimize it - what needs to be done is clearer both to the compiler and to the human reader.
David Brown wrote:
It is best to write the code in the way that makes most sense - whatever
gives the best clarity and makes the programmer's intentions obvious to
readers, and with the least risk of errors.
the fact is it is somewhat hard to say which is more obvious to readers
if(key=='A') Something();
else if(key=='B') Something();
else if(key=='C') Something();
else if(key=='D') Something();
or
if(key=='A') Something();
if(key=='B') Something();
if(key=='C') Something();
if(key=='D') Something();
imo the second is more for human but logically its a bit diferent
becouse else chain only goes forward on "false" and new statemant on
both "true and false"
On 01/11/2024 18:47, David Brown wrote:
On 01/11/2024 19:05, Bart wrote:
On 01/11/2024 17:35, David Brown wrote:
What you have written here is all correct, but a more common method
would be to avoid having three printf's :
void shout_a_number(int n) {
printf( (const char* []) { "ONE", "TWO", "THREE" } [n] );
}
That's more likely to match what people would want.
I was also trying to show that all elements are evaluated, so each
has to have some side-effect to illustrate that.
Fair enough.
A true N-way-select construct (C only really has ?:) would evaluate
only one, and would deal with an out-of-range condition.
That's a matter of opinion and design choice, rather than being
requirements for a "true" select construct.
I don't think it's just opinion.
In general, an if-else-if chain (which was the point of the OP), would evaluate only one branch.
So would a switch-case construct if sensibly
implemented (in C's version, anything goes).
The same applies to C's c?a:b operator: only one of a or b is evaluated,
not both.
(This also why implementing if, switch, ?: via functions, which lots are keen to do in the reddit PL forum, requires closures, lazy evaluation or other advanced features.)
You are free to choose the rules you want for your own language, but
you are not free to dictate what you think the rules should be for
others. (You are welcome to /opinions/, of course.)
(In my implementations, a default/else branch value must be provided
if the whole thing is expected to return a value.)
OK, if that's what you want. My preference, if I were putting
together what /I/ thought was an idea language for /my/ use, would be
heavy use of explicit specifications and contracts for code, so that a
default/else branch is either disallowed (if there the selection
covers all legal values) or required (if the selection is
abbreviated). A default value "just in case" is, IMHO, worse than
useless.
All such multiway constructs in my languages (there are 4, one of which
the job of both 'if' and C's ?:) have an optional else branch. A
missing 'else' has an notional 'void' type.
But it becomes mandatory if the whole thing returns a value, to satisfy
the type system, because otherwise it will try and match with 'void'.
SOMETHING needs to happen when none of the branches are executed; what
value would be returned then? The behaviour needs to be defined. You
don't want to rely on compiler analysis for this stuff.
In C on the other hand, the ':' of '?:' is always needed, even when it
is not expected to yield a value. Hence you often see this things like
this:
p == NULL ? puts("error"): 0;
Here, gcc at least, also requires the types of the two branches to
match, even though the whole construct yields no common value.
Meanwhile
I allow this (if I was keen on a compact form):
(p = nil | print "error")
No else is needed.
the fact is it is somewhat hard to say which is more obvious to readers
if(key=='A') Something();
else if(key=='B') Something();
else if(key=='C') Something();
else if(key=='D') Something();
or
if(key=='A') Something();
if(key=='B') Something();
if(key=='C') Something();
if(key=='D') Something();
imo the second is more for human but logically its a bit diferent
becouse else chain only goes forward on "false" and new statemant on
both "true and false"
Bart wrote:
ral clear patterns here: you're testing the same variable 'n'
against several mutually exclusive alternatives, which also happen
to be consecutive values.
C is short of ways to express this, if you want to keep those
'somethings' as inline code (otherwise arrays of function pointers
or even label pointers could be use
so in short this groupo seem to have no conclusion but is tolerant
foir various approaches as it seems
imo the else latder is like most proper but i dont lkie it
optically, swich case i also dont like (use as far i i remember
never in my code, for years dont use even one)
so i persnally would use bare ifs and maybe elses ocasionally
(and switch should be mended but its fully not clear how,
On 01/11/2024 20:47, Bart wrote:
On 01/11/2024 18:47, David Brown wrote:
On 01/11/2024 19:05, Bart wrote:
On 01/11/2024 17:35, David Brown wrote:
What you have written here is all correct, but a more common method >>>>> would be to avoid having three printf's :
void shout_a_number(int n) {
printf( (const char* []) { "ONE", "TWO", "THREE" } [n] );
}
That's more likely to match what people would want.
I was also trying to show that all elements are evaluated, so each
has to have some side-effect to illustrate that.
Fair enough.
A true N-way-select construct (C only really has ?:) would evaluate
only one, and would deal with an out-of-range condition.
That's a matter of opinion and design choice, rather than being
requirements for a "true" select construct.
I don't think it's just opinion.
Yes, it is.
I don't disagree that such an "select one of these and evaluate only
that" construct can be a useful thing, or a perfectly good alternative
to to an "evaluate all of these then select one of them" construct. But you are completely wrong to think that one of these two is somehow the "true" or only correct way to have a selection.
In some languages, the construct for "A or B" will evaluate both, then
"or" them. In other languages, it will evaluate "A" then only evaluate
"B" if necessary. In others, expressions "A" and "B" cannot have side-effects, so the evaluation or not makes no difference. All of
these are perfectly valid design choices for a language.
In general, an if-else-if chain (which was the point of the OP), would
evaluate only one branch.
It evaluates all the conditionals down the chain until it hits a "true" result, then evaluates the body of the "if" that matches, then skips the rest.
(Of course generated code can evaluate all sorts of things in different orders, as long as observable behaviour - side-effects - are correct.)
So would a switch-case construct if sensibly implemented (in C's
version, anything goes).
C's switch is perfectly simply and clearly defined. It is not "anything goes". The argument to the switch is evaluated once, then control jumps
to the label of the switch case, then evaluation continues from that point. It is totally straight-forward.
You might not like the "fall-through" concept or the way C's switch does
not quite fit with structured programming. If so, I'd agree entirely.
The requirement for lots of "break;" statements in most C switch uses is
a source of countless errors in C coding and IMHO a clear mistake in the language design. But that does not hinder C's switch statements from
being very useful, very easy to understand (when used sensibly), and
with no doubts about how they work (again, when used sensibly).
The same applies to C's c?a:b operator: only one of a or b is
evaluated, not both.
You are conflating several ideas, then you wrote something that you
/know/ is pure FUD about C's switch statements.
So writing "The same
applies" makes no sense.
You are, of course, correct that in "c ? a : b", "c" is evaluated first
and then one and only one of "a" and "b".
(This also why implementing if, switch, ?: via functions, which lots
are keen to do in the reddit PL forum, requires closures, lazy
evaluation or other advanced features.)
Yes, you'd need something like that to implement such "short-circuit" operators using functions in C. In other languages, things may be different.
But it becomes mandatory if the whole thing returns a value, to
satisfy the type system, because otherwise it will try and match with
'void'.
Your language, your choice.
I'd question the whole idea of having a
construct that can evaluate to something of different types in the first place, whether or not it returns a value, but that's your choice.
SOMETHING needs to happen when none of the branches are executed; what
value would be returned then? The behaviour needs to be defined. You
don't want to rely on compiler analysis for this stuff.
In my hypothetical language described above, it never happens that none
of the branches are executed.
Do you feel you need to write code like this?
const char * flag_to_text_A(bool b) {
if (b == true) {
return "It's true!";
} else if (b == false) {
return "It's false!";
} else {
return "Schrödinger's cat has escaped!";
}
}
When you have your "else" or "default" clause that is added for
something that can't ever happen, how do you test it?
In C on the other hand, the ':' of '?:' is always needed, even when it
is not expected to yield a value. Hence you often see this things like
this:
p == NULL ? puts("error"): 0;
Given that the tertiary operator chooses between two things, it seems
fairly obvious that you need two alternatives to choose from - having a choice operator without at least two choices would be rather useless.
I can't say I have ever seen the tertiary operator used like this. There
are a few C programmers that like to code with everything as
expressions, using commas instead of semicolons, but they are IMHO
mostly just being smart-arses. It's a lot more common to write :
if (!p) puts("error");
Meanwhile I allow this (if I was keen on a compact form):
(p = nil | print "error")
No else is needed.
In C you could write :
p == NULL || puts("error");
which is exactly the same structure.
I think all of these, including your construct in your language, are smart-arse choices compared to a simple "if" statement, but personal
styles and preferences vary.
With the understanding that I am offering more than my own opinion,
On 02/11/2024 11:41, David Brown wrote:
On 01/11/2024 20:47, Bart wrote:
On 01/11/2024 18:47, David Brown wrote:
On 01/11/2024 19:05, Bart wrote:
On 01/11/2024 17:35, David Brown wrote:
What you have written here is all correct, but a more common
method would be to avoid having three printf's :
void shout_a_number(int n) {
printf( (const char* []) { "ONE", "TWO", "THREE" } [n] );
}
That's more likely to match what people would want.
I was also trying to show that all elements are evaluated, so
each has to have some side-effect to illustrate that.
Fair enough.
A true N-way-select construct (C only really has ?:) would
evaluate only one, and would deal with an out-of-range condition.
That's a matter of opinion and design choice, rather than being
requirements for a "true" select construct.
I don't think it's just opinion.
Yes, it is.
Then we disagree on what 'multi-way' select might mean. I think it
means branching, even if notionally, on one-of-N possible code paths.
The whole construct may or may not return a value. If it does, then
one of the N paths must be a default path.
...
fir <fir@grunge.pl> writes:
Bart wrote:
ral clear patterns here: you're testing the same variable 'n'
against several mutually exclusive alternatives, which also happen
to be consecutive values.
C is short of ways to express this, if you want to keep those
'somethings' as inline code (otherwise arrays of function pointers
or even label pointers could be use
so in short this groupo seem to have no conclusion but is tolerant
foir various approaches as it seems
imo the else latder is like most proper but i dont lkie it
optically, swich case i also dont like (use as far i i remember
never in my code, for years dont use even one)
so i persnally would use bare ifs and maybe elses ocasionally
(and switch should be mended but its fully not clear how,
I think you should have confidence in your own opinion. All
you're getting from other people is their opinion about what is
easier to understand, or "clear", or "readable", etc. As long as
the code is logically correct you are free to choose either
style, and it's perfectly okay to choose the one that you find
more appealing.
There is a case where using 'else' is necessary, when there is a
catchall action for circumstances matching "none of the above".
Alternatively a 'break' or 'continue' or 'goto' or 'return' may
be used to bypass subsequent cases, but you get the idea.
With the understanding that I am offering more than my own opinion,
I can say that I might use any of the patterns mentioned, depending
on circumstances. I don't think any one approach is either always
right or always wrong.
It's a mess. By contrast, my if statements look like this:
if then elsif then ... [else] fi
'elsif' is a part of the syntax. The whole thing can return a value.
There is a compact form (not for elsif, that would be too much) as shown above.
Bart wrote:
It's a mess. By contrast, my if statements look like this:
if then elsif then ... [else] fi
'elsif' is a part of the syntax. The whole thing can return a value.
There is a compact form (not for elsif, that would be too much) as shown
above.
as to if when thinking of it the if construct has such parts
if X then S else E
and the keyword if is not necessary imo as the expression x return
logical value them then can be used on this without if
X then {}
X else {}
i would prefer to denote (at least temporerely) then as ->
and else as ~> then you can build construct like
a -> b -> c -> d ~> e ~> f
when the arrows take logical value of the left
(if a true then b, if be true then c if c true then d,if
d false then e and if e false then f)
but some need also to use else to some previous espression and
i think how it could be done but maybe just parenthesis can be used
a (->b->c) ~>z
if a true then b and if b true then c but if a false then z
Bart wrote:
...
as to this switch as i said the C jas some syntax that resembles switch
and it is
[2] { printf("one"), printf("two"), printf("three") }
i mean it is like this compound sometheng you posted
{ printf("one"), printf("two"), printf("three") } [2]
but with "key" on the left to ilustrate the analogy to
swich(n) {case 0: printf("one"); case 1: printf("two"); case 2: rintf("three") }
imo the resemblance gives to think
the difference is this compound (array-like) example dont uses defined
keys so it semms some should be added
[n] {{1: printf("one")},{2: printf("two")},{3: printf("three")} }
so those deduction on switch gives the above imo
the question is if some things couldnt be ommitted for simplicity
[key] {'A': printf("one"); 'B': printf("two"); 'C': printf("three"}; }
something like that
(insted of
switch(key)
{
case 'A': printf("one"); break;
case 'B': printf("two"); break;
case 'C': printf("three"}; break;
}
On 03/11/2024 00:26, fir wrote:
Bart wrote:
...
as to this switch as i said the C jas some syntax that resembles
switch and it is
[2] { printf("one"), printf("two"), printf("three") }
i mean it is like this compound sometheng you posted
{ printf("one"), printf("two"), printf("three") } [2]
but with "key" on the left to ilustrate the analogy to
swich(n) {case 0: printf("one"); case 1: printf("two"); case 2:
rintf("three") }
imo the resemblance gives to think
the difference is this compound (array-like) example dont uses defined
keys so it semms some should be added
[n] {{1: printf("one")},{2: printf("two")},{3: printf("three")} }
so those deduction on switch gives the above imo
the question is if some things couldnt be ommitted for simplicity
[key] {'A': printf("one"); 'B': printf("two"); 'C': printf("three"}; }
something like that
(insted of
switch(key)
{
case 'A': printf("one"); break;
case 'B': printf("two"); break;
case 'C': printf("three"}; break;
}
Here the switch looks clearer. Write it with 300 cases instead of 3,
then that becomes obvious.
The first time I wrote a big C program, I used a syntax like this:
switch (x)
when 'A', 'B' then printf("one")
when 'C' then printf("two")
else printf("three")
endsw
This needed to be converted to normal C before compiling, but the macro system wasn't quite up to the job (making using gnu C which allows for
lists of case labels).
Instead I used a script to do the conversion, which needed 1:1 line correspondence. The result was something like this:
switch (x) {
break; case 'A': case 'B': printf("one");
break; case 'C': printf("two");
break; default: printf("three");
}
On 03/11/2024 01:21, fir wrote:
Bart wrote:
It's a mess. By contrast, my if statements look like this:
if then elsif then ... [else] fi
'elsif' is a part of the syntax. The whole thing can return a value.
There is a compact form (not for elsif, that would be too much) as shown >>> above.
as to if when thinking of it the if construct has such parts
if X then S else E
and the keyword if is not necessary imo as the expression x return
logical value them then can be used on this without if
X then {}
X else {}
i would prefer to denote (at least temporerely) then as ->
and else as ~> then you can build construct like
a -> b -> c -> d ~> e ~> f
when the arrows take logical value of the left
(if a true then b, if be true then c if c true then d,if
d false then e and if e false then f)
but some need also to use else to some previous espression and
i think how it could be done but maybe just parenthesis can be used
a (->b->c) ~>z
if a true then b and if b true then c but if a false then z
C already has this (I've added parentheses for clarity):
(a ? (b ? c : -) : z)
This shows you haven't provided a branch for b being false.
Also it's not clear if you intended for b to be evaluated twice; I've
assumed only once as it is nonsense otherwise.
bart wrote
C already has this (I've added parentheses for clarity):
(a ? (b ? c : -) : z)
This shows you haven't provided a branch for b being false.
coz ypu dont need to provide such branch
in c yu need to put that ":" each time? (if so some error was discovered
as it ould be better to not oblige its existence)
Bart wrote:
On 03/11/2024 00:26, fir wrote:
Bart wrote:
...
as to this switch as i said the C jas some syntax that resembles
switch and it is
[2] { printf("one"), printf("two"), printf("three") }
i mean it is like this compound sometheng you posted
{ printf("one"), printf("two"), printf("three") } [2]
but with "key" on the left to ilustrate the analogy to
swich(n) {case 0: printf("one"); case 1: printf("two"); case 2:
rintf("three") }
imo the resemblance gives to think
the difference is this compound (array-like) example dont uses defined
keys so it semms some should be added
[n] {{1: printf("one")},{2: printf("two")},{3: printf("three")} }
so those deduction on switch gives the above imo
the question is if some things couldnt be ommitted for simplicity
[key] {'A': printf("one"); 'B': printf("two"); 'C': printf("three"}; }
something like that
(insted of
switch(key)
{
case 'A': printf("one"); break;
case 'B': printf("two"); break;
case 'C': printf("three"}; break;
}
Here the switch looks clearer. Write it with 300 cases instead of 3,
then that becomes obvious.
depend on what some understoods by clearer - imo not
this []{;;;} at least is like logically drawed from other c syntax
and switch case overally the word case is ok imo but the word switch is overrally like wrong imo switch could be better replaced by two
word "select" and maybe "goto" as this swich that selects could use
select and this one wgo does goto could use word goto
goto key;
'A': printf("a");
'B': printf("b");
'C': printf("c");
'
overally thete is lso possibility to do it such way
void foo()
{
"a" { printf("aaa"); } //definitions not calls itself
"b" { printf("bbb"); }
"c" { printf("ccc"); }
"a";"b";"c"; //calls (???)
// would need maybe some some syntax to call it (many could be chosen)
// "a"() ? foo."a" ? foo.[key] ?
maybe this woudl be the best if established as ths is more syntaktc "low lewel"
}
On 02/11/2024 11:41, David Brown wrote:
On 01/11/2024 20:47, Bart wrote:
On 01/11/2024 18:47, David Brown wrote:
On 01/11/2024 19:05, Bart wrote:
On 01/11/2024 17:35, David Brown wrote:
What you have written here is all correct, but a more common
method would be to avoid having three printf's :
void shout_a_number(int n) {
printf( (const char* []) { "ONE", "TWO", "THREE" } [n] ); >>>>>> }
That's more likely to match what people would want.
I was also trying to show that all elements are evaluated, so each
has to have some side-effect to illustrate that.
Fair enough.
A true N-way-select construct (C only really has ?:) would evaluate >>>>> only one, and would deal with an out-of-range condition.
That's a matter of opinion and design choice, rather than being
requirements for a "true" select construct.
I don't think it's just opinion.
Yes, it is.
Then we disagree on what 'multi-way' select might mean. I think it means branching, even if notionally, on one-of-N possible code paths.
The whole construct may or may not return a value. If it does, then one
of the N paths must be a default path.
I don't disagree that such an "select one of these and evaluate only
that" construct can be a useful thing, or a perfectly good alternative
to to an "evaluate all of these then select one of them" construct.
But you are completely wrong to think that one of these two is somehow
the "true" or only correct way to have a selection.
In some languages, the construct for "A or B" will evaluate both, then
"or" them. In other languages, it will evaluate "A" then only
evaluate "B" if necessary. In others, expressions "A" and "B" cannot
have side-effects, so the evaluation or not makes no difference. All
of these are perfectly valid design choices for a language.
Those logical operators that may or may not short-circuit.
One feature of my concept of 'multi-way select' is that there is one or
more controlling expressions which determine which path is followed.
So, I'd be interested in what you think of as a multi-way select which
may evaluate more than one branch. Or was it that 'or' example?
In general, an if-else-if chain (which was the point of the OP),
would evaluate only one branch.
It evaluates all the conditionals down the chain until it hits a
"true" result, then evaluates the body of the "if" that matches, then
skips the rest.
I don't count evaluating the conditionals: here it is the branches that count (since it is one of those that is 'selected' via those
conditionals), and here you admit that only one is executed.
(Of course generated code can evaluate all sorts of things in
different orders, as long as observable behaviour - side-effects - are
correct.)
So would a switch-case construct if sensibly implemented (in C's
version, anything goes).
C's switch is perfectly simply and clearly defined. It is not
"anything goes". The argument to the switch is evaluated once, then
control jumps to the label of the switch case, then evaluation
continues from that point. It is totally straight-forward.
It's pretty much the complete opposite of straightforward, as you go on
to demonstrate.
C 'switch' looks like it might be properly structured if written
sensibly. The reality is different: what follows `switch (x)` is just
ONE C statement, often a compound statement.
Case labels can located ANYWHERE within that statement, including within nested statements (eg. inside a for-statement), and including
'default:', which could go before all the case labels!
The only place they can't go is within a further nested switch, which
has its own set of case-labels.
Control tranfers to any matching case-label or 'default:' and just keeps executing code within that ONE statement, unless it hits 'break;'.
It is totally chaotic. This is what I mean by 'anything goes'. This is a valid switch statement for example: 'switch (x);'.
You can't use such a statement as a solid basis for a multi-way
construct that returns a value, since it is, in general, impossible to sensibly enumerate the N branches.
You might not like the "fall-through" concept or the way C's switch
does not quite fit with structured programming. If so, I'd agree
entirely.
Good.
The requirement for lots of "break;" statements in most C switch uses
is a source of countless errors in C coding and IMHO a clear mistake
in the language design. But that does not hinder C's switch
statements from being very useful, very easy to understand (when used
sensibly), and with no doubts about how they work (again, when used
sensibly).
The same applies to C's c?a:b operator: only one of a or b is
evaluated, not both.
You are conflating several ideas, then you wrote something that you
/know/ is pure FUD about C's switch statements.
It wasn't.
YOU wrote FUD when you called them straightforward. I would
bet you that the majority of C programmers don't know just how weird
switch is.
So writing "The same applies" makes no sense.
'The same applies' was in reference to this previous remark of mine:
"In general, an if-else-if chain (which was the point of the OP), would evaluate only one branch. So would a switch-case construct if sensibly implemented (in C's version, anything goes). "
You are, of course, correct that in "c ? a : b", "c" is evaluated
first and then one and only one of "a" and "b".
And here you confirm that it does in fact apply: only one branch is executed.
You can't apply it to C's switch as there is no rigorous way of even determining what is a branch. Maybe it is a span between 2 case labels?
But then, one of those might be in a different nested statement!
(This also why implementing if, switch, ?: via functions, which lots
are keen to do in the reddit PL forum, requires closures, lazy
evaluation or other advanced features.)
Yes, you'd need something like that to implement such "short-circuit"
operators using functions in C. In other languages, things may be
different.
Yes, short-circut operators would need the same features. That's why
it's easier to build this stuff into a core language than to try and
design a language where 90% of the features are there to implement what should be core features.
But it becomes mandatory if the whole thing returns a value, to
satisfy the type system, because otherwise it will try and match with
'void'.
Your language, your choice.
These things tend to come about because that is the natural order that
comes through. It's something I observed rather than decided.
I'd question the whole idea of having a construct that can evaluate
to something of different types in the first place, whether or not it
returns a value, but that's your choice.
If the result of a multi-way execution doesn't yield a value to be used, then the types don't matter.
If it does, then they DO matter, as they have to be compatible types in
a static language.
This is just common sense; I don't know why you're questioning it. (I'd quite like to see a language of your design!)
SOMETHING needs to happen when none of the branches are executed;
what value would be returned then? The behaviour needs to be defined.
You don't want to rely on compiler analysis for this stuff.
In my hypothetical language described above, it never happens that
none of the branches are executed.
Do you feel you need to write code like this?
const char * flag_to_text_A(bool b) {
if (b == true) {
return "It's true!";
} else if (b == false) {
return "It's false!";
} else {
return "Schrödinger's cat has escaped!";
}
}
When you have your "else" or "default" clause that is added for
something that can't ever happen, how do you test it?
I write code like this:
func F(b) =
if X then
A # 'return' is optional
elsif Y then
B
fi
end
As it is, it requires 'else' (because this is a value-returning function.
X Y A B are arbitrary expressions. The need for 'else' is determined
during type analysis. Whether it will ever execute the default path
would be up to extra analysis, that I don't do, and would anyway be done later.
You can't design a language like this where valid syntax depends on
compiler and what it might or might not discover when analysing the code.
The rule instead is simple: where a multi-path construct yields a value, then it needs the default branch, always.
A compiler /might/ figure out it isn't needed, and not generate that bit
of code. (Or as I suggested, it might insert a suitable branch.)
You seem to like putting the onus on compiler writers to have to analyse programs to the limit.
(Note that my example is for dynamic code; there X Y may only be known
at runtime anyway.)
In my languages, the last statement of a function can be arbitrarily
complex and nested; there could be dozens of points where a return value
is needed.
In C on the other hand, the ':' of '?:' is always needed, even when
it is not expected to yield a value. Hence you often see this things
like this:
p == NULL ? puts("error"): 0;
Given that the tertiary operator chooses between two things, it seems
fairly obvious that you need two alternatives to choose from - having
a choice operator without at least two choices would be rather useless.
It seems you are just arguing in the defence of C rather than
objectively, and being contradictory in the process.
For example, earlier you said I'm wrong to insist on a default path for multi-way ops when it is expected to yield a value. But here you say it
is 'obvious' for the ?: multi-way operator to insist on a default path
even when any value is not used.
This is on top of saying that I'm spreading 'FUD' about switch and that
is it really a perfectly straightforward feature!
Now *I* am wary of trusting your judgement.
I can't say I have ever seen the tertiary operator used like this.
There are a few C programmers that like to code with everything as
expressions, using commas instead of semicolons, but they are IMHO
mostly just being smart-arses. It's a lot more common to write :
if (!p) puts("error");
Well, it happens, and I've seen it (and I've had to ensure my C compiler deals with it when it comes up, which it has). Maybe some instances of
it are hidden behind macros.
Meanwhile I allow this (if I was keen on a compact form):
(p = nil | print "error")
No else is needed.
In C you could write :
p == NULL || puts("error");
which is exactly the same structure.
This is new to me. So this is another possibility for the OP?
It's an untidy feature however; it's abusing || in similar ways to those
who separate things with commas to avoid needing a compounds statement.
It is also error prone as it is untuitive: you probably meant one of:
p != NULL || puts("error");
p == NULL && puts("error");
There are also limitations: what follows || or || needs to be something
that returns a type that can be coerced to an 'int' type.
(Note that the '|' is my example is not 'or'; it means 'then':
( c | a ) # these are exactly equivalent
if c then a fi
( c | a | ) # so are these
if c then a else b fi
There is no restriction on what a and b are, statements or expressions, unless the whole returns some value.)
I think all of these, including your construct in your language, are
smart-arse choices compared to a simple "if" statement, but personal
styles and preferences vary.
C's if statement is rather limited. As it is only if-else, then
if-else-if sequences must be emulated using nested if-else-(if else (if else....
Misleading indentation needs to be used to stop nested if's disappearing
to the right. When coding style mandates braces around if branches, an exception needs to be made for if-else-if chains (otherwise you will end
up with }}}}}}}... at the end.
And the whole thing cannot return a value; a separate ?: feature (whose branches must be expressions) is needed.
It is also liable to 'dangling' else, and error prone due to braces
being optional.
It's a mess. By contrast, my if statements look like this:
if then elsif then ... [else] fi
'elsif' is a part of the syntax. The whole thing can return a value.
There is a compact form (not for elsif, that would be too much) as shown above.
like. (It would be very strange for you to have constructs that you
don't like in your own personal one-man language.)
On 02/11/2024 21:44, Bart wrote:
I would disagree on that definition, yes. A "multi-way selection" would mean, to me, a selection of one of N possible things - nothing more than that. It is far too general a phrase to say that it must involve
branching of some sort ("notional" or otherwise).
And it is too general
to say if you are selecting one of many things to do, or doing many
things and selecting one.
The whole construct may or may not return a value. If it does, then
one of the N paths must be a default path.
No, that is simply incorrect. For one thing, you can say that it is perfectly fine for the selection construct to return a value sometimes
and not at other times.
cases. It's fine to give selection choices for all possible inputs.
It's fine to say that the input must be a value for which there is a
choice.
What I see here is that you don't like C's constructs (that may be for
good reasons, it may be from your many misunderstandings about C, or it
may be from your knee-jerk dislike of everything C related).
You have
some different selection constructs in your own language, which you /do/ like. (It would be very strange for you to have constructs that you
don't like in your own personal one-man language.)
One feature of my concept of 'multi-way select' is that there is one
or more controlling expressions which determine which path is followed.
Okay, that's fine for /your/ language's multi-way select construct. But other people and other languages may do things differently.
There are plenty of C programmers - including me - who would have
preferred to have "switch" be a more structured construct which could
not be intertwined with other constructs in this way. That does not
mean "switch" is not clearly defined - nor does it hinder almost every real-world use of "switch" from being reasonably clear and structured.
It does, however, /allow/ people to use "switch" in more complex and
less clear ways.
You are confusing "this makes it possible to write messy code" with a
belief that messy code is inevitable or required. And you are
forgetting that it is always possible to write messy or incomprehensible code in any language, with any construct.
You can't use such a statement as a solid basis for a multi-way
construct that returns a value, since it is, in general, impossible to
sensibly enumerate the N branches.
It is simple and obvious to enumerate the branches in almost all
real-world cases of switch statements. (And /please/ don't faff around with cherry-picked examples you have found somewhere as if they were representative of anything.)
So if I understand correctly, you are saying that chains of if/else, an imaginary version of "switch", and the C tertiary operator all evaluate
the same things in the same way, while with C's switch you have no idea
what happens?
That is true, if you cherry-pick what you choose to
ignore in each case until it fits your pre-conceived ideas.
No, what you call "natural" is entirely subjective. You have looked at
a microscopic fraction of code written in a tiny proportion of
programming languages within a very narrow set of programming fields.
That's not criticism - few people have looked at anything more.
What I /do/ criticise is that your assumption that this almost
negligible experience gives you the right to decide what is "natural" or "true", or how programming languages or tools "should" work.
You need
to learn that other people have different ideas, needs, opinions or preferences.
I'd question the whole idea of having a construct that can evaluate
to something of different types in the first place, whether or not it
returns a value, but that's your choice.
If the result of a multi-way execution doesn't yield a value to be
used, then the types don't matter.
Of course they do.
This is just common sense; I don't know why you're questioning it.
(I'd quite like to see a language of your design!)
def foo(n) :
if n == 1 : return 10
if n == 2 : return 20
if n == 3 : return
That's Python, quite happily having a multiple choice selection that sometimes does not return a value.
Yes, that is a dynamically typed
language, not a statically type language.
std::optional<int> foo(int n) {
if (n == 1) return 10;
if (n == 2) return 20;
if (n == 3) return {};
}
That's C++, a statically typed language, with a multiple choice
selection that sometimes does not return a value - the return type
supports values of type "int" and non-values.
X Y A B are arbitrary expressions. The need for 'else' is determined
during type analysis. Whether it will ever execute the default path
would be up to extra analysis, that I don't do, and would anyway be
done later.
But if it is not possible for neither of X or Y to be true, then how
would you test the "else" clause? Surely you are not proposing that programmers be required to write lines of code that will never be
executed and cannot be tested?
You can't design a language like this where valid syntax depends on
compiler and what it might or might not discover when analysing the code.
Why not? It is entirely reasonable to say that a compiler for a
language has to be able to do certain types of analysis.
Anyone who is convinced that their own personal preferences are more "natural" or inherently superior to all other alternatives, and can't justify their claims other than saying that everything else is "a mess",
is just navel-gazing.
Tim Rentsch wrote:
fir <fir@grunge.pl> writes:
Bart wrote:
ral clear patterns here: you're testing the same variable 'n'
against several mutually exclusive alternatives, which also happen
to be consecutive values.
C is short of ways to express this, if you want to keep those
'somethings' as inline code (otherwise arrays of function pointers
or even label pointers could be use
so in short this groupo seem to have no conclusion but is tolerant
foir various approaches as it seems
imo the else latder is like most proper but i dont lkie it
optically, swich case i also dont like (use as far i i remember
never in my code, for years dont use even one)
so i persnally would use bare ifs and maybe elses ocasionally
(and switch should be mended but its fully not clear how,
I think you should have confidence in your own opinion. All
you're getting from other people is their opinion about what is
easier to understand, or "clear", or "readable", etc. As long as
the code is logically correct you are free to choose either
style, and it's perfectly okay to choose the one that you find
more appealing.
There is a case where using 'else' is necessary, when there is a
catchall action for circumstances matching "none of the above".
Alternatively a 'break' or 'continue' or 'goto' or 'return' may
be used to bypass subsequent cases, but you get the idea.
With the understanding that I am offering more than my own opinion,
I can say that I might use any of the patterns mentioned, depending
on circumstances. I don't think any one approach is either always
right or always wrong.
maybe, but some may heve some strong arguments (for use this and not
that) i may overlook
fir <fir@grunge.pl> writes:
Tim Rentsch wrote:
fir <fir@grunge.pl> writes:
Bart wrote:
ral clear patterns here: you're testing the same variable 'n'
against several mutually exclusive alternatives, which also happen
to be consecutive values.
C is short of ways to express this, if you want to keep those
'somethings' as inline code (otherwise arrays of function pointers
or even label pointers could be use
so in short this groupo seem to have no conclusion but is tolerant
foir various approaches as it seems
imo the else latder is like most proper but i dont lkie it
optically, swich case i also dont like (use as far i i remember
never in my code, for years dont use even one)
so i persnally would use bare ifs and maybe elses ocasionally
(and switch should be mended but its fully not clear how,
I think you should have confidence in your own opinion. All
you're getting from other people is their opinion about what is
easier to understand, or "clear", or "readable", etc. As long as
the code is logically correct you are free to choose either
style, and it's perfectly okay to choose the one that you find
more appealing.
There is a case where using 'else' is necessary, when there is a
catchall action for circumstances matching "none of the above".
Alternatively a 'break' or 'continue' or 'goto' or 'return' may
be used to bypass subsequent cases, but you get the idea.
With the understanding that I am offering more than my own opinion,
I can say that I might use any of the patterns mentioned, depending
on circumstances. I don't think any one approach is either always
right or always wrong.
maybe, but some may heve some strong arguments (for use this and not
that) i may overlook
I acknowledge the point, but you haven't gotten any arguments,
only opinions.
fir <fir@grunge.pl> writes:
Tim Rentsch wrote:
With the understanding that I am offering more than my own opinion,
I can say that I might use any of the patterns mentioned, depending
on circumstances. I don't think any one approach is either always
right or always wrong.
maybe, but some may heve some strong arguments (for use this and not
that) i may overlook
I acknowledge the point, but you haven't gotten any arguments,
only opinions.
[...]
Here, the question was, can:
if (c1) s1;
else if (c2) s2;
always be rewritten as:
if (c1) s1;
if (c2) s2;
[...]
On 04.11.2024 12:56, Bart wrote:
[...]
Here, the question was, can:
if (c1) s1;
else if (c2) s2;
always be rewritten as:
if (c1) s1;
if (c2) s2;
Erm, no. The question was even more specific.
It had (per example)
not only all ci disjunct but also defined as a linear sequence of
natural numbers! - In other languages [than "C"] this may be more
important since [historically] there were specific constructs for
that case; see e.g. 'switch' definitions in Simula, or the 'case'
statement of Algol 68, both mapping elements onto an array[1..N];
labels in the first case, and expressions in the latter case. So
in "C" we could at least consider using something similar, like,
say, arrays of function pointers indexed by those 'n'.
I'd suggest that by just pointing it out.)
I'm a bit astonished, BTW, about this huge emphasis on the topic
"opinions" in later posts of this thread. The OP asked (even in
the subject) about "practice" which actually invites if not asks
for providing opinions (besides practical experiences).
[...] As long as
the code is logically correct you are free to choose either
style, and it's perfectly okay to choose the one that you find
more appealing.
That the OP's example contained some clear patterns has already been
covered (I did so anyway).
On 04/11/2024 04:00, Tim Rentsch wrote:
fir <fir@grunge.pl> writes:
Tim Rentsch wrote:
With the understanding that I am offering more than my own opinion,
I can say that I might use any of the patterns mentioned, depending
on circumstances. I don't think any one approach is either always
right or always wrong.
maybe, but some may heve some strong arguments (for use this and not
that) i may overlook
I acknowledge the point, but you haven't gotten any arguments,
only opinions.
Pretty much everything about PL design is somebody's opinion.
Bart wrote:
On 04/11/2024 04:00, Tim Rentsch wrote:
fir <fir@grunge.pl> writes:
Tim Rentsch wrote:
With the understanding that I am offering more than my own opinion,
I can say that I might use any of the patterns mentioned, depending
on circumstances. I don't think any one approach is either always
right or always wrong.
maybe, but some may heve some strong arguments (for use this and not
that) i may overlook
I acknowledge the point, but you haven't gotten any arguments,
only opinions.
Pretty much everything about PL design is somebody's opinion.
overally when you think and discuss such thing some conclusions may do
appear - and often soem do for me, though they are not always very clear
or 'hard'
overally from this thread i noted that switch (which i already dont
liked) is bad.. note those two elements of switch it is "switch"
and "Case" are in weird not obvious relation in c (and what will it
work when you mix it etc)
what i concluded was than if you do thing such way
a { } //this is analogon to case - named block
b { } //this is analogon to case - named block
n() // here by "()" i noted call of some wariable that mey yeild
'call' to a ,b, c, d, e, f //(in that case na would be soem enum or
pointer)
c( ) //this is analogon to case - named block
d( ) //this is analogon to case - named block
then everything is clear - this call just selects and calls block , and
block itself are just definitions and are skipped in execution until
"called"
this is example of some conclusion for me from thsi thread - and i think
such codes as this my own initial example should be probably done such
way (though it is not c, i know
fir wrote:
Bart wrote:note in fact both array usage like tab[5] and fuunction call like foo()
On 04/11/2024 04:00, Tim Rentsch wrote:
fir <fir@grunge.pl> writes:
Tim Rentsch wrote:
With the understanding that I am offering more than my own opinion, >>>>>> I can say that I might use any of the patterns mentioned, depending >>>>>> on circumstances. I don't think any one approach is either always >>>>>> right or always wrong.
maybe, but some may heve some strong arguments (for use this and not >>>>> that) i may overlook
I acknowledge the point, but you haven't gotten any arguments,
only opinions.
Pretty much everything about PL design is somebody's opinion.
overally when you think and discuss such thing some conclusions may do
appear - and often soem do for me, though they are not always very clear
or 'hard'
overally from this thread i noted that switch (which i already dont
liked) is bad.. note those two elements of switch it is "switch"
and "Case" are in weird not obvious relation in c (and what will it
work when you mix it etc)
what i concluded was than if you do thing such way
a { } //this is analogon to case - named block
b { } //this is analogon to case - named block
n() // here by "()" i noted call of some wariable that mey yeild
'call' to a ,b, c, d, e, f //(in that case na would be soem enum or
pointer)
c( ) //this is analogon to case - named block
d( ) //this is analogon to case - named block
then everything is clear - this call just selects and calls block , and
block itself are just definitions and are skipped in execution until
"called"
this is example of some conclusion for me from thsi thread - and i think
such codes as this my own initial example should be probably done such
way (though it is not c, i know
are analogues to swich case - as when you call fuctions the call is like switch and function definition sets are 'cases'
Bart wrote:
On 04/11/2024 04:00, Tim Rentsch wrote:
fir <fir@grunge.pl> writes:
Tim Rentsch wrote:
With the understanding that I am offering more than my own opinion,
I can say that I might use any of the patterns mentioned, depending
on circumstances. I don't think any one approach is either always
right or always wrong.
maybe, but some may heve some strong arguments (for use this and not
that) i may overlook
I acknowledge the point, but you haven't gotten any arguments,
only opinions.
Pretty much everything about PL design is somebody's opinion.
overally when you think and discuss such thing some conclusions may do
appear - and often soem do for me, though they are not always very clear
or 'hard'
overally from this thread i noted that switch (which i already dont
liked) is bad.. note those two elements of switch it is "switch"
and "Case" are in weird not obvious relation in c (and what will it
work when you mix it etc)
what i concluded was than if you do thing such way
a { } //this is analogon to case - named block
b { } //this is analogon to case - named block
n() // here by "()" i noted call of some wariable that mey yeild
'call' to a ,b, c, d, e, f //(in that case na would be soem enum or
pointer)
c( ) //this is analogon to case - named block
d( ) //this is analogon to case - named block
then everything is clear - this call just selects and calls block , and
block itself are just definitions and are skipped in execution until
"called"
this is example of some conclusion for me from thsi thread - and i think
such codes as this my own initial example should be probably done such
way (though it is not c, i know
On 04/11/2024 15:06, fir wrote:
fir wrote:
Bart wrote:note in fact both array usage like tab[5] and fuunction call like foo()
On 04/11/2024 04:00, Tim Rentsch wrote:
fir <fir@grunge.pl> writes:
Tim Rentsch wrote:
With the understanding that I am offering more than my own opinion, >>>>>>> I can say that I might use any of the patterns mentioned, depending >>>>>>> on circumstances. I don't think any one approach is either always >>>>>>> right or always wrong.
maybe, but some may heve some strong arguments (for use this and not >>>>>> that) i may overlook
I acknowledge the point, but you haven't gotten any arguments,
only opinions.
Pretty much everything about PL design is somebody's opinion.
overally when you think and discuss such thing some conclusions may do
appear - and often soem do for me, though they are not always very clear >>> or 'hard'
overally from this thread i noted that switch (which i already dont
liked) is bad.. note those two elements of switch it is "switch"
and "Case" are in weird not obvious relation in c (and what will it
work when you mix it etc)
what i concluded was than if you do thing such way
a { } //this is analogon to case - named block
b { } //this is analogon to case - named block
n() // here by "()" i noted call of some wariable that mey yeild
'call' to a ,b, c, d, e, f //(in that case na would be soem enum or
pointer)
c( ) //this is analogon to case - named block
d( ) //this is analogon to case - named block
then everything is clear - this call just selects and calls block , and
block itself are just definitions and are skipped in execution until
"called"
this is example of some conclusion for me from thsi thread - and i think >>> such codes as this my own initial example should be probably done such
way (though it is not c, i know
are analogues to swich case - as when you call fuctions the call is
like switch and function definition sets are 'cases'
Yes, switch could be implemented via a table of label pointers, but it
needs a GNU extension.
For example this switch:
#include <stdio.h>
int main(void) {
for (int i=0; i<10; ++i) {
switch(i) {
case 7: case 2: puts("two or seven"); break;
case 5: puts("five"); break;
default: puts("other");
}
}
}
Could also be written like this:
#include <stdio.h>
int main(void) {
void* table[] = {
&&Lother, &&Lother, &&L27, &&Lother, &&Lother, &&L5,
&&Lother, &&L27, &&Lother, &&Lother};
for (int i=0; i<10; ++i) {
goto *table[i];
L27: puts("two or seven"); goto Lend;
L5: puts("five"); goto Lend;
Lother: puts("other");
Lend:;
}
}
(A compiler may generate something like this, although it will be range-checked if need. In practice, small numbers of cases, or where the
case values are too spread out, might be implemented as if-else chains.)
On 03/11/2024 17:00, David Brown wrote:
On 02/11/2024 21:44, Bart wrote:
I would disagree on that definition, yes. A "multi-way selection"
would mean, to me, a selection of one of N possible things - nothing
more than that. It is far too general a phrase to say that it must
involve branching of some sort ("notional" or otherwise).
Not really. If the possible options involving actions written in-line,
and you only want one of those executed, then you need to branch around
the others!
And it is too general to say if you are selecting one of many things
to do, or doing many things and selecting one.
Sorry, but this is the key part. You are not evaluating N things and selecting one; you are evaluating ONLY one of N things.
For X, it builds a list by evaluating all the elements, and returns the value of the last. For Y, it evaluates only ONE element (using internal switch, so branching), which again is the last.
You don't seem keen on keeping these concepts distinct?
The whole construct may or may not return a value. If it does, then
one of the N paths must be a default path.
No, that is simply incorrect. For one thing, you can say that it is
perfectly fine for the selection construct to return a value sometimes
and not at other times.
How on earth is that going to satisfy the type system? You're saying
it's OK to have this:
int x = if (randomfloat()<0.5) 42;
Or even this, which was discussed recently, and which is apparently
valid C:
int F(void) {
if (randomfloat()<0.5) return 42;
In the first example, you could claim that no assignment takes place
with a false condition (so x contains garbage). In the second example,
what value does F return when the condition is false?
You can't hide behind your vast hyper-optimising compiler; the language needs to say something about it.
My language will not allow it. Most people would say that that's a good thing. You seem to want to take the perverse view that such code should
be allowed to return garbage values or have undefined behaviour.
After all, this is C! But please tell me, what would be the downside of
not allowing it?
It's fine if it never returns at all for some
cases. It's fine to give selection choices for all possible inputs.
It's fine to say that the input must be a value for which there is a
choice.
What I see here is that you don't like C's constructs (that may be for
good reasons, it may be from your many misunderstandings about C, or
it may be from your knee-jerk dislike of everything C related).
With justification. 0010 means 8 in C? Jesus.
It's hardly knee-jerk either since I first looked at it in 1982, when my
own language barely existed. My opinion has not improved.
You have some different selection constructs in your own language,
which you /do/ like. (It would be very strange for you to have
constructs that you don't like in your own personal one-man language.)
It's a one-man language but most of its constructs and features are universal. And therefore can be used for comparison.
One feature of my concept of 'multi-way select' is that there is one
or more controlling expressions which determine which path is followed.
Okay, that's fine for /your/ language's multi-way select construct.
But other people and other languages may do things differently.
FGS, /how/ different? To select WHICH path or which element requires
some input. That's the controlling expression.
Or maybe with your ideal language, you can select an element of an array without bothering to provide an index!
There are plenty of C programmers - including me - who would have
preferred to have "switch" be a more structured construct which could
not be intertwined with other constructs in this way. That does not
mean "switch" is not clearly defined - nor does it hinder almost every
real-world use of "switch" from being reasonably clear and structured.
It does, however, /allow/ people to use "switch" in more complex and
less clear ways.
Try and write a program which takes any arbitrary switch construct (that usually means written by someone else, because obviously all yours will
be sensible), and cleanly isolates all the branches including the
default branch.
Hint: the lack of 'break' in a non-empty span between two case labels
will blur the line. So will a conditional break (example below unless
it's been culled).
You are confusing "this makes it possible to write messy code" with a
belief that messy code is inevitable or required. And you are
forgetting that it is always possible to write messy or
incomprehensible code in any language, with any construct.
I can't write that randomfloat example in my language.
I can't leave out
a 'break' in a switch statement (it's not meaningful). It is impossible
to do the crazy things you can do with switch in C.
Yes, with most languages you can write nonsense programs, but that
doesn't give the language a licence to forget basic rules and common
sense, and just allow any old rubbish even if clearly wrong:
int F() {
F(1, 2.3, "four", F,F,F,F(),F(F()));
F(42);
}
This is apparently valid C. It is impossible to write this in my language.
You can't use such a statement as a solid basis for a multi-way
construct that returns a value, since it is, in general, impossible
to sensibly enumerate the N branches.
It is simple and obvious to enumerate the branches in almost all
real-world cases of switch statements. (And /please/ don't faff
around with cherry-picked examples you have found somewhere as if they
were representative of anything.)
Oh, right. I'm not allowed to use counter-examples to lend weight to my comments. In that case, perhaps you shouldn't be allowed to use your sensible examples either. After all we don't know what someone will feed
to a compiler.
But, suppose C was upgraded so that switch could return a value. For
that, you'd need the value at the end of each branch. OK, here's a
simple one:
y = switch (x) {
case 12:
if (c) case 14: break;
100;
case 13:
200;
break;
}
Any ideas? I will guess that x=12/c=false or c=13 will yield 200. What
avout x=12/c=true, or x=14, or x = anything else?
So if I understand correctly, you are saying that chains of if/else,
an imaginary version of "switch", and the C tertiary operator all
evaluate the same things in the same way, while with C's switch you
have no idea what happens?
Yes. With C's switch, you can't /in-general/ isolate things into
distinct blocks. You might have a stab if you stick to a subset of C and follow various guidelines, in an effort to make 'switch' look normal.
See the example above.
That is true, if you cherry-pick what you choose to ignore in each
case until it fits your pre-conceived ideas.
You're the one who's cherry-picking examples of C!
Here is my attempt at
converting the above switch into my syntax (using a tool derived from my
C compiler):
switch x
when 12 then
if c then
fi
100
fallthrough
when 13 then
200
end switch
It doesn't attempt to deal with fallthrough, and misses out that
14-case, and that conditional break. It's not easy; I might have better
luck with assembly!
No, what you call "natural" is entirely subjective. You have looked
at a microscopic fraction of code written in a tiny proportion of
programming languages within a very narrow set of programming fields.
I've worked with systems programming and have done A LOT in the 15 years until the mid 90s. That included pretty much everything involved in
writing graphical applications given only a text-based disk OS that
provided file-handling.
Plus of course devising and implementing everthing needed to run my own systems language. (After mid 90s, Windows took over half the work.)
That's not criticism - few people have looked at anything more.
Very few people use their own languages, especially over such a long
period, also use them to write commercial applications, or create
languages for others to use.
What I /do/ criticise is that your assumption that this almost
negligible experience gives you the right to decide what is "natural"
or "true", or how programming languages or tools "should" work.
So, in your opinion, 'switch' should work how it works in C? That is the most intuitive and natural way implementing it?
You need to learn that other people have different ideas, needs,
opinions or preferences.
Most people haven't got a clue about devising PLs.
I'd question the whole idea of having a construct that can
evaluate to something of different types in the first place, whether
or not it returns a value, but that's your choice.
If the result of a multi-way execution doesn't yield a value to be
used, then the types don't matter.
Of course they do.
Of course they don't! Here, F, G and H return int, float and void* respectively:
if (c1) F();
else if (c2) G();
else H();
C will not complain that those branches yield different types. But you
say it should do? Why?
You're just being contradictory for the sake of it aren't you?!
This is just common sense; I don't know why you're questioning it.
(I'd quite like to see a language of your design!)
def foo(n) :
if n == 1 : return 10
if n == 2 : return 20
if n == 3 : return
That's Python, quite happily having a multiple choice selection that
sometimes does not return a value.
Python /always/ returns some value. If one isn't provided, it returns
None. Which means checking that a function returns an explicit value
goes out the window. Delete the 10 and 20 (or the entire body), and it
still 'works'.
Yes, that is a dynamically typed language, not a statically type
language.
std::optional<int> foo(int n) {
if (n == 1) return 10;
if (n == 2) return 20;
if (n == 3) return {};
}
That's C++, a statically typed language, with a multiple choice
selection that sometimes does not return a value - the return type
supports values of type "int" and non-values.
So what happens when n is 4? Does it return garbage (so that's bad).
Does it arrange to return some special value of 'optional' that means no value?
In that case, the type still does matter, but the language is
providing that default path for you.
X Y A B are arbitrary expressions. The need for 'else' is determined
during type analysis. Whether it will ever execute the default path
would be up to extra analysis, that I don't do, and would anyway be
done later.
But if it is not possible for neither of X or Y to be true, then how
would you test the "else" clause? Surely you are not proposing that
programmers be required to write lines of code that will never be
executed and cannot be tested?
Why not? They still have to write 'end', or do you propose that can be
left out if control never reaches the end of the function?!
(In earlier versions of my dynamic language, the compiler would insert
an 'else' branch if one was needed, returning 'void'.
I decided that requiring an explicit 'else' branch was better and more failsafe.)
You can't design a language like this where valid syntax depends on
compiler and what it might or might not discover when analysing the
code.
Why not? It is entirely reasonable to say that a compiler for a
language has to be able to do certain types of analysis.
This was the first part of your example:
const char * flag_to_text_A(bool b) {
if (b == true) {
return "It's true!";
} else if (b == false) {
return "It's false!";
/I/ would question why you'd want to make the second branch conditional
in the first place. Write an 'else' there, and the issue doesn't arise.
Because I can't see the point of deliberately writing code that usually takes two paths, when either:
(1) you know that one will never be taken, or
(2) you're not sure, but don't make any provision in case it is
Fix that first rather relying on compiler writers to take care of your
badly written code.
And also, you keep belittling my abilities and my language, when C allows:
int F(void) {}
How about getting your house in order first.
Anyone who is convinced that their own personal preferences are more
"natural" or inherently superior to all other alternatives, and can't
justify their claims other than saying that everything else is "a
mess", is just navel-gazing.
I wrote more here but the post is already too long.
Let's just that
'messy' is a fair assessment of C's conditional features, since you can write this:
On 03/11/2024 21:00, Bart wrote:
To my mind, this is a type of "multi-way selection" :
(const int []){ a, b, c }[n];
I can't see any good reason to exclude it as fitting the descriptive
phrase.
And if "a", "b" and "c" are not constant, but require
evaluation of some sort, it does not change things. Of course if these required significant effort to evaluate,
or had side-effects, then you
would most likely want a "multi-way selection" construction that did the selection first, then the evaluation - but that's a matter of programmer choice, and does not change the terms.
I am very keen on keeping the concepts distinct in cases where it
matters.
int x = if (randomfloat()<0.5) 42;
In C, no. But when we have spread to other languages, including hypothetical languages, there's nothing to stop that. Not only could it
be supported by the run-time type system, but it would be possible to
have compile-time types that are more flexible
and only need to be
"solidified" during code generation. That might allow the language to track things like "uninitialised" or "no value" during compilation
without having them part of a real type (such as std::optional<> or a C
It doesn't return a value. That is why it is UB to try to use that non-existent value.
My language will not allow it. Most people would say that that's a
good thing. You seem to want to take the perverse view that such code
should be allowed to return garbage values or have undefined behaviour.
Is your idea of "most people" based on a survey of more than one person?
Note that I have not suggested returning garbage values - I have
suggested that a language might support handling "no value" in a
convenient and safe manner.
Totally independent of and orthogonal to that, I strongly believe that
there is no point in trying to define behaviour for something that
cannot happen,
With justification. 0010 means 8 in C? Jesus.
I think the word "neighbour" is counter-intuitive to spell.
Once a thread here has wandered this far off-topic, it is perhaps not unreasonable to draw comparisons with your one-man language.
The real problem with your language is that you think it is perfect
int F() {
F(1, 2.3, "four", F,F,F,F(),F(F()));
F(42);
It is undefined behaviour in C. Programmers are expected to write
sensible code.
If I were the designer of the C language and the maintainer of the C standards, you might have a point. C is not /my/ language.
We can agree that C /lets/ people write messy code. It does not
/require/ it. And I have never found a programming language that stops people writing messy code.
On 04/11/2024 16:35, David Brown wrote:
On 03/11/2024 21:00, Bart wrote:
To my mind, this is a type of "multi-way selection" :
(const int []){ a, b, c }[n];
I can't see any good reason to exclude it as fitting the descriptive
phrase.
And if "a", "b" and "c" are not constant, but require evaluation of
some sort, it does not change things. Of course if these required
significant effort to evaluate,
Or you had a hundred of them.
or had side-effects, then you would most likely want a "multi-way
selection" construction that did the selection first, then the
evaluation - but that's a matter of programmer choice, and does not
change the terms.
You still don't get how different the concepts are.
On 04/11/2024 16:35, David Brown wrote:
On 03/11/2024 21:00, Bart wrote:
Here is a summary of C vs my language.
I am very keen on keeping the concepts distinct in cases where it
matters.
I know, you like to mix things up. I like clear lines:
func F:int ... Always returns a value
proc P ... Never returns a value
and only need to be "solidified" during code generation. That might
allow the language to track things like "uninitialised" or "no value"
during compilation without having them part of a real type (such as
std::optional<> or a C
But you are always returning an actual type in agreement with the
language. That is my point. You're not choosing to just fall off that
cliff and return garbage or just crash.
However, your example with std::optional did just that, despite having
that type available.
It doesn't return a value. That is why it is UB to try to use that
non-existent value.
And why it is so easy to avoid that UB.
Note that I have not suggested returning garbage values - I have
suggested that a language might support handling "no value" in a
convenient and safe manner.
But in C it is garbage.
Totally independent of and orthogonal to that, I strongly believe that
there is no point in trying to define behaviour for something that
cannot happen,
But it could for n==4.
EVERYBODY agrees that leading zero octals in C were a terrible idea. You can't say it's just me thinks that!
int F() {
F(1, 2.3, "four", F,F,F,F(),F(F()));
F(42);
It is undefined behaviour in C. Programmers are expected to write
sensible code.
But it would be nice if the language stopped people writing such things, yes?
Can you tell me which other current languages, other than C++ and
assembly, allow such nonsense?
None? So it's not just me and my language then! Mine is lower level and still plenty unsafe, but it has somewhat higher standards.
If I were the designer of the C language and the maintainer of the C
standards, you might have a point. C is not /my/ language.
You do like to defend it though.
We can agree that C /lets/ people write messy code. It does not
/require/ it. And I have never found a programming language that
stops people writing messy code.
I had included half a dozen points that made C's 'if' error prone and confusing, that would not occur in my syntax because it is better designed.
You seem to be incapable of drawing a line beween what a language can enforce, and what a programmer is free to express.
Or rather, because a programmer has so much freedom anyway, let's not
bother with any lines at all! Just have a language that simply doesn't
care.
On 04/11/2024 20:50, Bart wrote:
But it could for n==4.
Again, you /completely/ miss the point.
If you have a function (or construct) that returns a correct value for inputs 1, 2 and 3, and you never pass it the value 4 (or anything else), then there is no undefined behaviour no matter what the code looks like
for values other than 1, 2 and 3. If someone calls that function with input 4, then /their/ code has the error - not the code that doesn't
handle an input 4.
On 04/11/2024 20:50, Bart wrote:
On 04/11/2024 16:35, David Brown wrote:
On 03/11/2024 21:00, Bart wrote:
To my mind, this is a type of "multi-way selection" :
(const int []){ a, b, c }[n];
I can't see any good reason to exclude it as fitting the descriptive
phrase.
And if "a", "b" and "c" are not constant, but require evaluation of
some sort, it does not change things. Of course if these required
significant effort to evaluate,
Or you had a hundred of them.
or had side-effects, then you would most likely want a "multi-way
selection" construction that did the selection first, then the
evaluation - but that's a matter of programmer choice, and does not
change the terms.
You still don't get how different the concepts are.
Yes, I do. I also understand how they are sometimes exactly the same thing, depending on the language, and how they can often have the same
end result, depending on the details, and how they can often be
different, especially in the face of side-effects or efficiency concerns.
Look, it's really /very/ simple.
A) You can have a construct that says "choose one of these N things to execute and evaluate, and return that value (if any)".
B) You can have a construct that says "here are N things, select one of
them to return as a value".
Both of these can reasonably be called "multi-way selection" constructs.
Some languages can have one as a common construct, other languages may have the other, and many support both in some way. Pretty much any language that allows the programmer to have control over execution order will let you do both in some way, even if there is not a clear language construct for it and you have to write it manually in code.
Mostly type A will be most efficient if there is a lot of effort
involved in putting together the things to select. Type B is likely to
be most efficient if you already have the collection of things to choose from (it can be as simple as an array lookup), if the creation of the collection can be done in parallel (such as in some SIMD uses), or if
the cpu can generate them all before it has established the selection
index.
Sometimes type A will be the simplest and clearest in the code,
sometimes type B will be the simplest and clearest in the code.
Both of these constructs are "multi-way selections".
Your mistake is in thinking that type A is all there is and all that matters, possibly because you feel you have a better implementation for
it than C has. (I think that you /do/ have a nicer switch than C, but
that does not justify limiting your thinking to it.)
On 04/11/2024 22:25, David Brown wrote:
On 04/11/2024 20:50, Bart wrote:
But it could for n==4.
Again, you /completely/ miss the point.
If you have a function (or construct) that returns a correct value for
inputs 1, 2 and 3, and you never pass it the value 4 (or anything
else), then there is no undefined behaviour no matter what the code
looks like for values other than 1, 2 and 3. If someone calls that
function with input 4, then /their/ code has the error - not the code
that doesn't handle an input 4.
This is the wrong kind of thinking.
If this was a library function then, sure, you can stipulate a set of
input values, but that's at a different level, where you are writing
code on top of a working, well-specified language.
You don't make use of holes in the language, one that can cause a crash. That is, by allowing a function to run into an internal RET op with no provision for a result. That's if there even is a RET; perhaps your compilers are so confident that that path is not taken, or you hint it
won't be, that they won't bother!
It will start executing whatever random bytes follow the function.
As I said in my last post, a missing return value caused an internal
error in one of my C implementations because a pushed return value was missing.
How should that be fixed, via a hack in the implementation which pushes
some random value to avoid an immediate crash? And then what?
Let the user - the author of the function - explicitly provide that
value then at least that can be documented: if N isn't in 1..3, then F returns so and so.
You know that makes perfect sense, but because you've got used to that dangerous feature in C you think it's acceptable.
Then we disagree on what 'multi-way' select might mean. I think it means branching, even if notionally, on one-of-N possible code paths.
The whole construct may or may not return a value. If it does, then one
of the N paths must be a default path.
Bart <bc@freeuk.com> wrote:
Then we disagree on what 'multi-way' select might mean. I think it means
branching, even if notionally, on one-of-N possible code paths.
OK.
The whole construct may or may not return a value. If it does, then one
of the N paths must be a default path.
You need to cover all input values. This is possible when there
is reasonably small number of possibilities. For example, switch on
char variable which covers all possible values does not need default
path. Default is needed only when number of possibilities is too
large to explicitely give all of them. And some languages allow
ranges, so that you may be able to cover all values with small
number of ranges.
Bart <bc@freeuk.com> wrote:
Then we disagree on what 'multi-way' select might mean. I think it means
branching, even if notionally, on one-of-N possible code paths.
OK.
The whole construct may or may not return a value. If it does, then one
of the N paths must be a default path.
You need to cover all input values. This is possible when there
is reasonably small number of possibilities. For example, switch on
char variable which covers all possible values does not need default
path. Default is needed only when number of possibilities is too
large to explicitely give all of them. And some languages allow
ranges, so that you may be able to cover all values with small
number of ranges.
On 02.11.2024 19:09, Tim Rentsch wrote:
[...] As long as
the code is logically correct you are free to choose either
style, and it's perfectly okay to choose the one that you find
more appealing.
This is certainly true for one-man-shows.
Hardly suited for most professional contexts I worked in.
On 04/11/2024 04:00, Tim Rentsch wrote:
fir <fir@grunge.pl> writes:
Tim Rentsch wrote:
With the understanding that I am offering [nothing] more than my
own opinion, I can say that I might use any of the patterns
mentioned, depending on circumstances. I don't think any one
approach is either always right or always wrong.
maybe, but some may heve some strong arguments (for use this and
not that) i may overlook
I acknowledge the point, but you haven't gotten any arguments,
only opinions.
Pretty much everything about PL design is somebody's opinion.
Bart <bc@freeuk.com> wrote:
Then we disagree on what 'multi-way' select might mean. I think it means
branching, even if notionally, on one-of-N possible code paths.
OK.
The whole construct may or may not return a value. If it does, then one
of the N paths must be a default path.
You need to cover all input values. This is possible when there
is reasonably small number of possibilities. For example, switch on
char variable which covers all possible values does not need default
path. Default is needed only when number of possibilities is too
large to explicitely give all of them. And some languages allow
ranges, so that you may be able to cover all values with small
number of ranges.
On 05/11/2024 12:42, Waldek Hebisch wrote:
Bart <bc@freeuk.com> wrote:
Then we disagree on what 'multi-way' select might mean. I think it means >>> branching, even if notionally, on one-of-N possible code paths.
OK.
The whole construct may or may not return a value. If it does, then one
of the N paths must be a default path.
You need to cover all input values. This is possible when there
is reasonably small number of possibilities. For example, switch on
char variable which covers all possible values does not need default
path. Default is needed only when number of possibilities is too
large to explicitely give all of them. And some languages allow
ranges, so that you may be able to cover all values with small
number of ranges.
What's easier to implement in a language: to have a conditional need for
an 'else' branch, which is dependent on the compiler performing some arbitrarily complex levels of analysis on some arbitrarily complex set
of expressions...
...or to just always require 'else', with a dummy value if necessary?
On 05/11/2024 13:42, Waldek Hebisch wrote:
Bart <bc@freeuk.com> wrote:
Then we disagree on what 'multi-way' select might mean. I think it means >>> branching, even if notionally, on one-of-N possible code paths.
OK.
I appreciate this is what Bart means by that phrase, but I don't agree
with it. I'm not sure if that is covered by "OK" or not!
The whole construct may or may not return a value. If it does, then one
of the N paths must be a default path.
You need to cover all input values. This is possible when there
is reasonably small number of possibilities. For example, switch on
char variable which covers all possible values does not need default
path. Default is needed only when number of possibilities is too
large to explicitely give all of them. And some languages allow
ranges, so that you may be able to cover all values with small
number of ranges.
I think this is all very dependent on what you mean by "all input values".
Supposing I declare this function:
// Return the integer square root of numbers between 0 and 10
int small_int_sqrt(int x);
To me, the range of "all input values" is integers from 0 to 10. I
could implement it as :
int small_int_sqrt(int x) {
if (x == 0) return 0;
if (x < 4) return 1;
if (x < 9) return 2;
if (x < 16) return 3;
unreachable();
}
If the user asks for small_int_sqrt(-10) or small_int_sqrt(20), that's /their/ fault and /their/ problem. I said nothing about what would
happen in those cases.
But some people seem to feel that "all input values" means every
possible value of the input types, and thus that a function like this
should return a value even when there is no correct value in and no
correct value out.
This is, IMHO, just nonsense and misunderstands the contract between function writers and function users.
Further, I am confident that these people are quite happen to write code like :
// Take a pointer to an array of two ints, add them, and return the sum
int sum_two_ints(const int * p) {
return p[0] + p[1];
}
Perhaps, in a mistaken belief that it makes the code "safe", they will add :
if (!p) return 0;
at the start of the function. But they will not check that "p" actually points to an array of two ints (how could they?), nor will they check
for integer overflow (and what would they do if it happened?).
A function should accept all input values - once you have made clear
what the acceptable input values can be. A "default" case is just a short-cut for conveniently handling a wide range of valid input values -
it is never a tool for handling /invalid/ input values.
On 05/11/2024 12:42, Waldek Hebisch wrote:
Bart <bc@freeuk.com> wrote:
Then we disagree on what 'multi-way' select might mean. I think it means >>> branching, even if notionally, on one-of-N possible code paths.
OK.
The whole construct may or may not return a value. If it does, then one
of the N paths must be a default path.
You need to cover all input values. This is possible when there
is reasonably small number of possibilities. For example, switch on
char variable which covers all possible values does not need default
path. Default is needed only when number of possibilities is too
large to explicitely give all of them. And some languages allow
ranges, so that you may be able to cover all values with small
number of ranges.
What's easier to implement in a language: to have a conditional need for
an 'else' branch, which is dependent on the compiler performing some arbitrarily complex levels of analysis on some arbitrarily complex set
of expressions...
...or to just always require 'else', with a dummy value if necessary?
Even if you went with the first, what happens if the compiler can't guarantee that all values of a selector are covered; should it report
that, or say nothing?
What happens if you do need 'else', but later change things so all bases
are covered; will the compiler report it as being unnecesary, so that
you remove it?
Now, C doesn't have such a feature to test out (ie. that is a construct
with an optional 'else' branch, the whole of which returns a value). The nearest is function return values:
int F(int n) {
if (n==1) return 10;
if (n==2) return 20;
}
Here, neither tcc not gcc report that you might run into the end of the function. It will return garbage if called with anything other than 1 or 2.
gcc will say something with enough warning levels (reaches end of
non-void function). But it will say the same here:
int F(unsigned char c) {
if (c<128) return 10;
if (c>=128) return 20;
}
David Brown <david.brown@hesbynett.no> wrote:
On 05/11/2024 13:42, Waldek Hebisch wrote:
Bart <bc@freeuk.com> wrote:
Then we disagree on what 'multi-way' select might mean. I think it means >>>> branching, even if notionally, on one-of-N possible code paths.
OK.
I appreciate this is what Bart means by that phrase, but I don't agree
with it. I'm not sure if that is covered by "OK" or not!
You may prefer your own definition, but Bart's is resonable one.
The whole construct may or may not return a value. If it does, then one >>>> of the N paths must be a default path.
You need to cover all input values. This is possible when there
is reasonably small number of possibilities. For example, switch on
char variable which covers all possible values does not need default
path. Default is needed only when number of possibilities is too
large to explicitely give all of them. And some languages allow
ranges, so that you may be able to cover all values with small
number of ranges.
I think this is all very dependent on what you mean by "all input values". >>
Supposing I declare this function:
// Return the integer square root of numbers between 0 and 10
int small_int_sqrt(int x);
To me, the range of "all input values" is integers from 0 to 10. I
could implement it as :
int small_int_sqrt(int x) {
if (x == 0) return 0;
if (x < 4) return 1;
if (x < 9) return 2;
if (x < 16) return 3;
unreachable();
}
If the user asks for small_int_sqrt(-10) or small_int_sqrt(20), that's
/their/ fault and /their/ problem. I said nothing about what would
happen in those cases.
But some people seem to feel that "all input values" means every
possible value of the input types, and thus that a function like this
should return a value even when there is no correct value in and no
correct value out.
Well, some languages treat types more seriously than C. In Pascal
type of your input would be 0..10 and all input values would be
handled. Sure, when domain is too complicated to express in type
than it could be documented restriction. Still, it makes sense to
signal error if value goes outside handled rage, so in a sense all
values of input type are handled: either you get valid answer or
clear error.
This is, IMHO, just nonsense and misunderstands the contract between
function writers and function users.
Further, I am confident that these people are quite happen to write code
like :
// Take a pointer to an array of two ints, add them, and return the sum
int sum_two_ints(const int * p) {
return p[0] + p[1];
}
I do not think that people wanting strong type checking are happy
with C. Simply, either they use different language or use C
without bitching, but aware of its limitations.
I certainly would
be quite unhappy with code above. It is possible that I would still
use it as a compromise (say if it was desirable to have single
prototype but handle points in spaces of various dimensions),
but my first attempt would be something like:
typedef struct {int p[2];} two_int;
....
Perhaps, in a mistaken belief that it makes the code "safe", they will add : >>
if (!p) return 0;
at the start of the function. But they will not check that "p" actually
points to an array of two ints (how could they?), nor will they check
for integer overflow (and what would they do if it happened?).
I am certainly unhappy with overflow handling in current hardware
an by extention with overflow handling in C.
A function should accept all input values - once you have made clear
what the acceptable input values can be. A "default" case is just a
short-cut for conveniently handling a wide range of valid input values -
it is never a tool for handling /invalid/ input values.
Well, default can signal error which frequently is right handling
of invalid input values.
On 05/11/2024 13:42, Waldek Hebisch wrote:
Supposing I declare this function:
// Return the integer square root of numbers between 0 and 10
int small_int_sqrt(int x);
To me, the range of "all input values" is integers from 0 to 10. I
could implement it as :
int small_int_sqrt(int x) {
if (x == 0) return 0;
if (x < 4) return 1;
if (x < 9) return 2;
if (x < 16) return 3;
unreachable();
}
If the user asks for small_int_sqrt(-10) or small_int_sqrt(20), that's /their/ fault and /their/ problem. I said nothing about what would
happen in those cases.
But some people seem to feel that "all input values" means every
possible value of the input types, and thus that a function like this
should return a value even when there is no correct value in and no
correct value out.
// Take a pointer to an array of two ints, add them, and return the sum
int sum_two_ints(const int * p) {
return p[0] + p[1];
}
Perhaps, in a mistaken belief that it makes the code "safe", they will
add :
if (!p) return 0;
at the start of the function. But they will not check that "p" actually points to an array of two ints (how could they?), nor will they check
for integer overflow (and what would they do if it happened?).
On 05/11/2024 20:39, Waldek Hebisch wrote:
David Brown <david.brown@hesbynett.no> wrote:
On 05/11/2024 13:42, Waldek Hebisch wrote:
Bart <bc@freeuk.com> wrote:
Then we disagree on what 'multi-way' select might mean. I think it
means
branching, even if notionally, on one-of-N possible code paths.
OK.
I appreciate this is what Bart means by that phrase, but I don't agree
with it. I'm not sure if that is covered by "OK" or not!
You may prefer your own definition, but Bart's is resonable one.
The only argument I can make here is that I have not seen "multi-way
select" as a defined phrase with a particular established meaning.
Bart <bc@freeuk.com> wrote:
On 05/11/2024 12:42, Waldek Hebisch wrote:
Bart <bc@freeuk.com> wrote:
Then we disagree on what 'multi-way' select might mean. I think it means >>>> branching, even if notionally, on one-of-N possible code paths.
OK.
The whole construct may or may not return a value. If it does, then one >>>> of the N paths must be a default path.
You need to cover all input values. This is possible when there
is reasonably small number of possibilities. For example, switch on
char variable which covers all possible values does not need default
path. Default is needed only when number of possibilities is too
large to explicitely give all of them. And some languages allow
ranges, so that you may be able to cover all values with small
number of ranges.
What's easier to implement in a language: to have a conditional need for
an 'else' branch, which is dependent on the compiler performing some
arbitrarily complex levels of analysis on some arbitrarily complex set
of expressions...
...or to just always require 'else', with a dummy value if necessary?
Well, frequently it is easier to do bad job, than a good one.
normally you do not need very complex analysis:
On 05/11/2024 20:33, David Brown wrote:
On 05/11/2024 20:39, Waldek Hebisch wrote:
David Brown <david.brown@hesbynett.no> wrote:
On 05/11/2024 13:42, Waldek Hebisch wrote:
Bart <bc@freeuk.com> wrote:
Then we disagree on what 'multi-way' select might mean. I think it >>>>>> means
branching, even if notionally, on one-of-N possible code paths.
OK.
I appreciate this is what Bart means by that phrase, but I don't agree >>>> with it. I'm not sure if that is covered by "OK" or not!
You may prefer your own definition, but Bart's is resonable one.
The only argument I can make here is that I have not seen "multi-way
select" as a defined phrase with a particular established meaning.
Well, it started off as 2-way select, meaning constructs like this:
x = c ? a : b;
x := (c | a | b)
Where one of two branches is evaluated. I extended the latter to N-way select:
x := (n | a, b, c, ... | z)
(defmacro nsel (expr . clauses)^(caseql ,expr ,*[mapcar list 1 clauses]))
(nsel 1 (prinl "one") (prinl "two") (prinl "three"))"one"
(nsel (+ 1 1) (prinl "one") (prinl "two") (prinl "three"))"two"
(nsel (+ 1 3) (prinl "one") (prinl "two") (prinl "three"))nil
(nsel (+ 1 2) (prinl "one") (prinl "two") (prinl "three"))"three"
(macroexpand-1 '(nsel x a b c d))(caseql x (1 a)
On 05/11/2024 20:33, David Brown wrote:
On 05/11/2024 20:39, Waldek Hebisch wrote:
David Brown <david.brown@hesbynett.no> wrote:
On 05/11/2024 13:42, Waldek Hebisch wrote:
Bart <bc@freeuk.com> wrote:
Then we disagree on what 'multi-way' select might mean. I think it >>>>>> means
branching, even if notionally, on one-of-N possible code paths.
OK.
I appreciate this is what Bart means by that phrase, but I don't agree >>>> with it. I'm not sure if that is covered by "OK" or not!
You may prefer your own definition, but Bart's is resonable one.
The only argument I can make here is that I have not seen "multi-way
select" as a defined phrase with a particular established meaning.
Well, it started off as 2-way select, meaning constructs like this:
x = c ? a : b;
x := (c | a | b)
Where one of two branches is evaluated. I extended the latter to N-way select:
x := (n | a, b, c, ... | z)
On 2024-11-05, Bart <bc@freeuk.com> wrote:
Well, it started off as 2-way select, meaning constructs like this:
x = c ? a : b;
x := (c | a | b)
Where one of two branches is evaluated. I extended the latter to N-way
select:
x := (n | a, b, c, ... | z)
This looks quite error-prone. You have to count carefully that
the cases match the intended values. If an entry is
inserted, all the remaining ones shift to a higher value.
You've basically taken a case construct and auto-generated
the labels starting from 1.
On 04/11/2024 20:50, Bart wrote:
On 04/11/2024 16:35, David Brown wrote:
On 03/11/2024 21:00, Bart wrote:
Here is a summary of C vs my language.
<snip the irrelevant stuff>
I am very keen on keeping the concepts distinct in cases where it
matters.
I know, you like to mix things up. I like clear lines:
func F:int ... Always returns a value
proc P ... Never returns a value
Oh, you /know/ that, do you? And how do you "know" that? Is that
because you still think I am personally responsible for the C language,
and that I think C is the be-all and end-all of perfect languages?
I agree that it can make sense to divide different types of "function".
I disagree that whether or not a value is returned has any significant relevance. I see no difference, other than minor syntactic issues,
between "int foo(...)" and "void foo(int * result, ...)".
If you have a function (or construct) that returns a correct value for inputs 1, 2 and 3, and you never pass it the value 4 (or anything else), then there is no undefined behaviour no matter what the code looks like
for values other than 1, 2 and 3. If someone calls that function with input 4, then /their/ code has the error - not the code that doesn't
handle an input 4.
I agree that this a terrible idea. <https://gcc.gnu.org/bugzilla/show_bug.cgi?id=60523>
But picking one terrible idea in C does not mean /everything/ in C is a terrible idea! /That/ is what you got wrong, as you do so often.
Can you tell me which other current languages, other than C++ and
assembly, allow such nonsense?
Python.
Of course, it is equally meaningless in Python as it is in C.
I defend it if that is appropriate. Mostly, I /explain/ it to you. It
is bizarre that people need to do that for someone who claims to have written a C compiler, but there it is.
I'm glad you didn't - it would be a waste of effort.
You /do/ understand that I use top-quality tools with carefully chosen warnings, set to throw fatal errors, precisely because I want a language that has a lot more "lines" and restrictions that your little tools?
/Every/ C programmer uses a restricted subset of C - some more
restricted than others. I choose to use a very strict subset of C for
my work, because it is the best language for the tasks I need to do. (I also use a very strict subset of C++ when it is a better choice.)
On 05/11/2024 13:29, David Brown wrote:
On 05/11/2024 13:42, Waldek Hebisch wrote:
Supposing I declare this function:
// Return the integer square root of numbers between 0 and 10
int small_int_sqrt(int x);
To me, the range of "all input values" is integers from 0 to 10. I
could implement it as :
int small_int_sqrt(int x) {
if (x == 0) return 0;
if (x < 4) return 1;
if (x < 9) return 2;
if (x < 16) return 3;
unreachable();
}
If the user asks for small_int_sqrt(-10) or small_int_sqrt(20), that's
/their/ fault and /their/ problem. I said nothing about what would
happen in those cases.
But some people seem to feel that "all input values" means every
possible value of the input types, and thus that a function like this
should return a value even when there is no correct value in and no
correct value out.
Your example is an improvement on your previous ones. At least it
attempts to deal with out-of-range conditions!
However there is still the question of providing that return type. If 'unreachable' is not a special language feature, then this can fail
either if the language requires the 'return' keyword, or 'unreachable' doesn't yield a compatible type (even if it never returns because it's
an error handler).
Getting that right will satisfy both the language (if it cared more
about such matters than C apparently does), and the casual reader
curious about how the function contract is met (that is, supplying that promised return int type if or when it returns).
// Take a pointer to an array of two ints, add them, and return the sum
int sum_two_ints(const int * p) {
return p[0] + p[1];
}
Perhaps, in a mistaken belief that it makes the code "safe", they will
add :
if (!p) return 0;
at the start of the function. But they will not check that "p"
actually points to an array of two ints (how could they?), nor will
they check for integer overflow (and what would they do if it happened?).
This is a different category of error.
Here's a related example of what I'd class as a language error:
int a;
a = (exit(0), &a);
A type mismatch error is usually reported. However, the assignment is
never done because it never returns from that exit() call.
I expect you wouldn't think much of a compiler that didn't report such
an error because that code is never executed.
But to me that is little different from running into the end of function without the proper provisions for a valid return value.
On 04/11/2024 22:25, David Brown wrote:
On 04/11/2024 20:50, Bart wrote:
On 04/11/2024 16:35, David Brown wrote:
On 03/11/2024 21:00, Bart wrote:
Here is a summary of C vs my language.
<snip the irrelevant stuff>
I am very keen on keeping the concepts distinct in cases where it
matters.
I know, you like to mix things up. I like clear lines:
func F:int ... Always returns a value
proc P ... Never returns a value
Oh, you /know/ that, do you? And how do you "know" that? Is that
because you still think I am personally responsible for the C
language, and that I think C is the be-all and end-all of perfect
languages?
I agree that it can make sense to divide different types of
"function". I disagree that whether or not a value is returned has any
significant relevance. I see no difference, other than minor
syntactic issues, between "int foo(...)" and "void foo(int * result,
...)".
I don't use functional concepts; my functions may or may not be pure.
But the difference between value-returning and non-value returning
functions to me is significant:
Func Proc
return x; Y N
return; N Y
hit final } N Y
Pure ? Unlikely
Side-effects ? Likely
Call within expr Y N
Call standalone ? Y
Having a clear distinction helps me focus more precisely on how a
routine has to work.
In C, the syntax is dreadful: not only can you barely distinguish a
function from a procedure (even without attributes, user types and
macros add in), but you can hardly tell them apart from variable declarations.
In fact, function declarations can even be declared in the middle of a
set of variable declarations.
You can learn a lot about the underlying structure of of a language by implementing it. So when I generate IL from C for example, I found the
need to have separate instructions to call functions and procedures, and separate return instructions too.
If you have a function (or construct) that returns a correct value for
inputs 1, 2 and 3, and you never pass it the value 4 (or anything
else), then there is no undefined behaviour no matter what the code
looks like for values other than 1, 2 and 3. If someone calls that
function with input 4, then /their/ code has the error - not the code
that doesn't handle an input 4.
No. The function they are calling is badly formed. There should never be
any circumstance where a value-returning function terminates (hopefully
by running into RET) without an explicit set return value.
I agree that this a terrible idea.
<https://gcc.gnu.org/bugzilla/show_bug.cgi?id=60523>
But picking one terrible idea in C does not mean /everything/ in C is
a terrible idea! /That/ is what you got wrong, as you do so often.
What the language does is generally fine. /How/ it does is generally terrible. (Type syntax; no 'fun' keyword; = vs ==; operator precedence; format codes; 'break' in switch; export by default; struct T vs typedef
T; dangling 'else'; optional braces; ... there's reams of this stuff!)
So actually, I'm not wrong. There have been discussions about all of
these and a lot more.
Can you tell me which other current languages, other than C++ and
assembly, allow such nonsense?
Python.
Of course, it is equally meaningless in Python as it is in C.
Python at least can trap the errors. Once you fix the unlimited
recursion, it will detect the wrong number of arguments. In C, before
C23 anyway, any number and types of arguments is legal in that example.
I defend it if that is appropriate. Mostly, I /explain/ it to you.
It is bizarre that people need to do that for someone who claims to
have written a C compiler, but there it is.
It is bizarre that the ins and outs of C, a supposedly simple language,
are so hard to understand.
I'm glad you didn't - it would be a waste of effort.
I guessed that. You seemingly don't care that C is a messy language with many quirks; you just work around it by using a subset, with some help
from your compiler in enforcing that subset.
So you're using a strict dialect. The trouble is that everyone else
using C will either be using their own dialect incompatible with yours,
or are stuck using the messy language and laid-back compilers operating
in lax mode by default.
I'm interested in fixing things at source - within a language.
You /do/ understand that I use top-quality tools with carefully chosen
warnings, set to throw fatal errors, precisely because I want a
language that has a lot more "lines" and restrictions that your little
tools? /Every/ C programmer uses a restricted subset of C - some more
restricted than others. I choose to use a very strict subset of C for
my work, because it is the best language for the tasks I need to do.
(I also use a very strict subset of C++ when it is a better choice.)
I'd guess only 1% of your work with C involves the actual language, and
99% using additional tooling.
With me it's mostly about the language.
On 06/11/2024 15:40, Bart wrote:
There are irrelevant differences in syntax, which could easily disappear entirely if a language supported a default initialisation value when a return gives no explicit value. (i.e., "T foo() { return; }; T x =
foo();" could be treated in the same way as "T x;" in a static initialisation context.)
Then you list some things that may or may not happen, which are of
course totally irrelevant. If you list the differences between bikes
and cars, you don't include "some cars are red" and "bikes are unlikely
to be blue".
It's a pointless distinction. Any function or procedure can be morphed into the other form without any difference in the semantic meaning of
the code, requiring just a bit of re-arrangement at the caller site:
int foo(int x) { int y = ...; return y; }
void foo(int * res, int x) { int y = ...; *res = y; }
void foo(int x) { ... ; return; }
int foo(int x) { ... ; return 0; }
There is no relevance in the division here, which is why most languages don't make a distinction unless they do so simply for syntactic reasons.
In C, the syntax is dreadful: not only can you barely distinguish a
function from a procedure (even without attributes, user types and
macros add in), but you can hardly tell them apart from variable
declarations.
As always, you are trying to make your limited ideas of programming languages appear to be correct, universal, obvious or "natural" by
saying things that you think are flaws in C. That's not how a
discussion works, and it is not a way to convince anyone of anything.
The fact that C does not have a keyword used in the declaration or definition of a function does not in any way mean that there is the slightest point in your artificial split between "func" and "proc" functions.
(It doesn't matter that I too prefer a clear keyword for defining
functions in a language.)
That is solely from your choice of an IL.
Of course you are wrong!
If there was an alternative language that I thought would be better for
the tasks I have, I'd use that. (Actually, a subset of C++ is often better, so I use that when I can.)
What do you think I should do instead? Whine in newsgroups to people
that don't write language standards (for C or anything else) and don't
make compilers?
Make my own personal language that is useless to
everyone else and holds my customers to ransom by being the only person
that can work with their code?
On 05/11/2024 23:48, Bart wrote:
On 05/11/2024 13:29, David Brown wrote:
int small_int_sqrt(int x) {
if (x == 0) return 0;
if (x < 4) return 1;
if (x < 9) return 2;
if (x < 16) return 3;
unreachable();
}
"unreachable()" is a C23 standardisation of a feature found in most
high-end compilers. For gcc and clang, there is
__builtin_unreachable(), and MSVC has its version.
Getting that right will satisfy both the language (if it cared more
about such matters than C apparently does), and the casual reader
curious about how the function contract is met (that is, supplying
that promised return int type if or when it returns).
C gets it right here. There is no need for a return type when there is
no return
indeed, trying to force some sort of type or "default" value
would be counterproductive. It would be confusing to the reader, > add untestable and unexecutable source code,
Let's now look at another alternative - have the function check for validity, and return some kind of error signal if the input is invalid. There are two ways to do this - we can have a value of the main return...
type acting as an error signal, or we can have an additional return value.
All in all, we have a significant costs in various aspects, with no real benefit, all in the name of a mistaken belief that we are avoiding
undefined behaviour.
On 06/11/2024 14:50, David Brown wrote:
C gets it right here. There is no need for a return type when there
is no return
There is no return for only half the function! A function with a return
type is a function that CAN return. If it can't ever return, then make
it a procedure.
Take this function where N can never be zero; is this the right way to
write it in C:
int F(int N) {
if (N==0) unreachable();
return abc/N; // abc is a global with value 100
}
If doesn't look right. If I compile it with gcc (using __builtin_unreachable), and call F(0), then it crashes. So it doesn't do much does it?!
On 06/11/2024 15:47, David Brown wrote:
On 06/11/2024 15:40, Bart wrote:
There are irrelevant differences in syntax, which could easily
disappear entirely if a language supported a default initialisation
value when a return gives no explicit value. (i.e., "T foo() {
return; }; T x = foo();" could be treated in the same way as "T x;" in
a static initialisation context.)
You wrote:
T foo () {return;} # definition?
T x = foo(); # call?
I'm not quite sure what you're saying here. That a missing return value
in non-void function would default to all-zeros?
Maybe. A rather pointless feature just to avoid writing '0', and which
now introduces a new opportunity for a silent error (accidentally
forgetting a return value).
It's not quite the same as a static initialisiation, which is zeroed
when a program starts.
Then you list some things that may or may not happen, which are of
course totally irrelevant. If you list the differences between bikes
and cars, you don't include "some cars are red" and "bikes are
unlikely to be blue".
Yes; if you're using a vehicle, or planning a journey or any related
thing, it helps to remember if it's a bike or a car! At least here you acknowledge the difference.
But I guess you find those likely/unlikely macros of gcc pointless too.
If I know something is a procedure, then I also know it is likely to
change global state, that I might need to deal with a return value, and
a bunch of other stuff.
Boldly separating the two with either FUNC or PROC denotations I find
helps tremendously. YM-obviously-V, but you can't have a go at me for my view.
If I really found it a waste of time, the distinction would have been dropped decades ago.
It's a pointless distinction. Any function or procedure can be
morphed into the other form without any difference in the semantic
meaning of the code, requiring just a bit of re-arrangement at the
caller site:
int foo(int x) { int y = ...; return y; }
void foo(int * res, int x) { int y = ...; *res = y; }
void foo(int x) { ... ; return; }
int foo(int x) { ... ; return 0; }
There is no relevance in the division here, which is why most
languages don't make a distinction unless they do so simply for
syntactic reasons.
As I said, you like to mix things up. You disagreed. I'm not surprised.
Here you've demonstrated how a function that returns results by value
can be turned into a procedure that returns a result by reference.
So now, by-value and by-reference are the same thing?
I listed seven practical points of difference between functions and procedures, and above is an eighth point, but you just dismissing them.
Is there any point in this?
I do like taking what some think as a single feature and having
dedicated versions, because I find it helpful.
That includes functions, loops, control flow and selections.
In C, the syntax is dreadful: not only can you barely distinguish a
function from a procedure (even without attributes, user types and
macros add in), but you can hardly tell them apart from variable
declarations.
As always, you are trying to make your limited ideas of programming
languages appear to be correct, universal, obvious or "natural" by
saying things that you think are flaws in C. That's not how a
discussion works, and it is not a way to convince anyone of anything.
The fact that C does not have a keyword used in the declaration or
definition of a function does not in any way mean that there is the
slightest point in your artificial split between "func" and "proc"
functions.
void F();
void (*G);
void *H();
void (*I)();
OK, 4 things declared here. Are they procedures, functions, variables,
or pointers to functions? (I avoided using a typedef in place of 'void'
to make things easier.)
I /think/ they are as follows: procedure, pointer variable, function (returning void*), and pointer to a procedure. But I had to work at it,
even though the examples are very simple.
I don't know about you, but I prefer syntax like this:
proc F
ref void G
ref proc H
func I -> ref void
Now come on, scream at again for prefering a nice syntax for
programming, one which just tells me at a glance what it means without having to work it out.
(It doesn't matter that I too prefer a clear keyword for defining
functions in a language.)
Why? Don't your smart tools tell you all that anyway?
That is solely from your choice of an IL.
The IL design also falls into place from the natural way these things
have to work.
Of course you are wrong!
You keep saying that. But then you also keep saying, from time to time,
that you agree that something in C was a bad idea. So I'm still wrong
when calling out the same thing?
If there was an alternative language that I thought would be better
for the tasks I have, I'd use that. (Actually, a subset of C++ is
often better, so I use that when I can.)
What do you think I should do instead? Whine in newsgroups to people
that don't write language standards (for C or anything else) and don't
make compilers?
What makes you think I'm whining? The thread opened up a discussion
about multi-way selections, and it got into how it could be done with features from other languages.
I gave some examples from mine, as I'm very familiar with that, and it
uses simple features that are easy to grasp and appreciate. You could
have done the same from ones you know.
But you just hate the idea that I have my own language to draw on, whose syntax is very sweet ('serious' languages hate such syntax for some
reason, and is usually relegated to scripting languages.)
I guess then you just have to belittle and insult me, my languages and
my views at every opporunity.
Make my own personal language that is useless to everyone else and
holds my customers to ransom by being the only person that can work
with their code?
Plenty of companies use DSLs. But isn't that sort of what you do? That
is, using 'C' with a particular interpretation or enforcement of the
rules, which needs to go in hand with a particular compiler, version,
sets of options and assorted makefiles.
I for one would never be able to build one of your programs. It might as well be written in your inhouse language with proprietory tools.
On 06/11/2024 20:38, Bart wrote:
void F();
void (*G);
void *H();
void (*I)();
OK, 4 things declared here. Are they procedures, functions, variables,
or pointers to functions? (I avoided using a typedef in place of
'void' to make things easier.)
I /think/ they are as follows: procedure, pointer variable, function
(returning void*), and pointer to a procedure. But I had to work at
it, even though the examples are very simple.
I don't know about you, but I prefer syntax like this:
proc F
ref void G
ref proc H
func I -> ref void
It is not the use of a keyword for functions that I disagree with, nor
am I arguing for C's syntax or against your use of "ref" or ordering. I simply don't think there is much to be gained by using "proc F" instead
of "func F -> void" (assuming that's the right syntax) - or just "func F".
But I think there is quite a bit to be gained if the func/proc
distinction told us something useful and new, rather than just the
existence or lack of a return type.
On 06/11/2024 14:50, David Brown wrote:
On 05/11/2024 23:48, Bart wrote:
On 05/11/2024 13:29, David Brown wrote:
int small_int_sqrt(int x) {
if (x == 0) return 0;
if (x < 4) return 1;
if (x < 9) return 2;
if (x < 16) return 3;
unreachable();
}
"unreachable()" is a C23 standardisation of a feature found in most
high-end compilers. For gcc and clang, there is
__builtin_unreachable(), and MSVC has its version.
So it's a kludge.
Cool, I can create one of those too:
func smallsqrt(int x)int =
if
elsif x=0 then 0
elsif x<4 then 1
elsif x<9 then 2
elsif x<16 then 3
dummyelse int.min
fi
end
'dummyelse' is a special version of 'else' that tells the compiler that control will (should) never arrive there. ATM it does nothing but inform
the reader of that and to remind the author. But later stages of the compiler can choose not to generate code for it, or to generate error-reporting code.
BTW your example lets through negative values; I haven't fixed that.)
This is all a large and complex subject. But it's not really the point
of the discussion.
On 07/11/2024 13:23, Bart wrote:
On 06/11/2024 14:50, David Brown wrote:
On 05/11/2024 23:48, Bart wrote:
On 05/11/2024 13:29, David Brown wrote:
int small_int_sqrt(int x) {
if (x == 0) return 0;
if (x < 4) return 1;
if (x < 9) return 2;
if (x < 16) return 3;
unreachable();
}
"unreachable()" is a C23 standardisation of a feature found in most
high-end compilers. For gcc and clang, there is
__builtin_unreachable(), and MSVC has its version.
So it's a kludge.
You mean it is something you don't understand? Think of this as an opportunity to learn something new.
'dummyelse' is a special version of 'else' that tells the compiler
that control will (should) never arrive there. ATM it does nothing but
inform the reader of that and to remind the author. But later stages
of the compiler can choose not to generate code for it, or to generate
error-reporting code.
You are missing the point - that is shown clearly by the "int.min".
You have your way of doing things, and have no interest in learning
anything else or even bothering to listen or think.
Your bizarre hatred
of C is overpowering for you
On 02/11/2024 21:44, Bart wrote:
(Note that the '|' is my example is not 'or'; it means 'then':
( c | a ) # these are exactly equivalent
if c then a fi
( c | a | ) # so are these
if c then a else b fi
There is no restriction on what a and b are, statements or
expressions, unless the whole returns some value.)
Ah, so your language has a disastrous choice of syntax here so that
sometimes "a | b" means "or", and sometimes it means "then" or
"implies", and sometimes it means "else".
Why have a second syntax with
a confusing choice of operators when you have a perfectly good "if /
then / else" syntax?
Or if you feel an operator adds a lot to the
language here, why not choose one that would make sense to people, such
as "=>" - the common mathematical symbol for "implies".
Well, it started off as 2-way select, meaning constructs like this:
x = c ? a : b;
x := (c | a | b)
Where one of two branches is evaluated. I extended the latter to N-way select:
x := (n | a, b, c, ... | z)
Where again one of these elements is evaluated, selected by n (here
having the values of 1, 2, 3, ... compared with true, false above, but
there need to be at least 2 elements inside |...| to distinguish them).
I applied it also to other statements that can be provide values, such
as if-elsif chains and switch, but there the selection might be
different (eg. a series of tests are done sequentially until a true one).
I don't know how it got turned into 'multi-way'.
[...]
x := (n | a, b, c, ... | z)
It's a version of Algol68's case construct:
x := CASE n IN a, b, c OUT z ESAC
which also has the same compact form I use. I only use the compact
version because n is usually small, and it is intended to be used within
an expression: print (n | "One", "Two", "Three" | "Other").
On 02/11/2024 21:44, Bart wrote:
(Note that the '|' is my example is not 'or'; it means 'then':
( c | a ) # these are exactly equivalent
if c then a fi
( c | a | b ) # so are these [fixed]
if c then a else b fi
There is no restriction on what a and b are, statements or
expressions, unless the whole returns some value.)
Ah, so your language has a disastrous choice of syntax here so that sometimes "a | b" means "or", and sometimes it means "then" or
"implies", and sometimes it means "else".
Why have a second syntax with
a confusing choice of operators when you have a perfectly good "if /
then / else" syntax?
m when it has (*P).m.
Or if you feel an operator adds a lot to the
language here, why not choose one that would make sense to people, such
as "=>" - the common mathematical symbol for "implies".
On 03.11.2024 18:00, David Brown wrote:
On 02/11/2024 21:44, Bart wrote:
(Note that the '|' is my example is not 'or'; it means 'then':
( c | a ) # these are exactly equivalent
if c then a fi
( c | a | ) # so are these
if c then a else b fi
There is no restriction on what a and b are, statements or
expressions, unless the whole returns some value.)
Ah, so your language has a disastrous choice of syntax here so that
sometimes "a | b" means "or", and sometimes it means "then" or
"implies", and sometimes it means "else".
(I can't comment on the "other use" of the same syntax in the
"poster's language" since it's not quoted here.)
But it's not uncommon in programming languages that operators
are context specific, and may mean different things depending
on context.
You are saying "disastrous choice of syntax". - Wow! Hard stuff.
I suggest to cool down before continuing reading further. :-)
Incidentally above syntax is what Algol 68 supports;
Or if you feel an operator adds a lot to the
language here, why not choose one that would make sense to people, such
as "=>" - the common mathematical symbol for "implies".
This is as opinion of course arguable. It's certainly also
influenced where one is coming from (i.e. personal expertise
from other languages).
The detail of what symbols are used is
not that important to me, if it fits to the overall language
design.
From the high-level languages I used in my life I was almost
always "missing" something with conditional expressions. I
don't want separate and restricted syntaxes (plural!) in "C"
(for statements and expressions respectively), for example.
Some are lacking conditional expressions completely. Others
support the syntax with a 'fi' end-terminator and simplify
structures (and add to maintainability) supporting 'else-if'.
And few allow 'if' expressions on the left-hand side of an
assignment. (Algol 68 happens to support everything I need.
Unfortunately it's a language I never used professionally.)
I'm positive that folks who use languages that support those
syntactic forms wouldn't like to miss them. (Me for sure.)
("disastrous syntax" - I'm still laughing... :-)
On 03.11.2024 18:00, David Brown wrote:
or using the respective alternative forms with ( a | b | c) ,
or ( a | b ) where no 'ELSE' is required. (And there's also
the 'ELIF' and the '|:' as alternative form available.)
BTW, the same symbols can also be used as an alternative form
of the 'case' statement; the semantic distinction is made by
context, e.g. the types involved in the construct.
Bart, out of interest; have you invented that syntax for your
language yourself of borrowed it from another language (like
Algol 68)?
On 08/11/2024 17:37, Janis Papanagnou wrote:
BTW, the same symbols can also be used as an alternative form
of the 'case' statement; the semantic distinction is made by
context, e.g. the types involved in the construct.
You mean whether the 'a' in '(a | b... | c)' has type Bool rather than Int?
I've always discriminated on the number of terms between the two |s:
either 1, or more than 1.
Bart, out of interest; have you invented that syntax for your
language yourself of borrowed it from another language (like
Algol 68)?
It was heavily inspired by the syntax (not the semantics) of Algol68,
even though I'd never used it at that point.
I like that it solved the annoying begin-end aspect of Algol60/Pascal
syntax where you have to write the clunky:
[snip examples]
I enhanced it by not needing stropping (and so not allowing embedded
spaces within names); allowing redundant semicolons while at the same
time, turning newlines into semicolons when a line obviously didn't
continue; plus allowing ordinary 'end' or 'end if' to be used as well as 'fi'.
My version then can look like this, a bit less forbidding than Algol68:
if cond then
s1
s2
else
s3
s4
end
On 08/11/2024 18:37, Janis Papanagnou wrote:
The | operator means "or" in the OP's language (AFAIK - only he actually knows the language). So "(a | b | c)" in that language will sometimes
mean the same as "(a | b | c)" in C, and sometimes it will mean the same
as "(a ? b : c)" in C.
There may be some clear distinguishing feature that disambiguates these
uses. But this is a one-man language - there is no need for a clear
syntax or grammar, documentation, consistency in the language, or a consideration for corner cases or unusual uses.
Incidentally above syntax is what Algol 68 supports;
Yes, he said later that Algol 68 was the inspiration for it. Algol 68
was very successful in its day - but there are good reasons why many of
its design choices were been left behind long ago in newer languages.
This is as opinion of course arguable. It's certainly also
influenced where one is coming from (i.e. personal expertise
from other languages).
The language here is "mathematics". I would not expect anyone who even considers designing a programming language to be unfamiliar with that
symbol.
The detail of what symbols are used is
not that important to me, if it fits to the overall language
design.
I am quite happy with the same symbol being used for very different
meanings in different contexts. C's use of "*" for indirection and for multiplication is rarely confusing. Using | for "bitwise or" and also
using it for a "pipe" operator would probably be fine - only one
operation makes sense for the types involved. But here the two
operations - "bitwise or" (or logical or) and "choice" can apply to to
the same types of operands. That's what makes it a very poor choice of syntax.
(For comparison, Algol 68 uses "OR", "∨" or "\/" for the "or" operator, thus it does not have this confusion.)
[...]
I've nothing (much) against the operation - it's the choice of operator
that is wrong.
This was the first part of your example:
const char * flag_to_text_A(bool b) {
if (b == true) {
return "It's true!";
} else if (b == false) {
return "It's false!";
/I/ would question why you'd want to make the second branch conditional
in the first place.
Write an 'else' there, and the issue doesn't arise.
Because I can't see the point of deliberately writing code that usually
takes two paths, when either:
(1) you know that one will never be taken, or
(2) you're not sure, but don't make any provision in case it is
Fix that first rather relying on compiler writers to take care of your
badly written code.
[...]
If you have a function (or construct) that returns a correct value for
inputs 1, 2 and 3, and you never pass it the value 4 (or anything else),
then there is no undefined behaviour no matter what the code looks like
for values other than 1, 2 and 3. If someone calls that function with
input 4, then /their/ code has the error - not the code that doesn't
handle an input 4.
On 06/11/2024 07:26, Kaz Kylheku wrote:
On 2024-11-05, Bart <bc@freeuk.com> wrote:
Well, it started off as 2-way select, meaning constructs like this:
x = c ? a : b;
x := (c | a | b)
Where one of two branches is evaluated. I extended the latter to N-way
select:
x := (n | a, b, c, ... | z)
This looks quite error-prone. You have to count carefully that
the cases match the intended values. If an entry is
inserted, all the remaining ones shift to a higher value.
You've basically taken a case construct and auto-generated
the labels starting from 1.
It's a version of Algol68's case construct:
x := CASE n IN a, b, c OUT z ESAC
which also has the same compact form I use. I only use the compact
version because n is usually small, and it is intended to be used within
an expression: print (n | "One", "Two", "Three" | "Other").
This an actual example (from my first scripting language; not written by
me):
Crd[i].z := (BendAssen |P.x, P.y, P.z)
An out-of-bounds index yields 'void' (via a '| void' part inserted by
the compiler). This is one of my examples from that era:
xt := (messa | 1,1,1, 2,2,2, 3,3,3)
yt := (messa | 3,2,1, 3,2,1, 3,2,1)
Algol68 didn't have 'switch', but I do, as well as a separate
case...esac statement that is more general. Those are better for
multi-line constructs.
As for being error prone because values can get out of step, so is a
function call like this:
f(a, b, c, d, e)
But I also have keyword arguments.
On 04.11.2024 23:25, David Brown wrote:
If you have a function (or construct) that returns a correct value for
inputs 1, 2 and 3, and you never pass it the value 4 (or anything else),
then there is no undefined behaviour no matter what the code looks like
for values other than 1, 2 and 3. If someone calls that function with
input 4, then /their/ code has the error - not the code that doesn't
handle an input 4.
Well, it's a software system design decision whether you want to
make the caller test the preconditions for every function call,
or let the callee take care of unexpected input, or both.
We had always followed the convention to avoid all undefined
situations and always define every 'else' case by some sensible
behavior, at least writing a notice into a log-file, but also
to "fix" the runtime situation to be able to continue operating.
(Note, I was mainly writing server-side software where this was
especially important.)
That's one reason why (as elsethread mentioned) I dislike 'else'
to handle a defined value; I prefer an explicit 'if' and use the
else for reporting unexpected situations (that practically never
appear, or, with the diagnostics QA-evaluated, asymptotically
disappearing).
(For pure binary predicates there's no errors branch, of course.)
Janis
PS: One of my favorite IT-gotchas is the plane crash where the
code specified landing procedure functions for height < 50.0 ft
and for height > 50.0 ft conditions, which mostly worked since
the height got polled only every couple seconds, and the case
height = 50.0 ft happened only very rarely due to the typical
descent characteristics during landing.
On 08.11.2024 23:24, Bart wrote:
On 08/11/2024 17:37, Janis Papanagnou wrote:
BTW, the same symbols can also be used as an alternative form
of the 'case' statement; the semantic distinction is made by
context, e.g. the types involved in the construct.
You mean whether the 'a' in '(a | b... | c)' has type Bool rather than Int? >>
I've always discriminated on the number of terms between the two |s:
either 1, or more than 1.
I suppose in a [historic] "C" like language it's impossible to
distinguish on type here (given that there was no 'bool' type
[in former times] in "C"). - But I'm not quite sure whether
you're speaking here about your "C"-like language or some other
language you implemented.
if cond then
s1
s2
else
s3
s4
end
(Looks a lot more like a scripting language without semicolons.)
On 08.11.2024 19:18, David Brown wrote:
On 08/11/2024 18:37, Janis Papanagnou wrote:
The language here is "mathematics". I would not expect anyone who even
considers designing a programming language to be unfamiliar with that
symbol.
Mathematics, unfortunately, [too] often has several symbols for
the same thing. (It's in that respect not very different from
programming languages, where you can [somewhat] rely on + - * /
but beyond that it's getting more tight.)
Programming languages have the additional problem that you don't
have all necessary symbols available, so language designers have
to map them onto existing symbols. (Also Unicode in modern times
do not solve that fact, since languages typically rely on ASCII,
or some 8-bit extension, at most; full Unicode support, I think,
is rare, especially on the lexical language level. Some allow
them in strings, some in identifiers; but in language keywords?)
BTW, in Algol 68 you can define operators, so you can define
"OP V" or "OP ^" (for 'or' and 'and', respectively, but we cannot
define (e.g.) "OP ·" (a middle dot, e.g. for multiplication).[*]
The detail of what symbols are used is
not that important to me, if it fits to the overall language
design.
I am quite happy with the same symbol being used for very different
meanings in different contexts. C's use of "*" for indirection and for
multiplication is rarely confusing. Using | for "bitwise or" and also
using it for a "pipe" operator would probably be fine - only one
operation makes sense for the types involved. But here the two
operations - "bitwise or" (or logical or) and "choice" can apply to to
the same types of operands. That's what makes it a very poor choice of
syntax.
Well, I'm more used (from mathematics) to 'v' and '^' than to '|'
and '&', respectively. But that doesn't prevent me from accepting
other symbols like '|' to have some [mathematical] meaning, or
even different meanings depending on context. In mathematics it's
not different; same symbols are used in different contexts with
different semantics. (And there's also the mentioned problem of
non-coherent literature WRT used mathematics' symbols.)
(For comparison, Algol 68 uses "OR", "∨" or "\/" for the "or" operator,
thus it does not have this confusion.)
Actually, while I like Algol 68's flexibility, there's in some
cases (to my liking) too many variants. This had partly been
necessary, of course, due to the (even more) restricted character
sets (e.g. 6-bit characters) available in the 1960's.
The two options for conditionals I consider very useful, though,
and it also produces very legible and easily understandable code.
[...]
I've nothing (much) against the operation - it's the choice of operator
that is wrong.
Well, on opinions there's nothing more to discuss, I suppose.
Bart wrote:
On 06/11/2024 07:26, Kaz Kylheku wrote:
On 2024-11-05, Bart <bc@freeuk.com> wrote:
[...] I extended the latter to N-way select:
x := (n | a, b, c, ... | z)
This looks quite error-prone. You have to count carefully that
the cases match the intended values. If an entry is
inserted, all the remaining ones shift to a higher value.
You've basically taken a case construct and auto-generated
the labels starting from 1.
It's a version of Algol68's case construct:
x := CASE n IN a, b, c OUT z ESAC
which also has the same compact form I use. I only use the compact
version because n is usually small, and it is intended to be used within
an expression: print (n | "One", "Two", "Three" | "Other").
[...]
An out-of-bounds index yields 'void' (via a '| void' part inserted by
the compiler). This is one of my examples from that era:
xt := (messa | 1,1,1, 2,2,2, 3,3,3)
yt := (messa | 3,2,1, 3,2,1, 3,2,1)
still the more c compatimle version would look better imo
xt = {1,1,1, 2,2,2, 3,3,3}[messa];
yt = {3,2,1, 3,2,1, 3,2,1}[messa];
[...]
On 09/11/2024 07:54, Janis Papanagnou wrote:
Well, it's a software system design decision whether you want to
make the caller test the preconditions for every function call,
or let the callee take care of unexpected input, or both.
Well, I suppose it is their decision - they can do the right thing, or
the wrong thing, or both.
I believe I explained in previous posts why it is the /caller's/ responsibility to ensure pre-conditions are fulfilled, and why anything
else is simply guaranteeing extra overheads while giving you less
information for checking code correctness. But I realise that could
have been lost in the mass of posts, so I can go through it again if you want.
[...]
(On security boundaries, system call interfaces, etc., where the caller
could be malicious or incompetent in a way that damages something other
than their own program, you have to treat all inputs as dangerous and sanitize them, just like data from external sources. That's a different matter, and not the real focus here.)
We had always followed the convention to avoid all undefined
situations and always define every 'else' case by some sensible
behavior, at least writing a notice into a log-file, but also
to "fix" the runtime situation to be able to continue operating.
(Note, I was mainly writing server-side software where this was
especially important.)
You can't "fix" bugs in the caller code by writing to a log file.
Sometimes you can limit the damage, however.
If you can't trust the people writing the calling code, then that should
be the focus of your development process - find a way to be sure that
the caller code is right. That's where you want your conventions, or to focus code reviews, training, automatic test systems - whatever is appropriate for your team and project. Make sure callers pass correct
data to the function, and the function can do its job properly.
Sometimes it makes sense to specify functions differently, and accept a
wider input. Maybe instead of saying "this function will return the
integer square root of numbers between 0 and 10", you say "this function
will return the integer square root if given a number between 0 and 10,
and will log a message and return -1 for other int values". Fair enough
- now you've got a new function where it is very easy for the caller to ensure the preconditions are satisfied. But be very aware of the costs
- you have now destroyed the "purity" of the function, and lost the key mathematical relation between the input and output. (You have also made everything much less efficient.)
[...]
On 09/11/2024 05:51, Janis Papanagnou wrote:
[...]
Sure, I appreciate all this. We must do the best we can - I am simply
saying that using | for this operation is far from the best choice.
Well, I'm more used (from mathematics) to 'v' and '^' than to '|'
and '&', respectively. But that doesn't prevent me from accepting
other symbols like '|' to have some [mathematical] meaning, or
even different meanings depending on context. In mathematics it's
not different; same symbols are used in different contexts with
different semantics. (And there's also the mentioned problem of
non-coherent literature WRT used mathematics' symbols.)
We are - unfortunately, perhaps - constrained by common keyboards and
ASCII (for the most part). "v" and "^" are poor choices for "or" and
"and" - "∨" and "∧" would be much nicer, but are hard to type.
For
better or worse, the programming world has settled on "|" and "&" as practical alternatives.
("+" and "." are often used in boolean logic,
and can be typed on normal keyboards, but would quickly be confused with other uses of those symbols.)
[...]
Well, on opinions there's nothing more to discuss, I suppose.
Opinions can be justified, and that discussion can be interesting.
Purely subjective opinion is less interesting.
On 05/11/2024 19:53, Waldek Hebisch wrote:
Bart <bc@freeuk.com> wrote:
On 05/11/2024 12:42, Waldek Hebisch wrote:
Bart <bc@freeuk.com> wrote:
Then we disagree on what 'multi-way' select might mean. I think it means >>>>> branching, even if notionally, on one-of-N possible code paths.
OK.
The whole construct may or may not return a value. If it does, then one >>>>> of the N paths must be a default path.
You need to cover all input values. This is possible when there
is reasonably small number of possibilities. For example, switch on
char variable which covers all possible values does not need default
path. Default is needed only when number of possibilities is too
large to explicitely give all of them. And some languages allow
ranges, so that you may be able to cover all values with small
number of ranges.
What's easier to implement in a language: to have a conditional need for >>> an 'else' branch, which is dependent on the compiler performing some
arbitrarily complex levels of analysis on some arbitrarily complex set
of expressions...
...or to just always require 'else', with a dummy value if necessary?
Well, frequently it is easier to do bad job, than a good one.
I assume that you consider the simple solution the 'bad' one?
I'd would consider a much elaborate one putting the onus on external
tools, and still having an unpredictable result to be the poor of the two.
You want to create a language that is easily compilable, no matter how complex the input.
With the simple solution, the worst that can happen is that you have to write a dummy 'else' branch, perhaps with a dummy zero value.
If control never reaches that point, it will never be executed (at
worse, it may need to skip an instruction).
But if the compiler is clever enough (optionally clever, it is not a requirement!), then it could eliminate that code.
A bonus is that when debugging, you can comment out all or part of the previous lines, but the 'else' now catches those untested cases.
normally you do not need very complex analysis:
I don't want to do any analysis at all! I just want a mechanical
translation as effortlessly as possible.
I don't like unbalanced code within a function because it's wrong and
can cause problems.
On 05/11/2024 20:39, Waldek Hebisch wrote:
David Brown <david.brown@hesbynett.no> wrote:
On 05/11/2024 13:42, Waldek Hebisch wrote:
Bart <bc@freeuk.com> wrote:
Then we disagree on what 'multi-way' select might mean. I think it means >>>>> branching, even if notionally, on one-of-N possible code paths.
OK.
I appreciate this is what Bart means by that phrase, but I don't agree
with it. I'm not sure if that is covered by "OK" or not!
You may prefer your own definition, but Bart's is resonable one.
The only argument I can make here is that I have not seen "multi-way
select" as a defined phrase with a particular established meaning.
The whole construct may or may not return a value. If it does, then one >>>>> of the N paths must be a default path.
You need to cover all input values. This is possible when there
is reasonably small number of possibilities. For example, switch on
char variable which covers all possible values does not need default
path. Default is needed only when number of possibilities is too
large to explicitely give all of them. And some languages allow
ranges, so that you may be able to cover all values with small
number of ranges.
I think this is all very dependent on what you mean by "all input values". >>>
Supposing I declare this function:
// Return the integer square root of numbers between 0 and 10
int small_int_sqrt(int x);
To me, the range of "all input values" is integers from 0 to 10. I
could implement it as :
int small_int_sqrt(int x) {
if (x == 0) return 0;
if (x < 4) return 1;
if (x < 9) return 2;
if (x < 16) return 3;
unreachable();
}
If the user asks for small_int_sqrt(-10) or small_int_sqrt(20), that's
/their/ fault and /their/ problem. I said nothing about what would
happen in those cases.
But some people seem to feel that "all input values" means every
possible value of the input types, and thus that a function like this
should return a value even when there is no correct value in and no
correct value out.
Well, some languages treat types more seriously than C. In Pascal
type of your input would be 0..10 and all input values would be
handled. Sure, when domain is too complicated to express in type
than it could be documented restriction. Still, it makes sense to
signal error if value goes outside handled rage, so in a sense all
values of input type are handled: either you get valid answer or
clear error.
No, it does not make sense to do that. Just because the C language does
not currently (maybe once C++ gets contracts, C will copy them) have a
way to specify input sets other than by types, does not mean that
functions in C always have a domain matching all possible combinations
of bits in the underlying representation of the parameter's types.
It might be a useful fault-finding aid temporarily to add error messages
for inputs that are invalid but can physically be squeezed into the parameters. That won't stop people making incorrect declarations of the function and passing completely different parameter types to it, or
finding other ways to break the requirements of the function.
And in general there is no way to check the validity of the inputs - you usually have no choice but to trust the caller. It's only in simple
cases, like the example above, that it would be feasible at all.
There are, of course, situations where the person calling the function
is likely to be incompetent, malicious, or both, and where there can be serious consequences for what you might prefer to consider as invalid
input values.
You have that for things like OS system calls - it's no
different than dealing with user inputs or data from external sources.
But you handle that by extending the function - increase the range of
valid inputs and appropriate outputs. You no longer have a function
that takes a number between 0 and 10 and returns the integer square root
- you now have a function that takes a number between -(2 ^ 31 + 1) and
(2 ^ 31) and returns the integer square root if the input is in the
range 0 to 10 or halts the program with an error message for other
inputs in the wider range. It's a different function, with a wider set
of inputs - and again, it is specified to give particular results for particular inputs.
I certainly would
be quite unhappy with code above. It is possible that I would still
use it as a compromise (say if it was desirable to have single
prototype but handle points in spaces of various dimensions),
but my first attempt would be something like:
typedef struct {int p[2];} two_int;
....
I think you'd quickly find that limiting and awkward in C (but it might
be appropriate in other languages).
But don't misunderstand me - I am
all in favour of finding ways in code that make input requirements
clearer or enforceable within the language - never put anything in
comments if you can do it in code. You could reasonably do this in C
for the first example :
// Do not use this directly
extern int small_int_sqrt_implementation(int x);
// Return the integer square root of numbers between 0 and 10
static inline int small_int_sqrt(int x) {
assert(x >= 0 && x <= 10);
return small_int_sqrt_implementation(x);
}
A function should accept all input values - once you have made clear
what the acceptable input values can be. A "default" case is just a
short-cut for conveniently handling a wide range of valid input values - >>> it is never a tool for handling /invalid/ input values.
Well, default can signal error which frequently is right handling
of invalid input values.
Will that somehow fix the bug in the code that calls the function?
It can be a useful debugging and testing aid, certainly, but it does not make the code "correct" or "safe" in any sense.
On 09/11/2024 03:57, Janis Papanagnou wrote:
[...] - But I'm not quite sure whether
you're speaking here about your "C"-like language or some other
language you implemented.
I currently have three HLL implementations:
* For my C subset language (originally I had some enhancements, now
dropped)
* For my 'M' systems language inspired by A68 syntax
* For my 'Q' scripting language, with the same syntax, more or less
The remark was about those last two.
if cond then
s1
s2
else
s3
s4
end
(Looks a lot more like a scripting language without semicolons.)
This is what I've long suspected: that people associate clear, pseudo-code-like syntax with scripting languages.
[...]
On 09.11.2024 12:06, David Brown wrote:
On 09/11/2024 07:54, Janis Papanagnou wrote:
Well, it's a software system design decision whether you want to
make the caller test the preconditions for every function call,
or let the callee take care of unexpected input, or both.
Well, I suppose it is their decision - they can do the right thing, or
the wrong thing, or both.
I believe I explained in previous posts why it is the /caller's/
responsibility to ensure pre-conditions are fulfilled, and why anything
else is simply guaranteeing extra overheads while giving you less
information for checking code correctness. But I realise that could
have been lost in the mass of posts, so I can go through it again if you
want.
I haven't read all the posts, or rather, I just skipped most posts;
it's too time consuming.
Since you explicitly elaborated - thanks! - I will read this one...
[...]
(On security boundaries, system call interfaces, etc., where the caller
could be malicious or incompetent in a way that damages something other
than their own program, you have to treat all inputs as dangerous and
sanitize them, just like data from external sources. That's a different
matter, and not the real focus here.)
We had always followed the convention to avoid all undefined
situations and always define every 'else' case by some sensible
behavior, at least writing a notice into a log-file, but also
to "fix" the runtime situation to be able to continue operating.
(Note, I was mainly writing server-side software where this was
especially important.)
You can't "fix" bugs in the caller code by writing to a log file.
Sometimes you can limit the damage, however.
I spoke more generally of fixing situations (not only bugs).
If you can't trust the people writing the calling code, then that should
be the focus of your development process - find a way to be sure that
the caller code is right. That's where you want your conventions, or to
focus code reviews, training, automatic test systems - whatever is
appropriate for your team and project. Make sure callers pass correct
data to the function, and the function can do its job properly.
Yes.
Sometimes it makes sense to specify functions differently, and accept a
wider input. Maybe instead of saying "this function will return the
integer square root of numbers between 0 and 10", you say "this function
will return the integer square root if given a number between 0 and 10,
and will log a message and return -1 for other int values". Fair enough
- now you've got a new function where it is very easy for the caller to
ensure the preconditions are satisfied. But be very aware of the costs
- you have now destroyed the "purity" of the function, and lost the key
mathematical relation between the input and output. (You have also made
everything much less efficient.)
I disagree in the "much less" generalization. I also think that when
weighing performance versus safety my preferences might be different;
I'm only speaking about a "rule of thumb", not about the actual (IMO) necessity(!) to make this decisions depending on the project context.
David Brown <david.brown@hesbynett.no> wrote:
On 05/11/2024 20:39, Waldek Hebisch wrote:
David Brown <david.brown@hesbynett.no> wrote:
On 05/11/2024 13:42, Waldek Hebisch wrote:
Bart <bc@freeuk.com> wrote:
It might be a useful fault-finding aid temporarily to add error messages
for inputs that are invalid but can physically be squeezed into the
parameters. That won't stop people making incorrect declarations of the
function and passing completely different parameter types to it, or
finding other ways to break the requirements of the function.
And in general there is no way to check the validity of the inputs - you
usually have no choice but to trust the caller. It's only in simple
cases, like the example above, that it would be feasible at all.
There are, of course, situations where the person calling the function
is likely to be incompetent, malicious, or both, and where there can be
serious consequences for what you might prefer to consider as invalid
input values.
You apparently exclude possibility of competent persons making a
mistake.
AFAIK industry statistic shows that code develeped by
good developers using rigorous process still contains substantial
number of bugs. So, it makes sense to have as much as possible
verified mechanically. Which in common practice means depending on
type checks. In less common practice you may have some theorem
proving framework checking assertions about input arguments,
then the assertions take role of types.
But don't misunderstand me - I am
all in favour of finding ways in code that make input requirements
clearer or enforceable within the language - never put anything in
comments if you can do it in code. You could reasonably do this in C
for the first example :
// Do not use this directly
extern int small_int_sqrt_implementation(int x);
// Return the integer square root of numbers between 0 and 10
static inline int small_int_sqrt(int x) {
assert(x >= 0 && x <= 10);
return small_int_sqrt_implementation(x);
}
Hmm, why extern implementation and static wrapper? I would do
the opposite.
A function should accept all input values - once you have made clear
what the acceptable input values can be. A "default" case is just a
short-cut for conveniently handling a wide range of valid input values - >>>> it is never a tool for handling /invalid/ input values.
Well, default can signal error which frequently is right handling
of invalid input values.
Will that somehow fix the bug in the code that calls the function?
It can be a useful debugging and testing aid, certainly, but it does not
make the code "correct" or "safe" in any sense.
There is concept of "partial correctness": code if it finishes returns correct value. A variation of this is: code if it finishes without
signaling error returns correct values. Such condition may be
much easier to verify than "full correctness" and in many case
is almost as useful. In particular, mathematicians are _very_
unhappy when program return incorrect results. But they are used
to programs which can not deliver results, either because of
lack or resources or because needed case was not implemented.
When dealing with math formulas there are frequently various
restrictions on parameters, like we can only divide by nonzero
quantity. By signaling error when restrictions are not
satisfied we ensure that sucessful completition means that
restrictions were satisfied. Of course that alone does not
mean that result is correct, but correctness of "general"
case is usually _much_ easier to ensure. In other words,
failing restrictions are major source of errors, and signaling
errors effectively eliminates it.
In world of prefect programmers, they would check restrictions
before calling any function depending on them, or prove that
restrictions on arguments to a function imply correctness of
calls made by the function. But world is imperfect and in
real world extra runtime checks are quite useful.
On 10/11/2024 07:57, Waldek Hebisch wrote:
David Brown <david.brown@hesbynett.no> wrote:
On 05/11/2024 20:39, Waldek Hebisch wrote:
David Brown <david.brown@hesbynett.no> wrote:
On 05/11/2024 13:42, Waldek Hebisch wrote:
Bart <bc@freeuk.com> wrote:
Type checks can be extremely helpful, and strong typing greatly reduces
the errors in released code by catching them early (at compile time).
And temporary run-time checks are also helpful during development or debugging.
But extra run-time checks are costly (and I don't mean just in run-time performance, which is only an issue in a minority of situations). They
mean more code - which means more scope for errors, and more code that
must be checked and maintained. Usually this code can't be tested well
in final products - precisely because it is there to handle a situation
that never occurs.
A function should accept all input values - once you have made clear >>>>> what the acceptable input values can be. A "default" case is just a >>>>> short-cut for conveniently handling a wide range of valid input values - >>>>> it is never a tool for handling /invalid/ input values.
Well, default can signal error which frequently is right handling
of invalid input values.
Will that somehow fix the bug in the code that calls the function?
It can be a useful debugging and testing aid, certainly, but it does not >>> make the code "correct" or "safe" in any sense.
There is concept of "partial correctness": code if it finishes returns
correct value. A variation of this is: code if it finishes without
signaling error returns correct values. Such condition may be
much easier to verify than "full correctness" and in many case
is almost as useful. In particular, mathematicians are _very_
unhappy when program return incorrect results. But they are used
to programs which can not deliver results, either because of
lack or resources or because needed case was not implemented.
When dealing with math formulas there are frequently various
restrictions on parameters, like we can only divide by nonzero
quantity. By signaling error when restrictions are not
satisfied we ensure that sucessful completition means that
restrictions were satisfied. Of course that alone does not
mean that result is correct, but correctness of "general"
case is usually _much_ easier to ensure. In other words,
failing restrictions are major source of errors, and signaling
errors effectively eliminates it.
Yes, out-of-band signalling in some way is a useful way to indicate a problem, and can allow parameter checking without losing the useful
results of a function. This is the principle behind exceptions in many languages - then functions either return normally with correct results,
or you have a clearly abnormal situation.
In world of prefect programmers, they would check restrictions
before calling any function depending on them, or prove that
restrictions on arguments to a function imply correctness of
calls made by the function. But world is imperfect and in
real world extra runtime checks are quite useful.
Runtime checks in a function can be useful if you know the calling code might not be perfect and the function is going to take responsibility
for identifying that situation. Programmers will often be writing both
the caller and callee code, and put temporary debugging and test checks wherever it is most convenient.
But I think being too enthusiastic about putting checks in the wrong
place - the callee function - can hide the real problems, or make the
callee code writer less careful about getting their part of the code correct.
Bart <bc@freeuk.com> wrote:
I assume that you consider the simple solution the 'bad' one?
You wrote about _always_ requiring 'else' regardless if it is
needed or not. Yes, I consider this bad.
I'd would consider a much elaborate one putting the onus on external
tools, and still having an unpredictable result to be the poor of the two. >>
You want to create a language that is easily compilable, no matter how
complex the input.
Normally time spent _using_ compiler should be bigger than time
spending writing compiler. If compiler gets enough use, it
justifies some complexity.
I am mainly concerned with clarity and correctness of source code.
Dummy 'else' doing something may hide errors.
Dummy 'else' signaling
error means that something which could be compile time error is
only detected at runtime.
Compiler that detects most errors of this sort is IMO better than
compiler which makes no effort to detect them. And clearly, once
problem is formulated in sufficiently general way, it becomes
unsolvable. So I do not expect general solution, but expect
resonable effort.
normally you do not need very complex analysis:
I don't want to do any analysis at all! I just want a mechanical
translation as effortlessly as possible.
I don't like unbalanced code within a function because it's wrong and
can cause problems.
Well, I demand more from compiler than you do...
David Brown <david.brown@hesbynett.no> wrote:
Runtime checks in a function can be useful if you know the calling code
might not be perfect and the function is going to take responsibility
for identifying that situation. Programmers will often be writing both
the caller and callee code, and put temporary debugging and test checks
wherever it is most convenient.
But I think being too enthusiastic about putting checks in the wrong
place - the callee function - can hide the real problems, or make the
callee code writer less careful about getting their part of the code
correct.
IME the opposite: not having checks in called function simply delays
moment when error is detected. Getting errors early helps focus on
tricky problems or misconceptions. And motivates programmers to
be more careful
Concerning correct place for checks: one could argue that check
should be close to place where the result of check matters, which
frequently is in called function.
And frequently check requires
computation that is done by called function as part of normal
processing, but would be extra code in the caller.
On 11/11/2024 20:09, Waldek Hebisch wrote:
David Brown <david.brown@hesbynett.no> wrote:
Concerning correct place for checks: one could argue that check
should be close to place where the result of check matters, which
frequently is in called function.
No, there I disagree. The correct place for the checks should be close
to where the error is, and that is in the /calling/ code. If the called function is correctly written, reviewed, tested, documented and
considered "finished", why would it be appropriate to add extra code to
that in order to test and debug some completely different part of the code?
The place where the result of the check /really/ matters, is the calling code. And that is also the place where you can most easily find the
error, since the error is in the calling code, not the called function.
And it is most likely to be the code that you are working on at the time
- the called function is already written and tested.
And frequently check requires
computation that is done by called function as part of normal
processing, but would be extra code in the caller.
It is more likely to be the opposite in practice.
And for much of the time, the called function has no real practical way
to check the parameters anyway. A function that takes a pointer
parameter - not an uncommon situation - generally has no way to check
the validity of the pointer. It can't check that the pointer actually points to useful source data or an appropriate place to store data.
All it can do is check for a null pointer, which is usually a fairly
useless thing to do (unless the specifications for the function make the pointer optional). After all, on most (but not all) systems you already have a "free" null pointer check - if the caller code has screwed up and passed a null pointer when it should not have done, the program will
quickly crash when the pointer is used for access. Many compilers
provide a way to annotate function declarations to say that a pointer
must not be null, and can then spot at least some such errors at compile time. And of course the calling code will very often be passing the
address of an object in the call - since that can't be null, a check in
the function is pointless.
Once you get to more complex data structures, the possibility for the
caller to check the parameters gets steadily less realistic.
So now your practice of having functions "always" check their parameters leaves the people writing calling code with a false sense of security - usually you /don't/ check the parameters, you only ever do simple checks that that called could (and should!) do if they were realistic. You've
got the maintenance and cognitive overload of extra source code for your various "asserts" and other check, regardless of any run-time costs
(which are often irrelevant, but occasionally very important).
You will note that much of this - for both sides of the argument - uses words like "often", "generally" or "frequently". It is important to appreciate that programming spans a very wide range of situations, and I don't want to be too categorical about things. I have already said
there are situations when parameter checking in called functions can
make sense. I've no doubt that for some people and some types of
coding, such cases are a lot more common than what I see in my coding.
Note also that when you can use tools to automate checks, such as
"sanitize" options in compilers or different languages that have more in-built checks, the balance differs. You will generally pay a run-time cost for those checks, but you don't have the same kind of source-level costs - your code is still clean, clear, and amenable to correctness checking, without hiding the functionality of the code in a mass of unnecessary explicit checks. This is particularly good for debugging,
and the run-time costs might not be important. (But if run-time costs
are not important, there's a good chance that C is not the best language
to be using in the first place.)
if (n==0) { printf ("n: %u\n",n); n++;}
if (n==1) { printf ("n: %u\n",n); n++;}
if (n==2) { printf ("n: %u\n",n); n++;}
if (n==3) { printf ("n: %u\n",n); n++;}
if (n==4) { printf ("n: %u\n",n); n++;}
printf ("all if completed, n=%u\n",n);
Dan Purgert <dan@djph.net> wrote or quoted:
if (n==0) { printf ("n: %u\n",n); n++;}
if (n==1) { printf ("n: %u\n",n); n++;}
if (n==2) { printf ("n: %u\n",n); n++;}
if (n==3) { printf ("n: %u\n",n); n++;}
if (n==4) { printf ("n: %u\n",n); n++;}
printf ("all if completed, n=%u\n",n);
My bad if the following instruction structure's already been hashed
out in this thread, but I haven't been following the whole convo!
In my C 101 classes, after we've covered "if" and "else",
I always throw this program up on the screen and hit the newbies
with this curveball: "What's this bad boy going to spit out?".
My bad if the following instruction structure's already been hashed
out in this thread, but I haven't been following the whole convo!
In my C 101 classes, after we've covered "if" and "else",
I always throw this program up on the screen and hit the newbies
with this curveball: "What's this bad boy going to spit out?".
Well, it's a blue moon when someone nails it. Most of them fall
for my little gotcha hook, line, and sinker.
#include <stdio.h>
const char * english( int const n )
{ const char * result;
if( n == 0 )result = "zero";
if( n == 1 )result = "one";
if( n == 2 )result = "two";
if( n == 3 )result = "three";
else result = "four";
return result; }
void print_english( int const n )
{ printf( "%s\n", english( n )); }
int main( void )
{ print_english( 0 );
print_english( 1 );
print_english( 2 );
print_english( 3 );
print_english( 4 ); }
Dan Purgert <dan@djph.net> wrote or quoted:
if (n==0) { printf ("n: %u\n",n); n++;}
if (n==1) { printf ("n: %u\n",n); n++;}
if (n==2) { printf ("n: %u\n",n); n++;}
if (n==3) { printf ("n: %u\n",n); n++;}
if (n==4) { printf ("n: %u\n",n); n++;}
printf ("all if completed, n=%u\n",n);
My bad if the following instruction structure's already been hashed
out in this thread, but I haven't been following the whole convo!
In my C 101 classes, after we've covered "if" and "else",
I always throw this program up on the screen and hit the newbies
with this curveball: "What's this bad boy going to spit out?".
Well, it's a blue moon when someone nails it. Most of them fall
for my little gotcha hook, line, and sinker.
#include <stdio.h>
const char * english( int const n )
{ const char * result;
if( n == 0 )result = "zero";
if( n == 1 )result = "one";
if( n == 2 )result = "two";
if( n == 3 )result = "three";
else result = "four";
return result; }
void print_english( int const n )
{ printf( "%s\n", english( n )); }
int main( void )
{ print_english( 0 );
print_english( 1 );
print_english( 2 );
print_english( 3 );
print_english( 4 ); }
David Brown <david.brown@hesbynett.no> wrote:
On 11/11/2024 20:09, Waldek Hebisch wrote:
David Brown <david.brown@hesbynett.no> wrote:
Concerning correct place for checks: one could argue that check
should be close to place where the result of check matters, which
frequently is in called function.
No, there I disagree. The correct place for the checks should be close
to where the error is, and that is in the /calling/ code. If the called
function is correctly written, reviewed, tested, documented and
considered "finished", why would it be appropriate to add extra code to
that in order to test and debug some completely different part of the code? >>
The place where the result of the check /really/ matters, is the calling
code. And that is also the place where you can most easily find the
error, since the error is in the calling code, not the called function.
And it is most likely to be the code that you are working on at the time
- the called function is already written and tested.
And frequently check requires
computation that is done by called function as part of normal
processing, but would be extra code in the caller.
It is more likely to be the opposite in practice.
And for much of the time, the called function has no real practical way
to check the parameters anyway. A function that takes a pointer
parameter - not an uncommon situation - generally has no way to check
the validity of the pointer. It can't check that the pointer actually
points to useful source data or an appropriate place to store data.
All it can do is check for a null pointer, which is usually a fairly
useless thing to do (unless the specifications for the function make the
pointer optional). After all, on most (but not all) systems you already
have a "free" null pointer check - if the caller code has screwed up and
passed a null pointer when it should not have done, the program will
quickly crash when the pointer is used for access. Many compilers
provide a way to annotate function declarations to say that a pointer
must not be null, and can then spot at least some such errors at compile
time. And of course the calling code will very often be passing the
address of an object in the call - since that can't be null, a check in
the function is pointless.
Well, in a sense pointers are easy: if you do not play nasty tricks
with casts then type checks do significant part of checking. Of
course, pointer may be uninitialized (but compiler warnings help a lot
here), memory may be overwritten, etc. But overwritten memory is
rather special, if you checked that content of memory is correct,
but it is overwritten after the check, then earlier check does not
help. Anyway, main point is ensuring that pointed to data satisfies
expected conditions.
Once you get to more complex data structures, the possibility for the
caller to check the parameters gets steadily less realistic.
So now your practice of having functions "always" check their parameters
leaves the people writing calling code with a false sense of security -
usually you /don't/ check the parameters, you only ever do simple checks
that that called could (and should!) do if they were realistic. You've
got the maintenance and cognitive overload of extra source code for your
various "asserts" and other check, regardless of any run-time costs
(which are often irrelevant, but occasionally very important).
You will note that much of this - for both sides of the argument - uses
words like "often", "generally" or "frequently". It is important to
appreciate that programming spans a very wide range of situations, and I
don't want to be too categorical about things. I have already said
there are situations when parameter checking in called functions can
make sense. I've no doubt that for some people and some types of
coding, such cases are a lot more common than what I see in my coding.
Note also that when you can use tools to automate checks, such as
"sanitize" options in compilers or different languages that have more
in-built checks, the balance differs. You will generally pay a run-time
cost for those checks, but you don't have the same kind of source-level
costs - your code is still clean, clear, and amenable to correctness
checking, without hiding the functionality of the code in a mass of
unnecessary explicit checks. This is particularly good for debugging,
and the run-time costs might not be important. (But if run-time costs
are not important, there's a good chance that C is not the best language
to be using in the first place.)
Our experience differs. As a silly example consider a parser
which produces parse tree. Caller is supposed to pass syntactically
correct string as an argument. However, checking syntactic corretnetness requires almost the same effort as producing parse tree, so it
ususal that parser both checks correctness and produces the result.
I have computations that are quite different than parsing but
in some cases share the same characteristic: checking correctness of arguments requires complex computation similar to producing
actual result. More freqently, called routine can check various
invariants which with high probablity can detect errors. Doing
the same check in caller is inpractical.
Most of my coding is in different languages than C. One of languages
that I use essentially forces programmer to insert checks in
some places. For example unions are tagged and one can use
specific variant only after checking that this is the current
variant. Similarly, fall-trough control structures may lead
to type error at compile time. But signalling error is considered
type safe. So code which checks for unhandled case and signals
errors is accepted as type correct. Unhandled cases frequently
lead to type errors. There is some overhead, but IMO it is accepable.
The language in question is garbage collected, so many memory
related problems go away.
Frequently checks come as natural byproduct of computations. When
handling tree like structures in C IME usualy simplest code code
is reqursive with base case being the null pointer. When base
case should not occur we get check instead of computation.
Skipping such checks also put cognitive load on the reader:
normal pattern has corresponding case, so reader does not know
if the case was ommited by accident or it can not occur. Comment
may clarify this, but error check is equally clear.
On 16/11/2024 09:42, Stefan Ram wrote:
Dan Purgert <dan@djph.net> wrote or quoted:
if (n==0) { printf ("n: %u\n",n); n++;}
if (n==1) { printf ("n: %u\n",n); n++;}
if (n==2) { printf ("n: %u\n",n); n++;}
if (n==3) { printf ("n: %u\n",n); n++;}
if (n==4) { printf ("n: %u\n",n); n++;}
printf ("all if completed, n=%u\n",n);
My bad if the following instruction structure's already been hashed
out in this thread, but I haven't been following the whole convo!
In my C 101 classes, after we've covered "if" and "else",
I always throw this program up on the screen and hit the newbies
with this curveball: "What's this bad boy going to spit out?".
FGS please turn down the 'hip lingo' generator down a few notches!
On Sat, 16 Nov 2024 09:42:49 +0000, Stefan Ram wrote:
Dan Purgert <dan@djph.net> wrote or quoted:
if (n==0) { printf ("n: %u\n",n); n++;}
if (n==1) { printf ("n: %u\n",n); n++;}
if (n==2) { printf ("n: %u\n",n); n++;}
if (n==3) { printf ("n: %u\n",n); n++;}
if (n==4) { printf ("n: %u\n",n); n++;}
printf ("all if completed, n=%u\n",n);
My bad if the following instruction structure's already been hashed
out in this thread, but I haven't been following the whole convo!
In my C 101 classes, after we've covered "if" and "else",
I always throw this program up on the screen and hit the newbies
with this curveball: "What's this bad boy going to spit out?".
Well, it's a blue moon when someone nails it. Most of them fall
for my little gotcha hook, line, and sinker.
#include <stdio.h>
const char * english( int const n )
{ const char * result;
if( n == 0 )result = "zero";
if( n == 1 )result = "one";
if( n == 2 )result = "two";
if( n == 3 )result = "three";
else result = "four";
return result; }
void print_english( int const n )
{ printf( "%s\n", english( n )); }
int main( void )
{ print_english( 0 );
print_english( 1 );
print_english( 2 );
print_english( 3 );
print_english( 4 ); }
If I read your code correctly, you have actually included not one,
but TWO curveballs. Well done!
On 10/11/2024 06:00, Waldek Hebisch wrote:
Bart <bc@freeuk.com> wrote:
I'd would consider a much elaborate one putting the onus on external
tools, and still having an unpredictable result to be the poor of the two. >>>
You want to create a language that is easily compilable, no matter how
complex the input.
Normally time spent _using_ compiler should be bigger than time
spending writing compiler. If compiler gets enough use, it
justifies some complexity.
That doesn't add up: the more the compiler gets used, the slower it
should get?!
The sort of analysis you're implying I don't think belongs in the kind
of compiler I prefer. Even if it did, it would be later on in the
process than the point where the above restriction is checked, so
wouldn't exist in one of my compilers anyway.
I don't like open-ended tasks like this where compilation time could end
up being anything. If you need to keep recompiling the same module, then
you don't want to repeat that work each time.
I am mainly concerned with clarity and correctness of source code.
So am I. I try to keep my syntax clean and uncluttered.
Dummy 'else' doing something may hide errors.
So can 'unreachable'.
Dummy 'else' signaling
error means that something which could be compile time error is
only detected at runtime.
Compiler that detects most errors of this sort is IMO better than
compiler which makes no effort to detect them. And clearly, once
problem is formulated in sufficiently general way, it becomes
unsolvable. So I do not expect general solution, but expect
resonable effort.
So how would David Brown's example work:
int F(int n) {
if (n==1) return 10;
if (n==2) return 20;
}
/You/ know that values -2**31 to 0 and 3 to 2**31-1 are impossible; the compiler doesn't. It's likely to tell you that you may run into the end
of the function.
So what do you want the compiler to here? If I try it:
func F(int n)int =
if n=1 then return 10 fi
if n=2 then return 20 fi
end
It says 'else needed' (in that last statement). I can also shut it up
like this:
func F(int n)int = # int is i64 here
if n=1 then return 10 fi
if n=2 then return 20 fi
0
end
Since now that last statement is the '0' value (any int value wil do).
What should my compiler report instead? What analysis should it be
doing? What would that save me from typing?
normally you do not need very complex analysis:
I don't want to do any analysis at all! I just want a mechanical
translation as effortlessly as possible.
I don't like unbalanced code within a function because it's wrong and
can cause problems.
Well, I demand more from compiler than you do...
Perhaps you're happy for it to be bigger and slower too. Most of my
projects build more or less instantly. Here 'ms' is a version that runs programs directly from source (the first 'ms' is 'ms.exe' and subsequent ones are 'ms.m' the lead module):
c:\bx>ms ms ms ms ms ms ms ms ms ms ms ms ms ms ms ms hello
Hello World! 21:00:45
This builds and runs 15 successive generations of itself in memory
before building and running hello.m; it took 1 second in all. (Now try
that with gcc!)
Here:
c:\cx>tm \bx\mm -runp cc sql
Compiling cc.m to <pcl>
Compiling sql.c to sql.exe
This compiles my C compiler from source but then it /interprets/ the IR produced. This interpreted compiler took 6 seconds to build the 250Kloc
test file, and it's a very slow interpreter (it's used for testing and debugging).
(gcc -O0 took a bit longer to build sql.c! About 7 seconds but it is
using a heftier windows.h.)
If I run the C compiler from source as native code (\bx\ms cc sql) then building the compiler *and* sql.c takes 1/3 of a second.
You can't do this stuff with the compilers David Brown uses; I'm
guessing you can't do it with your prefered ones either.
[...]
My preferences are very much weighted towards correctness, not
efficiency. That includes /knowing/ that things are correct, not just passing some tests. [...]
I wonder what happened to Stefan. He used to make perfectly good posts.
Then he disappeared for a bit, and came back with this new "style".
Given that this "new" Stefan can write posts with interesting C content,
such as this one, and has retained his ugly coding layout and
non-standard Usenet format, I have to assume it's still the same person behind the posts.
On 10.11.2024 16:13, David Brown wrote:
[...]
My preferences are very much weighted towards correctness, not
efficiency. That includes /knowing/ that things are correct, not just
passing some tests. [...]
I agree with you. But given what you write I'm also sure you know
what's achievable in theory, what's an avid wish, and what's really
possible.
Yet there's also projects that don't seem to care, where
speedy delivery is the primary goal. Guaranteeing formal correctness
had never been an issue in the industry contexts I worked in, and I
was always glad when I had a good test environment, with a good test coverage, and continuous refinement of tests. Informal documentation,
factual checks of the arguments, and actual tests was what kept the
quality of our project deliveries at a high level.
On 16.11.2024 17:38, David Brown wrote:
I wonder what happened to Stefan. He used to make perfectly good posts.
Then he disappeared for a bit, and came back with this new "style".
Given that this "new" Stefan can write posts with interesting C content,
such as this one, and has retained his ugly coding layout and
non-standard Usenet format, I have to assume it's still the same person
behind the posts.
Sorry that I cannot resist asking what you consider "non-standard
Usenet format", given that your posts don't consider line length.
(Did the "standards" change during the past three decades maybe?
Do we use only those parts of the "standards" that we like and
ignore others? Or does it boil down to Netiquette is no standard?)
Janis, just curious and no offense intended :-)
On 16.11.2024 17:38, David Brown wrote:
I wonder what happened to Stefan. He used to make perfectly good
posts. Then he disappeared for a bit, and came back with this new
"style".
Given that this "new" Stefan can write posts with interesting C
content, such as this one, and has retained his ugly coding layout
and non-standard Usenet format, I have to assume it's still the
same person behind the posts.
Sorry that I cannot resist asking what you consider "non-standard
Usenet format", given that your posts don't consider line length.
(Did the "standards" change during the past three decades maybe?
Do we use only those parts of the "standards" that we like and
ignore others? Or does it boil down to Netiquette is no standard?)
There are a great variety of projects, [...]
Of course testing is important, at many levels. But the time to test
your code is when you are confident that it is correct - testing is not
an alternative to writing code that is as clearly correct as you are
able to make it.
On 11/16/24 04:42, Stefan Ram wrote:
...
[...]
#include <stdio.h>
const char * english( int const n )
{ const char * result;
if( n == 0 )result = "zero";
if( n == 1 )result = "one";
if( n == 2 )result = "two";
if( n == 3 )result = "three";
else result = "four";
return result; }
void print_english( int const n )
{ printf( "%s\n", english( n )); }
int main( void )
{ print_english( 0 );
print_english( 1 );
print_english( 2 );
print_english( 3 );
print_english( 4 ); }
Nice. It did take a little while for me to figure out what was wrong,
but since I knew that something was wrong, I did eventually find it -
without first running the program.
Bart <bc@freeuk.com> wrote:
On 10/11/2024 06:00, Waldek Hebisch wrote:
Bart <bc@freeuk.com> wrote:
I'd would consider a much elaborate one putting the onus on external
tools, and still having an unpredictable result to be the poor of the two. >>>>
You want to create a language that is easily compilable, no matter how >>>> complex the input.
Normally time spent _using_ compiler should be bigger than time
spending writing compiler. If compiler gets enough use, it
justifies some complexity.
That doesn't add up: the more the compiler gets used, the slower it
should get?!
More complicated does not mean slower. Binary search or hash tables
are more complicated than linear search, but for larger data may
be much faster.
More generaly, I want to minimize time spent by the programmer,
that is _sum over all iterations leading to correct program_ of
compile time and "think time". Compiler that compiles slower,
but allows less iterations due to better diagnostics may win.
Also, humans perceive 0.1s delay almost like no delay at all.
So it does not matter if single compilation step is 0.1s or
0.1ms. Modern computers can do a lot of work in 0.1s.
Yes. This may lead to some complexity. Simple approach is to
avoid obviously useless recompilation ('make' is doing this).
More complicated approach may keep some intermediate data and
try to "validate" them first. If previous analysis is valid,
then it can be reused. If something significant changes, than
it needs to be re-done. But many changes only have very local
effect, so at least theoretically re-using analyses could
save substantial time.
Since now that last statement is the '0' value (any int value wil do).
What should my compiler report instead? What analysis should it be
doing? What would that save me from typing?
Currently in typed language that I use literal translation of
the example hits a hole in checks, that is the code is accepted.
Concerning needed analyses: one thing needed is representation of
type, either Pascal range type or enumeration type (the example
is _very_ unatural because in modern programming magic numbers
are avoided and there would be some symbolic representation
adding meaning to the numbers). Second, compiler must recognize
that this is a "multiway switch" and collect conditions.
Once
you have such representation (which may be desirable for other
reasons) it is easy to determine set of handled values. More
precisely, in this example we just have small number of discrete
values. More ambitious compiler may have list of ranges.
If type also specifies list of values or list of ranges, then
it is easy to check if all values of the type are handled.
You can't do this stuff with the compilers David Brown uses; I'm
guessing you can't do it with your prefered ones either.
To recompile the typed system I use (about 0.4M lines) on new fast
machine I need about 53s. But that is kind of cheating:
- this time is for parallel build using 20 logical cores
- the compiler is not in the language it compiles (but in untyped
vesion of it)
- actuall compilation of the compiler is small part of total
compile time
On slow machine compile time can be as large as 40 minutes.
An untyped system that I use has about 0.5M lines and recompiles
itself in 16s on the same machine. This one uses single core.
On slow machine compile time may be closer to 2 minutes.
Again, compiler compile time is only a part of build time.
Actualy, one time-intensive part is creating index for included documentation.
Another is C compilation for a library file
(system has image-processing functions and low-level part of
image processing is done in C). Recomplation starts from
minimal version of the system, rebuilding this minimal
version takes 3.3s.
Anyway, I do not need cascaded recompilation than you present.
Both system above have incermental compilation, the second one
at statement/function level: it offers interactive prompt
which takes a statement from the user, compiles it and immediately
executes. Such statement may define a function or perform compilation.
Even on _very_ slow machine there is no noticable delay due to
compilation, unless you feed the system with some oversized statement
or function (presumably from a file).
An untyped system
On 19/11/2024 01:53, Waldek Hebisch wrote:
More complicated does not mean slower. Binary search or hash tables
are more complicated than linear search, but for larger data may
be much faster.
That's not the complexity I had in mind. The 100-200MB sizes of
LLVM-based compilers are not because they use hash-tables over linear >search.
My tools can generally build my apps from scratch in 0.1 seconds; big >compilers tend to take a lot longer. Only Tiny C is in that ballpark.
Bart <bc@freeuk.com> writes:
On 19/11/2024 01:53, Waldek Hebisch wrote:
More complicated does not mean slower. Binary search or hash tables
are more complicated than linear search, but for larger data may
be much faster.
That's not the complexity I had in mind. The 100-200MB sizes of
LLVM-based compilers are not because they use hash-tables over linear
search.
You still have this irrational obsession with the amount of disk
space consumed by a compiler suite - one that is useful to a massive
number of developers (esp. compared with the user-base of your
compiler).
The amount of disk space consumed by a compilation suite is
a meaningless statistic. 10MByte disks are a relic of the
distant past.
My tools can generally build my apps from scratch in 0.1 seconds; big
compilers tend to take a lot longer. Only Tiny C is in that ballpark.
And Tiny C is useless for the majority of real-world applications.
How many people are using your compiler to build production applications?
On 19.11.2024 09:19, David Brown wrote:
[...]
There are a great variety of projects, [...]
I don't want the theme to get out of hand, so just one amendment to...
Of course testing is important, at many levels. But the time to test
your code is when you are confident that it is correct - testing is not
an alternative to writing code that is as clearly correct as you are
able to make it.
Sound like early days practice, where code is written, "defined" at
some point as "correct", and then tests written (sometimes written
by the same folks who implemented the code) to prove that the code
is doing the expected, or the tests have been spared because it was
"clear" that the code is "correct" (sort of).
Since the 1990's we've had other principles, yes, "on many levels"
(as you started your paragraph). At all levels there's some sort of specification (or description) that defined the expected outcome
and behavior; tests [of levels higher than unit-tests] are written
if not in parallel then usually by separate groups. The decoupling
is important, the "first implement, then test" serializing certainly
not.
Of course every responsible programmer tries to create correct code, supported by own experience and by projects' regulatory means. But
that doesn't guarantee correct code. Neither do test guarantee that.
But tests have been, IME, more effective in supporting correctness
than being "confident that it is correct" (as you say).
On 19/11/2024 01:53, Waldek Hebisch wrote:
Another example, building 40Kloc interpreter from source then running it
in memory:
c:\qx>tm \bx\mm -run qq hello
Compiling qq.m to memory
Hello, World! 19-Nov-2024 15:38:47
TM: 0.11
c:\qx>tm qq hello
Hello, World! 19-Nov-2024 15:38:49
TM: 0.05
The second version runs a precompiled EXE. So building from source added only 90ms.
On 10/11/2024 06:00, Waldek Hebisch wrote:
Bart <bc@freeuk.com> wrote:
I'd would consider a much elaborate one putting the onus on external
tools, and still having an unpredictable result to be the poor of the
two.
You want to create a language that is easily compilable, no matter how
complex the input.
Normally time spent _using_ compiler should be bigger than time
spending writing compiler. If compiler gets enough use, it
justifies some complexity.
That doesn't add up: the more the compiler gets used, the slower it
should get?!
On 19/11/2024 01:53, Waldek Hebisch wrote:
Bart <bc@freeuk.com> wrote:
On 10/11/2024 06:00, Waldek Hebisch wrote:
Bart <bc@freeuk.com> wrote:
I'd would consider a much elaborate one putting the onus on external >>>>> tools, and still having an unpredictable result to be the poor of the two.
You want to create a language that is easily compilable, no matter how >>>>> complex the input.
Normally time spent _using_ compiler should be bigger than time
spending writing compiler. If compiler gets enough use, it
justifies some complexity.
That doesn't add up: the more the compiler gets used, the slower it
should get?!
More complicated does not mean slower. Binary search or hash tables
are more complicated than linear search, but for larger data may
be much faster.
That's not the complexity I had in mind. The 100-200MB sizes of
LLVM-based compilers are not because they use hash-tables over linear search.
More generaly, I want to minimize time spent by the programmer,
that is _sum over all iterations leading to correct program_ of
compile time and "think time". Compiler that compiles slower,
but allows less iterations due to better diagnostics may win.
Also, humans perceive 0.1s delay almost like no delay at all.
So it does not matter if single compilation step is 0.1s or
0.1ms. Modern computers can do a lot of work in 0.1s.
What's the context of this 0.1 seconds? Do you consider it long or short?
My tools can generally build my apps from scratch in 0.1 seconds; big compilers tend to take a lot longer. Only Tiny C is in that ballpark.
So I'm failing to see your point here. Maybe you picked up that 0.1
seconds from an earlier post of mine and are suggesting I ought to be
able to do a lot more analysis within that time?
Yes. This may lead to some complexity. Simple approach is to
avoid obviously useless recompilation ('make' is doing this).
More complicated approach may keep some intermediate data and
try to "validate" them first. If previous analysis is valid,
then it can be reused. If something significant changes, than
it needs to be re-done. But many changes only have very local
effect, so at least theoretically re-using analyses could
save substantial time.
I consider compilation: turning textual source code into a form that can
be run, typically binary native code, to be a completely routine task
that should be as simple and as quick as flicking a light switch.
While anything else that might be a deep analysis of that program I
consider to be a quite different task. I'm not saying there is no place
for it, but I don't agree it should be integrated into every compiler
and always invoked.
Since now that last statement is the '0' value (any int value wil do).
What should my compiler report instead? What analysis should it be
doing? What would that save me from typing?
Currently in typed language that I use literal translation of
the example hits a hole in checks, that is the code is accepted.
Concerning needed analyses: one thing needed is representation of
type, either Pascal range type or enumeration type (the example
is _very_ unatural because in modern programming magic numbers
are avoided and there would be some symbolic representation
adding meaning to the numbers). Second, compiler must recognize
that this is a "multiway switch" and collect conditions.
The example came from C. Even if written as a switch, C switches do not return values (and also are hard to even analyse as to which branch is which).
In my languages, switches can return values, and a switch written as the last statement of a function is considered to do so, even if each branch uses an explicit 'return'. Then, it will consider a missing ELSE a 'hole'.
It will not do any analysis of the range other than what is necessary to implement switch (duplicate values, span of values, range-checking when using jump tables).
So the language may require you to supply a dummy 'else x' or 'return
x'; so what?
The alternative appears to be one of:
* Instead of 'else' or 'return', to write 'unreachable', which puts some
trust, not in the programmer, but some person calling your function
who does not have sight of the source code, to avoid calling it with
invalid arguments
Once
you have such representation (which may be desirable for other
reasons) it is easy to determine set of handled values. More
precisely, in this example we just have small number of discrete
values. More ambitious compiler may have list of ranges.
If type also specifies list of values or list of ranges, then
it is easy to check if all values of the type are handled.
The types are tyically plain integers, with ranges from 2**8 to 2**64.
The ranges associated with application needs will be more arbitrary.
If talking about a language with ranged integer types, then there might
be more point to it, but that is itself a can of worms. (It's hard to do without getting halfway to implementing Ada.)
You can't do this stuff with the compilers David Brown uses; I'm
guessing you can't do it with your prefered ones either.
To recompile the typed system I use (about 0.4M lines) on new fast
machine I need about 53s. But that is kind of cheating:
- this time is for parallel build using 20 logical cores
- the compiler is not in the language it compiles (but in untyped
vesion of it)
- actuall compilation of the compiler is small part of total
compile time
On slow machine compile time can be as large as 40 minutes.
40 minutes for 400K lines? That's 160 lines per second; how old is this machine? Is the compiler written in Python?
An untyped system that I use has about 0.5M lines and recompiles
itself in 16s on the same machine. This one uses single core.
On slow machine compile time may be closer to 2 minutes.
So 4K to 30Klps.
Again, compiler compile time is only a part of build time.
Actualy, one time-intensive part is creating index for included
documentation.
Which is not going to be part of a routine build.
Another is C compilation for a library file
(system has image-processing functions and low-level part of
image processing is done in C). Recomplation starts from
minimal version of the system, rebuilding this minimal
version takes 3.3s.
My language tools work on a whole program, where a 'program' is a single
EXE or DLL file (or a single OBJ file in some cases).
A 'build' then turns N source files into 1 binary file. This is the task
I am talking about.
A complete application may have several such binaries and a bunch of
other stuff. Maybe some source code is generated by a script. This part
is open-ended.
However each of my current projects is a single, self-contained binary
by design.
Anyway, I do not need cascaded recompilation than you present.
Both system above have incermental compilation, the second one
at statement/function level: it offers interactive prompt
which takes a statement from the user, compiles it and immediately
executes. Such statement may define a function or perform compilation.
Even on _very_ slow machine there is no noticable delay due to
compilation, unless you feed the system with some oversized statement
or function (presumably from a file).
This sounds like a REPL system. There, each line is a new part of the program which is processed, executed and discarded.
In that regard, it
is not really what I am talking about, which is AOT compilation of a
program represented by a bunch of source files.
Or can a new line redefine something, perhaps a function definition, previously entered amongst the last 100,000 lines? Can a new line
require compilation of something typed 50,000 lines ago?
What happens if you change the type of a global; are you saying that
none of the program codes needs revising?
An untyped system
What do you mean by an untyped system? To me it usually means
dynamically typed.
On 19/11/2024 15:51, Bart wrote:
On 19/11/2024 01:53, Waldek Hebisch wrote:
Another example, building 40Kloc interpreter from source then running it
in memory:
c:\qx>tm \bx\mm -run qq hello
Compiling qq.m to memory
Hello, World! 19-Nov-2024 15:38:47
TM: 0.11
c:\qx>tm qq hello
Hello, World! 19-Nov-2024 15:38:49
TM: 0.05
The second version runs a precompiled EXE. So building from source added
only 90ms.
Sorry, that should be 60ms. Running that interpreter from source only
takes 1/16th of a second longer not 1/11th of a second.
BTW I didn't remark on the range of your (WH's) figures. They spanned 40 minutes for a build to instant, but it's not clear for which languages
they are, which tools are used and which machines. Or how much work they
have to do to get those faster times, or what work they don't do: I'm guessing it's not processing 0.5M lines for that fastest time.
So it was hard to formulate a response.
All my timings are either for C or my systems language, running on one
core on the same PC.
Bart <bc@freeuk.com> wrote:
It is related: both gcc anf LLVM are doing analyses that in the
past were deemed inpracticaly expensive (both in time and in space).
Those analyses work now thanks to smart algorithms that
significantly reduced resource usage. I know that you consider
this too expensive.
What's the context of this 0.1 seconds? Do you consider it long or short?
Context is interactive response. It means "pretty fast for interactive
use".
My tools can generally build my apps from scratch in 0.1 seconds; big
compilers tend to take a lot longer. Only Tiny C is in that ballpark.
So I'm failing to see your point here. Maybe you picked up that 0.1
seconds from an earlier post of mine and are suggesting I ought to be
able to do a lot more analysis within that time?
This 0.1s is old thing. My point is that if you are compiling simple
change, than you should be able to do more in this time. In normal developement source file bigger than 10000 lines are relatively
rare, so once you get in range of 50000-100000 lines per second
making compiler faster is of marginal utility.
We clearly differ in question of what is routine. Creating usable
executable is rare task, once executable is created it can be used
for long time. OTOH developement is routine and for this one wants
to know if a change is correct.
Already simple thing would be an improvement: make compiler aware of
error routine (if you do not have it add one) so that when you
signal error compiler will know that there is no need for normal
return value.
Which is not going to be part of a routine build.
In a sense build is not routine. Build is done for two purposes:
- to install working system from sources, that includes
documentaion
- to check that build works properly after changes, this also
should check documentaion build.
Normal developement goes without rebuilding the system.
I know. But this is not what I do. Build produces mutiple
artifacts, some of them executable, some are loadable code (but _not_
in form recogized by operating system), some essentially non-executable
(like documentation).
This sounds like a REPL system. There, each line is a new part of the
program which is processed, executed and discarded.
First, I am writing about two different systems. Both have REPL.
Lines typed at REPL are "discarded", but their effect may last
long time.
What happens if you change the type of a global; are you saying that
none of the program codes needs revising?
In typed system there are no global "library" variables, all data
is encapsulated in modules and normally accessed in abstract way,
by calling apropriate functions. So, in "clean" code you
can recompile a single module and the whole system works.
Bart <bc@freeuk.com> wrote:
BTW I didn't remark on the range of your (WH's) figures. They spanned 40
minutes for a build to instant, but it's not clear for which languages
they are, which tools are used and which machines. Or how much work they
have to do to get those faster times, or what work they don't do: I'm
guessing it's not processing 0.5M lines for that fastest time.
As I wrote, there are 2 different system, if interesed you can fetch
them from github.
I do not think I will use your system language. And for C compiler
at least currently it does not make big difference to me if your
compiler can do 1Mloc or 5Mloc on my machine, both are "pretty fast".
What matters more is support of debugging output, supporting
targets that I need (like ARM or Risc-V), good diagnostics
and optimization.
I recently installed TinyC on small Risc-V
machine, I think that available memory (64MB all, about 20MB available
to user programs) is too small to run gcc or clang.
Dan Purgert <dan@djph.net> wrote or quoted:
if (n==0) { printf ("n: %u\n",n); n++;}
if (n==1) { printf ("n: %u\n",n); n++;}
if (n==2) { printf ("n: %u\n",n); n++;}
if (n==3) { printf ("n: %u\n",n); n++;}
if (n==4) { printf ("n: %u\n",n); n++;}
printf ("all if completed, n=%u\n",n);
My bad if the following instruction structure's already been hashed
out in this thread, but I haven't been following the whole convo!
In my C 101 classes, after we've covered "if" and "else",
I always throw this program up on the screen and hit the newbies
with this curveball: "What's this bad boy going to spit out?".
Well, it's a blue moon when someone nails it. Most of them fall
for my little gotcha hook, line, and sinker.
#include <stdio.h>
const char * english( int const n )
{ const char * result;
if( n == 0 )result = "zero";
if( n == 1 )result = "one";
if( n == 2 )result = "two";
if( n == 3 )result = "three";
else result = "four";
return result; }
void print_english( int const n )
{ printf( "%s\n", english( n )); }
int main( void )
{ print_english( 0 );
print_english( 1 );
print_english( 2 );
print_english( 3 );
print_english( 4 ); }
On 19/11/2024 23:41, Waldek Hebisch wrote:
Bart <bc@freeuk.com> wrote:
It's funny how nobody seems to care about the speed of compilers (which
can vary by 100:1), but for the generated programs, the 2:1 speedup you >might get by optimising it is vital!
Here I might borrow one of your arguments and suggest such a speed-up is >only necessary on a rare production build.
I recently installed TinyC on small Risc-V
machine, I think that available memory (64MB all, about 20MB available
to user programs) is too small to run gcc or clang.
Only 20,000KB? My first compilers worked on 64KB systems, not all of
which was available either.
None of my recent products will do so now, but they will still fit on a >floppy disk.
On 19/11/2024 23:41, Waldek Hebisch wrote:
Bart <bc@freeuk.com> wrote:
BTW I didn't remark on the range of your (WH's) figures. They spanned 40 >>> minutes for a build to instant, but it's not clear for which languages
they are, which tools are used and which machines. Or how much work they >>> have to do to get those faster times, or what work they don't do: I'm
guessing it's not processing 0.5M lines for that fastest time.
As I wrote, there are 2 different system, if interesed you can fetch
them from github.
Do you have a link? Probably I won't attempt to build but I can see what
it looks like.
I do not think I will use your system language. And for C compiler
at least currently it does not make big difference to me if your
compiler can do 1Mloc or 5Mloc on my machine, both are "pretty fast".
What matters more is support of debugging output, supporting
targets that I need (like ARM or Risc-V), good diagnostics
and optimization.
It's funny how nobody seems to care about the speed of compilers (which
can vary by 100:1), but for the generated programs, the 2:1 speedup you might get by optimising it is vital!
Here I might borrow one of your arguments and suggest such a speed-up is only necessary on a rare production build.
I recently installed TinyC on small Risc-V
machine, I think that available memory (64MB all, about 20MB available
to user programs) is too small to run gcc or clang.
Only 20,000KB? My first compilers worked on 64KB systems, not all of
which was available either.
None of my recent products will do so now, but they will still fit on a floppy disk.
BTW why don't you use a cross-compiler? That's what David Brown would say.
Bart <bc@freeuk.com> writes:
On 19/11/2024 23:41, Waldek Hebisch wrote:
Bart <bc@freeuk.com> wrote:
It's funny how nobody seems to care about the speed of compilers (which
can vary by 100:1), but for the generated programs, the 2:1 speedup you
might get by optimising it is vital!
I don't consider it funny at all, rather it is simply the way things
should be. One compiles once.
One's customer runs the resulting
executable perhaps millions of times.
Here I might borrow one of your arguments and suggest such a speed-up is
only necessary on a rare production build.
And again, you've clearly never worked with any significantly
large project. Like for instance an operating system.
machine, I think that available memory (64MB all, about 20MB available
to user programs) is too small to run gcc or clang.
Only 20,000KB? My first compilers worked on 64KB systems, not all of
which was available either.
My first compilers worked on 4KW PDP-8. Not that I have any
interest in _ever_ working in such a constrained environment
ever again.
None of my recent products will do so now, but they will still fit on a
floppy disk.
And, nobody cares.
On 19/11/2024 22:40, Waldek Hebisch wrote:
Bart <bc@freeuk.com> wrote:
It is related: both gcc anf LLVM are doing analyses that in the
past were deemed inpracticaly expensive (both in time and in space).
Those analyses work now thanks to smart algorithms that
significantly reduced resource usage. I know that you consider
this too expensive.
How long would LLVM take to compile itself on one core? (Here I'm not
even sure what LLVM is; it you download the binary, it's about 2.5GB,
but a typical LLVM compiler might 100+ MB. But I guess it will be while
in either case.)
I have product now that is like a mini-LLVM backend. It can build into a standalone library of under 0.2MB, which can directy produce EXEs, or it
can interpret. Building that product from scratch takes 60ms.
That is my kind of product
What's the context of this 0.1 seconds? Do you consider it long or short? >>Context is interactive response. It means "pretty fast for interactive
use".
It's less than the time to press and release the Enter key.
My tools can generally build my apps from scratch in 0.1 seconds; big
compilers tend to take a lot longer. Only Tiny C is in that ballpark.
So I'm failing to see your point here. Maybe you picked up that 0.1
seconds from an earlier post of mine and are suggesting I ought to be
able to do a lot more analysis within that time?
This 0.1s is old thing. My point is that if you are compiling simple
change, than you should be able to do more in this time. In normal
developement source file bigger than 10000 lines are relatively
rare, so once you get in range of 50000-100000 lines per second
making compiler faster is of marginal utility.
I *AM* doing more in that time! It just happens to be stuff you appear
to have no interest in:
* I write whole-program compilers: you always process all source files
of an application. The faster the compiler, the bigger the scale of app
it becomes practical on.
* That means no headaches with dependencies (it goes in hand with a
decent module scheme)
* I can change one tiny corner of a the program, say add an /optional/ argument to a function, which requires compiling all call-sites across
the program, and the next compilation will take care of everything
* If I were to do more with optimisation (there is lots that can be done without getting into the heavy stuff), it automatically applies to the
whole program
* I can choose to run applications from source code, without generating discrete binary files, just like a script language
* I can choose (with my new backend) to interpret programs in this
static language. (Interpretation gives better debugging opportunities)
* I don't need to faff around with object files or linkers
Module-based independent compilation and having to link 'object files'
is stone-age stuff.
We clearly differ in question of what is routine. Creating usable
executable is rare task, once executable is created it can be used
for long time. OTOH developement is routine and for this one wants
to know if a change is correct.
I take it then that you have some other way of doing test runs of a
program without creating an executable?
It's difficult to tell from your comments.
Already simple thing would be an improvement: make compiler aware of
error routine (if you do not have it add one) so that when you
signal error compiler will know that there is no need for normal
return value.
OK, but what does that buy me? Saving a few bytes for a return
instruction in a function? My largest program, which is 0.4MB, already
only occupies 0.005% of the machines 8GB.
Which is not going to be part of a routine build.
In a sense build is not routine. Build is done for two purposes:
- to install working system from sources, that includes
documentaion
- to check that build works properly after changes, this also
should check documentaion build.
Normal developement goes without rebuilding the system.
We must be talking at cross-purposes then.
Either you're developing using interpreted code, or you must have some
means of converting source code to native code, but for some reason you don't use 'compile' or 'build' to describe that process.
Or maybe your REPL/incremental process can run for days doing
incremental changes without doing a full compile.
It seems quite mysterious.
I might run my compiler hundreds of times a day (at 0.1 seconds a time,
600 builds would occupy one whole minute in the day!). I often do it for frivolous purposes, such as trying to get some output lined up just
right. Or just to make sure something has been recompiled since it's so quick it's hard to tell.
I know. But this is not what I do. Build produces mutiple
artifacts, some of them executable, some are loadable code (but _not_
in form recogized by operating system), some essentially non-executable
(like documentation).
So, 'build' means something different to you. I use 'build' just as a
change from writing 'compile'.
This sounds like a REPL system. There, each line is a new part of the
program which is processed, executed and discarded.
First, I am writing about two different systems. Both have REPL.
Lines typed at REPL are "discarded", but their effect may last
long time.
My last big app used a compiled core but most user-facing functionality
was done using an add-on script language. This meant I could develop
such modules from within a working application, which provided a rich, persistent environment.
Changes to the core program required a rebuild and a restart.
However the whole thing was an application, not a language.
What happens if you change the type of a global; are you saying that
none of the program codes needs revising?
In typed system there are no global "library" variables, all data
is encapsulated in modules and normally accessed in abstract way,
by calling apropriate functions. So, in "clean" code you
can recompile a single module and the whole system works.
I used module-at-time compilation until 10-12 years ago. The module
scheme had to be upgraded at the same time, but it took several goes to
get it right.
Now I wouldn't go back. Who cares about compiling a single module that
may or may not affect a bunch of others? Just compile the lot!
If a project's scale becomes too big, then it should be split into independent program units, for example a core EXE file and a bunch of
DLLs; that's the new granularity. Or a lot of functionality can be off-loaded to scripts, as I used to do.
(My scripting language code still needs bytecode compilation, and I also
use whole-program units there, but the bytecode compiler goes up to 2Mlps.)
On 19/11/2024 23:41, Waldek Hebisch wrote:
Bart <bc@freeuk.com> wrote:
BTW I didn't remark on the range of your (WH's) figures. They spanned 40 >>> minutes for a build to instant, but it's not clear for which languages
they are, which tools are used and which machines. Or how much work they >>> have to do to get those faster times, or what work they don't do: I'm
guessing it's not processing 0.5M lines for that fastest time.
As I wrote, there are 2 different system, if interesed you can fetch
them from github.
Do you have a link? Probably I won't attempt to build but I can see what
it looks like.
[...]
All I have been arguing against is the idea of blindly putting in
validity tests for parameters in functions, as though it were a habit
that by itself leads to fewer bugs in code.
On 19/11/2024 23:41, Waldek Hebisch wrote:
Bart <bc@freeuk.com> wrote:
I do not think I will use your system language. And for C compiler
at least currently it does not make big difference to me if your
compiler can do 1Mloc or 5Mloc on my machine, both are "pretty fast".
What matters more is support of debugging output, supporting
targets that I need (like ARM or Risc-V), good diagnostics
and optimization.
It's funny how nobody seems to care about the speed of compilers (which
can vary by 100:1), but for the generated programs, the 2:1 speedup you might get by optimising it is vital!
BTW why don't you use a cross-compiler? That's what David Brown would say.
On 15/11/2024 19:50, Waldek Hebisch wrote:
David Brown <david.brown@hesbynett.no> wrote:
On 11/11/2024 20:09, Waldek Hebisch wrote:
David Brown <david.brown@hesbynett.no> wrote:
Concerning correct place for checks: one could argue that check
should be close to place where the result of check matters, which
frequently is in called function.
No, there I disagree. The correct place for the checks should be close
to where the error is, and that is in the /calling/ code. If the called >>> function is correctly written, reviewed, tested, documented and
considered "finished", why would it be appropriate to add extra code to
that in order to test and debug some completely different part of the code? >>>
The place where the result of the check /really/ matters, is the calling >>> code. And that is also the place where you can most easily find the
error, since the error is in the calling code, not the called function.
And it is most likely to be the code that you are working on at the time >>> - the called function is already written and tested.
And frequently check requires
computation that is done by called function as part of normal
processing, but would be extra code in the caller.
It is more likely to be the opposite in practice.
And for much of the time, the called function has no real practical way
to check the parameters anyway. A function that takes a pointer
parameter - not an uncommon situation - generally has no way to check
the validity of the pointer. It can't check that the pointer actually
points to useful source data or an appropriate place to store data.
All it can do is check for a null pointer, which is usually a fairly
useless thing to do (unless the specifications for the function make the >>> pointer optional). After all, on most (but not all) systems you already >>> have a "free" null pointer check - if the caller code has screwed up and >>> passed a null pointer when it should not have done, the program will
quickly crash when the pointer is used for access. Many compilers
provide a way to annotate function declarations to say that a pointer
must not be null, and can then spot at least some such errors at compile >>> time. And of course the calling code will very often be passing the
address of an object in the call - since that can't be null, a check in
the function is pointless.
Well, in a sense pointers are easy: if you do not play nasty tricks
with casts then type checks do significant part of checking. Of
course, pointer may be uninitialized (but compiler warnings help a lot
here), memory may be overwritten, etc. But overwritten memory is
rather special, if you checked that content of memory is correct,
but it is overwritten after the check, then earlier check does not
help. Anyway, main point is ensuring that pointed to data satisfies
expected conditions.
That does not match reality. Pointers are far and away the biggest
source of errors in C code. Use after free, buffer overflows, mixups of
who "owns" the pointer - the scope for errors is boundless. You are
correct that type systems can catch many potential types of errors - unfortunately, people /do/ play nasty tricks with type checks.
Conversions of pointer types are found all over the place in C
programming, especially conversions back and forth with void* pointers.
All this means that invalid pointer parameters are very much a real
issue - but are typically impossible to check in the called function.
The way you avoid getting errors in your pointers is being careful about having the right data in the first place, so you only call functions
with valid parameters. You do this by having careful control about the ownership and lifetime of pointers, and what they point to, keeping conventions in the names of your pointers and functions to indicate who
owns what, and so on. And you use sanitizers and similar tools during testing and debugging to distinguish between tests that worked by luck,
and ones that worked reliably. (And of course you may consider other languages than C that help you express your requirements in a clearer
manner or with better automatic checking.)
Put the same effort and due diligence into the rest of your code, and suddenly you find your checks for other kinds of parameters in functions
are irrelevant as you are now making sure you call functions with appropriate valid inputs.
Once you get to more complex data structures, the possibility for the
caller to check the parameters gets steadily less realistic.
So now your practice of having functions "always" check their parameters >>> leaves the people writing calling code with a false sense of security -
usually you /don't/ check the parameters, you only ever do simple checks >>> that that called could (and should!) do if they were realistic. You've
got the maintenance and cognitive overload of extra source code for your >>> various "asserts" and other check, regardless of any run-time costs
(which are often irrelevant, but occasionally very important).
You will note that much of this - for both sides of the argument - uses
words like "often", "generally" or "frequently". It is important to
appreciate that programming spans a very wide range of situations, and I >>> don't want to be too categorical about things. I have already said
there are situations when parameter checking in called functions can
make sense. I've no doubt that for some people and some types of
coding, such cases are a lot more common than what I see in my coding.
Note also that when you can use tools to automate checks, such as
"sanitize" options in compilers or different languages that have more
in-built checks, the balance differs. You will generally pay a run-time >>> cost for those checks, but you don't have the same kind of source-level
costs - your code is still clean, clear, and amenable to correctness
checking, without hiding the functionality of the code in a mass of
unnecessary explicit checks. This is particularly good for debugging,
and the run-time costs might not be important. (But if run-time costs
are not important, there's a good chance that C is not the best language >>> to be using in the first place.)
Our experience differs. As a silly example consider a parser
which produces parse tree. Caller is supposed to pass syntactically
correct string as an argument. However, checking syntactic corretnetness
requires almost the same effort as producing parse tree, so it
ususal that parser both checks correctness and produces the result.
The trick here is to avoid producing a syntactically invalid string in
the first place. Solve the issue at the point where there is a mistake
in the code!
(If you are talking about a string that comes from outside the code in
some way, then of course you need to check it - and if that is most conveniently done during the rest of parsing, then that is fair enough.)
I have computations that are quite different than parsing but
in some cases share the same characteristic: checking correctness of
arguments requires complex computation similar to producing
actual result. More freqently, called routine can check various
invariants which with high probablity can detect errors. Doing
the same check in caller is inpractical.
I think you are misunderstanding me - maybe I have been unclear. I am saying that it is the /caller's/ responsibility to make sure that the parameters it passes are correct, not the /callee's/ responsibility.
That does not mean that the caller has to add checks to get the
parameters right - it means the caller has to use correct parameters.
Think of this like walking near a cliff-edge. Checking parameters
before the call is like having a barrier at the edge of the cliff. My recommendation is that you know where the cliff edge is, and don't walk there.
On 20/11/2024 02:33, Bart wrote:
It's funny how nobody seems to care about the speed of compilers
(which can vary by 100:1), but for the generated programs, the 2:1
speedup you might get by optimising it is vital!
To understand this, you need to understand the benefits of a program
running quickly.
Let's look at the main ones:
There is usually a point where a program is "fast enough" - going faster makes no difference. No one is ever going to care if a compilation
takes 1 second or 0.1 seconds, for example.
It doesn't take much thought to realise that for most developers, the
speed of their compiler is not actually a major concern in comparison to
the speed of other programs.
While writing code, and testing and debugging it, a given build might
only be run a few times, and compile speed is a bit more relevant. Generally, however, most programs are run far more often, and for far longer, than their compilation time.
And as usual, you miss out the fact that toy compilers - like yours, or TinyC - miss all the other features developers want from their tools. I want debugging information, static error checking, good diagnostics,
support for modern language versions (that's primarily C++ rather than
C), useful extensions, compact code, correct code generation, and most importantly of all, support for the target devices I want.
I wouldn't
care if your compiler can run at a billion lines per second and gcc took
an hour to compile - I still wouldn't be interested in your compiler
because it does not generate code for the devices I use. Even if it
did, it would be useless to me, because I can trust the code gcc
generates and I cannot trust the code your tool generates.
And even if
your tool did everything else I need, and you could convince me that it
is something a professional could rely on, I'd still use gcc for the
better quality generated code, because that translates to money saved
for my customers.
BTW why don't you use a cross-compiler? That's what David Brown would
say.
That is almost certainly what he normally does. It can still be fun to play around with things like TinyC, even if it is of no practical use
for the real development.
Bart <bc@freeuk.com> wrote:
Either you're developing using interpreted code, or you must have some
means of converting source code to native code, but for some reason you
don't use 'compile' or 'build' to describe that process.
Or maybe your REPL/incremental process can run for days doing
incremental changes without doing a full compile.
Yes.
It seems quite mysterious.
There is nothing misterious here. In typed system each module has
a vector (one dimensional array) called domain vector containg amoung
other references to called function. All inter-module calls are
indirect ones, they take thing to call from the domain vector. When
module starts execution references point to a runtime routine doing
similar work to dynamic linker. The first call goes to runtime
support routine which finds needed code and replaces reference in
the domain vector.
When a module is recompiled references is domain vectors are
reinitialized to point to runtimne. So searches are run again
and if needed pick new routine.
Note that there is a global table keeping info (including types)
about all exported routines from all modules. This table is used
when compileing a module and also by the search process at runtime.
The effect is that after recompilation of a single module I have
runnuble executable in memory including code of the new module.
If you wonder about compiling the same module many times: system
has garbage collector and unused code is garbage collected.
So, when old version is replaced by new one the old becomes a
garbage and will be collected in due time.
The other system is similar in principle, but there is no need
for runtime search and domain vectors.
I might run my compiler hundreds of times a day (at 0.1 seconds a time,
600 builds would occupy one whole minute in the day!). I often do it for
frivolous purposes, such as trying to get some output lined up just
right. Or just to make sure something has been recompiled since it's so
quick it's hard to tell.
I know. But this is not what I do. Build produces mutiple
artifacts, some of them executable, some are loadable code (but _not_
in form recogized by operating system), some essentially non-executable
(like documentation).
So, 'build' means something different to you. I use 'build' just as a
change from writing 'compile'.
Build means creating new fully-functional system. That involves
possibly multiple compilations and whatever else is needed.
On 20/11/2024 16:15, David Brown wrote:
On 20/11/2024 02:33, Bart wrote:
It's funny how nobody seems to care about the speed of compilers
(which can vary by 100:1), but for the generated programs, the 2:1
speedup you might get by optimising it is vital!
To understand this, you need to understand the benefits of a program
running quickly.
As I said, people are preoccupied with that for programs in general. But when it comes to compilers, it doesn't apply! Clearly, you are implying
that those benefits don't matter when the program is a compiler.
Let's look at the main ones:
<snip>
OK. I guess you missed the bits here and in another post, where I
suggested that enabling optimisation is fine for production builds.
For the routines ones that I do 100s of times a day, where test runs are generally very short, then I don't want to hang about waiting for a
compiler that is taking 30 times longer than necessary for no good reason.
There is usually a point where a program is "fast enough" - going
faster makes no difference. No one is ever going to care if a
compilation takes 1 second or 0.1 seconds, for example.
If you look at all the interactions people have with technology, with
GUI apps, even with mechanical things, a 1 second latency is generally disastrous.
A one-second delay between pressing a key and seeing a character appear
on a display or any other feedback, would drive most people up to wall.
But 0.1 is perfectly fine.
It doesn't take much thought to realise that for most developers, the
speed of their compiler is not actually a major concern in comparison
to the speed of other programs.
Most developers are stuck with what there is. Naturally they will make
the best of it. Usually by finding 100 ways or 100 reasons to avoid
running the compiler.
While writing code, and testing and debugging it, a given build might
only be run a few times, and compile speed is a bit more relevant.
Generally, however, most programs are run far more often, and for far
longer, than their compilation time.
Developing code is the critical bit.
Even when a test run takes a bit longer as you need to set things up,
when you do need to change something and run it again, you don't want
any pointless delay.
Neither do you want to waste /your/ time pandering to a compiler's
slowness by writing makefiles and defining dependencies.
Or even
splitting things up into tiny modules.
I don't want to care about that
at all. Here's my bunch of source files, just build the damn thing, and
do it now!
On 20/11/2024 21:17, Bart wrote:
For the routines ones that I do 100s of times a day, where test runs
are generally very short, then I don't want to hang about waiting for
a compiler that is taking 30 times longer than necessary for no good
reason.
Your development process sounds bad in so many ways it is hard to know
where to start. I think perhaps the foundation is that you taught
yourself a bit of programming in the 1970's,
As I said, no one is ever going to care if a compilation takes 1 second
or 0.1 seconds.
As I said, no one is ever going to care if a compilation takes 1 second
or 0.1 seconds.
As I said, no one is ever going to care if a compilation takes 1 second
or 0.1 seconds.
So your advice is that developers should be stuck
Which do you think an employer (or amateur programmer) would prefer?
a) A compiler that runs in 0.1 seconds with little static checking
b) A compiler that runs in 10 seconds but spots errors saving 6 hours debugging time
I might spend an hour or two writing code (including planing,
organising, reading references, etc.) and then 5 seconds building it.
Then there might be anything from a few minutes to a few hours testing
or debugging.
But using a good compiler saves a substantial amount of developer time
<snip the rest to save time>
On 21/11/2024 13:00, David Brown wrote:
On 20/11/2024 21:17, Bart wrote:
For the routines ones that I do 100s of times a day, where test runs
are generally very short, then I don't want to hang about waiting for
a compiler that is taking 30 times longer than necessary for no good
reason.
Your development process sounds bad in so many ways it is hard to know
where to start. I think perhaps the foundation is that you taught
yourself a bit of programming in the 1970's,
1970s builds, especially on mainframes, were dominated by link times.
Bart <bc@freeuk.com> writes:
On 21/11/2024 13:00, David Brown wrote:
On 20/11/2024 21:17, Bart wrote:
For the routines ones that I do 100s of times a day, where test runs
are generally very short, then I don't want to hang about waiting for
a compiler that is taking 30 times longer than necessary for no good
reason.
Your development process sounds bad in so many ways it is hard to know
where to start. I think perhaps the foundation is that you taught
yourself a bit of programming in the 1970's,
1970s builds, especially on mainframes, were dominated by link times.
Which mainframe do you have experience on?
I spent a decade writing a mainframe operating system (the largest application we had to compile regularly) and the link time was a
minor fraction of the overall build time.
It was so minor that our build system stored the object files
so that the OS engineers only needed to recompile the object
associated with the source file being modified rather than
the entire OS, they'd share the rest of the object files
with the entire OS team.
On 21/11/2024 15:50, Scott Lurndal wrote:
Bart <bc@freeuk.com> writes:
On 21/11/2024 13:00, David Brown wrote:
On 20/11/2024 21:17, Bart wrote:
For the routines ones that I do 100s of times a day, where test runs >>>>> are generally very short, then I don't want to hang about waiting for >>>>> a compiler that is taking 30 times longer than necessary for no good >>>>> reason.
Your development process sounds bad in so many ways it is hard to know >>>> where to start. I think perhaps the foundation is that you taught
yourself a bit of programming in the 1970's,
1970s builds, especially on mainframes, were dominated by link times.
Which mainframe do you have experience on?
I spent a decade writing a mainframe operating system (the largest
application we had to compile regularly) and the link time was a
minor fraction of the overall build time.
It was so minor that our build system stored the object files
so that the OS engineers only needed to recompile the object
associated with the source file being modified rather than
the entire OS, they'd share the rest of the object files
with the entire OS team.
The one I remember most was 'TKB' I think it was, running on ICL 4/72
(360 clone). It took up most of the memory. It was used to link my small >Fortran programs.
Bart <bc@freeuk.com> writes:
On 21/11/2024 15:50, Scott Lurndal wrote:
Bart <bc@freeuk.com> writes:
On 21/11/2024 13:00, David Brown wrote:
On 20/11/2024 21:17, Bart wrote:
For the routines ones that I do 100s of times a day, where test runs >>>>>> are generally very short, then I don't want to hang about waiting for >>>>>> a compiler that is taking 30 times longer than necessary for no good >>>>>> reason.
Your development process sounds bad in so many ways it is hard to know >>>>> where to start. I think perhaps the foundation is that you taught
yourself a bit of programming in the 1970's,
1970s builds, especially on mainframes, were dominated by link times.
Which mainframe do you have experience on?
I spent a decade writing a mainframe operating system (the largest
application we had to compile regularly) and the link time was a
minor fraction of the overall build time.
It was so minor that our build system stored the object files
so that the OS engineers only needed to recompile the object
associated with the source file being modified rather than
the entire OS, they'd share the rest of the object files
with the entire OS team.
The one I remember most was 'TKB' I think it was, running on ICL 4/72
(360 clone). It took up most of the memory. It was used to link my small
Fortran programs.
So you generalize from your one non-standard experience to the entire ecosystem.
Typical Bart.
On 21/11/2024 16:10, Scott Lurndal wrote:
Bart <bc@freeuk.com> writes:
On 21/11/2024 15:50, Scott Lurndal wrote:
Bart <bc@freeuk.com> writes:
On 21/11/2024 13:00, David Brown wrote:Which mainframe do you have experience on?
On 20/11/2024 21:17, Bart wrote:
For the routines ones that I do 100s of times a day, where test runs >>>>>>> are generally very short, then I don't want to hang about waiting for >>>>>>> a compiler that is taking 30 times longer than necessary for no good >>>>>>> reason.
Your development process sounds bad in so many ways it is hard to know >>>>>> where to start. I think perhaps the foundation is that you taught >>>>>> yourself a bit of programming in the 1970's,
1970s builds, especially on mainframes, were dominated by link times. >>>>
I spent a decade writing a mainframe operating system (the largest
application we had to compile regularly) and the link time was a
minor fraction of the overall build time.
It was so minor that our build system stored the object files
so that the OS engineers only needed to recompile the object
associated with the source file being modified rather than
the entire OS, they'd share the rest of the object files
with the entire OS team.
The one I remember most was 'TKB' I think it was, running on ICL 4/72
(360 clone). It took up most of the memory. It was used to link my small >>> Fortran programs.
So you generalize from your one non-standard experience to the entire ecosystem.
Typical Bart.
Typical Scott. Did you post just to do a bit of bart-bashing?
Have you also considered that your experience of building operating
systems might itself be non-standard?
Bart <bc@freeuk.com> wrote:
...or to just always require 'else', with a dummy value if necessary?
Well, frequently it is easier to do bad job, than a good one.
I assume that you consider the simple solution the 'bad' one?
You wrote about _always_ requiring 'else' regardless if it is
needed or not. Yes, I consider this bad.
On 20/11/2024 21:17, Bart wrote:
Your development process sounds bad in so many ways it is hard to know
where to start. I think perhaps the foundation is that you taught
yourself a bit of programming in the 1970's,
And presumably you also advise doing so on a bargain basement
single-core computer from at least 15 years ago?
Sure. That's when you run a production build. I can even do that myself
on some programs (the ones where my C transpiler still works) and pass
it through gcc-O3. Then it might run 30% faster.
int main(void) {
int a;
int* p = 0;
a = *p;
}
Here's what happens with my C compiler when told to interpret it:
c:\cx>cc -i c
Compiling c.c to c.(int)
Error: Null ptr access
Here's what happens with gcc:
c:\cx>gcc c.c
c:\cx>a
<crashes>
Is there some option to insert such a check with gcc? I've no idea; most people don't.
Bart <bc@freeuk.com> wrote:
int main(void) {
int a;
int* p = 0;
a = *p;
}
Here's what happens with my C compiler when told to interpret it:
c:\cx>cc -i c
Compiling c.c to c.(int)
Error: Null ptr access
Here's what happens with gcc:
c:\cx>gcc c.c
c:\cx>a
<crashes>
Is there some option to insert such a check with gcc? I've no idea; most
people don't.
I would do
gcc -g c.c
gdb a.out
run
and gdb would show me place with bad access. Things like bound
checking array access or overflow checking makes a big difference.
Null pointer access is reliably detected by hardware so no big
deal. Say what you 'cc' will do with the following function:
int
foo(int n) {
int a[10];
int i;
int res = 0;
for(i = 0; i <= 10; i++) {
a[i] = n + i;
}
for(i = 0; i <= 10; i++) {
res += a[i];
}
res;
}
Here gcc at compile time says:
foo.c: In function ‘foo’:
foo.c:15:17: warning: iteration 10 invokes undefined behavior [-Waggressive-loop-optimizations]
15 | res += a[i];
| ~^~~
foo.c:14:18: note: within this loop
14 | for(i = 0; i <= 10; i++) {
| ~~^~~~~
Bart <bc@freeuk.com> wrote:
Sure. That's when you run a production build. I can even do that
myself on some programs (the ones where my C transpiler still
works) and pass it through gcc-O3. Then it might run 30% faster.
On fast machine running Dhrystone 2.2a I get:
tcc-0.9.28rc 20000000
gcc-12.2 -O 64184852
gcc-12.2 -O2 83194672
clang-14 -O 83194672
clang-14 -O2 85763288
so with 02 this is more than 4 times faster. Dhrystone correlated
resonably with runtime of tight compute-intensive programs.
Compiler started to cheat on original Dhrystone, so there are
bigger benchmarks like SPEC INT. But Dhrystone 2 has modifications
to make cheating harder, so I think it is still reasonable
benchmark. Actually, difference may be much bigger, for example
in image processing both clang and gcc can use vector intructions,
with may give additional speedup of order 8-16.
30% above means that you are much better than tcc or your program
is badly behaving (I have programs that make intensive use of
memory, here effect of optimization would be smaller, but still
of order 2).
Bart <bc@freeuk.com> wrote:
Sure. That's when you run a production build. I can even do that myself
on some programs (the ones where my C transpiler still works) and pass
it through gcc-O3. Then it might run 30% faster.
On fast machine running Dhrystone 2.2a I get:
tcc-0.9.28rc 20000000
gcc-12.2 -O 64184852
gcc-12.2 -O2 83194672
clang-14 -O 83194672
clang-14 -O2 85763288
so with 02 this is more than 4 times faster. Dhrystone correlated
resonably with runtime of tight compute-intensive programs.
Compiler started to cheat on original Dhrystone, so there are
bigger benchmarks like SPEC INT. But Dhrystone 2 has modifications
to make cheating harder, so I think it is still reasonable
benchmark. Actually, difference may be much bigger, for example
in image processing both clang and gcc can use vector intructions,
with may give additional speedup of order 8-16.
30% above means that you are much better than tcc or your program
is badly behaving (I have programs that make intensive use of
memory, here effect of optimization would be smaller, but still
of order 2).
On Fri, 22 Nov 2024 12:33:29 -0000 (UTC)
antispam@fricas.org (Waldek Hebisch) wrote:
Bart <bc@freeuk.com> wrote:
Sure. That's when you run a production build. I can even do that
myself on some programs (the ones where my C transpiler still
works) and pass it through gcc-O3. Then it might run 30% faster.
On fast machine running Dhrystone 2.2a I get:
tcc-0.9.28rc 20000000
gcc-12.2 -O 64184852
gcc-12.2 -O2 83194672
clang-14 -O 83194672
clang-14 -O2 85763288
so with 02 this is more than 4 times faster. Dhrystone correlated
resonably with runtime of tight compute-intensive programs.
Compiler started to cheat on original Dhrystone, so there are
bigger benchmarks like SPEC INT. But Dhrystone 2 has modifications
to make cheating harder, so I think it is still reasonable
benchmark. Actually, difference may be much bigger, for example
in image processing both clang and gcc can use vector intructions,
with may give additional speedup of order 8-16.
30% above means that you are much better than tcc or your program
is badly behaving (I have programs that make intensive use of
memory, here effect of optimization would be smaller, but still
of order 2).
gcc -O is not what Bart was talking about. It is quite similar to -O1.
Try gcc -O0.
With regard to speedup, I had run only one or two benchmarks with tcc
and my results were close to those of Bart. gcc -O0 very similar to tcc
in speed of the exe, but compiles several times slower. gcc -O2 exe
about 2.5 times faster.
I'd guess, I can construct a case, where gcc successfully vectorized
some floating-point loop calculation and showed 10x speed up vs tcc on
modern Zen5 hardware. But that's would not be typical.
On 21/11/2024 13:00, David Brown wrote:
On 20/11/2024 21:17, Bart wrote:
Your development process sounds bad in so many ways it is hard to know
where to start. I think perhaps the foundation is that you taught
yourself a bit of programming in the 1970's,
I did a CS degree actually. I also spent a year programming, working for
the ARC and SRC (UK research councils).
But since you are being so condescending, I think /your/ problem is in having to use C. I briefly mentioned that a 'better language' can help.
While I don't claim that my language is particularly safe, mine is
somewhat safer than C in its type system, and far less error prone in
its syntax and its overall design (for example, a function's details are always defined in exactly one place, so less maintenance and fewer
things to get wrong).
So, half the options in your C compilers are to help get around those shortcomings.
You also seem proud that in this example:
int F(int n) {
if (n==1) return 10;
if (n==2) return 20;
}
You can use 'unreachable()', a new C feature, to silence compiler
messages about running into the end of the function, something I
considered a complete hack.
My language requires a valid return value from the last statement. In
that it's similar to the Rust example I posted 9 hours ago.
Yet the gaslighting here suggested what I chose to do was completely wrong.
And presumably you also advise doing so on a bargain basement
single-core computer from at least 15 years ago?
Another example of you acknowledging that compilation speed can be a problem. So a brute force approach to speed is what counts for you.
If you found that it took several hours to drive 20 miles from A to B,
your answer would be to buy a car that goes at 300mph, rather than doing endless detours along the way.
Or another option is to think about each journey extremely carefully,
and then only do the trip once a week!
You also seem proud that in this example:
int F(int n) {
if (n==1) return 10;
if (n==2) return 20;
}
You can use 'unreachable()', a new C feature, to silence compiler
messages about running into the end of the function, something I
considered a complete hack.
On 22/11/2024 12:33, Waldek Hebisch wrote:
Bart <bc@freeuk.com> wrote:
Sure. That's when you run a production build. I can even do that myself
on some programs (the ones where my C transpiler still works) and pass
it through gcc-O3. Then it might run 30% faster.
On fast machine running Dhrystone 2.2a I get:
tcc-0.9.28rc 20000000
gcc-12.2 -O 64184852
gcc-12.2 -O2 83194672
clang-14 -O 83194672
clang-14 -O2 85763288
so with 02 this is more than 4 times faster. Dhrystone correlated
resonably with runtime of tight compute-intensive programs.
Compiler started to cheat on original Dhrystone, so there are
bigger benchmarks like SPEC INT. But Dhrystone 2 has modifications
to make cheating harder, so I think it is still reasonable
benchmark. Actually, difference may be much bigger, for example
in image processing both clang and gcc can use vector intructions,
with may give additional speedup of order 8-16.
30% above means that you are much better than tcc or your program
is badly behaving (I have programs that make intensive use of
memory, here effect of optimization would be smaller, but still
of order 2).
The 30% applies to my typical programs, not benchmarks. Sure, gcc -O3
can do a lot of aggressive optimisations when everything is contained
within one short module and most runtime is spent in clear bottlenecks.
Real apps, like say my compilers, are different. They tend to use
globals more, program flow is more disseminated. The bottlenecks are
harder to pin down.
But, OK, here's the first sizeable benchmark that I thought of (I can't
find a reliable Dhrystone one; perhaps you can post a link).
Sysop: | DaiTengu |
---|---|
Location: | Appleton, WI |
Users: | 991 |
Nodes: | 10 (0 / 10) |
Uptime: | 119:56:36 |
Calls: | 12,958 |
Files: | 186,574 |
Messages: | 3,265,641 |