the fact that in c a language/compiler sees only functions or variables
that are up in code is a disaster
it is a disaster becouse it dont alow you to split code on N files
each file has realted functions and variables and not to care on the
global order of it
If you have something to say about splitting a C translation unit(something I don't think I've ever had a need to do), perhaps because
I don't recall refactoring existing code, primarily because theoriginal programmers used multiple translation units logically
Having said that, I don't remember it ever being a big deal. Ifsome source file needs to be subdivided, you simply subdivide it
you should not care becouse it has no sense those separate files are
then like in parralel dimensions and that is good (they will not go in conflict as it had to have unique names anyway)
the thing that yu need to care is some kind of bad annoying design flaw
as those historic goto abuse or other like that
So the solution is give at least compiler extension that would allow you
to have it changed that it see up and down
the fact thet this switch is not present is another flaw..so you have
two flaws here
On 16/05/2026 23:03, fir wrote:
the fact that in c a language/compiler sees only functions or
variables that are up in code is a disaster
it is a disaster becouse it dont alow you to split code on N files
each file has realted functions and variables and not to care on the
global order of it
I mentioned something like this a week ago, suggesting that in C it
was harder work than necessary to split one source file up into two or
more.
the fact that in c a language/compiler sees only functions or
variables that are up in code is a disaster
On Sun, 17 May 2026 00:03:40 +0200, fir wrote:
the fact that in c a language/compiler sees only functions or
variables that are up in code is a disaster
Back in those days, languages that needed multipass compilers (e.g.
Algol 68) were considered complicated and expensive to implement.
That’s why C went for a single-pass language design, like Pascal. And
like Pascal, it has forward declarations to mitigate this somewhat.
You need some kind of use-before-define facility in any realistic
language, if you want to allow recursion, and in particular mutual
recursion.
It’s amusing to think that C++, that behemoth that, in terms of sheer complexity, leaves old-style monsters like Algol 68 or PL/I in the
dust, is still essentially a single-pass language design.
Bart <bc@freeuk.com> writes:
On 16/05/2026 23:03, fir wrote:
the fact that in c a language/compiler sees only functions or
variables that are up in code is a disaster
it is a disaster becouse it dont alow you to split code on N files
each file has realted functions and variables and not to care on the
global order of it
I mentioned something like this a week ago, suggesting that in C it
was harder work than necessary to split one source file up into two or
more.
And you offered no evidence for your claim, not even telling us
that you had tried it and found it difficult.
On 16/05/2026 23:03, fir wrote:
the fact that in c a language/compiler sees only functions or
variables that are up in code is a disaster
it is a disaster becouse it dont alow you to split code on N files
each file has realted functions and variables and not to care on the
global order of it
I mentioned something like this a week ago, suggesting that in C it was harder work than necessary to split one source file up into two or more.
static T A();
static T B();
static T E();
static T F();
T A(){}
T B(){}
T C(){}
T D(){}
Bart pisze:
static T A();
static T B();
static T E();
static T F();
T A(){}
T B(){}
T C(){}
T D(){}
imo the variables (call if file variables or block of code variables)
makes probably more problem than functions
good way of design is imo to have few functions and variables/arrays who work on this together - like in small c file...in another file you make another set of functions and variables...bad design is to split those variables out of its functions, and today i would need move the
variables like up (enforcing BAD DESIGN) of make extern declarations for visibility (enforcing even more BAD DESIGN imo) it also involves CODE JUMPING which is BAD
solution - add this f**kin switch to compiler...what i said is
mathematical (logical) proof its bad design
On 17/05/2026 02:21, Keith Thompson wrote:
Bart <bc@freeuk.com> writes:
On 16/05/2026 23:03, fir wrote:
the fact that in c a language/compiler sees only functions or
variables that are up in code is a disaster
it is a disaster becouse it dont alow you to split code on N files
each file has realted functions and variables and not to care on the
global order of it
I mentioned something like this a week ago, suggesting that in C it
was harder work than necessary to split one source file up into two or
more.
And you offered no evidence for your claim, not even telling us
that you had tried it and found it difficult.
Everyone here will know what is involved. But nobody wants to admit
that it can be onerous.
Here is a very simple example [...]
Bart pisze:
On 16/05/2026 23:03, fir wrote:
the fact that in c a language/compiler sees only functions or
variables that are up in code is a disaster
it is a disaster becouse it dont alow you to split code on N files
each file has realted functions and variables and not to care on the
global order of it
I mentioned something like this a week ago, suggesting that in C it
was harder work than necessary to split one source file up into two or
more.
ye i remember we afair both agree on that point even few years ago (that
it is bed - the fact you need to bother if it is up makes DEPENDENCY
more worse it is a CURSED DEPENDENCY
in this post above hovever i mentioned slightly other thing - that not only this is bad (which was talken already) but that there is real need
for compiler extension/switch - i know it would violate standard c, but there is a need a switch that violates c imo - for practicel reasons of
get rid of this annoyance
Bart <bc@freeuk.com> writes:
On 17/05/2026 02:21, Keith Thompson wrote:
Bart <bc@freeuk.com> writes:
On 16/05/2026 23:03, fir wrote:
the fact that in c a language/compiler sees only functions or
variables that are up in code is a disaster
it is a disaster becouse it dont alow you to split code on N files
each file has realted functions and variables and not to care on the >>>>> global order of it
I mentioned something like this a week ago, suggesting that in C it
was harder work than necessary to split one source file up into two or >>>> more.
And you offered no evidence for your claim, not even telling us
that you had tried it and found it difficult.
Everyone here will know what is involved. But nobody wants to admit
that it can be onerous.
It could be onerous. The point is, in actual practice it almost
never is onerous.
Here is a very simple example [...]
The example is not evidence but a strawman argument. It just
doesn't match the experience of actual practice of other C
developers (speaking for myself, and other developers I have
known personally, and the comments of other newsgroup folks who
have participated in the conversation).
It’s amusing to think that C++, that behemoth that, in terms of sheer complexity, leaves old-style monsters like Algol 68 or PL/I in the
dust, is still essentially a single-pass language design.
On Sun, 17 May 2026 06:48:06 -0700, Tim Rentsch wrote:
Bart <bc@freeuk.com> writes:
On 17/05/2026 02:21, Keith Thompson wrote:
Bart <bc@freeuk.com> writes:
On 16/05/2026 23:03, fir wrote:
the fact that in c a language/compiler sees only functions or
variables that are up in code is a disaster
it is a disaster becouse it dont alow you to split code on N files >>>>>> each file has realted functions and variables and not to care on the >>>>>> global order of it
fir, "You are doing it wrong".
I mentioned something like this a week ago, suggesting that in C it
was harder work than necessary to split one source file up into two or >>>>> more.
Bart, in reality, a smart developer almost never has to "split one source file up into two or more". Instead, they /plan/ for isolation and encapsulation
of the functional parts of their code, and /intentionally/ develop
multiple source files from the start. That's the way professionals do it.
You, for instance, might write the parser, calling external functions
to generate output code. For your purposes, those external functions
can be debugging stubs in a separate source or object module.
Fir, OTOH, might write the code generator functions, with a simple
debugging driver as a separate source or object module. Once both of
you have tested your parts to success, you can combine your two works,
with fir supplying the code generator and you the parser, to create a
single compiler program. None of this has to be "onerous". The hard
part is agreeing on the contract between your code and fir's code, and
that's part of what a professional does /before/ (s)he starts coding.
And you offered no evidence for your claim, not even telling us
that you had tried it and found it difficult.
Everyone here will know what is involved. But nobody wants to admit
that it can be onerous.
It could be onerous. The point is, in actual practice it almost
never is onerous.
Here is a very simple example [...]
The example is not evidence but a strawman argument. It just
doesn't match the experience of actual practice of other C
developers (speaking for myself, and other developers I have
known personally, and the comments of other newsgroup folks who
have participated in the conversation).
Agreed. Bart and fir have a "special" view of coding, which is
at odds with my 30+ years experience in the profession (and
my 20+ years of post-professional (amateur) programming.
On Sun, 17 May 2026 06:48:06 -0700, Tim Rentsch wrote:
Bart <bc@freeuk.com> writes:
On 17/05/2026 02:21, Keith Thompson wrote:
Bart <bc@freeuk.com> writes:
On 16/05/2026 23:03, fir wrote:
the fact that in c a language/compiler sees only functions or
variables that are up in code is a disaster
it is a disaster becouse it dont alow you to split code on N files >>>>>> each file has realted functions and variables and not to care on the >>>>>> global order of it
fir, "You are doing it wrong".
I mentioned something like this a week ago, suggesting that in C it
was harder work than necessary to split one source file up into two or >>>>> more.
Bart, in reality, a smart developer almost never has to "split one source file up into two or more". Instead, they /plan/ for isolation and encapsulation
of the functional parts of their code, and /intentionally/ develop
multiple source files from the start. That's the way professionals do it.
You, for instance, might write the parser, calling external functions
to generate output code. For your purposes, those external functions
can be debugging stubs in a separate source or object module.
Fir, OTOH, might write the code generator functions, with a simple
debugging driver as a separate source or object module. Once both of
you have tested your parts to success, you can combine your two works,
with fir supplying the code generator and you the parser, to create a
single compiler program. None of this has to be "onerous". The hard
part is agreeing on the contract between your code and fir's code, and
that's part of what a professional does /before/ (s)he starts coding.
And you offered no evidence for your claim, not even telling us
that you had tried it and found it difficult.
Everyone here will know what is involved. But nobody wants to admit
that it can be onerous.
It could be onerous. The point is, in actual practice it almost
never is onerous.
Here is a very simple example [...]
The example is not evidence but a strawman argument. It just
doesn't match the experience of actual practice of other C
developers (speaking for myself, and other developers I have
known personally, and the comments of other newsgroup folks who
have participated in the conversation).
Agreed. Bart and fir have a "special" view of coding, which is
at odds with my 30+ years experience in the profession (and
my 20+ years of post-professional (amateur) programming.
Lew Pitcher pisze:
Agreed. Bart and fir have a "special" view of coding, which is
at odds with my 30+ years experience in the profession (and
my 20+ years of post-professional (amateur) programming.
were talking here about pices of c code..name it c files for example
say you have N of such pices - when i code my app i got the pices just
as i sait ..one is for example setup_window.c another is timer.c another
is blitter.c and so on
each have set of functions and "global" variables related to
them..mostly to them but some of them also may be accesed by other pices/files
if you have such system that those functions in each pice see all other pieces up and down its ideal and proper situation becouse all thise
files like orthogonal one to another and thus like separated only they
see other by names
adding a rigiud constrain that you must keep that files in artifical
linear order from up to down ias absolute crazyy becuose
1) naturally given order dont exist
2) ist absolutly sill and tiresome to manage such stupid order
its absolute FLAW and its a disaster
bartc simplyhas a dose of intelligence to see it too (though such things some may see in different extent..with time im even more angry than
before (when i was writing on this many years ago)
c also has other flaws (also mentioned) - language should be designed
such way not to make code jumping and unnecessary dependencies which
kill codes making work on it more tiresome and stupid
On 17/05/2026 17:24, fir wrote:
Lew Pitcher pisze:
Agreed. Bart and fir have a "special" view of coding, which is
at odds with my 30+ years experience in the profession (and
my 20+ years of post-professional (amateur) programming.
were talking here about pices of c code..name it c files for example
say you have N of such pices - when i code my app i got the pices just
as i sait ..one is for example setup_window.c another is timer.c
another is blitter.c and so on
each have set of functions and "global" variables related to
them..mostly to them but some of them also may be accesed by other
pices/files
if you have such system that those functions in each pice see all
other pieces up and down its ideal and proper situation becouse all
thise files like orthogonal one to another and thus like separated
only they
see other by names
adding a rigiud constrain that you must keep that files in artifical
linear order from up to down ias absolute crazyy becuose
1) naturally given order dont exist
2) ist absolutly sill and tiresome to manage such stupid order
C does allow you to have them in arbitrary order.
But it means writing and maintaining function prototypes at the top of
the file. That is what's tiresome.
its absolute FLAW and its a disaster
bartc simplyhas a dose of intelligence to see it too (though such
things some may see in different extent..with time im even more angry
than before (when i was writing on this many years ago)
c also has other flaws (also mentioned) - language should be designed
such way not to make code jumping and unnecessary dependencies which
kill codes making work on it more tiresome and stupid
My personal language allows anything to be defined in any order,
including types, variables, enums and macros, at module scope or inside
a function.
So if you wanted, you could define all local variables at the end of a function rather than at the top; in C syntax:
void GF() {
F(&x, &y, &z);
...
int x, y, z;
}
This doesn't look that useful at first, but in a current project where I
am generating code in that language, it is invaluable, as I don't know exactly how many tempory variables need to be defined until I'm done generating code for the body.
Bart, in reality, a smart developer almost never has to "split one
source file up into two or more". Instead, they /plan/ for isolation
and encapsulation of the functional parts of their code, and
/intentionally/ develop multiple source files from the start. That's
the way professionals do it.
On 17/05/2026 02:21, Keith Thompson wrote:
Bart <bc@freeuk.com> writes:
On 16/05/2026 23:03, fir wrote:
the fact that in c a language/compiler sees only functions or
variables that are up in code is a disaster
it is a disaster becouse it dont alow you to split code on N files
each file has realted functions and variables and not to care on the
global order of it
I mentioned something like this a week ago, suggesting that in C it
was harder work than necessary to split one source file up into two or
more.
And you offered no evidence for your claim, not even telling us
that you had tried it and found it difficult.
Everyone here will know what is involved. But nobody wants to admit that
it can be onerous.
Here is a very simple example of one 'module', which involves two source >files, 'a.h' and 'a.c':
[snip]
So, yes, I believe a decent module scheme means stuff like this is less
work than in C.
On 17/05/2026 17:24, fir wrote:
Lew Pitcher pisze:
Agreed. Bart and fir have a "special" view of coding, which is
at odds with my 30+ years experience in the profession (and
my 20+ years of post-professional (amateur) programming.
were talking here about pices of c code..name it c files for example
say you have N of such pices - when i code my app i got the pices just
as i sait ..one is for example setup_window.c another is timer.c
another is blitter.c and so on
each have set of functions and "global" variables related to
them..mostly to them but some of them also may be accesed by other
pices/files
if you have such system that those functions in each pice see all
other pieces up and down its ideal and proper situation becouse all
thise files like orthogonal one to another and thus like separated
only they
see other by names
adding a rigiud constrain that you must keep that files in artifical
linear order from up to down ias absolute crazyy becuose
1) naturally given order dont exist
2) ist absolutly sill and tiresome to manage such stupid order
C does allow you to have them in arbitrary order.
But it means writing and maintaining function prototypes at the top of
the file. That is what's tiresome.
its absolute FLAW and its a disaster
bartc simplyhas a dose of intelligence to see it too (though such
things some may see in different extent..with time im even more angry
than before (when i was writing on this many years ago)
c also has other flaws (also mentioned) - language should be designed
such way not to make code jumping and unnecessary dependencies which
kill codes making work on it more tiresome and stupid
My personal language allows anything to be defined in any order,
including types, variables, enums and macros, at module scope or inside
a function.
So if you wanted, you could define all local variables at the end of a function rather than at the top; in C syntax:
void GF() {
F(&x, &y, &z);
...
int x, y, z;
}
This doesn't look that useful at first, but in a current project where I
am generating code in that language, it is invaluable, as I don't know exactly how many tempory variables need to be defined until I'm done generating code for the body.
On Sun, 17 May 2026 14:43:28 -0000 (UTC), Lew Pitcher wrote:
Bart, in reality, a smart developer almost never has to "split one
source file up into two or more". Instead, they /plan/ for isolation
and encapsulation of the functional parts of their code, and
/intentionally/ develop multiple source files from the start. That's
the way professionals do it.
Meanwhile, back in the real world, programs evolve and need to adapt
to changes in the problems they were written to solve.
What starts out as one source file might later grow to two or more.
For instance, I start out with one program for performing the main
functions of my time-and-billing system, then later discover the need
for additional utilities driven by the same config settings: so this necessitates breaking out some common routines into a new module that
is common to all the executables.
It’s called “refactoring”.
On 17/05/2026 06:50, Lawrence D’Oliveiro wrote:
On Sun, 17 May 2026 00:03:40 +0200, fir wrote:
the fact that in c a language/compiler sees only functions or
variables that are up in code is a disaster
Back in those days, languages that needed multipass compilers (e.g.
Algol 68) were considered complicated and expensive to implement.
That’s why C went for a single-pass language design, like Pascal. And
like Pascal, it has forward declarations to mitigate this somewhat.
You need some kind of use-before-define facility in any realistic
language, if you want to allow recursion, and in particular mutual
recursion.
It’s amusing to think that C++, that behemoth that, in terms of sheer
complexity, leaves old-style monsters like Algol 68 or PL/I in the
dust, is still essentially a single-pass language design.
The way it works in Python is peculiar.
This fails for example:
def F():
G()
F()
def G():
print("G")
But it works if that F() call is moved to the end, even though G is
defined after F.
This is because 'def' is an executable statement, so executing 'def G' before doing F() is sufficient to get G into the global symbol table.
All languages I develop have out-of-order definitions so this stuff is
never a problem.
C does allow you to have them in arbitrary order.
But it means writing and maintaining function prototypes at the top
of the file. That is what's tiresome.
[...]
Much better, IMHO, is to use a language that lets you mix declarations
and statements as needed.
I see declaring your local variables in a
list at the top of a function - or, far worse, at the bottom - as
archaic style.
On 18/05/2026 01:30, Lawrence D’Oliveiro wrote:
[...]
It’s called “refactoring”.
Yes, I think some people have been overstating their claims that they
rarely or never split a source file into two or more parts. Certainly
with good planing you reduce the likelihood of having to split code
later, but it's rare that you start a project with a clear enough specification and that never changes during the lifetime of the code.
But I also think Bart is wildly overstating his claims of how much of an effort it is. Usually it's just something you do, without needing much extra effort or risk - the thought and planning effort going in to how
you are doing your re-structure is the time-consuming part, not the mechanical copy-and-paste or adding a couple of extern declarations to a
new header.
On 18/05/2026 01:30, Lawrence D’Oliveiro wrote:
On Sun, 17 May 2026 14:43:28 -0000 (UTC), Lew Pitcher wrote:
Bart, in reality, a smart developer almost never has to "split one
source file up into two or more". Instead, they /plan/ for isolation
and encapsulation of the functional parts of their code, and
/intentionally/ develop multiple source files from the start. That's
the way professionals do it.
Meanwhile, back in the real world, programs evolve and need to adapt
to changes in the problems they were written to solve.
What starts out as one source file might later grow to two or more.
For instance, I start out with one program for performing the main
functions of my time-and-billing system, then later discover the need
for additional utilities driven by the same config settings: so this
necessitates breaking out some common routines into a new module that
is common to all the executables.
It’s called “refactoring”.
Yes, I think some people have been overstating their claims that they
rarely or never split a source file into two or more parts. Certainly
with good planing you reduce the likelihood of having to split code
later, but it's rare that you start a project with a clear enough specification and that never changes during the lifetime of the code.
But I also think Bart is wildly overstating his claims of how much of an effort it is. Usually it's just something you do, without needing much extra effort or risk - the thought and planning effort going in to how
you are doing your re-structure is the time-consuming part, not the mechanical copy-and-paste or adding a couple of extern declarations to a
new header.
On 2026-05-18 08:56, David Brown wrote:
[...]
Much better, IMHO, is to use a language that lets you mix declarations
and statements as needed.
Indeed. But not "mixing" as a value per se, but to keep declarations
locally is a good thing, IMO.
I see declaring your local variables in a list at the top of a
function - or, far worse, at the bottom - as archaic style.
Well, "archaic" expresses a time-related qualification. But even in
earlier times we saw, depending on the actual programming language,
both styles existing.
Anyway we need forward declarations or other means (e.g. multi-pass)
to make mutual recursions or circular data structures possible.
[...] But in the C world, pre-C99 code is often written in a style with local variables all declared at the top of the function - after C99, it
is common to declare them when you need them. So as a C programmer, I
see declaring local variables in one clump together as archaic.
[...]
On 2026-05-18 09:02, David Brown wrote:
On 18/05/2026 01:30, Lawrence D’Oliveiro wrote:
[...]
It’s called “refactoring”.
Yes, I think some people have been overstating their claims that they
rarely or never split a source file into two or more parts. Certainly
with good planing you reduce the likelihood of having to split code
later, but it's rare that you start a project with a clear enough
specification and that never changes during the lifetime of the code.
In C++ I had usually written a class first in one source file, tested
its basic function, and then separated the declaration in a header and
the implementation in a separate implementation file. It was just for convenience, and no issue at all, or rather trivial, to separate the
parts. That was no refactoring, though.
In real refactoring projects I've done _complex transformation_ tasks,
not just the mundane and trivial task of separating code across files.
But I also think Bart is wildly overstating his claims of how much of
an effort it is. Usually it's just something you do, without needing
much extra effort or risk - the thought and planning effort going in
to how you are doing your re-structure is the time-consuming part, not
the mechanical copy-and-paste or adding a couple of extern
declarations to a new header.
Indeed. - Actually I'm unsure whether he's making up these statements
because he really has a mental problem and difficulties handling that,
or whether he has some satisfaction in making up peculiar claims just
for the sake of an argument.
In article <10uc81u$1kd2r$1@dont-email.me>, Bart <bc@freeuk.com> wrote:
On 17/05/2026 02:21, Keith Thompson wrote:
Bart <bc@freeuk.com> writes:
On 16/05/2026 23:03, fir wrote:
the fact that in c a language/compiler sees only functions or
variables that are up in code is a disaster
it is a disaster becouse it dont alow you to split code on N files
each file has realted functions and variables and not to care on the >>>>> global order of it
I mentioned something like this a week ago, suggesting that in C it
was harder work than necessary to split one source file up into two or >>>> more.
And you offered no evidence for your claim, not even telling us
that you had tried it and found it difficult.
Everyone here will know what is involved. But nobody wants to admit that
it can be onerous.
Here is a very simple example of one 'module', which involves two source
files, 'a.h' and 'a.c':
[snip]
So, yes, I believe a decent module scheme means stuff like this is less
work than in C.
The example is you have two files, a.h and a.c; a.c has (say) a
bunch of `static`-qualified functions; obviously these are
file-scope, but have internal linkage. They can call each
other.
Now you split the file into two, and these functions need
external linkage and prototypes in a header file. Instead of
just creating a new `.c` file, you've got to change some stuff
in a header file (possibly newly created) as well.
I'll admit: you have a valid point.
Yes, this happens. I do it pretty frequently. It's a chore,
though I don't think it's quite as bad as you are making it out
to be; annoying, certainly, but probably not in my list of top
10, 15, or 20 annoyances about C.
But it is an annoyance, nonetheless.
On 17/05/2026 18:56, Bart wrote:
But it means writing and maintaining function prototypes at the top of
the file. That is what's tiresome.
/If/ you want to write your functions in an arbitrary order, then you
need to declare your static functions before you use them. That is certainly true.
I accept that you like it, and I appreciate
that you are not alone in that. But other people have different preferences. I /like/ having a fixed order. I almost never declare static functions, and when programming in a language that allows
arbitrary order (such as Python), I still order my code bottom-up. This means when you look at my code, if you want to know the definition of an identifier, you only need to look in one direction - upwards.
I am not saying that this ordering is somehow universally or objectively better than arbitrary order. But I am saying that arbitrary order is
not universally or objectively better in a programming language.
Flexibility has its downsides, and there's a lot of personal opinion and preferences involved.
But I agree that if you use a language that has a "declare before use"
rule (as many languages do), and you want to write in an arbitrary
order, then it will involve extra effort. Such effort may be annoying,
but it is entirely self-imposed.
So if you wanted, you could define all local variables at the end of a
function rather than at the top; in C syntax:
void GF() {
F(&x, &y, &z);
...
int x, y, z;
}
This doesn't look that useful at first, but in a current project where
I am generating code in that language, it is invaluable, as I don't
know exactly how many tempory variables need to be defined until I'm
done generating code for the body.
It does not just look useless at first sight, it looks horrible. But I write my code myself, for the most part - generated code does not need
to be as easily read and understood. (I can't imagine how this
"feature" is useful for generating code - surely it would be negligible effort to build up your list of variables as you build up the generated statements, and output the whole function in one lump. You are no
longer trying to fit this into a few KB of ram on a Z80.)
Much better, IMHO, is to use a language that lets you mix declarations
and statements as needed. I see declaring your local variables in a
list at the top of a function - or, far worse, at the bottom - as
archaic style.
On 18/05/2026 02:22, Dan Cross wrote:
In article <10uc81u$1kd2r$1@dont-email.me>, Bart <bc@freeuk.com> wrote:
On 17/05/2026 02:21, Keith Thompson wrote:
Bart <bc@freeuk.com> writes:
On 16/05/2026 23:03, fir wrote:
the fact that in c a language/compiler sees only functions or
variables that are up in code is a disaster
it is a disaster becouse it dont alow you to split code on N files >>>>>> each file has realted functions and variables and not to care on the >>>>>> global order of it
I mentioned something like this a week ago, suggesting that in C it
was harder work than necessary to split one source file up into two or >>>>> more.
And you offered no evidence for your claim, not even telling us
that you had tried it and found it difficult.
Everyone here will know what is involved. But nobody wants to admit that >>> it can be onerous.
Here is a very simple example of one 'module', which involves two source >>> files, 'a.h' and 'a.c':
[snip]
So, yes, I believe a decent module scheme means stuff like this is less
work than in C.
The example is you have two files, a.h and a.c; a.c has (say) a
bunch of `static`-qualified functions; obviously these are
file-scope, but have internal linkage. They can call each
other.
Now you split the file into two, and these functions need
external linkage and prototypes in a header file. Instead of
just creating a new `.c` file, you've got to change some stuff
in a header file (possibly newly created) as well.
I'll admit: you have a valid point.
Yes, this happens. I do it pretty frequently. It's a chore,
though I don't think it's quite as bad as you are making it out
to be; annoying, certainly, but probably not in my list of top
10, 15, or 20 annoyances about C.
So you admit you have a list too! Mine had 100 items.
On 18/05/2026 07:56, David Brown wrote:
On 17/05/2026 18:56, Bart wrote:
But it means writing and maintaining function prototypes at the top
of the file. That is what's tiresome.
/If/ you want to write your functions in an arbitrary order, then you
need to declare your static functions before you use them. That is
certainly true.
I accept that you like it, and I appreciate that you are not alone
in that. But other people have different preferences. I /like/
having a fixed order. I almost never declare static functions, and
when programming in a language that allows arbitrary order (such as
Python), I still order my code bottom-up. This means when you look at
my code, if you want to know the definition of an identifier, you only
need to look in one direction - upwards.
I am not saying that this ordering is somehow universally or
objectively better than arbitrary order. But I am saying that
arbitrary order is not universally or objectively better in a
programming language. Flexibility has its downsides, and there's a lot
of personal opinion and preferences involved.
But I agree that if you use a language that has a "declare before use"
rule (as many languages do), and you want to write in an arbitrary
order, then it will involve extra effort. Such effort may be
annoying, but it is entirely self-imposed.
In my C compiler project, there are nearly 3000 top-level functions, variables, types, enumerations, named constants and macros. That is, declared at module scope, and including local and exported names.
(My non-C compiler generates that list which is picked up by my IDE.)
I really don't want the extra hassle of managing a dependency order. In
any case, there are also mutual out of order references.
So it is a necessity. Once you use such a language yourself, C will seem archaic.
So if you wanted, you could define all local variables at the end of
a function rather than at the top; in C syntax:
void GF() {
F(&x, &y, &z);
...
int x, y, z;
}
This doesn't look that useful at first, but in a current project
where I am generating code in that language, it is invaluable, as I
don't know exactly how many tempory variables need to be defined
until I'm done generating code for the body.
It does not just look useless at first sight, it looks horrible. But
I write my code myself, for the most part - generated code does not
need to be as easily read and understood. (I can't imagine how this
"feature" is useful for generating code - surely it would be
negligible effort to build up your list of variables as you build up
the generated statements, and output the whole function in one lump.
You are no longer trying to fit this into a few KB of ram on a Z80.)
It is for simplicity. I don't want an extra pass generating internal
data structures when I can just generate source text as I go. But then
some details are not known until later.
There are ways of injecting new text into an earlier spot, but they are hairy. Since the target language has this feature, then why not use it?
Since either it has 'out-of-order' definitions, or it hasn't. In this
case it has.
Much better, IMHO, is to use a language that lets you mix declarations
and statements as needed. I see declaring your local variables in a
list at the top of a function - or, far worse, at the bottom - as
archaic style.
I allow definitions anywhere, including mixed with executable statements.
But I prefer to keep executable code clean and free of type-related
clutter, and to have separate summary of locals in one place.
This also allows easier transitioning between my static language, which needs type annotations, and my dynamic one, which doesn't. Or porting to other languages where details of my algorithm are relevant, but language specific types aren't.
I don't routinely declare locals at the end of a function. Especially if they are initialised, as that assignment takes place at that spot. So
then placement is important.
In article <10uepa5$2dpee$1@dont-email.me>, Bart <bc@freeuk.com> wrote:
On 18/05/2026 02:22, Dan Cross wrote:
In article <10uc81u$1kd2r$1@dont-email.me>, Bart <bc@freeuk.com> wrote: >>>> On 17/05/2026 02:21, Keith Thompson wrote:
Bart <bc@freeuk.com> writes:
On 16/05/2026 23:03, fir wrote:
the fact that in c a language/compiler sees only functions or
variables that are up in code is a disaster
it is a disaster becouse it dont alow you to split code on N files >>>>>>> each file has realted functions and variables and not to care on the >>>>>>> global order of it
I mentioned something like this a week ago, suggesting that in C it >>>>>> was harder work than necessary to split one source file up into two or >>>>>> more.
And you offered no evidence for your claim, not even telling us
that you had tried it and found it difficult.
Everyone here will know what is involved. But nobody wants to admit that >>>> it can be onerous.
Here is a very simple example of one 'module', which involves two source >>>> files, 'a.h' and 'a.c':
[snip]
So, yes, I believe a decent module scheme means stuff like this is less >>>> work than in C.
The example is you have two files, a.h and a.c; a.c has (say) a
bunch of `static`-qualified functions; obviously these are
file-scope, but have internal linkage. They can call each
other.
Now you split the file into two, and these functions need
external linkage and prototypes in a header file. Instead of
just creating a new `.c` file, you've got to change some stuff
in a header file (possibly newly created) as well.
I'll admit: you have a valid point.
Yes, this happens. I do it pretty frequently. It's a chore,
though I don't think it's quite as bad as you are making it out
to be; annoying, certainly, but probably not in my list of top
10, 15, or 20 annoyances about C.
So you admit you have a list too! Mine had 100 items.
Of course. I never said that I did not.
Indeed, no one I've seen engage with you seriously recently has
suggested that they think C is perfect; most have said there are
things they wish were different about C.
Dan Cross pisze:
In article <10uepa5$2dpee$1@dont-email.me>, Bart <bc@freeuk.com> wrote: >>> On 18/05/2026 02:22, Dan Cross wrote:in fast this "only see up" is on top or near top on that list..what you
In article <10uc81u$1kd2r$1@dont-email.me>, Bart <bc@freeuk.com>
wrote:
On 17/05/2026 02:21, Keith Thompson wrote:
Bart <bc@freeuk.com> writes:
On 16/05/2026 23:03, fir wrote:
the fact that in c a language/compiler sees only functions or
variables that are up in code is a disaster
it is a disaster becouse it dont alow you to split code on N files >>>>>>>> each file has realted functions and variables and not to care on >>>>>>>> the
global order of it
I mentioned something like this a week ago, suggesting that in C it >>>>>>> was harder work than necessary to split one source file up into >>>>>>> two or
more.
And you offered no evidence for your claim, not even telling us
that you had tried it and found it difficult.
Everyone here will know what is involved. But nobody wants to admit >>>>> that
it can be onerous.
Here is a very simple example of one 'module', which involves two
source
files, 'a.h' and 'a.c':
[snip]
So, yes, I believe a decent module scheme means stuff like this is
less
work than in C.
The example is you have two files, a.h and a.c; a.c has (say) a
bunch of `static`-qualified functions; obviously these are
file-scope, but have internal linkage. They can call each
other.
Now you split the file into two, and these functions need
external linkage and prototypes in a header file. Instead of
just creating a new `.c` file, you've got to change some stuff
in a header file (possibly newly created) as well.
I'll admit: you have a valid point.
Yes, this happens. I do it pretty frequently. It's a chore,
though I don't think it's quite as bad as you are making it out
to be; annoying, certainly, but probably not in my list of top
10, 15, or 20 annoyances about C.
So you admit you have a list too! Mine had 100 items.
Of course. I never said that I did not.
Indeed, no one I've seen engage with you seriously recently has
suggested that they think C is perfect; most have said there are
things they wish were different about C.
find
worse?
i never make a list but this "see only up" is definitelly near top
other very annoing is that long x = 'dhbsf00d' is not standarized afaik (such 8-char tags would be extremely handy sometimes)
also annoying is thet if {} has its own scope
also annoying if you cant foo( float x,y,z) {} (this ia somewhat disputable if its allowed but x, y are ints but im not sure if it work
in today c
also annoying is that foo() {} has return type int instead of void
(though this could be eventually disputable)
b instead of *a.b THAT is annoying (esp im not sure does not a.*b is notjust avaliable?)
annoying are also things that are not in c like "int x, y = foo(2,3);"
but those dont belong to that list
On 18/05/2026 12:57, Bart wrote:
In my C compiler project, there are nearly 3000 top-level functions,
variables, types, enumerations, named constants and macros. That is,
declared at module scope, and including local and exported names.
(My non-C compiler generates that list which is picked up by my IDE.)
I really don't want the extra hassle of managing a dependency order.
In any case, there are also mutual out of order references.
So it is a necessity. Once you use such a language yourself, C will
seem archaic.
You are simply incorrect.
You have a personal style that is somewhat anarchic and you prefer to be able to jumble your declarations around. Okay, I accept that's what you like.
But you are wrong to think it is somehow an objectively better
way to organise code or an objectively good language feature. You are wrong to guess what I or anyone else might think or prefer.
You really need to stop assuming that you have found some kind of
nirvana of programming. All you have found is what you happen to like - nothing more than that.
It is for simplicity. I don't want an extra pass generating internal
data structures when I can just generate source text as I go. But then
some details are not known until later.
I don't care. This is a limitation of your own making, and there is no good technical reason for that limitation. So your "solution" to your
own invented problem does not provide any evidence or justification for
why declaring after usage is a useful feature in a language. Of course
you can make your own language the way you prefer, and use it the way
you want - but don't imagine that it is "better" in any sense, or that others would also like it. Similarly, I don't claim to know that others would /not/ like it - I can only answer for myself. And I can only
point to the strong bias of real-world imperative languages where
"declare before use" is the norm as an indication that people see that
as a better way for a language to work.
fir pisze:
Dan Cross pisze:
i never make a list but this "see only up" is definitelly near top
other very annoing is that long x = 'dhbsf00d' is not standarized
afaik (such 8-char tags would be extremely handy sometimes)
also annoying is thet if {} has its own scope
also annoying if you cant foo( float x,y,z) {} (this ia somewhat
disputable if its allowed but x, y are ints but im not sure if it work
in today c>> also annoying is that foo() {} has return type int instead of void
(though this could be eventually disputable)
(*Q)->m; -> only fixes one level!
annoying is alos that "'" dont work more like
if(a) a=3,c=3,print(2), x=10;
AND ye i forgot
b instead of *a.b THAT is annoying (esp im not sure does not a.*b is notjust avaliable?)
m is equivalent to (*P).m. However, (**Q).m cannot be reduced to
annoying are also things that are not in c like "int x, y = foo(2,3);"
but those dont belong to that list
On 18/05/2026 12:57, David Brown wrote:
On 18/05/2026 12:57, Bart wrote:
In my C compiler project, there are nearly 3000 top-level functions,
variables, types, enumerations, named constants and macros. That is,
declared at module scope, and including local and exported names.
(My non-C compiler generates that list which is picked up by my IDE.)
I really don't want the extra hassle of managing a dependency order.
In any case, there are also mutual out of order references.
So it is a necessity. Once you use such a language yourself, C will
seem archaic.
You are simply incorrect.
You have a personal style that is somewhat anarchic and you prefer to
be able to jumble your declarations around. Okay, I accept that's
what you like.
But you are wrong to think it is somehow an objectively better way
to organise code or an objectively good language feature. You are
wrong to guess what I or anyone else might think or prefer.
You really need to stop assuming that you have found some kind of
nirvana of programming. All you have found is what you happen to like
- nothing more than that.
This is more fundamental. You seem think all top-level entities, even
within one source file, need to be some sort of ordered set, which
depends on which interactions there may or may not be between then.
Interactions which can change as code is developed, which may require a change in the order.
Nobody should really need to care about such things; they are plenty of other matters to deal with.
You only thing it is a 'good' thing because C requires it.
If you were using a language where order didn't matter, would you go to
the same trouble?
Maybe some people prefer their functions in alphabetical order!
So it's not question of liking or not liking, but doing anyway with an irrelevant distraction.
In any case, when I look at open source C code, I do see loads of
forward function declarations, even for non-exported functions.
I guess not everyone is as meticuluous as you. So for those people,
needing those declarations is a nuisance.
It is for simplicity. I don't want an extra pass generating internal
data structures when I can just generate source text as I go. But
then some details are not known until later.
I don't care. This is a limitation of your own making, and there is
no good technical reason for that limitation. So your "solution" to
your own invented problem does not provide any evidence or
justification for why declaring after usage is a useful feature in a
language. Of course you can make your own language the way you
prefer, and use it the way you want - but don't imagine that it is
"better" in any sense, or that others would also like it. Similarly,
I don't claim to know that others would /not/ like it - I can only
answer for myself. And I can only point to the strong bias of real-
world imperative languages where "declare before use" is the norm as
an indication that people see that as a better way for a language to
work.
also annoying if you cant foo( float x,y,z) {} (this ia somewhat
disputable if its allowed but x, y are ints but im not sure if it
work in today c>> also annoying is that foo() {} has return type int
instead of void
(though this could be eventually disputable)
Not sure what either of these mean.
AND ye i forgot
b instead of *a.b THAT is annoying (esp im not sure does not a.*bis notjust avaliable?)
m is equivalent to (*P).m. However, (**Q).m cannot be reduced to(*Q)->m; -> only fixes one level!
A post-fix '*' operator would mean being able to type P*.m and Q**.m.
You wouldn't really need -> here.
(My language uses post-fix '^' for derefs. However, the language allows
you to drop the explicit '^'; it will add in the derefs as needed.
That means those examples can be written as P.m and Q.m. This is very nice.
I was against allowing P->m in C to be written as P.m in C at one time,
but I've changed my mind; the cleaner code is too big an advantage.)
annoying are also things that are not in c like "int x, y =
foo(2,3);" but those dont belong to that list
Huh? This is legal C.
annoying are also things that are not in c like "int x, y =
foo(2,3);" but those dont belong to that list
Huh? This is legal C.
Bart pisze:
also annoying if you cant foo( float x,y,z) {} (this ia somewhat
disputable if its allowed but x, y are ints but im not sure if it
work in today c>> also annoying is that foo() {} has return type int
instead of void
(though this could be eventually disputable)
Not sure what either of these mean.
i simply mean that repeating that float is annoying
Something(float x1, float y1, float x2, float y2 ) {}
should be
Something(float x1,y1,x2,y2 ) {}
esp if those last 3 are not ints by default in present c - it was in
old, but in present i dont remember
checked they are not
_WAVE_SAMPLES.C:62:18: error: unknown type name 'y'
void foo(float x,y,z) {}
in old c it would be disputable if foo(a,b,c) {} having a b c ints it is
not good but if it not work then thise repeating floats are annoying
one of two should work
void foo(float x,y,z) {}
either x y z are all floats or x float y z ints..
present compile error is bad
Huh. This is C:
uint64_t F(uint64_t s, uint64_t t, uint64_t u, uint64_t v) ...
This is my language:
func F(u64 s, t, u, v)u64 ...
That's awful.
Bart pisze:
(*Q)->m; -> only fixes one level!
AND ye i forgot
b instead of *a.b THAT is annoying (esp im not sure does not a.*bis notjust avaliable?)
m is equivalent to (*P).m. However, (**Q).m cannot be reduced to
A post-fix '*' operator would mean being able to type P*.m and Q**.m.
You wouldn't really need -> here.
(My language uses post-fix '^' for derefs. However, the language
allows you to drop the explicit '^'; it will add in the derefs as needed.
That means those examples can be written as P.m and Q.m. This is very
nice.
I was against allowing P->m in C to be written as P.m in C at one
time, but I've changed my mind; the cleaner code is too big an
advantage.)
from sorta cpmpatibility only to repair flaws imo *a.b should work as
(*a).b and a.*b should work as *(a.b) i guess
if to skip pointers at all
void foo(int*a)
{
a=2; print(a); //woring as *a=2; print(*a);
}
im maybe not 100% sure but it seem im like 75% convinced that those
pointers in fact should be dropped
(i was writing back then on this but i dont remember fully my conclusions)
probbaly you should use labels a, and p instead of present *a, *p
and only at lace of definition it should be noted as pointer
float x = 3.4;
float* xf = &x;
xf+=0.1;
Bart pisze:
(*Q)->m; -> only fixes one level!
AND ye i forgot
b instead of *a.b THAT is annoying (esp im not sure does not a.*bis notjust avaliable?)
m is equivalent to (*P).m. However, (**Q).m cannot be reduced to
A post-fix '*' operator would mean being able to type P*.m and Q**.m.
You wouldn't really need -> here.
(My language uses post-fix '^' for derefs. However, the language
allows you to drop the explicit '^'; it will add in the derefs as needed.
That means those examples can be written as P.m and Q.m. This is very
nice.
I was against allowing P->m in C to be written as P.m in C at one
time, but I've changed my mind; the cleaner code is too big an
advantage.)
from sorta cpmpatibility only to repair flaws imo *a.b should work as
(*a).b and a.*b should work as *(a.b) i guess
if to skip pointers at all
void foo(int*a)
{
a=2; print(a); //woring as *a=2; print(*a);
}
im maybe not 100% sure but it seem im like 75% convinced that those
pointers in fact should be dropped
(i was writing back then on this but i dont remember fully my conclusions)
probbaly you should use labels a, and p instead of present *a, *p
and only at lace of definition it should be noted as pointer
float x = 3.4;
float* xf = &x;
xf+=0.1;
the usage of *xf has probably not much sense becouse you much more
operate on values even if you use pointers than on pointers itself
i know people often use
char *P = "akjsnjksnk";
while(*p++!=0) something;
but still imo use by value is more common
so it would be more
while(p!=0) { &p++; something; }
assuming you still use int*p; as a definition of pointer
and &p as deference to its addres value (which is not obvious becouse eventually one could use *p as a deference to pointer value - just
oposite it is now
int *P; - definition
p - pointer value,
*p - value
where it could be
int *P; - definition
p - value
*p - pointer value
this "swap" is probably quite sane
annoying are also things that are not in c like "int x, y =
foo(2,3);" but those dont belong to that list
Huh? This is legal C.
Bart pisze:
annoying are also things that are not in c like "int x, y =
foo(2,3);" but those dont belong to that list
Huh? This is legal C.
ye i know , but it is probably not work like i would have
you can deturn structure like int2{int x, y;}
but as thsi structure is not builtin it makes a SIN/FLAW
of making unnecesary dependency (on structure definition)
alos makes unnecesary names and so on
you know that ** unnecesary dependencies are bad **, and
** unnesesary things (objects ) are bad **
(of i would like to return two values and assign it to two separate variables)
As for the return type, I think that needs to be specified. As you
showed, it wasn't clear what it is, which is bad.
C is not high enough level to do type inference, but even if it was,
users would need to apply the same algorithms, and do the same analysis,
to figure out what some return type actually was. It's better to just
state it.
float x = 3.4;
float* xf = &x;
xf+=0.1;
Tricky one. Of that was 1 on the RHS, then 'xf += 1' might increment the pointer, or it might increment the float it points to. So ambiguous.
On 18/05/2026 16:04, fir wrote:
Bart pisze:
annoying are also things that are not in c like "int x, y =
foo(2,3);" but those dont belong to that list
Huh? This is legal C.
ye i know , but it is probably not work like i would have
you can deturn structure like int2{int x, y;}
but as thsi structure is not builtin it makes a SIN/FLAW
of making unnecesary dependency (on structure definition)
alos makes unnecesary names and so on
you know that ** unnecesary dependencies are bad **, and
** unnesesary things (objects ) are bad **
(of i would like to return two values and assign it to two separate
variables)
OK, I used to have such a feature, but dropped it because it wasn't used enough; it wasn't worth the extra complication. But it worked like this:
func foo:int, int =
return (10, 20)
end
It was used like this:
int x, y
(x, y) := foo()
(You can't combine with the declaration.) It also worked like this:
x := foo() # discard 2nd value
foo() # discard both
The implementation didn't use a struct, more of a tuple, but it simply returned N values (limited to 3/4) in the first N registers.
Bart pisze:
float x = 3.4;
float* xf = &x;
xf+=0.1;
Tricky one. Of that was 1 on the RHS, then 'xf += 1' might increment
the pointer, or it might increment the float it points to. So ambiguous.
this aboe may be flawed as it probably should be
float x = 3.4;
float* xf = x;
xf+=0.1;
here float* xf = x; copies the adress but it also look like it copies a value
this swapped * arithmetic is not ambigious imo - its just swapped
a meand value of pointer a and *a means adress/pointer value as adress
this is probably better for all people who do not like to se such many *
in codes as they would be very reduced
and when you would see it you would see it in two cases
1) in definitions
2) in places where realadres arithmetic is done
now you see it in places where values are used and sometimes you dont
see it when dres arithmetic is done - which is sorta swapped to reality
it seems (becous when you see * you should hae places when pointer arithemetic is really done imo)
in fact you culd ewen get rid it in definitions using this reference
like in c++
int& c = a;
c is value
*c would mean pointer
i dont like c++ but it is the option
option is also drop * and use & in both cases
int& c = a;
c is value
&c would mean pointer
this is maybe enough to say that
I. reference (if it behaves how i assumed here above) is a swapped pointer II. swapped probably has more sense then normal
(so eventually they really coud add references to c)
On 18/05/2026 16:04, fir wrote:
Bart pisze:
annoying are also things that are not in c like "int x, y =
foo(2,3);" but those dont belong to that list
Huh? This is legal C.
ye i know , but it is probably not work like i would have
you can deturn structure like int2{int x, y;}
but as thsi structure is not builtin it makes a SIN/FLAW
of making unnecesary dependency (on structure definition)
alos makes unnecesary names and so on
you know that ** unnecesary dependencies are bad **, and
** unnesesary things (objects ) are bad **
(of i would like to return two values and assign it to two separate
variables)
OK, I used to have such a feature, but dropped it because it wasn't used enough; it wasn't worth the extra complication. But it worked like this:
func foo:int, int =
return (10, 20)
end
It was used like this:
int x, y
(x, y) := foo()
(You can't combine with the declaration.) It also worked like this:
x := foo() # discard 2nd value
foo() # discard both
The implementation didn't use a struct, more of a tuple, but it simply returned N values (limited to 3/4) in the first N registers.
Sure. But in the C world, pre-C99 code is often written in a style with local variables all declared at the top of the function - after C99, it
is common to declare them when you need them. So as a C programmer, I
see declaring local variables in one clump together as archaic.
David Brown <david.brown@hesbynett.no> wrote:
Sure. But in the C world, pre-C99 code is often written in a style with
local variables all declared at the top of the function - after C99, it
is common to declare them when you need them. So as a C programmer, I
see declaring local variables in one clump together as archaic.
I agree about style. But AFAICS even in very early C there is block structure and one can put variable declarations in the middle of
sequence, just at the cost of introducing extra blocks. So C99
looks nicer as there is no need for extra blocks, but pre C99
already one could keep scopes tight and initialize variables in
declarations.
On 18/05/2026 09:22, Janis Papanagnou wrote:
On 2026-05-18 08:56, David Brown wrote:
[...]
Much better, IMHO, is to use a language that lets you mix
declarations and statements as needed.
Indeed. But not "mixing" as a value per se, but to keep declarations
locally is a good thing, IMO.
Yes - it is not the mixing itself that is good, it is what it allows.
You get to keep your scopes small, you don't need to declare variables
until you have an initial value for them (who cares if reading an uninitialised variable is UB if you never have them!), and you can often declare your variables as const. That means it's easy to know what the variable holds because it is only set once, and never changed.
I see declaring your local variables in a list at the top of a
function - or, far worse, at the bottom - as archaic style.
Well, "archaic" expresses a time-related qualification. But even in
earlier times we saw, depending on the actual programming language,
both styles existing.
Sure. But in the C world, pre-C99 code is often written in a style with local variables all declared at the top of the function - after C99, it
is common to declare them when you need them. So as a C programmer, I
see declaring local variables in one clump together as archaic.
David Brown <david.brown@hesbynett.no> wrote:
Sure. But in the C world, pre-C99 code is often written in a style with
local variables all declared at the top of the function - after C99, it
is common to declare them when you need them. So as a C programmer, I
see declaring local variables in one clump together as archaic.
I agree about style. But AFAICS even in very early C there is block >structure and one can put variable declarations in the middle of
sequence, just at the cost of introducing extra blocks. [snip]
In article <10uffq4$m4se$1@paganini.bofh.team>,
Waldek Hebisch <antispam@fricas.org> wrote:
David Brown <david.brown@hesbynett.no> wrote:
Sure. But in the C world, pre-C99 code is often written in a style with >>> local variables all declared at the top of the function - after C99, it >>> is common to declare them when you need them. So as a C programmer, I
see declaring local variables in one clump together as archaic.
I agree about style. But AFAICS even in very early C there is block >>structure and one can put variable declarations in the middle of
sequence, just at the cost of introducing extra blocks. [snip]
Maybe. But on projects based on older variants of C, it was
common as a matter of local policy to mandate that locals were
declared at the top of a function; this enabled readers to get
a sense of how much stack space was required at a glance.
On 18/05/2026 09:35, David Brown wrote:
On 18/05/2026 09:22, Janis Papanagnou wrote:
On 2026-05-18 08:56, David Brown wrote:
[...]
Much better, IMHO, is to use a language that lets you mix
declarations and statements as needed.
Indeed. But not "mixing" as a value per se, but to keep declarations
locally is a good thing, IMO.
Yes - it is not the mixing itself that is good, it is what it allows.
You get to keep your scopes small, you don't need to declare variables
until you have an initial value for them (who cares if reading an
uninitialised variable is UB if you never have them!), and you can
often declare your variables as const. That means it's easy to know
what the variable holds because it is only set once, and never changed.
I see declaring your local variables in a list at the top of a
function - or, far worse, at the bottom - as archaic style.
Well, "archaic" expresses a time-related qualification. But even in
earlier times we saw, depending on the actual programming language,
both styles existing.
Sure. But in the C world, pre-C99 code is often written in a style
with local variables all declared at the top of the function - after
C99, it is common to declare them when you need them. So as a C
programmer, I see declaring local variables in one clump together as
archaic.
I hate dealing with code that just declares variables higgledy-piggledy
all over the place; it is so lazy. I'm not into block scopes either.
I /like/ to have them all in one place for easy reference. Then the code itself will have less clutter.
There are several places in C where it is common to declare stuff in one place:
* In parameter lists (at least we always have that, except when
parameter 'x' is used, and you see 'x' in the body, your practice means
I have to scan backwards from that point to see if it is parameter 'x',
or some other 'x' in a nested scope that shadows it).
* #includes
* Imported variables
* Imported functions
* Forward function declarations
Imagine how annoying it would be if any of those last were used inside a block within a function. Well, that's how annoying local declarations
mixed with code are.
On 18/05/2026 19:35, Bart wrote:
I hate dealing with code that just declares variables higgledy-
piggledy all over the place; it is so lazy. I'm not into block scopes
either.
Um, okay. You seem to define a lot of your programming life around
hating things a large proportion of programmers consider good practice.
But you are free to have your preferences.
People who mix declarations and statements in their C programming do so knowing full well that they could put their local variable declarations
all at the top of the function - but they feel that mixing them gives
the best quality code, with the least risk of mistakes and the clearest, most understandable and most maintainable results.
Or perhaps they only do so specifically to annoy you.
I appreciate that /you/ find declarations after statements to be
annoying - yet you sing the praises of your own language because it
allows declarations to occur anywhere mixed with statements, including
after their usage. I suppose it is annoying when someone else does it,
yet innovative and god's gift to the programming world when /you/ do it.
You know, all it takes is to admit that out-of-order definitions can
be a desirable and convenient language feature. It's one that other
languages have, it's not something that only exists in my language
(though I take it a bit further).
I hate dealing with code that just declares variables
higgledy-piggledy all over the place; it is so lazy. I'm not into
block scopes either.
I /like/ to have them all in one place for easy reference. Then the
code itself will have less clutter.
I think you would have liked my language! That does have char
constants up to 'ABCDEFGH' (When I supported 128-bit types, up to 'ABCDEFGHIJKLMNOP'.)
C will support them up to 'int' width, which means 'ABCD' for 32-bit
int. In theory a C with a 64-bit int would allow longer constants, but
that's not going to happen on any common platforms.
As it is, even anything above 'A' is implementation-defined, unless
C23 has changed that.
[...]also annoying is thet if {} has its own scope
annoying are also things that are not in c like "int x, y =
foo(2,3);" but those dont belong to that list
Huh? This is legal C.
On 18/05/2026 01:30, Lawrence D’Oliveiro wrote:
On Sun, 17 May 2026 14:43:28 -0000 (UTC), Lew Pitcher wrote:
Bart, in reality, a smart developer almost never has to "split one
source file up into two or more". Instead, they /plan/ for isolation
and encapsulation of the functional parts of their code, and
/intentionally/ develop multiple source files from the start. That's
the way professionals do it.
Meanwhile, back in the real world, programs evolve and need to adapt
to changes in the problems they were written to solve.
What starts out as one source file might later grow to two or more.
For instance, I start out with one program for performing the main
functions of my time-and-billing system, then later discover the need
for additional utilities driven by the same config settings: so this
necessitates breaking out some common routines into a new module that
is common to all the executables.
It’s called “refactoring”.
Yes, I think some people have been overstating their claims that they
rarely or never split a source file into two or more parts.
Certainly
with good planing you reduce the likelihood of having to split code
later, but it's rare that you start a project with a clear enough specification and that never changes during the lifetime of the code.
But I also think Bart is wildly overstating his claims of how much of an effort it is. Usually it's just something you do, without needing much extra effort or risk - the thought and planning effort going in to how
you are doing your re-structure is the time-consuming part, not the mechanical copy-and-paste or adding a couple of extern declarations to a
new header.
I agree about style. But AFAICS even in very early C there is block
structure and one can put variable declarations in the middle of
sequence, just at the cost of introducing extra blocks. So C99
looks nicer as there is no need for extra blocks, but pre C99
already one could keep scopes tight and initialize variables in
declarations.
Certainly you can do that. But it quickly gets ugly and inconvenient if
you have a lot of extra blocks. It's fair enough if you have a block >already, from a loop or conditional.
Maybe. But on projects based on older variants of C, it was
common as a matter of local policy to mandate that locals were
declared at the top of a function; this enabled readers to get
a sense of how much stack space was required at a glance.
In article <10ufgg4$2l9ge$1@dont-email.me>,
David Brown <david.brown@hesbynett.no> wrote:
...
I agree about style. But AFAICS even in very early C there is block
structure and one can put variable declarations in the middle of
sequence, just at the cost of introducing extra blocks. So C99
looks nicer as there is no need for extra blocks, but pre C99
already one could keep scopes tight and initialize variables in
declarations.
Certainly you can do that. But it quickly gets ugly and inconvenient if
you have a lot of extra blocks. It's fair enough if you have a block
already, from a loop or conditional.
I have always liked this feature of C - that you can create an "extra"
block anywhere and then have locals declared there, like this:
...
puts("This is regular code");
{
int i = 10;
...
}
puts("More code here");
...
This is useful if you are modifying code you didn't write and don't understand, but you just want to add some new feature (and declare some variables for your own use), without disturbing (i.e., conflicting with) anything that is already there. And, as noted, this works in all versions
of C, going back ...
I hate dealing with code that just declares variables
higgledy-piggledy all over the place; it is so lazy. I'm not into
block scopes either.
[...] For some experimental code I take note of the lines in a single
file, take note of the modular setup, and say. okay, this is getting
pretty big. its time to separate these. Iiirc, MISRA has a limit on the number of lines in a function?
On 2026-05-19 01:31, Chris M. Thomasson wrote:
[...] For some experimental code I take note of the lines in a single
file, take note of the modular setup, and say. okay, this is getting
pretty big. its time to separate these. Iiirc, MISRA has a limit on
the number of lines in a function?
(I don't know about MISRA.)
Determining such numbers to get characteristics of source code can
certainly be useful.
Limiting these numbers (and rejecting such code), OTOH, appears to
me to be quite stupid.
On 19/05/2026 09:18, Janis Papanagnou wrote:
On 2026-05-19 01:31, Chris M. Thomasson wrote:
[...] For some experimental code I take note of the lines in a single
file, take note of the modular setup, and say. okay, this is getting
pretty big. its time to separate these. Iiirc, MISRA has a limit on
the number of lines in a function?
(I don't know about MISRA.)
Determining such numbers to get characteristics of source code can
certainly be useful.
Limiting these numbers (and rejecting such code), OTOH, appears to
me to be quite stupid.
I don't recall reading a rule in MISRA that limits the length of a function. But there are a number of different editions of the MISRA
coding standards.
Generally, it is a good idea to keep functions fairly short.
But
sometimes it is clearer to have a longer function than to break it into
many small functions. And occasionally you come across something that
is best handled as one very large function, if the structure is regular
and clear (such as a large switch statement). Guidelines about sizes
can be a good idea, but fixed rules rarely are.
(MISRA compliance does allow for many of its rules and guidelines to be broken, if appropriately commented and justified.)
On Mon, 18 May 2026 18:35:04 +0100, Bart wrote:
I hate dealing with code that just declares variables
higgledy-piggledy all over the place; it is so lazy. I'm not into
block scopes either.
Tight control over scope reduces the chances of unexpected
interactions across different parts of the code.
On 19/05/2026 07:58, Lawrence D’Oliveiro wrote:
On Mon, 18 May 2026 18:35:04 +0100, Bart wrote:
I hate dealing with code that just declares variables
higgledy-piggledy all over the place; it is so lazy. I'm not into
block scopes either.
Tight control over scope reduces the chances of unexpected
interactions across different parts of the code.
At the same time, it makes it harder to see what interactions there
might be, because you don't know whether the 'x' appearing here is the
same one used earlier and/or the one used later.
You need to very carefully trace the declaration for each instance.
And also (this is what I hate), you have a bloody great declaration
right in the middle of what would have been your nice clean code.
There is also the odd way that scopes work:
int a; // a1
.... // a1 in scope
{
print(a); // a1 in scope
double a; // a2
print(a); // a2 in scope
}
.... // a1 in scope
They don't need to be delimited by {...}; here that block has a1 in
first half and a2 in second half, but both are 'a'.
Here:
....
double b = a, a;
....
'b' is initialised from a1, then it declares a2, however since both a1
and a2 are called 'a', it is a touch confusing!
On 19/05/2026 07:58, Lawrence D’Oliveiro wrote:
On Mon, 18 May 2026 18:35:04 +0100, Bart wrote:
I hate dealing with code that just declares variables
higgledy-piggledy all over the place; it is so lazy. I'm not into
block scopes either.
Tight control over scope reduces the chances of unexpected
interactions across different parts of the code.
At the same time, it makes it harder to see what interactions there
might be, because you don't know whether the 'x' appearing here is the
same one used earlier and/or the one used later.
You need to very carefully trace the declaration for each instance.
And also (this is what I hate), you have a bloody great declaration
right in the middle of what would have been your nice clean code.
There is also the odd way that scopes work:
int a; // a1
.... // a1 in scope
{
print(a); // a1 in scope
double a; // a2
print(a); // a2 in scope
}
.... // a1 in scope
They don't need to be delimited by {...}; here that block has a1 in
first half and a2 in second half, but both are 'a'.
Here:
....
double b = a, a;
....
'b' is initialised from a1, then it declares a2, however since both a1
and a2 are called 'a', it is a touch confusing!
On 19/05/2026 11:55, Bart wrote:
On 19/05/2026 07:58, Lawrence D’Oliveiro wrote:
On Mon, 18 May 2026 18:35:04 +0100, Bart wrote:
I hate dealing with code that just declares variables
higgledy-piggledy all over the place; it is so lazy. I'm not into
block scopes either.
Tight control over scope reduces the chances of unexpected
interactions across different parts of the code.
At the same time, it makes it harder to see what interactions there
might be, because you don't know whether the 'x' appearing here is the
same one used earlier and/or the one used later.
You need to very carefully trace the declaration for each instance.
And also (this is what I hate), you have a bloody great declaration
right in the middle of what would have been your nice clean code.
There is also the odd way that scopes work:
int a; // a1
.... // a1 in scope
{
print(a); // a1 in scope
double a; // a2
print(a); // a2 in scope
}
.... // a1 in scope
They don't need to be delimited by {...}; here that block has a1 in
first half and a2 in second half, but both are 'a'.
Yes they do. How else does 'a' stop being a2 and revert to a1?
On 19/05/2026 12:15, Richard Harnden wrote:
On 19/05/2026 11:55, Bart wrote:
On 19/05/2026 07:58, Lawrence D’Oliveiro wrote:
On Mon, 18 May 2026 18:35:04 +0100, Bart wrote:
I hate dealing with code that just declares variables
higgledy-piggledy all over the place; it is so lazy. I'm not into
block scopes either.
Tight control over scope reduces the chances of unexpected
interactions across different parts of the code.
At the same time, it makes it harder to see what interactions there
might be, because you don't know whether the 'x' appearing here is
the same one used earlier and/or the one used later.
You need to very carefully trace the declaration for each instance.
And also (this is what I hate), you have a bloody great declaration
right in the middle of what would have been your nice clean code.
There is also the odd way that scopes work:
int a; // a1
.... // a1 in scope
{
print(a); // a1 in scope
double a; // a2
print(a); // a2 in scope
}
.... // a1 in scope
They don't need to be delimited by {...}; here that block has a1 in
first half and a2 in second half, but both are 'a'.
Yes they do. How else does 'a' stop being a2 and revert to a1?
a1 turned into a2 without crossing '{'.
On 19/05/2026 13:48, Bart wrote:
On 19/05/2026 12:15, Richard Harnden wrote:
On 19/05/2026 11:55, Bart wrote:
On 19/05/2026 07:58, Lawrence D’Oliveiro wrote:
On Mon, 18 May 2026 18:35:04 +0100, Bart wrote:
I hate dealing with code that just declares variables
higgledy-piggledy all over the place; it is so lazy. I'm not into
block scopes either.
Tight control over scope reduces the chances of unexpected
interactions across different parts of the code.
At the same time, it makes it harder to see what interactions there
might be, because you don't know whether the 'x' appearing here is
the same one used earlier and/or the one used later.
You need to very carefully trace the declaration for each instance.
And also (this is what I hate), you have a bloody great declaration
right in the middle of what would have been your nice clean code.
There is also the odd way that scopes work:
int a; // a1
.... // a1 in scope
{
print(a); // a1 in scope
double a; // a2
print(a); // a2 in scope
}
.... // a1 in scope
They don't need to be delimited by {...}; here that block has a1 in
first half and a2 in second half, but both are 'a'.
Yes they do. How else does 'a' stop being a2 and revert to a1?
a1 turned into a2 without crossing '{'.
It did so at the new declaration of "a". How could that possibly be surprising or "odd" ?
On 19/05/2026 13:23, David Brown wrote:
On 19/05/2026 13:48, Bart wrote:
On 19/05/2026 12:15, Richard Harnden wrote:
On 19/05/2026 11:55, Bart wrote:
On 19/05/2026 07:58, Lawrence D’Oliveiro wrote:
On Mon, 18 May 2026 18:35:04 +0100, Bart wrote:
I hate dealing with code that just declares variables
higgledy-piggledy all over the place; it is so lazy. I'm not into >>>>>>> block scopes either.
Tight control over scope reduces the chances of unexpected
interactions across different parts of the code.
At the same time, it makes it harder to see what interactions there >>>>> might be, because you don't know whether the 'x' appearing here is
the same one used earlier and/or the one used later.
You need to very carefully trace the declaration for each instance.
And also (this is what I hate), you have a bloody great declaration >>>>> right in the middle of what would have been your nice clean code.
There is also the odd way that scopes work:
int a; // a1
.... // a1 in scope
{
print(a); // a1 in scope
double a; // a2
print(a); // a2 in scope
}
.... // a1 in scope
They don't need to be delimited by {...}; here that block has a1 in >>>>> first half and a2 in second half, but both are 'a'.
Yes they do. How else does 'a' stop being a2 and revert to a1?
a1 turned into a2 without crossing '{'.
It did so at the new declaration of "a". How could that possibly be
surprising or "odd" ?
How could it possibly not be?!
{...} is a block in C, and C has block scopes. Given that, people might expect a scope to be delimited by the enclosing spaces.
But names don't come into existence until declared, which can be in the middle of a block. Until then, the outer version of 'a' is still in scope.
The point at each that happens, in a busy section of code with lots of declarations, can be unclear, with overlaps:
double a;
.... // a2 b1 c1
double b;
.... // a2 b2 c1
double c;
.... // a2 b2 c2
These all shadow a, b, c in an outer scope, but at different times.
The simplest scoping rule in C is for labels: they have function-wide
scope, regardless of block nesting label.
That would almost be as simple as how my own scopes work, except C just
has to still be quirky:
double c; c: goto c;
(Labels of course have their own namespace, for some obscure reason. I'm sure 99% of C programmers don't know that.)
On 19/05/2026 07:58, Lawrence D’Oliveiro wrote:
On Mon, 18 May 2026 18:35:04 +0100, Bart wrote:
I hate dealing with code that just declares variables
higgledy-piggledy all over the place; it is so lazy. I'm not into
block scopes either.
Tight control over scope reduces the chances of unexpected
interactions across different parts of the code.
At the same time, it makes it harder to see what interactions there
might be, because you don't know whether the 'x' appearing here is the
same one used earlier and/or the one used later.
On 19/05/2026 07:48, Kenny McCormack wrote:
In article <10ufgg4$2l9ge$1@dont-email.me>,
David Brown <david.brown@hesbynett.no> wrote:
...
I agree about style. But AFAICS even in very early C there is block
structure and one can put variable declarations in the middle of
sequence, just at the cost of introducing extra blocks. So C99
looks nicer as there is no need for extra blocks, but pre C99
already one could keep scopes tight and initialize variables in
declarations.
Certainly you can do that. But it quickly gets ugly and inconvenient if >>> you have a lot of extra blocks. It's fair enough if you have a block
already, from a loop or conditional.
I have always liked this feature of C - that you can create an "extra"
block anywhere and then have locals declared there, like this:
...
puts("This is regular code");
{
int i = 10;
...
}
puts("More code here");
...
This is useful if you are modifying code you didn't write and don't
understand, but you just want to add some new feature (and declare some
variables for your own use), without disturbing (i.e., conflicting with)
anything that is already there. And, as noted, this works in all versions >> of C, going back ...
I agree it can sometimes be useful. It is not often needed - especially
in C99 onwards - but sometimes it can be the neatest way to structure code.
I think probably the most common situation where I create "extra" blocks >would be in switch cases. It lets you pretend that "switch" is more >structured than it really is.
[...]
Can you give me an example of a real-world programming language in which
the scope of a local variable begins at the start of the enclosing
block, rather than at its declaration / definition?
On 19/05/2026 12:55, Bart wrote:
At the same time, it makes it harder to see what interactions there
might be, because you don't know whether the 'x' appearing here is the
same one used earlier and/or the one used later.
With small scopes, the definition of "x" is likely to be very close to
where you are using it - so there is much less to search.
No one is suggesting writing huge functions with multiple re-uses of the same variable name in different scopes, with lots of code between declarations and uses. You are imagining problems that don't exist in
most well-written code.
And you are pretending that potential issues with finding the right declaration applies specially when declarations are small and local in
scope - ignoring that without small scope declarations you often have to look further to find declarations, and that variables are not
necessarily declared in the function at all.
On 2026-05-19 15:40, David Brown wrote:
[...]
Can you give me an example of a real-world programming language in
which the scope of a local variable begins at the start of the
enclosing block, rather than at its declaration / definition?
If I'm asking Google it tells me that Python and Javascript (when
using 'var') would be two prominent examples.
Janis
On 19/05/2026 16:41, Janis Papanagnou wrote:
On 2026-05-19 15:40, David Brown wrote:
[...]
Can you give me an example of a real-world programming language in
which the scope of a local variable begins at the start of the
enclosing block, rather than at its declaration / definition?
If I'm asking Google it tells me that Python and Javascript (when
using 'var') would be two prominent examples.
I don't know about Javascript,
but it's wrong about Python.
def foo() :
print(a)
a = 1
print(a)
"a" is not in scope until the line "a = 1". Try it and see.
[snip]
On Sun, 17 May 2026 06:48:06 -0700, Tim Rentsch wrote:...
Bart <bc@freeuk.com> writes:
On 17/05/2026 02:21, Keith Thompson wrote:
Bart <bc@freeuk.com> writes:
I mentioned something like this a week ago, suggesting that in C it
was harder work than necessary to split one source file up into two or >>>>> more.
Bart, in reality, a smart developer almost never has to "split one source file up into two or more". Instead, they /plan/ for isolation and encapsulation
of the functional parts of their code, and /intentionally/ develop
multiple source files from the start. That's the way professionals do it.
And you offered no evidence for your claim, not even telling us
that you had tried it and found it difficult.
Everyone here will know what is involved. But nobody wants to admit
that it can be onerous.
It could be onerous. The point is, in actual practice it almost
never is onerous.
On 2026-05-19 17:07, David Brown wrote:
On 19/05/2026 16:41, Janis Papanagnou wrote:
On 2026-05-19 15:40, David Brown wrote:
[...]
Can you give me an example of a real-world programming language in
which the scope of a local variable begins at the start of the
enclosing block, rather than at its declaration / definition?
If I'm asking Google it tells me that Python and Javascript (when
using 'var') would be two prominent examples.
I don't know about Javascript,
I did JS programming but never tried to put the declaration behind
the point of use of the declared items.
I mean that is the whole point; even if some language (for whatever
reason) would allow that, why should a sane programmer make use of
such obfuscation!
but it's wrong about Python.
I don't know the details of and also don't program in python.
def foo() :
print(a)
a = 1
print(a)
"a" is not in scope until the line "a = 1". Try it and see.
$ python
ksh93: python: not found
I trust your words. :-)
[snip]
(I was just quoting search results. Never mind.)
On 19/05/2026 15:08, Bart wrote:
{...} is a block in C, and C has block scopes. Given that, people
might expect a scope to be delimited by the enclosing spaces.
People expect scopes to begin with a declaration, and end at the end of
the block (or file, for file-scope identifiers). This is the normal handling of scoping in most imperative programming languages. There are some languages (such as Python) where scopes begin when a variable is
first assigned, and continue to the end of the function rather than the block. And declarative programming languages may do things differently
- they do many things differently.
Languages may differ in the details - given "int a = foo(a);", for
example, the scope of the new "a" in C begins after "int a" - in some languages, it may not begin until the end of the semicolon. But that's
a small detail, rarely relevant in real code. (For the record, I'd have preferred if the scope of variables in C did not begin until the end of
the declaration / definition.)
Can you give me an example of a real-world programming language in which
the scope of a local variable begins at the start of the enclosing
block, rather than at its declaration / definition?
Perhaps you are mixing up "scope" and "lifetime" ? These are not the
same things.
But names don't come into existence until declared, which can be in
the middle of a block. Until then, the outer version of 'a' is still
in scope.
Yes. Scope is all about the names.
The point at each that happens, in a busy section of code with lots of
declarations, can be unclear, with overlaps:
double a;
.... // a2 b1 c1
double b;
.... // a2 b2 c1
double c;
.... // a2 b2 c2
These all shadow a, b, c in an outer scope, but at different times.
Yes. (Although this is again a strawman argument - people don't
normally write code that shadows outer scope variables.)
The simplest scoping rule in C is for labels: they have function-wide
scope, regardless of block nesting label.
The rule for C labels exists because you have to be able to jump
backwards and forwards for "goto" to be of much use. In C programming, labels (excluding case labels) are rarely used except sometimes for
error handling. I don't think I've written "goto" in C more than a
couple of times in my programming career.
C labels are unstructured. This is not a good thing in a programming language - it is an unfortunate necessary evil.
On 19/05/2026 17:23, Janis Papanagnou wrote:
$ python
ksh93: python: not found
Way OT - but what kind of OS are you running that does not have Python installed? Even if you don't program in Python, there must surely be
some Python programs on your system. (Of the 3900 files in my /usr/bin, some 211 of them are in Python. I don't program in Perl, but I have
perl on my system for the 169 Perl programs in my /usr/bin.)
Maybe you don't have it as "python", but as "python3" (or "python2") ?
Type help() for interactive help, or help(object) for help about object.help
Use quit() or Ctrl-D (i.e. EOF) to exitquit
I hate dealing with code that just declares variables
higgledy-piggledy all over the place; it is so lazy. I'm not into
block scopes either.
I /like/ to have them all in one place for easy reference. Then the
code itself will have less clutter.
On 2026-05-19 17:58, David Brown wrote:
On 19/05/2026 17:23, Janis Papanagnou wrote:
$ python
ksh93: python: not found
Way OT - but what kind of OS are you running that does not have Python
installed? Even if you don't program in Python, there must surely be
some Python programs on your system. (Of the 3900 files in my /usr/
bin, some 211 of them are in Python. I don't program in Perl, but I
have perl on my system for the 169 Perl programs in my /usr/bin.)
Maybe you don't have it as "python", but as "python3" (or "python2") ?
Yes, I saw that 'man python' works, and there's also '/usr/bin/python3' existing.
I'll certainly never use it!
$ python3
Python 3.12.3 (main, Mar 23 2026, 19:04:32) [GCC 13.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
Type help() for interactive help, or help(object) for help about object.help
Use quit() or Ctrl-D (i.e. EOF) to exitquit
Really?
It first suggests to type "help" and then reports that I should type
"help()" instead?
It prints that I should type "quit()" instead of "quit", or Ctrl-D,
but it will not _just do_ that quit.
Is that contemporary software interface ergonomy? Sensible design?
Useful interactive information? - I mean, we're obviously already
in the 3rd major release, and there's still such quality exhibited.
(I'm really getting angry if I encounter such interfaces.)
Yes, way OT, as you said, and nothing I'd like to dispute about.
Janis
On 19/05/2026 16:41, Janis Papanagnou wrote:
On 2026-05-19 15:40, David Brown wrote:
[...]
Can you give me an example of a real-world programming language in
which the scope of a local variable begins at the start of the
enclosing block, rather than at its declaration / definition?
If I'm asking Google it tells me that Python and Javascript (when
using 'var') would be two prominent examples.
Janis
I don't know about Javascript, but it's wrong about Python.
def foo() :
print(a)
a = 1
print(a)
"a" is not in scope until the line "a = 1". Try it and see.
It is possible that "a" blocks access to a global "a" before it is
declared - since programmers rarely intentionally shadow other
variables, and even more rarely try to access variables before they are defined, it's not a situation that is going to turn up in real code.
Such subtle details are best checked in a Python newsgroup.
Note also that Python doesn't really have a concept of declaring a variable. You initialise a reference to an object, you don't declare a variable as such. (So "a = 1" here creates a constant integer object
with the value 1 on the heap, or increases a reference count to an
existing such object, and creates "a" as an identifier that references
that object. "a" is not a variable in the sense used in compiled imperative languages.)
On 19/05/2026 14:40, David Brown wrote:
On 19/05/2026 15:08, Bart wrote:
{...} is a block in C, and C has block scopes. Given that, people
might expect a scope to be delimited by the enclosing spaces.
People expect scopes to begin with a declaration, and end at the end
of the block (or file, for file-scope identifiers). This is the
normal handling of scoping in most imperative programming languages.
There are some languages (such as Python) where scopes begin when a
variable is first assigned, and continue to the end of the function
rather than the block. And declarative programming languages may do
things differently - they do many things differently.
Languages may differ in the details - given "int a = foo(a);", for
example, the scope of the new "a" in C begins after "int a" - in some
languages, it may not begin until the end of the semicolon. But
that's a small detail, rarely relevant in real code. (For the record,
I'd have preferred if the scope of variables in C did not begin until
the end of the declaration / definition.)
Can you give me an example of a real-world programming language in
which the scope of a local variable begins at the start of the
enclosing block, rather than at its declaration / definition?
In Algo68:
BEGIN
print((a, newline));
INT a=777;
print((a, newline));
END
This prints 0 then 777. If I comment out the declaration, it says 'a'
has not been declared.
Here, 'a' is a named constant; if I use a:=777 instead, which defines a variable, it says the first 'a' has not been initialised (I guess 'INT
a' creates a reference to a local), a different error from not having
the INT line at all.
Now, you said real-world not mainstream, so all mine currently do that. Anything declared in a function body has function-wide scope, including
to any depth of nested block.
Outside of a function, it has module-wide scope. If also declared
'global', it has sub-program-wide scope (although other modules can
shadow it).
Here, obviously, exactly where in a module a name has been defined, is irrelevant.
In C, the same of thing happens when you introduced a forward
declaration of a function: you move its lexical scope to near the top.
In this case, it wouldn't be shadowing anything; there can't be anything
at an outer scope. So there wouldn't be a problem in scope starting from
the top of the file.
Perhaps you are mixing up "scope" and "lifetime" ? These are not the
same things.
I'm talking about static, lexical code, which can be determined at compile-time.
C labels are unstructured. This is not a good thing in a programming
language - it is an unfortunate necessary evil.
A good thing in this case; imagine the same L23 label occurring half a
dozen times in the same function, even if labels had block scopes.
On 19/05/2026 18:31, Janis Papanagnou wrote:
[ python3 interface stuff ]
It does not sound unreasonable to me. It's a programming language.
[...]
On 19/05/2026 16:07, David Brown wrote:
On 19/05/2026 16:41, Janis Papanagnou wrote:
On 2026-05-19 15:40, David Brown wrote:
[...]
Can you give me an example of a real-world programming language in
which the scope of a local variable begins at the start of the
enclosing block, rather than at its declaration / definition?
If I'm asking Google it tells me that Python and Javascript (when
using 'var') would be two prominent examples.
Janis
I don't know about Javascript, but it's wrong about Python.
def foo() :
print(a)
a = 1
print(a)
"a" is not in scope until the line "a = 1". Try it and see.
It is possible that "a" blocks access to a global "a" before it is
declared - since programmers rarely intentionally shadow other
variables, and even more rarely try to access variables before they
are defined, it's not a situation that is going to turn up in real
code. Such subtle details are best checked in a Python newsgroup.
Note also that Python doesn't really have a concept of declaring a
variable. You initialise a reference to an object, you don't declare
a variable as such. (So "a = 1" here creates a constant integer
object with the value 1 on the heap, or increases a reference count to
an existing such object, and creates "a" as an identifier that
references that object. "a" is not a variable in the sense used in
compiled imperative languages.)
In Lua:
function fred()
print(a)
local a
a=1
print(a)
end
a=999
fred()
print(a)
This outputs 999 1 999. So the global a is visible from that first
'print' line; its scope must be start earlier than its initialisation.
However these are both dynamic languages where you might find that that static-looking 'function' is really something executed at runtime. If I
try the same in mine I get Void 1 999.
On 19/05/2026 18:31, Bart wrote:
On 19/05/2026 14:40, David Brown wrote:
Can you give me an example of a real-world programming language in
which the scope of a local variable begins at the start of the
enclosing block, rather than at its declaration / definition?
In Algo68:
BEGIN
print((a, newline));
INT a=777;
print((a, newline));
END
This prints 0 then 777. If I comment out the declaration, it says 'a'
has not been declared.
Here, 'a' is a named constant; if I use a:=777 instead, which defines
a variable, it says the first 'a' has not been initialised (I guess
'INT a' creates a reference to a local), a different error from not
having the INT line at all.
OK. So there was a real language that worked that way, half a century ago.
Now, you said real-world not mainstream, so all mine currently do
that. Anything declared in a function body has function-wide scope,
including to any depth of nested block.
I don't count your language as "real world". But if you prefer the term "mainstream", I'll go along with that, and then skip your descriptions
of it.
In C, the same of thing happens when you introduced a forward
declaration of a function: you move its lexical scope to near the top.
Yes, C is exactly the same but for the minor detail of being completely different.
Perhaps you are mixing up "scope" and "lifetime" ? These are not the
same things.
I'm talking about static, lexical [scope], which can be determined at
compile-time.
Again, I think you are mixing up "scope" and "lifetime".
Can you
demonstrate that you understand the difference? It would be
particularly nice if you could show you know the difference for local variables in C.
Imagine, instead, that most people who program know how to write
sensible code.
Not everyone does, but most people do. Worrying about
how badly people could misuse any given feature is not helpful,
and of
course there are endless ways to write terrible code using your language
or any other language.
If I told you I like the green colour of my car, you'd tell me that your
car is red, and red is the only good colour for a car. After all, if
green aliens hid my car in a field of broccoli and surrounded it with
green frogs it would be hard to find, while your red car would be
clearly visible.
On 19/05/2026 17:48, David Brown wrote:
On 19/05/2026 18:31, Bart wrote:
On 19/05/2026 14:40, David Brown wrote:
Can you give me an example of a real-world programming language in
which the scope of a local variable begins at the start of the
enclosing block, rather than at its declaration / definition?
In Algo68:
BEGIN
print((a, newline));
INT a=777;
print((a, newline));
END
This prints 0 then 777. If I comment out the declaration, it says 'a'
has not been declared.
Here, 'a' is a named constant; if I use a:=777 instead, which defines
a variable, it says the first 'a' has not been initialised (I guess
'INT a' creates a reference to a local), a different error from not
having the INT line at all.
OK. So there was a real language that worked that way, half a century
ago.
While I don't care for it, a LOT of thought went into its design, by
some very clever people.
Now, you said real-world not mainstream, so all mine currently do
that. Anything declared in a function body has function-wide scope,
including to any depth of nested block.
I don't count your language as "real world". But if you prefer the
term "mainstream", I'll go along with that, and then skip your
descriptions of it.
You would dismiss potentially good ideas for petty reasons?
There is a fuzzy area in programming lanuages when you declare things in
the middle of a block.
You declare X in the middle, and its scope lasts until the end of the
block. But that happens between the start of the block and its declaration?
If this new X is not visible, then what happens if you try and use 'X'?
Apparently in C, you just get whatever outer X happens to be visible
from an outer scope.
There are other choices, including hiding outer Xs
but now allowing you to access this new X either.
If you want the current behaviour, you can create a new {} block, then
there are fewer surprises.
In C, the same of thing happens when you introduced a forward
declaration of a function: you move its lexical scope to near the top.
Yes, C is exactly the same but for the minor detail of being
completely different.
So, having functions, variables, types and macro generally declared near
the top of a source file, has no similarity with declaring local
variables near the top of a function?
OK..
Perhaps you are mixing up "scope" and "lifetime" ? These are not
the same things.
I'm talking about static, lexical [scope], which can be determined at
compile-time.
Again, I think you are mixing up "scope" and "lifetime".
I think you are. I said this can be determined at /compile-time/. It is purely to do with visibility from a particular spot in source code.
Can you demonstrate that you understand the difference? It would be
particularly nice if you could show you know the difference for local
variables in C.
Why don't you explain the difference?
Imagine, instead, that most people who program know how to write
sensible code.
I suspect I've seen more reams of nightmare C source code than you have.
Not everyone does, but most people do. Worrying about how badly
people could misuse any given feature is not helpful,
People like to push languages to the limits. They like to show off and
be clever. They will abuse any feature.
and of course there are endless ways to write terrible code using your
language or any other language.
In my language they can't have have 64 variations of the same 'abcdef' identifier IN THE SAME SCOPE by varying case. They can't have unlimited unrelated instances of even the exact same 'abcdef' identifier IN THE
SAME FUNCTION, thanks to block scope.
They can't have one block sharing two instances of the same 'abcdef'
name with the second being declared part-way through.
There's only one 'abcdef' identifier per function, whatever case mix is used, and that's your lot.
Sure, they can /deliberately/ write bad code in my language, but they
have to work much harder than in C, where you can do it without trying.
If I told you I like the green colour of my car, you'd tell me that
your car is red, and red is the only good colour for a car. After
all, if green aliens hid my car in a field of broccoli and surrounded
it with green frogs it would be hard to find, while your red car would
be clearly visible.
And yet, my now white car (it used to be red), is next to impossible to
spot in a car-park where 50% of cars are white. I have to identify it
from the number-plate.
So if I told you that, I'd have a valid point.
(I still don't know why C has a separate namespace for labels.)
On 19/05/2026 19:47, Bart wrote:
(I still don't know why C has a separate namespace for labels.)
There's a lot you don't know about C.
But it's strange how you wear your ignorance like a merit badge
while simultaneously claiming to be an expert in language design
who has
implemented a C compiler.)
On 2026-05-19 10:00, David Brown wrote:
On 19/05/2026 09:18, Janis Papanagnou wrote:
On 2026-05-19 01:31, Chris M. Thomasson wrote:
[...] For some experimental code I take note of the lines in a
single file, take note of the modular setup, and say. okay, this is
getting pretty big. its time to separate these. Iiirc, MISRA has a
limit on the number of lines in a function?
(I don't know about MISRA.)
Determining such numbers to get characteristics of source code can
certainly be useful.
Limiting these numbers (and rejecting such code), OTOH, appears to
me to be quite stupid.
I don't recall reading a rule in MISRA that limits the length of a
function. But there are a number of different editions of the MISRA
coding standards.
Generally, it is a good idea to keep functions fairly short.
Actually, the first time I've seen such a formulation was from
the context of "C". (Maybe even from K&R ? - Don't recall.)
I'm coming more from the camp saying that plain numbers of lines
are per se not an asset, not considering that a primary criterion
for programming. (Large functions may or may not be an indication
for bad style or bad software design, as short functions may be or
not, likewise.)
There's a lot more dimensions to organize code (classes, modules)
and other factors like how broad an interface is, depth of calls,
structures, function signatures, etc. - you can continue that list
ad nauseam. - There's tools that may provide hints based on such
criteria.
But sometimes it is clearer to have a longer function than to break it
into many small functions. And occasionally you come across something
that is best handled as one very large function, if the structure is
regular and clear (such as a large switch statement). Guidelines
about sizes can be a good idea, but fixed rules rarely are.
(MISRA compliance does allow for many of its rules and guidelines to
be broken, if appropriately commented and justified.)
We've invented such guidelines as well, back then. Making sensible
rules, forcing some, suggesting others, and demand to comment and
justify any deviations. - Yeah, it's nice to see that established
procedures and practices mostly seem to be handled sensibly.
On 19/05/2026 07:58, Lawrence D’Oliveiro wrote:
On Mon, 18 May 2026 18:35:04 +0100, Bart wrote:
I hate dealing with code that just declares variablesTight control over scope reduces the chances of unexpected
higgledy-piggledy all over the place; it is so lazy. I'm not into
block scopes either.
interactions across different parts of the code.
At the same time, it makes it harder to see what interactions there
might be, because you don't know whether the 'x' appearing here is the
same one used earlier and/or the one used later.
You need to very carefully trace the declaration for each instance.
And also (this is what I hate), you have a bloody great declaration
right in the middle of what would have been your nice clean code.
There is also the odd way that scopes work:
int a; // a1
.... // a1 in scope
{
print(a); // a1 in scope
double a; // a2
print(a); // a2 in scope
}
.... // a1 in scope
They don't need to be delimited by {...}; here that block has a1 in
first half and a2 in second half, but both are 'a'.
Here:
....
double b = a, a;
....
'b' is initialised from a1, then it declares a2, however since both a1
and a2 are called 'a', it is a touch confusing!
On 19/05/2026 15:08, Bart wrote:
On 19/05/2026 13:23, David Brown wrote:
On 19/05/2026 13:48, Bart wrote:
On 19/05/2026 12:15, Richard Harnden wrote:
On 19/05/2026 11:55, Bart wrote:
On 19/05/2026 07:58, Lawrence D’Oliveiro wrote:
On Mon, 18 May 2026 18:35:04 +0100, Bart wrote:
I hate dealing with code that just declares variables
higgledy-piggledy all over the place; it is so lazy. I'm not into >>>>>>>> block scopes either.
Tight control over scope reduces the chances of unexpected
interactions across different parts of the code.
At the same time, it makes it harder to see what interactions
there might be, because you don't know whether the 'x' appearing
here is the same one used earlier and/or the one used later.
You need to very carefully trace the declaration for each instance. >>>>>>
And also (this is what I hate), you have a bloody great
declaration right in the middle of what would have been your nice >>>>>> clean code.
There is also the odd way that scopes work:
int a; // a1
.... // a1 in scope
{
print(a); // a1 in scope
double a; // a2
print(a); // a2 in scope
}
.... // a1 in scope
They don't need to be delimited by {...}; here that block has a1
in first half and a2 in second half, but both are 'a'.
Yes they do. How else does 'a' stop being a2 and revert to a1?
a1 turned into a2 without crossing '{'.
It did so at the new declaration of "a". How could that possibly be
surprising or "odd" ?
How could it possibly not be?!
{...} is a block in C, and C has block scopes. Given that, people
might expect a scope to be delimited by the enclosing spaces.
People expect scopes to begin with a declaration, and end at the end of
the block (or file, for file-scope identifiers). This is the normal handling of scoping in most imperative programming languages. There are some languages (such as Python) where scopes begin when a variable is
first assigned, and continue to the end of the function rather than the block. And declarative programming languages may do things differently
- they do many things differently.
Languages may differ in the details - given "int a = foo(a);", for
example, the scope of the new "a" in C begins after "int a" - in some languages, it may not begin until the end of the semicolon. But that's
a small detail, rarely relevant in real code. (For the record, I'd have preferred if the scope of variables in C did not begin until the end of
the declaration / definition.)
Can you give me an example of a real-world programming language in which
the scope of a local variable begins at the start of the enclosing
block, rather than at its declaration / definition?
Perhaps you are mixing up "scope" and "lifetime" ? These are not the
same things.
But names don't come into existence until declared, which can be in
the middle of a block. Until then, the outer version of 'a' is still
in scope.
Yes. Scope is all about the names.
The point at each that happens, in a busy section of code with lots of
declarations, can be unclear, with overlaps:
double a;
.... // a2 b1 c1
double b;
.... // a2 b2 c1
double c;
.... // a2 b2 c2
These all shadow a, b, c in an outer scope, but at different times.
Yes. (Although this is again a strawman argument - people don't
normally write code that shadows outer scope variables.)
The simplest scoping rule in C is for labels: they have function-wide
scope, regardless of block nesting label.
The rule for C labels exists because you have to be able to jump
backwards and forwards for "goto" to be of much use. In C programming, labels (excluding case labels) are rarely used except sometimes for
error handling. I don't think I've written "goto" in C more than a
couple of times in my programming career.
C labels are unstructured. This is not a good thing in a programming language - it is an unfortunate necessary evil. Other identifiers have clearer and simpler scoping - scope starts with the declaration, and
ends with the end of the enclosing block.
That would almost be as simple as how my own scopes work, except C
just has to still be quirky:
double c; c: goto c;
(Labels of course have their own namespace, for some obscure reason.
I'm sure 99% of C programmers don't know that.)
Since we are pulling numbers out of the air, 99% of C programmers don't care. If any C programmer writes code where this is important, because they use the same identifier for a variable and a goto label, they
should not be programming.
On 5/19/2026 6:40 AM, David Brown wrote:[...]
Can you give me an example of a real-world programming language in
which the scope of a local variable begins at the start of the
enclosing block, rather than at its declaration / definition?
Perhaps you are mixing up "scope" and "lifetime" ? These are not
the same things.
Well, the far away land of C++, use a scope to define a lifetime?
{
foo x(whatever);
{
froboz m(zing);
x.bar(m);
x.baz(m);
}
x.compute();
}
"Chris M. Thomasson" <chris.m.thomasson.1@gmail.com> writes:
On 5/19/2026 6:40 AM, David Brown wrote:[...]
Can you give me an example of a real-world programming language in
which the scope of a local variable begins at the start of the
enclosing block, rather than at its declaration / definition?
Perhaps you are mixing up "scope" and "lifetime" ? These are not
the same things.
Well, the far away land of C++, use a scope to define a lifetime?
{
foo x(whatever);
{
froboz m(zing);
x.bar(m);
x.baz(m);
}
x.compute();
}
Chris, what exactly is your point?
The same scope and lifetime rules apply in C as in C++.
A difference
is that an object reaching the end of its lifetime in C++ can have
visible effects, whereas in C it generally just means that its memory
can be deallocated (though the actual deallocation can be delayed).
There are also different lifetime rules for heap-allocated objects,
in both C and C++. (Scope doesn't apply, since heap-allocated
objects don't have names.)
David is entirely correct that scope and lifetime are two different
things. The scope of an identifier is the region of program text
in which it's visible. The lifetime of an object is the time during
program execution when the object logically exists. Did you intend
to express disagreement?
Yes, scope and lifetime are related. For an object defined within
a block, its identifier's scope ends at the "}" of the block,
and its lifetime ends when execution logically reaches that same "}".
[...]
Chris, you have yet again refused to trim a large amount of quoted
text when posting a followup. I see that you use Thunderbird, which
as far as I know doesn't make it particularly difficult to do that.
Why do you do this?
If you do not start trimming irrelevant quoted text, expect this
to be the last time I interact with you.
On 5/19/2026 1:34 AM, Janis Papanagnou wrote:
On 2026-05-19 10:00, David Brown wrote:
On 19/05/2026 09:18, Janis Papanagnou wrote:
On 2026-05-19 01:31, Chris M. Thomasson wrote:
[...] For some experimental code I take note of the lines in a
single file, take note of the modular setup, and say. okay, this is >>>>> getting pretty big. its time to separate these. Iiirc, MISRA has a
limit on the number of lines in a function?
(I don't know about MISRA.)
Determining such numbers to get characteristics of source code can
certainly be useful.
Limiting these numbers (and rejecting such code), OTOH, appears to
me to be quite stupid.
I don't recall reading a rule in MISRA that limits the length of a
function. But there are a number of different editions of the MISRA
coding standards.
Generally, it is a good idea to keep functions fairly short.
Actually, the first time I've seen such a formulation was from
the context of "C". (Maybe even from K&R ? - Don't recall.)
I'm coming more from the camp saying that plain numbers of lines
are per se not an asset, not considering that a primary criterion
for programming. (Large functions may or may not be an indication
for bad style or bad software design, as short functions may be or
not, likewise.)
There's a lot more dimensions to organize code (classes, modules)
and other factors like how broad an interface is, depth of calls,
structures, function signatures, etc. - you can continue that list
ad nauseam. - There's tools that may provide hints based on such
criteria.
But sometimes it is clearer to have a longer function than to break
it into many small functions. And occasionally you come across
something that is best handled as one very large function, if the
structure is regular and clear (such as a large switch statement).
Guidelines about sizes can be a good idea, but fixed rules rarely are.
(MISRA compliance does allow for many of its rules and guidelines to
be broken, if appropriately commented and justified.)
We've invented such guidelines as well, back then. Making sensible
rules, forcing some, suggesting others, and demand to comment and
justify any deviations. - Yeah, it's nice to see that established
procedures and practices mostly seem to be handled sensibly.
it has some rather interesting guidelines:
https://www.stroustrup.com/JSF-AV-rules.pdf
Fwiw, think of a single file getting big, you have your:
[snip some list]
ect...
all in one file! BUT! This file is being separated in and of itself as a modular thing anyway. So, now its time to make all of these systems in
their own files. Fair enough?
On 19/05/2026 14:40, David Brown wrote:
[...]
Can you give me an example of a real-world programming language in
which the scope of a local variable begins at the start of the
enclosing block, rather than at its declaration / definition?
In Algo68:
BEGIN
print((a, newline));
INT a=777;
print((a, newline));
END
This prints 0 then 777. If I comment out the declaration, it says 'a'
has not been declared.
Here, 'a' is a named constant; if I use a:=777 instead, which defines a variable, it says the first 'a' has not been initialised (I guess 'INT
a' creates a reference to a local), a different error from not having
the INT line at all.
[...]
On 19/05/2026 18:31, Bart wrote:
[...]
OK. So there was a real language that worked that way, half a century ago.
On 19/05/2026 20:58, David Brown wrote:
On 19/05/2026 19:47, Bart wrote:
(I still don't know why C has a separate namespace for labels.)
There's a lot you don't know about C.
It sounds like you don't know either.
But it's strange how you wear your ignorance like a merit badge while
simultaneously claiming to be an expert in language design
Implementing C's three namespaces doesn't require knowing why they
exist, only that they do. Plus of course an infinite number of block
scopes in a language which averages 3 local variables per function.
who has implemented a C compiler.)
I implemented a C-subset compiler in 2017 (written in a language I
designed and implemented) which was tested on some half million lines of open source C code, much of it brutal.
That actually taught me quite a lot, but mostly that C was an even worse language than I'd thought. I also got to see a LOT of ugly code.
The code it processes is definitely C.
I'm not sure why you say I'm claiming to have written one; you don't
believe me? OK.
I can see why it might annoy some people here.
On 5/19/2026 3:01 PM, Keith Thompson wrote:
"Chris M. Thomasson" <chris.m.thomasson.1@gmail.com> writes:
On 5/19/2026 6:40 AM, David Brown wrote:[...]
Can you give me an example of a real-world programming language in
which the scope of a local variable begins at the start of the
enclosing block, rather than at its declaration / definition?
Perhaps you are mixing up "scope" and "lifetime" ? These are not
the same things.
Well, the far away land of C++, use a scope to define a lifetime?
{
foo x(whatever);
{
froboz m(zing);
x.bar(m);
x.baz(m);
}
x.compute();
}
Chris, what exactly is your point?
The same scope and lifetime rules apply in C as in C++.
A difference
is that an object reaching the end of its lifetime in C++ can have
visible effects, whereas in C it generally just means that its memory
can be deallocated (though the actual deallocation can be delayed).
There are also different lifetime rules for heap-allocated objects,
in both C and C++. (Scope doesn't apply, since heap-allocated
objects don't have names.)
froboz can create threads, x.bar(m), x.baz(m), use them, the all the
threads are joined in the dtor of m. So, this can be in C as well, just using manual function calls before the end of the scope. I have to admit that I do like C99's way to not have to declare everything up front.
Declare where you need it.
I prefer:
{
foo x(whatever);
{
froboz m(zing);
x.bar(m);
x.baz(m);
}
x.compute();
}
over:
{
foo x(whatever);
froboz m(zing);
x.bar(m);
x.baz(m);
x.compute();
}
There is another major difference between C and C++. C++ makes a much stronger distinction between "lifetimes" and "storage durations". In
C, the "lifetime" of an object is just the time for which its
"storage" is guaranteed to exist and be reserved for the object
(subject, as always, to "as if" rules). Thus if a compiler generates
code for a function which sets up space on a stack at entry to a
function, then the storage duration of a local variable might run from function entry to function exit, while the lifetime is only within the
inner block containing it. (Equally, it's fine for the storage space
on the stack to be be allocated at the start of the inner block rather
than the start of the function - the lifetime requirements determine
the minimum storage duration.)
It is unusual to see such extra blocks in C, but they turn up in C++
when you are using the lifetime of a local variable actively. I had
such usage yesterday, with a lock-guard variable inside a small
block. It is generally the lifetime that is important in such cases,
rather than the scope (though they will be the same thing in most
cases, baring function calls within the block that leave the scope but
not the lifetime of the variable).
On 2026-05-19 18:31, Bart wrote:
On 19/05/2026 14:40, David Brown wrote:
[...]
Can you give me an example of a real-world programming language in
which the scope of a local variable begins at the start of the
enclosing block, rather than at its declaration / definition?
In Algo68:
BEGIN
print((a, newline));
INT a=777;
print((a, newline));
END
This prints 0 then 777. If I comment out the declaration, it says 'a'
has not been declared.
1. The original post was talking about "variables" (not about identity declarations).
2. Are you sure that above code is valid (error-free) Algol 68 code?
I'm not sure about that. - My old textbook says that you can use these identifiers *after* you've set them (not before). And skimming through
the standard document, the Revised Report, I could not find anything.
If there is; can you point me to the relevant chapter, please?
It could also just not have been defined in the standard. (The above
code or Genie's behavior would then not tell anything relevant about
the language. - It would just expose yet another example of code that
an experienced programmer would just not write that way.)
You seem to have tested that with the Genie interpreter? - This would
not say anything about what the Algol 68 standard says, mind.
On 19/05/2026 23:16, Bart wrote:
On 19/05/2026 20:58, David Brown wrote:
On 19/05/2026 19:47, Bart wrote:
(I still don't know why C has a separate namespace for labels.)
There's a lot you don't know about C.
It sounds like you don't know either.
I don't know for sure - but I don't care that they are in separate namespaces, and I don't care about why other than for curiosity. I
think any code for which it matters, and code shares the same identifier
for a label and a variable, is hopelessly badly written or extremely
niche.
I have almost never had need of "goto" or labels (excluding
switch case labels, of course), and don't expect ever to do so in the future.
I'm not sure why you say I'm claiming to have written one; you don't
believe me? OK.
I find it hard to reconcile accurately writing a C compiler with your misconceptions and misunderstandings about the language, and your
reluctance to actually look at how the language is defined.
I believe
you have written a compiler that compiles the language you think C is,
and that is a close enough match to C for it to work (or appear to work) with a fair bit of C code.
I have no doubt that it works fine for the C
code you personally write or generate.
On 20/05/2026 07:59, David Brown wrote:
On 19/05/2026 23:16, Bart wrote:
On 19/05/2026 20:58, David Brown wrote:
On 19/05/2026 19:47, Bart wrote:
(I still don't know why C has a separate namespace for labels.)
There's a lot you don't know about C.
It sounds like you don't know either.
I don't know for sure - but I don't care that they are in separate
namespaces, and I don't care about why other than for curiosity. I
think any code for which it matters, and code shares the same
identifier for a label and a variable, is hopelessly badly written or
extremely niche.
So it /is/ pointless to have that separate namespace.
I have almost never had need of "goto" or labels (excluding switch
case labels, of course), and don't expect ever to do so in the future.
Is this what people here like to call a 'non sequitur'? Whether /you/
goto use it is not relevant.
I believe you have written a compiler that compiles the language you
think C is, and that is a close enough match to C for it to work (or
appear to work) with a fair bit of C code.
Yes. Especially C90 applications such as Lua source code. But since 2017
a lot more open source code has started to use C99 features I did't
support, such as compound literals, designated initialisers, runtime expressions in {...} initialisers, VLAs and so on.
I'm not planning to support those; many are poorly documented IMO and unintuitive to understand, hard to implement, and may have hidden depths
of complexity.
I have no doubt that it works fine for the C code you personally
write or generate.
It works perfectly, except that it targets x64 with Win64 ABI, and
doesn't optimise. It's when I want to target non-Windows OSes, and/or
have optimised code, that I use the C transpiler!
However it is still useful as an instant test of the generated code,
which otherwise takes 30-40 times longer with gcc-O0.
On 19/05/2026 20:58, David Brown wrote:...
On 19/05/2026 19:47, Bart wrote:
Implementing C's three namespaces doesn't require knowing why they
exist, only that they do.
...who has implemented a C compiler.)
I'm not sure why you say I'm claiming to have written one; you don'tAs far as I can tell, C doesn't have any feature too simple for you to misunderstand it. It's hard to believe that you have correctly
believe me? OK.
On 19/05/2026 20:58, David Brown wrote:
On 19/05/2026 19:47, Bart wrote:
(I still don't know why C has a separate namespace for labels.)
There's a lot you don't know about C.
It sounds like you don't know either.
[snip] I have almost never had need of "goto" or labels (excluding
switch case labels, of course), and don't expect ever to do so in the >future.
In article <10ujm3r$3pnbb$1@dont-email.me>,
David Brown <david.brown@hesbynett.no> wrote:
[snip] I have almost never had need of "goto" or labels (excluding
switch case labels, of course), and don't expect ever to do so in the
future.
While I generally try to avoid it, there are times (in C in
particular) when it really is the right tool; the inventors of C
understood that. From the Plan 9 `fortunes` file: "If you want
to go somewhere, goto is the best way to get there. K Thompson"
Breaking out of nested loops without a lot of unnecessary
ceremony is sort of an obvious example; jumping to common error
handling code is another that is often cited (though see the
Apple, "goto fail" bug for an example of how this can go bad).
It's interesting to me that after Dijkstra's famous March 1968,
"Go To Statement Considered Harmful" letter in CACM vol 11 no 3,
language designers seemed to address the issue by identifying
the primary useful patterns of `goto` use and codifying them as
first-class constructs in new(er) languages: labeled loops (and
corresponding elaborations on `break` etc) and exceptions are
obvious, and Go's `defer` statement is a newer example.
C came a few years after Dijkstra's letter, and I'm sure they
were aware of it. But while it inherited `break` from BCPL, the
only addition in this area is `continue`. DMR seemed to eschew
a more complex language for one that included `goto`, presumably
because it was intended for expert programmers judicious with
its use.
Once it escaped the lab, however....
On 20/05/2026 15:20, Bart wrote:
On 20/05/2026 07:59, David Brown wrote:
On 19/05/2026 23:16, Bart wrote:
On 19/05/2026 20:58, David Brown wrote:
On 19/05/2026 19:47, Bart wrote:
(I still don't know why C has a separate namespace for labels.)
There's a lot you don't know about C.
It sounds like you don't know either.
I don't know for sure - but I don't care that they are in separate
namespaces, and I don't care about why other than for curiosity. I
think any code for which it matters, and code shares the same
identifier for a label and a variable, is hopelessly badly written or
extremely niche.
So it /is/ pointless to have that separate namespace.
I did not say that, and I can't fathom how you would conclude that -
either from your own lack of relevant knowledge, or from what I wrote.
I would imagine that having them as a separate namespace would be more convenient in a compiler - the scoping and lookup rules are
significantly different from the namespace for variables and functions,
and having separate namespaces means compilers don't have to check for conflicts. There may be other good reasons for the separation, or it
might be a historical artifact inherited from a predecessor language, or
it might be that the C designers preferred separate namespaces and would only have combined them if they had good reason to do so.
I'm not planning to support those; many are poorly documented IMO and
unintuitive to understand, hard to implement, and may have hidden
depths of complexity.
You can make up your own mind about how difficult these features are to implement in your own compiler, though I question the reality of these concerns - I think you just like to complain that they are hard.
And
you can have the opinion that they are poorly documented, but we both
know the reality is that you haven't bothered to try to read the documentation.
The fact that people use these features should indicate
that they are well enough documented and understood for C programmers to use.
So are you trying to claim that you are particularly inept at
reading and understanding about C language features? I doubt that.
I'd be happier with an honest reason - such as you don't like these
features (for some non-technical reason), you don't find them useful yourself (fair enough),
and you therefore can't be bothered implementing
them in your own tools (again, fair enough). You made your C compiler
for fun, no one else uses it, and you have no obligations to anyone
else. It's entirely up to you to pick the features you choose to
support (as long as you don't make any claims to conformity). You don't have to make up bullshit excuses for choosing not to implement C99
features.
On 20/05/2026 15:22, David Brown wrote:
On 20/05/2026 15:20, Bart wrote:
On 20/05/2026 07:59, David Brown wrote:
On 19/05/2026 23:16, Bart wrote:
On 19/05/2026 20:58, David Brown wrote:
On 19/05/2026 19:47, Bart wrote:
(I still don't know why C has a separate namespace for labels.)
There's a lot you don't know about C.
It sounds like you don't know either.
I don't know for sure - but I don't care that they are in separate
namespaces, and I don't care about why other than for curiosity. I
think any code for which it matters, and code shares the same
identifier for a label and a variable, is hopelessly badly written
or extremely niche.
So it /is/ pointless to have that separate namespace.
I did not say that, and I can't fathom how you would conclude that -
You said that code which depends on it is 'hopelessly badly written' or 'extremely niche'. That implies the vast majority of decent code will
never need that feature. Ergo it is pointless.
Unless you can think of a use-case where it would be essential?
either from your own lack of relevant knowledge, or from what I wrote.
What exactly is lacking from that knowledge? Do you even know yourself?
I would imagine that having them as a separate namespace would be more
convenient in a compiler - the scoping and lookup rules are
significantly different from the namespace for variables and
functions, and having separate namespaces means compilers don't have
to check for conflicts. There may be other good reasons for the
separation, or it might be a historical artifact inherited from a
predecessor language, or it might be that the C designers preferred
separate namespaces and would only have combined them if they had good
reason to do so.
Having such an extra namespace for labels because it makes a compiler simpler does not make that useful for users.
Yes, maybe the namespace trick makes it a little simpler to check for duplicates of labels and locals.
But it also relies on label names only appearing in certain contexts.
That means extensions such as gcc's label pointers need special syntax
such as &&L, whereas function names can become pointers without even one
'&' needed.
And you can have the opinion that they are poorly documented, but we
both know the reality is that you haven't bothered to try to read the
documentation.
I have, that's why I can say they are poorly documented.
On 20/05/2026 17:18, Dan Cross wrote:
In article <10ujm3r$3pnbb$1@dont-email.me>,
David Brown <david.brown@hesbynett.no> wrote:
[snip] I have almost never had need of "goto" or labels (excluding
switch case labels, of course), and don't expect ever to do so in the
future.
While I generally try to avoid it, there are times (in C in
particular) when it really is the right tool; the inventors of C
understood that. From the Plan 9 `fortunes` file: "If you want
to go somewhere, goto is the best way to get there. K Thompson"
Certainly there are situations where people feel the most clear, simple, >reliable and efficient way to handle a particular bit of code is with a >"goto". Fair enough. Personally, I have almost never felt I am in that >situation. I can think of perhaps four reasons why I might have fewer >goto's than some other people :
1. I stick to optimising compilers in almost all cases - some people may
use goto rather than an extra bool flag or too, for efficiency purposes.
I am in the happy position of letting the compiler generate the goto's.
2. I almost never use dynamic memory, and thus don't have much use for a >typical "goto error" idiom for handling failed malloc.
3. My code is typically not meant to be portable, so I can use things
like gcc's "cleanup" attribute that can replace the need of "goto error"
or other handling when breaking out of inner loops.
4. With C99 "inline" and modern tools that do inlining and other >inter-procedural optimisations, there is often less benefit in nested
loops that might need a multi-level break - you can split the code into >separate functions and use "return" to break out of the middle of loops.
So I don't suggest that nobody has need of "goto", or that all "gotos"
are bad - merely that I very rarely have use of them myself.
Breaking out of nested loops without a lot of unnecessary
ceremony is sort of an obvious example; jumping to common error
handling code is another that is often cited (though see the
Apple, "goto fail" bug for an example of how this can go bad).
It's interesting to me that after Dijkstra's famous March 1968,
"Go To Statement Considered Harmful" letter in CACM vol 11 no 3,
language designers seemed to address the issue by identifying
the primary useful patterns of `goto` use and codifying them as
first-class constructs in new(er) languages: labeled loops (and
corresponding elaborations on `break` etc) and exceptions are
obvious, and Go's `defer` statement is a newer example.
C came a few years after Dijkstra's letter, and I'm sure they
were aware of it. But while it inherited `break` from BCPL, the
only addition in this area is `continue`. DMR seemed to eschew
a more complex language for one that included `goto`, presumably
because it was intended for expert programmers judicious with
its use.
Once it escaped the lab, however....
There's more than one feature of C, and of every other programming
language, that we'd all like to put back into Pandora's box. Of course, >we'd all disagree strongly on which features those are!
On 20/05/2026 17:41, Bart wrote:
First, a niche use is not pointless.
You don't know why the name spaces are separate. I don't know either,
as I have said.
Yes, maybe the namespace trick makes it a little simpler to check for
duplicates of labels and locals.
But it also relies on label names only appearing in certain contexts.
And that's fine. They can only appear in certain circumstances -
preceding a colon to define the label, or as the subject of a "goto".
I have, that's why I can say they are poorly documented.
Really? Is this you finally saying you have read a part of one C
standard version?
There are lots of different qualities of general purpose C compilers, if
you look outside the big 3 (gcc, clang, msvc).
I used to use DMC, Pelles C, lccwin32 extensively, and still use Tiny C.
In article <10ujm3r$3pnbb$1@dont-email.me>,
David Brown <david.brown@hesbynett.no> wrote:
[snip] I have almost never had need of "goto" or labels (excluding
switch case labels, of course), and don't expect ever to do so in the
future.
While I generally try to avoid it, there are times (in C in
particular) when it really is the right tool; the inventors of C
understood that. From the Plan 9 `fortunes` file: "If you want
to go somewhere, goto is the best way to get there. K Thompson"
On 19/05/2026 23:16, Bart wrote:[...]
On 19/05/2026 20:58, David Brown wrote:...
On 19/05/2026 19:47, Bart wrote:
Implementing C's three namespaces doesn't require knowing why they
exist, only that they do.
"namespaces" are a C++ feature, which means something quite different
from "name spaces". Both C and C++ have name spaces. C++ has only three: labels, macros, and ordinary identifiers. However, C has 4 name spaces
and an unlimited number of members of each of two families of name spaces:
On 20/05/2026 05:24, Janis Papanagnou wrote:
[...]
2. Are you sure that above code is valid (error-free) Algol 68 code?
I'm not sure about that. - My old textbook says that you can use these
identifiers *after* you've set them (not before). And skimming through
the standard document, the Revised Report, I could not find anything.
If there is; can you point me to the relevant chapter, please?
It could also just not have been defined in the standard. (The above
code or Genie's behavior would then not tell anything relevant about
the language. - It would just expose yet another example of code that
an experienced programmer would just not write that way.)
You seem to have tested that with the Genie interpreter? - This would
not say anything about what the Algol 68 standard says, mind.
You seem determined to prove Bart wrong, aren't you?
On 2026-05-19 18:48, David Brown wrote:
On 19/05/2026 18:31, Bart wrote:
[...]
OK. So there was a real language that worked that way, half a century
ago.
It's not that clear, IMO; all I see in his post is the result of a
specific tool he uses, and for a specific case (not for variables).
(See my recent post a few minutes ago for the details.)
[...]
On 2026-05-20 12:55, Bart wrote:
On 20/05/2026 05:24, Janis Papanagnou wrote:
[...]
2. Are you sure that above code is valid (error-free) Algol 68 code?
I'm not sure about that. - My old textbook says that you can use these
identifiers *after* you've set them (not before). And skimming through
the standard document, the Revised Report, I could not find anything.
If there is; can you point me to the relevant chapter, please?
It could also just not have been defined in the standard. (The above
code or Genie's behavior would then not tell anything relevant about
the language. - It would just expose yet another example of code that
an experienced programmer would just not write that way.)
You seem to have tested that with the Genie interpreter? - This would
not say anything about what the Algol 68 standard says, mind.
You seem determined to prove Bart wrong, aren't you?
You made a claim that looked strange, and I had been asking you
whether you can point to the standard
to clarify that question.
It's so simple! - Or it would be so simple if you'd not as usual
wrongly assume malevolent behavior.
And yes, you have been wrong! (As so often, sadly.) - You falsely
assumed some semantics in Algol 68 just because you wanted it to
support your argument.
For that you took an arbitrary Algol 68
interpreter
to "prove" the intention of the Algol 68 language
instead of pointing to the Algol 68 standard where such behavior
would be defined (or not).
As opposed to you I was trying to determine the truth about it,
and, as I wrote - you trimmed that part in the quote of my post
above:
"I've sent a mail to Marcel to clarify that, i.e. the behavior
of Genie, and whether that's standard behavior, undefined, or
an oversight or bug."
You obviously trimmed it to make an open question and doubt of
some facts - and my intention to focus on the fact! - look like
a personal attack. (You exposed malevolence in communication!)
Here's the answer of Marcel concerning the behavior of Genie:
"I could write a long story on this, but I will summarize it -
this is a bug"
On 2026-05-20 06:38, Janis Papanagnou wrote:
On 2026-05-19 18:48, David Brown wrote:
On 19/05/2026 18:31, Bart wrote:
[...]
OK. So there was a real language that worked that way, half a
century ago.
It's not that clear, IMO; all I see in his post is the result of a
specific tool he uses, and for a specific case (not for variables).
(See my recent post a few minutes ago for the details.)
David, the answer of my emailed question arrived; the behavior of
the Genie interpreter in case of identity relations is just a bug!
(Bart wrongly assumed and without evidence from the standard that
the Algol 68 language would support his claim.)
Algol 68 is a well designed language and the scope rules are as we
expect sensibly chosen; entities are valid within the block scope
and their "existence", when you can use them, starts with their
declaration (not before).
Most people don't know Algol 68. I suggest to take Bart's opinions
on that language with the same reservations as his opinions about
the "C" language. He's widely ignorant and unwilling to understand
the facts. Take his statements with a grain of salt - and, if in
doubt, better just ignore him.
On 21/05/2026 01:30, Janis Papanagnou wrote:[...]
You made a claim that looked strange, and I had been asking you
whether you can point to the standard
To the Algol68 standard? I wouldn't have a clue; probably only a
handful of people on the planet do.
On 5/20/2026 8:18 AM, Dan Cross wrote:
In article <10ujm3r$3pnbb$1@dont-email.me>,
David Brown <david.brown@hesbynett.no> wrote:
[snip] I have almost never had need of "goto" or labels (excluding
switch case labels, of course), and don't expect ever to do so in the
future.
While I generally try to avoid it, there are times (in C in
particular) when it really is the right tool; the inventors of C
understood that. From the Plan 9 `fortunes` file: "If you want
to go somewhere, goto is the best way to get there. K Thompson"
Oh shit. Side note. Did you ever hear about a nasty race condition in
Plan 9 wrt some lock-free thing, or even a mutex acquisition? The Plan9 >problem. I remember conversing about with with Alex Terekhov way back in >comp.programming.threads.
[...]
On 20/05/2026 18:51, David Brown wrote:
On 20/05/2026 17:41, Bart wrote:
First, a niche use is not pointless.
A niche use that takes advantage of some accidental quirk in a language?
One that wouldn't exist if the quirk wasn't there; that sort of niche use?
Over 50 years of use, every misfeature of C has been exploited by / somebody/. One reason why the language couldn't properly evolve.
You don't know why the name spaces are separate. I don't know either,
as I have said.
But I can have a good idea of the implications both by there being a separate namespace, or not. You snipped my example where it caused a limitation in the syntax.
Yes, maybe the namespace trick makes it a little simpler to check for
duplicates of labels and locals.
But it also relies on label names only appearing in certain contexts.
And that's fine. They can only appear in certain circumstances -
preceding a colon to define the label, or as the subject of a "goto".
As I said, limitations; 'goto (L)' is not allowed, for example, but '(F)
()' is as well as 'case (A):'.
I have, that's why I can say they are poorly documented.
Really? Is this you finally saying you have read a part of one C
standard version?
I've read plenty of the standard especially in 2017. Information for implementing C and providing headers had to be gleaned from multiple sources. There is also testing: the C standard doesn't provide a test-
suite to verify an implementation.
Since then, then no, I don't routinely look inside it.
So what? People can have opinions on languages, compare one to another, speculate on possible new features or modifying existing ones, etc,
based on their long personal experiences as /users/.
Some may also have experience as developers, and some even of developing languages in a similar field.
I also admit my implementation was casual. I had a particular set of
aims, which were largely achieved.
(The first language I implemented, not one of mine, didn't have a formal standard. You just picked it up from examples. But it was also a machine-oriented language, so it was adapted to the target to a certain extent.
That also was the case for other languages I used in the 70s, in that
the implementation for a particular platform became the standard, and
you used reference manuals for that version.
TLDR: you guys place too much importance on 'the C standard', and mainly
use it to batter me over the head with.
You don't 'own' C. There is no copyright on it. Anyone can use it as casually or as intensely as they like. Anyone can choose to create as professional or as casual a version as they like. Anyone choose to pontificate on things they like or dislike.
Anyone can choose to fork C and create a language that is slightly different, or a lot different, but they would ideally make that clear.
Hmm, still a bit long! In that case: TLDR: I like to deal with C
casually (like every language I use). If you don't like it, then too bad.
In article <10ukkh6$2gc8$1@dont-email.me>,
David Brown <david.brown@hesbynett.no> wrote:
On 20/05/2026 17:18, Dan Cross wrote:
In article <10ujm3r$3pnbb$1@dont-email.me>,
David Brown <david.brown@hesbynett.no> wrote:
[snip] I have almost never had need of "goto" or labels (excluding
switch case labels, of course), and don't expect ever to do so in the
future.
While I generally try to avoid it, there are times (in C in
particular) when it really is the right tool; the inventors of C
understood that. From the Plan 9 `fortunes` file: "If you want
to go somewhere, goto is the best way to get there. K Thompson"
Certainly there are situations where people feel the most clear, simple,
reliable and efficient way to handle a particular bit of code is with a
"goto". Fair enough. Personally, I have almost never felt I am in that
situation. I can think of perhaps four reasons why I might have fewer
goto's than some other people :
1. I stick to optimising compilers in almost all cases - some people may
use goto rather than an extra bool flag or too, for efficiency purposes.
I am in the happy position of letting the compiler generate the goto's.
2. I almost never use dynamic memory, and thus don't have much use for a
typical "goto error" idiom for handling failed malloc.
3. My code is typically not meant to be portable, so I can use things
like gcc's "cleanup" attribute that can replace the need of "goto error"
or other handling when breaking out of inner loops.
4. With C99 "inline" and modern tools that do inlining and other
inter-procedural optimisations, there is often less benefit in nested
loops that might need a multi-level break - you can split the code into
separate functions and use "return" to break out of the middle of loops.
So I don't suggest that nobody has need of "goto", or that all "gotos"
are bad - merely that I very rarely have use of them myself.
I'm with you on almost all of these, though I find the code that
has to check a boolean or other flag in an outer loop to be hard
to read compared to the `goto` alternative often enough that I
avoid it if I can. Of course, that's subjective. And I might
point out that there are other categories of cleanup one might
want to do, aside from just freeing memory (in the embedded
space, perhaps clearing a flag or setting register state in some
device to reset it...).
I particularly agree with the use of a static inline to avoid
some `goto`s. When the `goto fail` bug was announced, as an
exercise, I rewrote Apple's code to demonstrate how one might
eliminate the problematic pattern entirely:
The original was something like:
int
thing_that_can_fail(void)
{
int err = FAILURE;
void *p = malloc(...);
if (p == NULL)
goto fail;
void *q = whatever(...);
if (q == NULL)
goto fail;
err = something(p, q);
if (err != 0)
goto fail;
err = otherthing(p, q);
if (err != 0)
goto fail;
goto fail;
err = thirdthing(p, q);
if (err != 0)
goto fail;
return 0;
fail:
if (q != NULL)
dispose_of_q(q);
free(p);
return err;
}
But using a small `static` inline function, we can rewrite this
as two functions that separate resource allocation and whatever
else from other fallable logic:
static inline int
do_thing_that_can_fail(void *p, void *q)
{
int err = something(p, q);
if (err != 0)
return err;
err = otherthing(p, q);
if (err != 0)
return err;
return thirdthing(p, q);
}
int
thing_that_can_fail(void)
{
void *p = malloc(...);
if (p == NULL)
return FAILURE;
void *q = whatever(...);
if (q == NULL) {
free(p);
return FAILURE;
}
int ret = do_thing_that_can_fail(p, q);
dispose_of_q(q);
free(p);
return ret;
}
The response (this was amongst a bunch of developers are my last
company) was, "yeah, but you can't _always_ do that..." and that
may be true; but you _often_ can.
Regardless, it does not
automatically follow that `goto` is the best option for dealing
with this kind of failure. The Plan 9 kernel, for example, used
something equivalent to setjmp/longjmp and a stack of jump
buffers to provide a primitive exception handling mechanism in
that system's decidedly non-ISO dialect of C.
On 2026-05-20 06:38, Janis Papanagnou wrote:
On 2026-05-19 18:48, David Brown wrote:
On 19/05/2026 18:31, Bart wrote:
[...]
OK. So there was a real language that worked that way, half a
century ago.
It's not that clear, IMO; all I see in his post is the result of a
specific tool he uses, and for a specific case (not for variables).
(See my recent post a few minutes ago for the details.)
David, the answer of my emailed question arrived; the behavior of
the Genie interpreter in case of identity relations is just a bug!
(Bart wrongly assumed and without evidence from the standard that
the Algol 68 language would support his claim.)
Algol 68 is a well designed language and the scope rules are as we
expect sensibly chosen; entities are valid within the block scope
and their "existence", when you can use them, starts with their
declaration (not before).
Most people don't know Algol 68. I suggest to take Bart's opinions
on that language with the same reservations as his opinions about
the "C" language. He's widely ignorant and unwilling to understand
the facts. Take his statements with a grain of salt - and, if in
doubt, better just ignore him.
Janis
Bart <bc@freeuk.com> writes:
On 21/05/2026 01:30, Janis Papanagnou wrote:[...]
You made a claim that looked strange, and I had been asking you
whether you can point to the standard
To the Algol68 standard? I wouldn't have a clue; probably only a
handful of people on the planet do.
Please take this to comp.lang.misc. There's already some discussion
of Algol 68 there.
On 20/05/2026 22:14, Bart wrote:
On 20/05/2026 18:51, David Brown wrote:
On 20/05/2026 17:41, Bart wrote:
First, a niche use is not pointless.
A niche use that takes advantage of some accidental quirk in a
language? One that wouldn't exist if the quirk wasn't there; that sort
of niche use?
Over 50 years of use, every misfeature of C has been exploited by /
somebody/. One reason why the language couldn't properly evolve.
So do you have an example of where code has been written to take
advantage of the separate name spaces?
I don't - I merely can't rule
out niche cases. I could imagine some situations - perhaps from machine
or human translation from other languages, or old implementations with
very short identifier lengths.
You seem obsessed with calling every aspect of C a "misfeature".
But I can have a good idea of the implications both by there being a
separate namespace, or not. You snipped my example where it caused a
limitation in the syntax.
Yes, it was irrelevant.
You can have an opinion from ignorance, as many people have told you.
You can't expect that opinion to be respected.
You don't 'own' C. There is no copyright on it. Anyone can use it as
casually or as intensely as they like. Anyone can choose to create as
professional or as casual a version as they like. Anyone choose to
pontificate on things they like or dislike.
The C23 draft open on my desktop at the moment says "© ISO 2024 - All rights reserved" on every page. It /is/ copyrighted, and it is a
defined and standardised language.
On 20/05/2026 20:17, Dan Cross wrote:
[snip]
I particularly agree with the use of a static inline to avoid
some `goto`s. When the `goto fail` bug was announced, as an
exercise, I rewrote Apple's code to demonstrate how one might
eliminate the problematic pattern entirely:
(The mixture of tabs and spaces in your post resulted in a bit messed up >indentation in the quotation in my reply. I've tried to fix it up to
match the original indentation, but be warned it might look different
when you view it or re-quote it.)
[snip code examples]
The response (this was amongst a bunch of developers are my last
company) was, "yeah, but you can't _always_ do that..." and that
may be true; but you _often_ can.
Exactly. Some people worry do much about when you can't do something,
or what might go wrong if things had been different. Use the best
technique you can for the code at hand, and if you can't use that
technique in one place, use something else in that one place.
Regardless, it does not
automatically follow that `goto` is the best option for dealing
with this kind of failure. The Plan 9 kernel, for example, used
something equivalent to setjmp/longjmp and a stack of jump
buffers to provide a primitive exception handling mechanism in
that system's decidedly non-ISO dialect of C.
There are other ways to avoid or catch such errors, without changing the >structure or the use of "goto". One is to use static error checking. I >don't know the age of this code, but from gcc 6 onwards "-Wall" will
catch the "misleading indentation" of the double "goto fail". (gcc also >spots that the first "goto fail" skips the initialisation of "q", so it
is tested and possibly disposed-of when uninitialised. But that might
be an artifact of your paraphrasing of the original code.) Other static >error checkers would also no doubt be able to spot the bug.
There is also the question of brace style. That is something that
people have lots of different opinions on, but there's no doubt that if
the author had used a style that required the use of braces even for >single-statement blocks, such as the so-called "One True Brace Style",
then the error could not have occurred.
Personally, I allow a single short statement on the same line as the
"if". But if the statement is long, or I think the code is clearer
having it on a separate line, or if there is an "else" clause, I have it
in braces :
if (!p) goto fail;
if (!p) {
goto fail;
}
For me, this keeps everything simple, consistent, difficult to misread,
fits well with version control and other line-by-line comparisons, and
has a good balance between compactness and verbosity.
Other people, of course, have other style preferences, with different
pros and cons. OTBS, or my variation, would have made the error in this >code impossible - but it would not have hindered other kinds of bugs.
On 21/05/2026 07:45, David Brown wrote:
On 20/05/2026 22:14, Bart wrote:
On 20/05/2026 18:51, David Brown wrote:
On 20/05/2026 17:41, Bart wrote:
First, a niche use is not pointless.
A niche use that takes advantage of some accidental quirk in a
language? One that wouldn't exist if the quirk wasn't there; that
sort of niche use?
Over 50 years of use, every misfeature of C has been exploited by /
somebody/. One reason why the language couldn't properly evolve.
So do you have an example of where code has been written to take
advantage of the separate name spaces?
No. This is why I claimed it was pointless.
(I did do a survey of my codebases to find examples of label names
shadowing local identifiers, but found no instances. I didn't post the results (afaicr) since the code sample was too small, but it would be a massive job to do a bigger test.)
I don't - I merely can't rule out niche cases. I could imagine some
situations - perhaps from machine or human translation from other
languages, or old implementations with very short identifier lengths.
Yeah, a C implementation that only has 'a' to 'z' variables, but also
has a separate set of 'a' to 'z' labels!
You seem obsessed with calling every aspect of C a "misfeature".
Yes, it is astounding how much there is. Some can be excused due to its
age, but also lots could have been fixed long ago.
Some of it is an actual nuisance, but quite a bit is also aesthetically displeasing.
You didn't like my example of being able to write '(F)(x)' (or 'int
(a);') but not 'goto (L)', but the anomaly is there.
There is also not being able to write 'L:}'; this one bites.
But I can have a good idea of the implications both by there being a
separate namespace, or not. You snipped my example where it caused a
limitation in the syntax.
Yes, it was irrelevant.
And it's irrelevant because? It was a minor consequence due to
restrictions on where labels can appear, in turn due to not sharing the
same namespace as ordinary identifiers.
Irrelevant because it's part of a language extension? Those tend to
become standard.
You can have an opinion from ignorance, as many people have told you.
You can't expect that opinion to be respected.
That's just rubbish, and an attempt to stifle criticism.
Somebody asks why do I have to do X and not Y? Y makes more sense, it is easier etc.
The response here will be because the Standard says so. End of discussion.
Somebody was able to voice that opinion using their own experience and common sense. It may be valid; maybe other languages do do Y instead of X.
However all anyone here wants to suggest that that person is ignorant because they haven't read the standard. Well, they might read the
standard then find that Y is still better!
You don't 'own' C. There is no copyright on it. Anyone can use it as
casually or as intensely as they like. Anyone can choose to create as
professional or as casual a version as they like. Anyone choose to
pontificate on things they like or dislike.
The C23 draft open on my desktop at the moment says "© ISO 2024 - All
rights reserved" on every page. It /is/ copyrighted, and it is a
defined and standardised language.
The standards document itself might be copyrighted, not the language.
In article <10uloem$e6bl$1@kst.eternal-september.org>,
Keith Thompson <Keith.S.Thompson+u@gmail.com> wrote:
Bart <bc@freeuk.com> writes:
On 21/05/2026 01:30, Janis Papanagnou wrote:[...]
You made a claim that looked strange, and I had been asking you
whether you can point to the standard
To the Algol68 standard? I wouldn't have a clue; probably only a
handful of people on the planet do.
Please take this to comp.lang.misc. There's already some discussion
of Algol 68 there.
The Lord and Master has spoken. All must obey.
Yes, even JP.
[...]
Some of it is an actual nuisance, but quite a bit is also aesthetically displeasing.
[...]
There is also not being able to write 'L:}'; this one bites.
[...]
In article <10umdsa$hnml$1@dont-email.me>,
David Brown <david.brown@hesbynett.no> wrote:
On 20/05/2026 20:17, Dan Cross wrote:
[snip]
I particularly agree with the use of a static inline to avoid
some `goto`s. When the `goto fail` bug was announced, as an
exercise, I rewrote Apple's code to demonstrate how one might
eliminate the problematic pattern entirely:
(The mixture of tabs and spaces in your post resulted in a bit messed up
indentation in the quotation in my reply. I've tried to fix it up to
match the original indentation, but be warned it might look different
when you view it or re-quote it.)
Apologies for that; I suspect that was my editor trying to be
"helpful": I used spaces in the example, but usually use tabs;
I probably should have just stuck with the latter for
consistency.
This writeup of the original bug is pretty good: https://dwheeler.com/essays/apple-goto-fail.html
He also recommends several of the techniques you do.
In this particular case, the bug _should_ have been caught.
Some combination of rigorous testing, static analysis, and
manual review ought to have prevented it; that the bug made it
into production software anyway is a software engineering
failure.
I think one can level some reasonable criticism at the language,
however. The `goto error;` idiom is used in C because there are
few alternatives for cleanup handling on failure. Modulo what
we discussed before, that code can _often_ be restructured to
avoid it, but sometimes it can't, and frequently it just isn't.
In contrast, newer languages give you more expressive power in
this regard: Go has the `defer` keyword to register a closure
that runs when the enclosing function returns:
```go
f, err := os.Open(...)
if err != nil {
return err
}
defer f.Close()
...
```
The list of deferred closures will be run whenever the function
returns, no matter what path it takes. You can't accidentally
omit the close.
Similarly, the RAII idiom prevalent in C++ and Rust uses object
destructors that are automatically run when something goes out
of scope to do the cleanup. C++ pairs that with
exceptions (arguably worse than goto) while Rust represents
errors with sum types and a little bit of syntactic sugar with
the `?` operator. In either case, the descriptors run and do
the cleanup, regardless of whether the return was an error path
or a success path.
Other languages feature "linear types", instances of which must
be used exactly once: failure to cleanup on an error path is a
compile time error. This means that you can't forget to
deallocate memory, close a file, or unlock a mutex, for example,
but I don't know that it directly addresses the issue where you
just skip over the actual thing the program is supposed to do.
Still, with more expressive type systems, it is likely one can
much more easily structure the program so that the type of logic
that lead to this failure is unnecessary.
I understand that someone has written a paper proposing adding
`defer` to C; that would obviate many of these problems.
- Dan C.
On 2026-05-21 13:56, Bart wrote:
[...]
Some of it is an actual nuisance, but quite a bit is also
aesthetically displeasing.
(I'm not interested in what you find a nuisance, pleasing, aesthetic;
you have a very specific and "sharply limited" and strong opinion it
seems, so no good base for discussions which ask for a somewhat more
open mind.)
[...]
There is also not being able to write 'L:}'; this one bites.
But, being curious, what is that supposed to mean? - A label at the
end of a block? - Seems to work for me (with my GNU C-compiler)...
int main (void)
{
{
int a = 42;
goto end;
return a;
end:
}
return 0;
}
(although I wouldn't have been astonished if there'd be a requirement
to allow labels at statements only,
thus at least requiring something
like an empty statement as in
...
end: ;
}
But all fine in my book. (Especially since I'm anyway rarely - actually
not at all - using 'goto' labels.)
Similarly, the RAII idiom prevalent in C++ and Rust uses object
destructors that are automatically run when something goes out
of scope to do the cleanup. C++ pairs that with
exceptions (arguably worse than goto) while Rust represents
errors with sum types and a little bit of syntactic sugar with
the `?` operator. In either case, the descriptors run and do
the cleanup, regardless of whether the return was an error path
or a success path.
On 21/05/2026 01:30, Janis Papanagnou wrote:
On 2026-05-20 12:55, Bart wrote:
On 20/05/2026 05:24, Janis Papanagnou wrote:
[...]
2. Are you sure that above code is valid (error-free) Algol 68 code?
I'm not sure about that. - My old textbook says that you can use these >>>> identifiers *after* you've set them (not before). And skimming through >>>> the standard document, the Revised Report, I could not find anything.
If there is; can you point me to the relevant chapter, please?
It could also just not have been defined in the standard. (The above
code or Genie's behavior would then not tell anything relevant about
the language. - It would just expose yet another example of code that
an experienced programmer would just not write that way.)
You seem to have tested that with the Genie interpreter? - This would
not say anything about what the Algol 68 standard says, mind.
You seem determined to prove Bart wrong, aren't you?
You made a claim that looked strange, and I had been asking you
whether you can point to the standard
To the Algol68 standard? I wouldn't have a clue; probably only a handful
of people on the planet do.
For that you took an arbitrary Algol 68
interpreter
So give me links to some others, or to playgrounds.
Bart <bc@freeuk.com> writes:
On 21/05/2026 01:30, Janis Papanagnou wrote:
On 2026-05-20 12:55, Bart wrote:
On 20/05/2026 05:24, Janis Papanagnou wrote:
[...]
2. Are you sure that above code is valid (error-free) Algol 68 code? >>>>>
I'm not sure about that. - My old textbook says that you can use these >>>>> identifiers *after* you've set them (not before). And skimming through >>>>> the standard document, the Revised Report, I could not find anything. >>>>> If there is; can you point me to the relevant chapter, please?
It could also just not have been defined in the standard. (The above >>>>> code or Genie's behavior would then not tell anything relevant about >>>>> the language. - It would just expose yet another example of code that >>>>> an experienced programmer would just not write that way.)
You seem to have tested that with the Genie interpreter? - This would >>>>> not say anything about what the Algol 68 standard says, mind.
You seem determined to prove Bart wrong, aren't you?
You made a claim that looked strange, and I had been asking you
whether you can point to the standard
To the Algol68 standard? I wouldn't have a clue; probably only a handful
of people on the planet do.
A google search for "algol 68 standard document" would give you
a clue.
https://algol68-lang.org/docs/algol68-draft-report.pdf
For that you took an arbitrary Algol 68
interpreter
So give me links to some others, or to playgrounds.
You are not capable of searching the internet yourself?
Bart <bc@freeuk.com> writes:
On 21/05/2026 01:30, Janis Papanagnou wrote:
On 2026-05-20 12:55, Bart wrote:
On 20/05/2026 05:24, Janis Papanagnou wrote:
[...]
2. Are you sure that above code is valid (error-free) Algol 68
code?
I'm not sure about that. - My old textbook says that you can use
these identifiers *after* you've set them (not before). And
skimming through the standard document, the Revised Report, I
could not find anything. If there is; can you point me to the
relevant chapter, please?
It could also just not have been defined in the standard. (The
above code or Genie's behavior would then not tell anything
relevant about the language. - It would just expose yet another
example of code that an experienced programmer would just not
write that way.)
You seem to have tested that with the Genie interpreter? - This
would not say anything about what the Algol 68 standard says,
mind.
You seem determined to prove Bart wrong, aren't you?
You made a claim that looked strange, and I had been asking you
whether you can point to the standard
To the Algol68 standard? I wouldn't have a clue; probably only a
handful of people on the planet do.
A google search for "algol 68 standard document" would give you
a clue.
https://algol68-lang.org/docs/algol68-draft-report.pdf
For that you took an arbitrary Algol 68
interpreter
So give me links to some others, or to playgrounds.
You are not capable of searching the internet yourself?
On 21/05/2026 14:08, Janis Papanagnou wrote:
On 2026-05-21 13:56, Bart wrote:
[...]
Some of it is an actual nuisance, but quite a bit is also
aesthetically displeasing.
(I'm not interested in what you find a nuisance, pleasing,
aesthetic; you have a very specific and "sharply limited" and
strong opinion it seems, so no good base for discussions which ask
for a somewhat more open mind.)
[...]
There is also not being able to write 'L:}'; this one bites.
But, being curious, what is that supposed to mean? - A label at the
end of a block? - Seems to work for me (with my GNU C-compiler)...
It didn't use to be allowed (becase a label was defined as being a
prefix to another statement, so you needed at least an empty
statement like ";").
It seems to be relaxed in C23. If you do:
gcc -std=cxx -pedantic
then it will give warnings for xx = 90, 99, 11 and 17. Arbitrary
other C compilers may complain too.
So this is a rare example of something I'd thought was an annoying
quirk, being fixed.
Of course, if I'd brought it up years ago, people would say the same
things criticising me, my knowledge, my 'ignorance' and suggest it
was a non-issue as you can work around it, exactly like you do below.
(If generating code, then you don't always know in advance if a label
will be followed by a } or not; the result is that all labels are
written as "L:;".)
int main (void)
{
{
int a = 42;
goto end;
return a;
end:
}
return 0;
}
(although I wouldn't have been astonished if there'd be a
requirement to allow labels at statements only,
So you didn't know this? How ignorant of you!
However there is another aspect to this: in the original spec for
labels, defining it as a prefix to a statemen meant that its
definition was recursive. That means that a program like this:
L1:
L2:
....
Ln:
would risk overflowing the compiler stack, compared with a
non-recursive version. A minor risk as you'd need tens of thousands
of such labels. It would only show up in stress tests.
thus at least requiring something
like an empty statement as in
...
end: ;
}
But all fine in my book. (Especially since I'm anyway rarely -
actually not at all - using 'goto' labels.)
No, 'end:;' is not ugly at all!
On 21/05/2026 13:56, Bart wrote:
(The separation of the struct/union/enum tag name space from variable namespace /is/ a feature that people use
We have no idea of a user benefit from /not/ having separate name spaces here.
So are you justified in claiming it is pointless? No, you are not - because we haven't ruled out any possibility of user benefits, there is
a definite implementer benefit, and having the opposite choice would be
less likely to be of any benefit to anyone. But I think we can
reasonably say it is a very minor matter that makes little practical difference to anyone.
But labels are inherently simpler - you can only "goto" directly to a defined label. Parentheses could be of no benefit, so /allowing/ them would complicate the grammar.
It is not an anomaly when very different things have different rules.
There is also not being able to write 'L:}'; this one bites.
I don't see why that "bites", unless it is some weird smiley. I would expect most people to have a the close brace on a separate line from the label, but it is legitimate for people to want to jump to the end of a block.
No, most language extensions do not become standard. This one has been
in gcc for maybe three decades or more, and is not part of the standard.
On Thu, 21 May 2026 14:31:19 +0100
Bart <bc@freeuk.com> wrote:
On 21/05/2026 14:08, Janis Papanagnou wrote:
It didn't use to be allowed (becase a label was defined as being a
prefix to another statement, so you needed at least an empty
statement like ";").
It seems to be relaxed in C23. If you do:
gcc -std=cxx -pedantic
then it will give warnings for xx = 90, 99, 11 and 17. Arbitrary
other C compilers may complain too.
So this is a rare example of something I'd thought was an annoying
quirk, being fixed.
Of course, if I'd brought it up years ago, people would say the same
things criticising me, my knowledge, my 'ignorance' and suggest it
was a non-issue as you can work around it, exactly like you do below.
When I brought it not many years ago, but before C23 was finalized,
most people that reacted agreed with me that being pedantic in this
case is not a good idea on part of compiler unless it is asked
specifically to be pedantic.
[...]
A google search for "algol 68 standard document" would give you
a clue.
https://algol68-lang.org/docs/algol68-draft-report.pdf
[...]
On 21/05/2026 07:45, David Brown wrote:No, only to stifle uninformed criticism.
You can have an opinion from ignorance, as many people have told you.
You can't expect that opinion to be respected.
That's just rubbish, and an attempt to stifle criticism.
On 2026-05-21 13:56, Bart wrote:...
There is also not being able to write 'L:}'; this one bites.
But, being curious, what is that supposed to mean? - A label at the
end of a block? - Seems to work for me (with my GNU C-compiler)...
On Thu, 21 May 2026 15:12:16 GMT
scott@slp53.sl.home (Scott Lurndal) wrote:
https://algol68-lang.org/docs/algol68-draft-report.pdf
I did not open it, but my guess is that finding a clue from A68 Standard
is a Very Hard Job.
On 2026-05-21 09:08, Janis Papanagnou wrote:
On 2026-05-21 13:56, Bart wrote:...
There is also not being able to write 'L:}'; this one bites.
But, being curious, what is that supposed to mean? - A label at the
end of a block? - Seems to work for me (with my GNU C-compiler)...
The only two places where labels are permitted are in labeled statements (6.8.2p1) and goto statements. Since ';' qualifies as a null statement,
Bart is required to make the extremely painful modification of instead writing 'L:;}'.
On 2026-05-21 18:27, Michael S wrote:
On Thu, 21 May 2026 15:12:16 GMT
scott@slp53.sl.home (Scott Lurndal) wrote:
https://algol68-lang.org/docs/algol68-draft-report.pdf
I did not open it, but my guess is that finding a clue from A68 Standard
is a Very Hard Job.
Well, my experience with (most) standards is that they are typically
not written for the casual user or ordinary programmer. The Algol 68 standard, the Revised Report, specifically seems to be regularly
considered as being "not the easiest" to read document. (Myself I've
never read a standards document to learn a programming language, but
I've studied other international standards, documents of hundreds and thousands of pages, so I'm at least not repelled by such documents beforehand, and if I'm interested and want clarity about something I
look up these documents. - I did that also to look for Bart's claim,
but I could not find anything that would support Bart's statement.)
Though for the given sub-thread we have to state that to clarify the
correct syntax and semantics of a language we will have to look into
the respective defining reference documents; the standards. - A wild
guess, speculation, wish, or opinion, cannot substitute a standard.
On 2026-05-21 17:12, Scott Lurndal wrote:
[...]
A google search for "algol 68 standard document" would give you
a clue.
https://algol68-lang.org/docs/algol68-draft-report.pdf
The Genie system (that I strongly presume Bart is also using)
comes with a 700-pages manual/tutorial/reference that contains
the full Revised Report.
https://jmvdveer.home.xs4all.nl/learning-algol-68-genie.pdf
Yes, it would have been easy to *find* it, but I'd presume if
one is using the Genie he would already *have* the document.
But for some folks it's easier to write a dozen posts full of
complaints
instead of just taking the time to sit back and slow
down, think, and inspect the documentation instead of emitting
wild guesses and declare them as facts.
On 21/05/2026 13:55, David Brown wrote:
On 21/05/2026 13:56, Bart wrote:
[Separate 'namespace' for label names]
(The separation of the struct/union/enum tag name space from variable
namespace /is/ a feature that people use
Urgh, another discussion. Let's not go there. Suffice that it is unique
to C.
We have no idea of a user benefit from /not/ having separate name
spaces here.
So let's have separate namespaces for everything! We just have to keep
it quite so that people can discover that amazing fact by accident.
So are you justified in claiming it is pointless? No, you are not -
because we haven't ruled out any possibility of user benefits, there
is a definite implementer benefit, and having the opposite choice
would be less likely to be of any benefit to anyone. But I think we
can reasonably say it is a very minor matter that makes little
practical difference to anyone.
You keep saying this but then still disagree it is pointless.
If that extra namespace disappeared overnight, would anyone notice?
If making a new language, would you choose to have the same label name
and non-label name co-exist in the same scope?
If not familiar with C, the concept would be bizarre (it is bizarre
anyway).
But labels are inherently simpler - you can only "goto" directly to a
defined label. Parentheses could be of no benefit, so /allowing/ them
would complicate the grammar.
Having a label be just another term could be a benefit. For example:
goto cond ? L1 : L2;
That means that whatever follows 'goto' is just an expression and those
may have parentheses.
However, that example gives a better clue as to why label names are
special in C: that "L1:" looks like a label definition. I doubt such definitions are allowed in the middle of an expression, but it looks problematical.
It is not an anomaly when very different things have different rules.
OK, call it a lack of orthogonality.
There is also not being able to write 'L:}'; this one bites.
I don't see why that "bites", unless it is some weird smiley. I would
expect most people to have a the close brace on a separate line from
the label, but it is legitimate for people to want to jump to the end
of a block.
An intervening newline doesn't help. But this particular restriction has been eased in C23.
No, most language extensions do not become standard. This one has
been in gcc for maybe three decades or more, and is not part of the
standard.
It (label pointers) has been used to help make CPython faster for a long time. But only on Linux. On Windows, CPython has to build with MSVC (as
of a decade ago but maybe still the case), and that doesn't have the feature.
So Python on Windows may be slower because that useful extension was not standardised.
On 21/05/2026 19:26, Bart wrote:
On 21/05/2026 13:55, David Brown wrote:
On 21/05/2026 13:56, Bart wrote:
[Separate 'namespace' for label names]
(The separation of the struct/union/enum tag name space from variable
namespace /is/ a feature that people use
Urgh, another discussion. Let's not go there. Suffice that it is
unique to C.
Really? All other languages keep everything in the one name space, do they? Are you /sure/ about that? You have checked how Lisp, Scala, Ocaml, and other languages work?
On 21/05/2026 07:45, David Brown wrote:[...]
So do you have an example of where code has been written to take
advantage of the separate name spaces?
No. This is why I claimed it was pointless.
You didn't like my example of being able to write '(F)(x)' (or 'int
(a);') but not 'goto (L)', but the anomaly is there.
There is also not being able to write 'L:}'; this one bites.
On 2026-05-21 19:54, James Kuyper wrote:
On 2026-05-21 09:08, Janis Papanagnou wrote:
On 2026-05-21 13:56, Bart wrote:...
The only two places where labels are permitted are in labeledThere is also not being able to write 'L:}'; this one bites.
But, being curious, what is that supposed to mean? - A label at the
end of a block? - Seems to work for me (with my GNU C-compiler)...
statements
(6.8.2p1) and goto statements. Since ';' qualifies as a null statement,
Bart is required to make the extremely painful modification of instead
writing 'L:;}'.
Aha, so the GNU C-compiler I'm using was just generous to not require
a semicolon at the block-terminating brace.
Thanks for the information (and the standard reference)!
You're saying Lisp has all this too?
Lisp does has a reputation for being everything (interpreted/compiled; >static/dynamic; functional/imperative etc), but this seems a stretch.
Bart <bc@freeuk.com> writes:
On 21/05/2026 07:45, David Brown wrote:[...]
So do you have an example of where code has been written to take
advantage of the separate name spaces?
No. This is why I claimed it was pointless.
This entire discussion is pointless. It is a fact that C labels
are in their own name space. Nobody here has provided a rationale
for that fact. Nobody here other than you particularly cares.
Here's an entirely speculative possible rationale. If I define
a variable "foo" within a function, and then add a label "foo:"
later in the same function, the label name does not interfere with
the variable name. If labels were not in their own name space,
adding the label later in the function would interfere with the
declaration earlier in the function. Labels are the only named
entities whose scope begins before their definition. That *might*
have been in Ritchie's mind when he specified how labels work.
Putting parentheses around a label name wouldn't even make any sense. Disallowing `goto (L)` is not a quirk. Allowing it would be.
There is also not being able to write 'L:}'; this one bites.
It's always been trivially easy to write `L:;}` rather than `L:}`.
And C23 allows a bare label just before a `}`, so the problem is
now fixed.
I predict that you will be outraged that this minor problem wasn't
solved sooner rather than happy that it was actually solved.
Janis Papanagnou <janis_papanagnou+ng@hotmail.com> writes:...
On 2026-05-21 19:54, James Kuyper wrote:
The only two places where labels are permitted are in labeled
statements
(6.8.2p1) and goto statements. Since ';' qualifies as a null statement,
Bart is required to make the extremely painful modification of instead
writing 'L:;}'.
Aha, so the GNU C-compiler I'm using was just generous to not require
a semicolon at the block-terminating brace.
Thanks for the information (and the standard reference)!
gcc by default compiles GNU C, not ISO C. It also fails to emit many language-required diagnostics. GNU C is ISO C with gcc extensions.
Prior to C23, allowing "L:}" is a documented gcc extension.
It's rejected if you ask gcc to conform to ISO C17 or earlier.
Observing what gcc allows by default tells you very litlte about
the C language.
This entire discussion is pointless. It is a fact that C labels
are in their own name space. Nobody here has provided a rationale
for that fact.
On 21/05/2026 22:23, Keith Thompson wrote:
Bart <bc@freeuk.com> writes:
On 21/05/2026 07:45, David Brown wrote:[...]
So do you have an example of where code has been written to take
advantage of the separate name spaces?
No. This is why I claimed it was pointless.
This entire discussion is pointless. It is a fact that C labels
are in their own name space. Nobody here has provided a rationale
for that fact. Nobody here other than you particularly cares.
This is how labels entered the discussion:
"The simplest scoping rule in C is for labels: they have function-wide
scope, regardless of block nesting label.
That would almost be as simple as how my own scopes work, except C
just has to still be quirky:
double c; c: goto c;
(Labels of course have their own namespace, for some obscure
reason. I'm sure 99% of C programmers don't know that.) "
I introduced label names as an example of a much simpler scoping
scheme than other kinds of names.
Did the earliest C have block scopes? Then scoping rules between
labels and non-labels would have clashed. Perhaps that was a reason.
Putting parentheses around a label name wouldn't even make any sense.
Disallowing `goto (L)` is not a quirk. Allowing it would be.
Allowing labels as expressions has advantages; I gave some
examples. Being able to use parentheses is a consequence, but not
useful in itself. But the special namespace puts paid to that.
There is also not being able to write 'L:}'; this one bites.
It's always been trivially easy to write `L:;}` rather than `L:}`.
And C23 allows a bare label just before a `}`, so the problem is
now fixed.
I predict that you will be outraged that this minor problem wasn't
solved sooner rather than happy that it was actually solved.
Yes; why did it exist in the first place?
(1) I complain about something
(2) Everyone says it is a non-issue and I'm complaining about
triviliaties
(3) Despite that, it mysteriously gets fixed anyway
Is it possible that I might have had a valid point in the first place?
On 2026-05-21 17:23, Keith Thompson wrote:
...
This entire discussion is pointless. It is a fact that C labels
are in their own name space. Nobody here has provided a rationale
for that fact.
I did. I pointed out that the standard sets aside a separate name space
for each kind of identifier which is usable only in syntactic contexts
where an ordinary identifier would not be allowed.
The Rationale says "The position adopted in the Standard is to permit as
many separate name spaces as can be distinguished by context, ...".
Apparently the Committee likes separate name spaces; you'll have to ask--
them why.
Bart <bc@freeuk.com> writes:
Allowing labels as expressions has advantages; I gave some
examples. Being able to use parentheses is a consequence, but not
useful in itself. But the special namespace puts paid to that.
In C, label names are not expressions. It's that simple. C does
not allow parentheses around a label name, and there is no reason
at all why it should do so. (If it did, you'd probably complain
that it's inconsistent.)
It wasn't seen as necessary. The workaround, adding a semicolon, is
trivial, and has been mentioned at least as far back as K&R1, 1978.
(1) I complain about something
(2) Everyone says it is a non-issue and I'm complaining about
triviliaties
(3) Despite that, it mysteriously gets fixed anyway
Is it possible that I might have had a valid point in the first place?
Sure. FYI, here's the proposal that, as far as I can tell, led to C23 allowing a label immediately before a "}". It was written by Martin
Uecker in 2020.
<https://www.open-std.org/jtc1/sc22/wg14/www/docs/n2496.pdf>
The author described the existing definition as "an unnecessarily and annoying limitation" and proposed a fix, which was accepted.
Two things can be true simultaneously:
1. The existing limitation was no more than a minor annoyance with
an easy workaround. Few C programmers thought that it "bites".
2. The committee decided it was worth fixing. Somebody (notably not
you)
On 21/05/2026 02:45, Janis Papanagnou wrote:
[...]David, the answer of my emailed question arrived; the behavior of
the Genie interpreter in case of identity relations is just a bug!
[...]
Algol 68 is a well designed language and the scope rules are as we
expect sensibly chosen; entities are valid within the block scope
and their "existence", when you can use them, starts with their
declaration (not before).
[...]
OK. I know very little about Algol 68, and don't have the interest to bother checking such details myself. Your description makes sense to
me, both of the way the language is intended to work, and that there is
an oddity or bug in a particular implementation. "It makes sense to me" is, of course, not a particularly strong argument. But I don't think it
is worth going into more detail here in c.l.c.
Bart does have a tendency to mix up "this is how the language is
designed" and "this is what a particular implementation does". With his own languages, that's fine - there is only one implementation, and only
a rough description of the language, so the two coincide. It is also
how pre-standardisation languages tend to work. But it is not how C
works, nor, as I understand it, how Algol68 works.
On 2026-05-21 10:16, David Brown wrote:
[...]
Short update on that (for those interested); my question I recently
sent to Marcel about Genie's behavior made him not only explain that
it's a bug, but he also immediately fixed it. - The current Genie
release 3.12.2 (published a few hours ago) now reports:
a68g: runtime error: 1: attempt to use an uninitialised INT value,
in [] "SIMPLOUT" collateral-clause starting at "(" in this line.
Janis
On 2026-05-21 17:31, Keith Thompson wrote:
Janis Papanagnou <janis_papanagnou+ng@hotmail.com> writes:...
On 2026-05-21 19:54, James Kuyper wrote:
The only two places where labels are permitted are in labeled
statements
(6.8.2p1) and goto statements. Since ';' qualifies as a null statement, >>>> Bart is required to make the extremely painful modification of instead >>>> writing 'L:;}'.
Aha, so the GNU C-compiler I'm using was just generous to not require
a semicolon at the block-terminating brace.
Thanks for the information (and the standard reference)!
gcc by default compiles GNU C, not ISO C. It also fails to emit many
language-required diagnostics. GNU C is ISO C with gcc extensions.
Prior to C23, allowing "L:}" is a documented gcc extension.
It's rejected if you ask gcc to conform to ISO C17 or earlier.
Observing what gcc allows by default tells you very litlte about
the C language.
He's not relying on gcc, he's relying on my citation of a draft of the standard.
I missed the fact that 6.8.3p1 has been changed to allow a
label with no following statement as a block item in a compound statement.
Bart <bc@freeuk.com> writes:
[...]
(Labels of course have their own namespace, for some obscure
reason. I'm sure 99% of C programmers don't know that.) "
I suggest using the term "name space" rather than "namespace",
which happens to be the name of a C++ feature.
[...]
On 2026-05-22 02:24, Keith Thompson wrote:
Bart <bc@freeuk.com> writes:
[...]I suggest using the term "name space" rather than "namespace",
(Labels of course have their own namespace, for some obscure
reason. I'm sure 99% of C programmers don't know that.) "
which happens to be the name of a C++ feature.
The term "namespace" is, as I observe, a regular English word
(it's in my dictionary, at least).
So if you want to talk about a namespace there's nothing wrong
with using the term namespace (without any punctuation or any
arbitrary spacing).
On 21/05/2026 19:26, Bart wrote:
On 21/05/2026 13:55, David Brown wrote:
[...]
Having a label be just another term could be a benefit. For example:
goto cond ? L1 : L2;
C does not have any kind of goto statement with an expression. This is good and appropriate. "goto" is an unstructured concept, and should be used sparingly in well-written code.
[...]
It is not an anomaly when very different things have different rules.
OK, call it a lack of orthogonality.
"A foolish consistency is the hobgoblin of little minds."
[...]
Janis Papanagnou <janis_papanagnou+ng@hotmail.com> writes:
On 2026-05-22 02:24, Keith Thompson wrote:
Bart <bc@freeuk.com> writes:
[...]I suggest using the term "name space" rather than "namespace",
(Labels of course have their own namespace, for some obscure
reason. I'm sure 99% of C programmers don't know that.) "
which happens to be the name of a C++ feature.
The term "namespace" is, as I observe, a regular English word
(it's in my dictionary, at least).
It's not in dictionary.com or the online Merriam Webster (not
surprisingly, the OED has it), and I don't think it's a common term
outside programming.
So if you want to talk about a namespace there's nothing wrong
with using the term namespace (without any punctuation or any
arbitrary spacing).
The problem is that "namespace" is a C++ keyword with a specific
meaning that doesn't apply to C. And the C standard consistently
uses the phrase "name space", and defines it as a technical term.
Of course C and C++ are two different languages, but they share a lot
in common. C++ has both C-style name spaces and its own namespaces.
Referring to C's name spaces as "namespaces" might not cause much
confusion, but it's likely to induce a reaction from the more
pedantic among us.
On 21/05/2026 20:23, David Brown wrote:
On 21/05/2026 19:26, Bart wrote:
On 21/05/2026 13:55, David Brown wrote:
On 21/05/2026 13:56, Bart wrote:
[Separate 'namespace' for label names]
(The separation of the struct/union/enum tag name space from
variable namespace /is/ a feature that people use
Urgh, another discussion. Let's not go there. Suffice that it is
unique to C.
Really? All other languages keep everything in the one name space, do
they? Are you /sure/ about that? You have checked how Lisp, Scala,
Ocaml, and other languages work?
OK, it's something I haven't come across before in other languages.
Some may have naming conventions, others may only have types appearing
in certain contexts.
But I don't recall seeing anything like C's scheme which has both struct
tag names in their own namespace, AND user-defined type names in the
general namespace:
typedef struct Point {int x, y;} Point;
struct Point a;
Point b;
or:
struct Vector {Point p, q;};
struct Vector Vector;
It's messy; Point is both a struct tag and type name in the same scope. Vector is both a struct tag and variable name in the same scope.
You can't have the hat-trick however because the same type name and
variable name can't appear in the same scope; it needs an extra namespace!
At best they can be in the same block:
struct Vector Vector; {....; typedef struct Vector Vector; Vector:;}
Within those {} exist Vector (variable); Vector (struct tag); and Vector (type name); plus Vector (label) for good measure.
You're saying Lisp has all this too?
Lisp does has a reputation for being everything (interpreted/compiled; static/dynamic; functional/imperative etc), but this seems a stretch.
--------------------------------
record Point = (int x, y)
record Vector = (Point p, q)
Point a, b
Vector `Vector # best I could do
On 2026-05-21 21:23, David Brown wrote:
On 21/05/2026 19:26, Bart wrote:
On 21/05/2026 13:55, David Brown wrote:
[...]
Having a label be just another term could be a benefit. For example:
goto cond ? L1 : L2;
What benefit is it to have an alternative construct for existing
clearer 'if' and 'switch' features? - What's the concrete gain? -
To create harder to follow spaghetti-code?
Labels are no first-class object in "C", and there's no necessity
to be (IMO).
C does not have any kind of goto statement with an expression. This
is good and appropriate. "goto" is an unstructured concept, and
should be used sparingly in well-written code.
Well, for someone used to 'goto' the existence of a calculated 'goto'
might be considered worthwhile. (Many older languages do support
such a feature. But I have no overview about its prevalence in newer languages, though. I'd suppose it's primarily a remnant of history.)
[...]
It is not an anomaly when very different things have different rules.
OK, call it a lack of orthogonality.
"A foolish consistency is the hobgoblin of little minds."
Oh, orthogonality is a valuable property of a programming language!
"C" is mostly lacking that property, but "C" is what it is. There's
certainly no point in Bart's continuous complaints-tirades.
On 2026-05-22 02:24, Keith Thompson wrote:
Bart <bc@freeuk.com> writes:
[...]
(Labels of course have their own namespace, for some obscure
reason. I'm sure 99% of C programmers don't know that.) "
I suggest using the term "name space" rather than "namespace",
which happens to be the name of a C++ feature.
The term "namespace" is, as I observe, a regular English word
(it's in my dictionary, at least).
So if you want to talk about a namespace there's nothing wrong
with using the term namespace (without any punctuation or any
arbitrary spacing).
I think (instead of a space) it's sensible to apply typographical
conventions to signify "features". I've, for example, generally
used single quotes to identify programming language entities or
keywords, as in 'namespace' (a C++ feature or keyword).
("A namespace in C++ is denoted by the keyword 'namespace'.")
Denoting such things with (for example) single quotes makes the
whole text also better readable, because the typical programming
languages' entities are (also) English words and interfere with
the surrounding texts.
Janis
On 2026-05-21 10:16, David Brown wrote:
On 21/05/2026 02:45, Janis Papanagnou wrote:
[...]David, the answer of my emailed question arrived; the behavior of
the Genie interpreter in case of identity relations is just a bug!
[...]
Algol 68 is a well designed language and the scope rules are as we
expect sensibly chosen; entities are valid within the block scope
and their "existence", when you can use them, starts with their
declaration (not before).
[...]
OK. I know very little about Algol 68, and don't have the interest to
bother checking such details myself. Your description makes sense to
me, both of the way the language is intended to work, and that there
is an oddity or bug in a particular implementation. "It makes sense
to me" is, of course, not a particularly strong argument. But I don't
think it is worth going into more detail here in c.l.c.
Bart does have a tendency to mix up "this is how the language is
designed" and "this is what a particular implementation does". With
his own languages, that's fine - there is only one implementation, and
only a rough description of the language, so the two coincide. It is
also how pre-standardisation languages tend to work. But it is not
how C works, nor, as I understand it, how Algol68 works.
Short update on that (for those interested); my question I recently
sent to Marcel about Genie's behavior made him not only explain that
it's a bug, but he also immediately fixed it. - The current Genie
release 3.12.2 (published a few hours ago) now reports:
a68g: runtime error: 1: attempt to use an uninitialised INT value,
in [] "SIMPLOUT" collateral-clause starting at "(" in this line.
entities are valid within the block scope
and their "existence", when you can use them, starts with their
declaration (not before).
On 2026-05-21 21:23, David Brown wrote:
On 21/05/2026 19:26, Bart wrote:
On 21/05/2026 13:55, David Brown wrote:
[...]
Having a label be just another term could be a benefit. For example:
goto cond ? L1 : L2;
What benefit is it to have an alternative construct for existing
clearer 'if' and 'switch' features? - What's the concrete gain? -
To create harder to follow spaghetti-code?
Labels are no first-class object in "C", and there's no necessity
to be (IMO).
C does not have any kind of goto statement with an expression. This
is good and appropriate. "goto" is an unstructured concept, and
should be used sparingly in well-written code.
Well, for someone used to 'goto' the existence of a calculated 'goto'
might be considered worthwhile. (Many older languages do support
such a feature. But I have no overview about its prevalence in newer languages, though. I'd suppose it's primarily a remnant of history.)
On 2026-05-21 21:23, David Brown wrote:You're asking why '?:' exists when you can use 'if-else'?
On 21/05/2026 19:26, Bart wrote:
On 21/05/2026 13:55, David Brown wrote:
[...]
Having a label be just another term could be a benefit. For example:
goto cond ? L1 : L2;
What benefit is it to have an alternative construct for existing
clearer 'if' and 'switch' features? - What's the concrete gain? -
To create harder to follow spaghetti-code?
Labels are no first-class object in "C", and there's no necessityIn gnu C computed goto tends to be used for faster dispatching (in
to be (IMO).
C does not have any kind of goto statement with an expression. This
is good and appropriate. "goto" is an unstructured concept, and
should be used sparingly in well-written code.
Well, for someone used to 'goto' the existence of a calculated 'goto'
might be considered worthwhile. (Many older languages do support
such a feature. But I have no overview about its prevalence in newer languages, though. I'd suppose it's primarily a remnant of history.)
On 22/05/2026 05:52, Janis Papanagnou wrote:
On 2026-05-21 21:23, David Brown wrote:
On 21/05/2026 19:26, Bart wrote:
On 21/05/2026 13:55, David Brown wrote:
[...]
Having a label be just another term could be a benefit. For example:
goto cond ? L1 : L2;
What benefit is it to have an alternative construct for existing
clearer 'if' and 'switch' features? - What's the concrete gain? -
To create harder to follow spaghetti-code?
You're asking why '?:' exists when you can use 'if-else'?
On 22/05/2026 05:52, Janis Papanagnou wrote:
On 2026-05-21 21:23, David Brown wrote:
On 21/05/2026 19:26, Bart wrote:
On 21/05/2026 13:55, David Brown wrote:
[...]
Having a label be just another term could be a benefit. For example:
goto cond ? L1 : L2;
What benefit is it to have an alternative construct for existing
clearer 'if' and 'switch' features? - What's the concrete gain? -
To create harder to follow spaghetti-code?
You're asking why '?:' exists when you can use 'if-else'?
Actually you can do pretty much this with gnu C:
goto *(c ? &&L1 : &&L2);
but it still doesn't allow a general expression. It seems that what
comes after 'goto' must be either a regular label, or '*' to dereference
a label pointer.
In my syntax, it can do this:
goto (c | L1, L2, L3 | Lx)
It's sweeter /because/ label names live in the normal name space. I'm
sure A68 can do this too, but I couldn't get it to work.
The point is, other people do think is has a benefit! At least in gnu C
and A68G even if you dismiss my own use of it.
C does not have any kind of goto statement with an expression. This
is good and appropriate. "goto" is an unstructured concept, and
should be used sparingly in well-written code.
Well, for someone used to 'goto' the existence of a calculated 'goto'
might be considered worthwhile. (Many older languages do support
such a feature. But I have no overview about its prevalence in newer
languages, though. I'd suppose it's primarily a remnant of history.)
In gnu C computed goto tends to be used for faster dispatching (in
bytecode interpreters, emulators etc) compared with regular 'switch', >because it uses multiple dispatch points (which help CPU branch >predication). 'switch' uses only one.
(My language has a dedicated feature for this, so using explicit
computed goto is not needed. That allows it to outperform optimised >/standard/ C that uses plain switch.)
On 22/05/2026 12:41, Bart wrote:
On 22/05/2026 05:52, Janis Papanagnou wrote:
On 2026-05-21 21:23, David Brown wrote:
On 21/05/2026 19:26, Bart wrote:
On 21/05/2026 13:55, David Brown wrote:
[...]
Having a label be just another term could be a benefit. For example: >>>>>
goto cond ? L1 : L2;
What benefit is it to have an alternative construct for existing
clearer 'if' and 'switch' features? - What's the concrete gain? -
To create harder to follow spaghetti-code?
You're asking why '?:' exists when you can use 'if-else'?
I don't believe that is at all what he was asking. As I see it, he was asking what benefit you have from writing :
goto cond ? L1 : L2;
compared to :
if (cond) {
goto L1;
} else {
goto L2;
}
Given that - except for niche applications - goto's are rarely used in C
and then almost always with a single labele ("goto error", or escaping
from nested loops), it seems absurd to me to complicate gotos and labels
by supporting expressions.
C is mostly a structured programming
language - unstructured constructs like "goto" should be minimal, not encouraged.
On 22/05/2026 12:31, David Brown wrote:
On 22/05/2026 12:41, Bart wrote:
On 22/05/2026 05:52, Janis Papanagnou wrote:
On 2026-05-21 21:23, David Brown wrote:
On 21/05/2026 19:26, Bart wrote:
On 21/05/2026 13:55, David Brown wrote:
[...]
Having a label be just another term could be a benefit. For example: >>>>>>
goto cond ? L1 : L2;
What benefit is it to have an alternative construct for existing
clearer 'if' and 'switch' features? - What's the concrete gain? -
To create harder to follow spaghetti-code?
You're asking why '?:' exists when you can use 'if-else'?
I don't believe that is at all what he was asking. As I see it, he
was asking what benefit you have from writing :
goto cond ? L1 : L2;
compared to :
if (cond) {
goto L1;
} else {
goto L2;
}
This is the same difference between:
a = cond ? x : y;
and:
if (cond) {
a = x;
} else {
a = y;
}
If ?: is considered a convenient, clearer short-cut for the latter, then
the same applies to the goto.
Given that - except for niche applications - goto's are rarely used in C
You seem to be supporting some kinds of /very/ small niches like those enabled by label name spaces.
But I bet 'goto' in C is a lot more common!
and then almost always with a single labele ("goto error", or escaping
from nested loops), it seems absurd to me to complicate gotos and
labels by supporting expressions.
And yet, in the gnu extension, exactly this is enabled, by allowing first-class labels (sort of) and allowing them within expressions.
C is mostly a structured programming language - unstructured
constructs like "goto" should be minimal, not encouraged.
But when you do need to conditionally jump to L1 or L2, then ?: lets you
do that with only one 'goto', not two. (I can do it with zero: (cond |
L1 | L2); the 'goto' can be implied.)
We all know you don't use it yourself, but I can list half a dozen use- cases for 'goto'. Here are two:
(1) For generated C code when transpiling from a higher level language. Without 'goto' this would be near-impossible. (Not impossible, but hard enough that it is easier to use another language.)
(2) For porting code to C that already uses 'goto'.
In article <10upbvq$1dhq4$1@dont-email.me>, Bart <bc@freeuk.com> wrote:
On 22/05/2026 05:52, Janis Papanagnou wrote:
On 2026-05-21 21:23, David Brown wrote:
On 21/05/2026 19:26, Bart wrote:
On 21/05/2026 13:55, David Brown wrote:
[...]
Having a label be just another term could be a benefit. For example: >>>>>
goto cond ? L1 : L2;
What benefit is it to have an alternative construct for existing
clearer 'if' and 'switch' features? - What's the concrete gain? -
To create harder to follow spaghetti-code?
You're asking why '?:' exists when you can use 'if-else'?
Is that meant to be obtuse? No, the question is why anyone
would want to write code that uses `goto` in that fashion.
The ternary operator, `?:`, is expression-oriented. Label names
are just identifiers; identifiers may appear in expressions
(specifically, "primary expressions"), but only if they name an
object, or a function.
A label name can only appear as part of a labeled statement;
thus, it names neither an object nor a function, so it cannot be
used as part of a primary expression, and so therefore, one
cannot use label names in the form mentioned above.
You seem to be saying, "look, you can't do this!" But the
question is, _why would you want to do that in the first place_?
My subjective opinion is that that is not sweet: it's hideous.
Obviously someone thinks, or thought, that this had some kind of
benefit, otherwise they wouldn't have implemented it. It seems extraordinarily rarely used.
Sometimes people have an idea of what they think is going to be
a neat feature, only to find out with some experience that it
didn't pan out the way they thought it might. This happens all
the time;
In gnu C computed goto tends to be used for faster dispatching (in
bytecode interpreters, emulators etc) compared with regular 'switch',
because it uses multiple dispatch points (which help CPU branch
predication). 'switch' uses only one.
(My language has a dedicated feature for this, so using explicit
computed goto is not needed. That allows it to outperform optimised
/standard/ C that uses plain switch.)
Absent a benchmark, I find that assertion specious. Speculation
about performance on modern machines is just that; speculation.
But machines these days are so incredibly complicated, and so
dynamic during execution, that we simply cannot reason about
them from first principles anymore: you either produce data for
a particular target machine, or you're just guessing.
On 22/05/2026 01:24, Keith Thompson wrote:
Bart <bc@freeuk.com> writes:
Allowing labels as expressions has advantages; I gave some
examples. Being able to use parentheses is a consequence, but not
useful in itself. But the special namespace puts paid to that.
In C, label names are not expressions. It's that simple. C does
not allow parentheses around a label name, and there is no reason
at all why it should do so. (If it did, you'd probably complain
that it's inconsistent.)
You don't understand my point, so I will leave it.
Sure. FYI, here's the proposal that, as far as I can tell, led to C23
allowing a label immediately before a "}". It was written by Martin
Uecker in 2020.
<https://www.open-std.org/jtc1/sc22/wg14/www/docs/n2496.pdf>
This is very interesting. So perhaps tens of millions of C programmers >encountered the quirk, found it annoying, and moved on.
On 21/05/2026 14:18, Dan Cross wrote:
In article <10umdsa$hnml$1@dont-email.me>,
David Brown <david.brown@hesbynett.no> wrote:
[snip]
I think one can level some reasonable criticism at the language,
however. The `goto error;` idiom is used in C because there are
few alternatives for cleanup handling on failure. Modulo what
we discussed before, that code can _often_ be restructured to
avoid it, but sometimes it can't, and frequently it just isn't.
I think "isn't" is more common than "can't".
But I also don't think the
"goto error" idiom is necessarily bad in itself - it's just that it is
often used badly. A typical indication of poor usage is when a function
is getting very long, and there are multiple "error" labels.
However, the problem with this code was not the "goto error" idiom, or
the "goto" itself - the problem was the mismatch between indentation and
the statement under control of the "if". It would have been equally bad
if it had been "return" rather than "goto", or if gcc cleanup attributes
had been used to handle cleanup, or if some kind of "defer" mechanism
had been used (as supported by some programming languages, and proposed
for a future C version).
These various cleanup mechanisms can definitely be better than "goto
error", but they would not have prevented this error. (A programming >language that requires braces for statements controlled by "if", on the >other hand, /would/ have prevented the error.)
[snip]
Other languages feature "linear types", instances of which must
be used exactly once: failure to cleanup on an error path is a
compile time error. This means that you can't forget to
deallocate memory, close a file, or unlock a mutex, for example,
but I don't know that it directly addresses the issue where you
just skip over the actual thing the program is supposed to do.
Still, with more expressive type systems, it is likely one can
much more easily structure the program so that the type of logic
that lead to this failure is unnecessary.
There is definitely potential for a language's type system to make it
harder to make some kinds of mistakes. But there is a risk in making
the language too restrictive - people end up writing horrible code to
work around restrictions, or use "unsafe" code too much.
I did some work, eons ago, in a language called XC that was specifically
for XMOS microcontrollers. The language and tools had a feature that
made data races impossible by not allowing competing access to shared >variables from different threads. Since threads were part of the
hardware, and the tools analysed the code flow through threads, this
could all be enforced at build time - data had to be passed in messages,
not shared memory. But for some things that involved large buffers,
that was hopelessly inefficient - and these devices were regularly used
with USB, audio, and similar things that needed large and predictable >buffers. So code - even library and example code from the manufacturer
- was full of inline assembly to work around the "smart-arse" language
and tools.
I understand that someone has written a paper proposing adding
`defer` to C; that would obviate many of these problems.
Yes. Jens Gustedt - one of the few members of the C standards committee
who is vocal and public about pushing new ideas into C. (Of course not >everyone will agree with the ideas he comes with.)
<https://gustedt.wordpress.com/2026/02/15/defer-available-in-gcc-and-clang/>
This is the same difference between:
a = cond ? x : y;
and:
if (cond) {
a = x;
} else {
a = y;
}
If ?: is considered a convenient, clearer short-cut for the latter,
then the same applies to the goto.
[snip]
C is mostly a structured programming
language - unstructured constructs like "goto" should be minimal, not
encouraged.
But when you do need to conditionally jump to L1 or L2, then ?: lets you
do that with only one 'goto', not two.
(I can do it with zero: (cond | L1 | L2); the 'goto' can be implied.)
We all know you don't use it yourself, but I can list half a dozen
use-cases for 'goto'. Here are two:
(1) For generated C code when transpiling from a higher level language. >Without 'goto' this would be near-impossible. (Not impossible, but hard >enough that it is easier to use another language.)
(2) For porting code to C that already uses 'goto'.
In article <10un0j7$obiv$2@dont-email.me>,
David Brown <david.brown@hesbynett.no> wrote:
On 21/05/2026 14:18, Dan Cross wrote:
In article <10umdsa$hnml$1@dont-email.me>,
David Brown <david.brown@hesbynett.no> wrote:
[snip]
Other languages feature "linear types", instances of which must
be used exactly once: failure to cleanup on an error path is a
compile time error. This means that you can't forget to
deallocate memory, close a file, or unlock a mutex, for example,
but I don't know that it directly addresses the issue where you
just skip over the actual thing the program is supposed to do.
Still, with more expressive type systems, it is likely one can
much more easily structure the program so that the type of logic
that lead to this failure is unnecessary.
There is definitely potential for a language's type system to make it
harder to make some kinds of mistakes. But there is a risk in making
the language too restrictive - people end up writing horrible code to
work around restrictions, or use "unsafe" code too much.
Interesting. I've found it to be somewhat the opposite; using a
richer type system has lead to code that is easier to understand
and reason about, and less buggy: type-oriented programming can
make entire categories of errors *unrepresentable*, so it's not
just _harder_ to make certain kinds of mistakes, but
_impossible_. The existance of an object of some type can be
thought of as an existence proof that the invariants the type
represents hold.
And Alexis King wrote the great, "Parse, Don't Validate" essay
some years ago where she talks about "Type-Driven Development": https://lexi-lambda.github.io/blog/2019/11/05/parse-don-t-validate/
More recently, Yaron Minsky gave a talk and discussed this in an
OCaml context (https://www.youtube.com/watch?v=rUYP4C29yCw; the
most relevant bits start at about 6:35. The talk is about AI
and constraining agents, but the discussion around types is more
general).
I wrote the production OS loader for Oxide compute sleds using
this technique in the virtual memory system (and other places):
the loader uses multiple page sizes, and the rule is that, when
mapping a region of memory, it uses the largest page size that
it can, given size and alignment constraints. But I use the
type system to make it impossible to, say, map a 2MiB "large"
physical page frame to a non-2MiB aligned virtual boundary.
I did some work, eons ago, in a language called XC that was specifically
for XMOS microcontrollers. The language and tools had a feature that
made data races impossible by not allowing competing access to shared
variables from different threads. Since threads were part of the
hardware, and the tools analysed the code flow through threads, this
could all be enforced at build time - data had to be passed in messages,
not shared memory. But for some things that involved large buffers,
that was hopelessly inefficient - and these devices were regularly used
with USB, audio, and similar things that needed large and predictable
buffers. So code - even library and example code from the manufacturer
- was full of inline assembly to work around the "smart-arse" language
and tools.
Hmm. This reads less like an indictment of the idea of stronger
typing, but rather a failure to provide adequate abstractions in
the type system.
I'm going to mention Rust again; apologies. When confined to
the safe subset, it _also_ has data race freedom. The rules
that give this property are:
1. Every object has exactly one owner, and assignments of
non-trivial types change ownership (they are logically a
"move");
2. References to an object may be "borrowed" from the owner,
and mutable references (that is, references that may be used
to write to the object) are distinct from immutable
references (that is, references that may only be used to read
from an object);
3. Mutable and immutable references are temporally mutually
exclusive: a mutable reference may be borrowed from an object
iff that is the only live reference to that object at the
time: that is, it is not permitted to borrow a mut ref to an
object if either another mut ref or any immutable refs to it
are live; any number of immutable references may be taken to
an object concurrently.
If all of these rules are obeyed (and in safe rust, they're
verified at compile time by the borrow checker) then you cannot
have data races.
At first glance it appears that it must suffer from the same
drawbacks as `XC`, which you mentioned above. Except that the
language does provide controlled ways to share data
concurrently.
Using the _unsafe_ subset, there is one place where it is
permitted to have multiple, mutable references to an object: the `UnsafeCell`. This gives Rust interior mutability, which means
that you can build safe abstractions for data sharing, like
`Mutex` types that own the data they protect.
That last bit is important: since the `Mutex` owns whatever it
protects, it controls access to that thing: there's no going
behind the `Mutex`'s back and accessing something outside of
the lock. Instead, the `lock` method returns a `MutexGuard`
object, which one might think of as a handle to the data,
allowing the user to manipulate it, but also having a drop
method (in Rust, that's basically a destructor) that
automatically unlocks the mutex run the guard is destroyed.
Anyway, the point is, both the rigor _and_ the expressiveness of
the type system let you do things like this, whereas you just
cannot in C.
I understand that someone has written a paper proposing adding
`defer` to C; that would obviate many of these problems.
Yes. Jens Gustedt - one of the few members of the C standards committee
who is vocal and public about pushing new ideas into C. (Of course not
everyone will agree with the ideas he comes with.)
<https://gustedt.wordpress.com/2026/02/15/defer-available-in-gcc-and-clang/>
Ah, I didn't realize it was Jens. Very cool.
- Dan C.
Bart <bc@freeuk.com> writes:
On 22/05/2026 01:24, Keith Thompson wrote:
<https://www.open-std.org/jtc1/sc22/wg14/www/docs/n2496.pdf>
This is very interesting. So perhaps tens of millions of C programmers
encountered the quirk, found it annoying, and moved on.
The vast majority of C programmers have likely never used goto.
On 22/05/2026 15:04, Scott Lurndal wrote:
Bart <bc@freeuk.com> writes:
On 22/05/2026 01:24, Keith Thompson wrote:
<https://www.open-std.org/jtc1/sc22/wg14/www/docs/n2496.pdf>
This is very interesting. So perhaps tens of millions of C programmers
encountered the quirk, found it annoying, and moved on.
The vast majority of C programmers have likely never used goto.
Source? Oh, 'likely', so it is just a random guess.
In any case the issue (labels at the end of a compound statement)
applied also to case labels and to 'default:'.
Now you're going to say the vast majority of C programmers have never
used 'switch' either! 'Probably...'
In article <10upivp$1fkbh$1@dont-email.me>, Bart <bc@freeuk.com> wrote: >>This is the same difference between:
a = cond ? x : y;
and:
if (cond) {
a = x;
} else {
a = y;
}
If ?: is considered a convenient, clearer short-cut for the latter,
First, it's not, for reasons that have been explained to you but
that you are ignoring because you don't like them.
But beyond that, I don't think that's what it is "considered" to
be, at all. `?:` can be used in an expression; `if` cannot.
In fact, I find the ternary operator to be used rather
sparingly, even in the kind of code you posted above.
On 22/05/2026 17:16, Bart wrote:
On 22/05/2026 15:04, Scott Lurndal wrote:
Bart <bc@freeuk.com> writes:
On 22/05/2026 01:24, Keith Thompson wrote:
<https://www.open-std.org/jtc1/sc22/wg14/www/docs/n2496.pdf>
This is very interesting. So perhaps tens of millions of C programmers >>>> encountered the quirk, found it annoying, and moved on.
The vast majority of C programmers have likely never used goto.
Source? Oh, 'likely', so it is just a random guess.
In any case the issue (labels at the end of a compound statement)
applied also to case labels and to 'default:'.
If a programmer wants a case label or default label at the end of a
switch, doing nothing, then (prior to C23, and excluding compiler extensions) they have to put in an empty statement - "default : ;". (Or they could use "break;" to make code feel more symmetrical.) I can
imagine that happening, and I can imagine that is the most likely
situation where you would want a label right before an end brace. This seems to be the main motivation for the change in C23.
It is much harder to imagine that this "bites" anyone, or even annoys
people in any significant way.
I think you might need to switch out your crystal ball - your
Now you're going to say the vast majority of C programmers have never
used 'switch' either! 'Probably...'
predictions about what people will say or do have rarely come close to reality. Or, perhaps, you should stop saying such stupid things.
In article <10upivp$1fkbh$1@dont-email.me>, Bart <bc@freeuk.com> wrote:
This is the same difference between:
a = cond ? x : y;
and:
if (cond) {
a = x;
} else {
a = y;
}
If ?: is considered a convenient, clearer short-cut for the latter,
First, it's not, for reasons that have been explained to you but
that you are ignoring because you don't like them.
But beyond that, I don't think that's what it is "considered" to
be, at all. `?:` can be used in an expression; `if` cannot.
In fact, I find the ternary operator to be used rather
sparingly, even in the kind of code you posted above.
then the same applies to the goto.
You have made it clear that you think that _should_ be true.
However, it is _not_ true. A brief perusal of the standard
would show _how_ it is not true.
(I can do it with zero: (cond | L1 | L2); the 'goto' can be implied.)
Yikes.
On 22/05/2026 13:13, Dan Cross wrote:
In article <10upbvq$1dhq4$1@dont-email.me>, Bart <bc@freeuk.com> wrote:
On 22/05/2026 05:52, Janis Papanagnou wrote:
On 2026-05-21 21:23, David Brown wrote:
On 21/05/2026 19:26, Bart wrote:
On 21/05/2026 13:55, David Brown wrote:
[...]
Having a label be just another term could be a benefit. For example: >>>>>>
goto cond ? L1 : L2;
What benefit is it to have an alternative construct for existing
clearer 'if' and 'switch' features? - What's the concrete gain? -
To create harder to follow spaghetti-code?
You're asking why '?:' exists when you can use 'if-else'?
Is that meant to be obtuse? No, the question is why anyone
would want to write code that uses `goto` in that fashion.
The ternary operator, `?:`, is expression-oriented. Label names
are just identifiers; identifiers may appear in expressions
(specifically, "primary expressions"), but only if they name an
object, or a function.
And as I said, gnu C allows just that.
You're anyone asking the wrong question. It's usually a mistake to try
and double-guess the needs of a programmer, since it's possible they may >have a genuine use-case you haven't thought of. Or they have a
particular coding style they want to use.
A label name can only appear as part of a labeled statement;
thus, it names neither an object nor a function, so it cannot be
used as part of a primary expression, and so therefore, one
cannot use label names in the form mentioned above.
You seem to be saying, "look, you can't do this!" But the
question is, _why would you want to do that in the first place_?
So what on earth was the point of gnu C's &&L label pointers?
Other than enabling faster interpreters such as with CPython (which
needs all the help it can get). Writing FSMs and so on...
You seem to downplaying the benefits only because standard C doesn't
allow it.
My subjective opinion is that that is not sweet: it's hideous.
OK. Of course we'd have to see the alternative.
Obviously someone thinks, or thought, that this had some kind of
benefit, otherwise they wouldn't have implemented it. It seems
extraordinarily rarely used.
Sometimes people have an idea of what they think is going to be
a neat feature, only to find out with some experience that it
didn't pan out the way they thought it might. This happens all
the time;
It happens with me too! You should see some of my ideas that were too
naff. There were also better ones but that just didn't get enough usage
to warrant the support needed.
But I believe 'goto' is fundamental at this level of language, and it is >handy to allow first-class handling of labels even if little used: it's >something that if it isn't there when needed, it is harder to get around.
In gnu C computed goto tends to be used for faster dispatching (in
bytecode interpreters, emulators etc) compared with regular 'switch',
because it uses multiple dispatch points (which help CPU branch
predication). 'switch' uses only one.
(My language has a dedicated feature for this, so using explicit
computed goto is not needed. That allows it to outperform optimised
/standard/ C that uses plain switch.)
Absent a benchmark, I find that assertion specious. Speculation
about performance on modern machines is just that; speculation.
But machines these days are so incredibly complicated, and so
dynamic during execution, that we simply cannot reason about
them from first principles anymore: you either produce data for
a particular target machine, or you're just guessing.
OK, here is a test program which is a toy Pascal interpreter, running >recursive Fibonacci to calculate fib(36).
It was in C, then ported to my language. Someone also upgraded the C
version to use computed goto when available:
gcc -O2 using switch: 1.33 seconds (C version)
gcc -O2 using computed goto: 0.78 seconds
mm using 'doswitch': 1.03 seconds (my languages)
mm using 'doswitchu': 0.74 seconds (with special feature)
Timings are on Windows running on x64.
In my language, 'doswitch' is just a looping version of switch.
'doswitchu' is a version where the compiler uses multiple dispatch
points, one per branch, rather than looping back to the start.
Switching between the two means changing 'doswitch' to 'doswitchu'.
In C switching would require a rewrite, unless you write it in a
contrived manner. But then it still needs that bunch of macros shown
below, and it needs that table of label pointers to be manually maintained.
[snip]
cross@spitfire.i.gajendra.net (Dan Cross) writes:
In article <10upivp$1fkbh$1@dont-email.me>, Bart <bc@freeuk.com> wrote: >>>This is the same difference between:
a = cond ? x : y;
and:
if (cond) {
a = x;
} else {
a = y;
}
If ?: is considered a convenient, clearer short-cut for the latter,
First, it's not, for reasons that have been explained to you but
that you are ignoring because you don't like them.
But beyond that, I don't think that's what it is "considered" to
be, at all. `?:` can be used in an expression; `if` cannot.
In fact, I find the ternary operator to be used rather
sparingly, even in the kind of code you posted above.
I wouldn't necessarily categorize it as "sparingly"; Searching
through the simulator source base, I find it used rather
frequently in certain scenarios:
diag = snprintf(cp, remaining, "Feature1:%c Feature2:%c", is_feature1()?'+':'-',
is_feature2()?'+':'-');
deliver_interrupt(IntVec_TIMER, is_secure()?FLAGS_SECURE:0u);
In article <10upkur$1fkbh$2@dont-email.me>, Bart <bc@freeuk.com> wrote:
(My language has a dedicated feature for this, so using explicit
computed goto is not needed. That allows it to outperform optimised
/standard/ C that uses plain switch.)
Absent a benchmark, I find that assertion specious. Speculation
about performance on modern machines is just that; speculation.
But machines these days are so incredibly complicated, and so
dynamic during execution, that we simply cannot reason about
them from first principles anymore: you either produce data for
a particular target machine, or you're just guessing.
OK, here is a test program which is a toy Pascal interpreter, running
recursive Fibonacci to calculate fib(36).
It was in C, then ported to my language. Someone also upgraded the C
version to use computed goto when available:
gcc -O2 using switch: 1.33 seconds (C version)
gcc -O2 using computed goto: 0.78 seconds
mm using 'doswitch': 1.03 seconds (my languages)
mm using 'doswitchu': 0.74 seconds (with special feature)
Timings are on Windows running on x64.
In my language, 'doswitch' is just a looping version of switch.
'doswitchu' is a version where the compiler uses multiple dispatch
points, one per branch, rather than looping back to the start.
Where to start.
First, why not post the code? It's hard know what is _actually_
going on here because one cannot see how the implementations
might differ otherwise, aside from the tiny snippet you
included.
Second, how was it compiled? What optimization settings? What
compiler? Did you test different compilers to see if they did
things differently? You've made a broad general assertion about
the code generated in response to a `switch` statement; my own
experience is that that varies widely based on compiler, target
architecture, and optimization settings.
Finally, your assertion was that the version using your multiple
dispatch is faster due to branch predicition: you've shown that
this is faster, but you haven't shown _why_, let alone that it
is due to branch prediction. Since you're running this on
x86_64, did you try to look at the machine's perf counters to
see what's actually happening?
Switching between the two means changing 'doswitch' to 'doswitchu'.
In C switching would require a rewrite, unless you write it in a
contrived manner. But then it still needs that bunch of macros shown
below, and it needs that table of label pointers to be manually maintained. >>
[snip]
I'm not sure what you mean here. You've given examples of
timings using computed gotos and `switch` in C; it's unclear
what would "require a rewrite" since you appear to already have
both versions.
On 22/05/2026 16:47, David Brown wrote:
On 22/05/2026 17:16, Bart wrote:
On 22/05/2026 15:04, Scott Lurndal wrote:
Bart <bc@freeuk.com> writes:
On 22/05/2026 01:24, Keith Thompson wrote:
<https://www.open-std.org/jtc1/sc22/wg14/www/docs/n2496.pdf>
This is very interesting. So perhaps tens of millions of C programmers >>>>> encountered the quirk, found it annoying, and moved on.
The vast majority of C programmers have likely never used goto.
Source? Oh, 'likely', so it is just a random guess.
In any case the issue (labels at the end of a compound statement)
applied also to case labels and to 'default:'.
If a programmer wants a case label or default label at the end of a
switch, doing nothing, then (prior to C23, and excluding compiler
extensions) they have to put in an empty statement - "default : ;". (Or >> they could use "break;" to make code feel more symmetrical.) I can
imagine that happening, and I can imagine that is the most likely
situation where you would want a label right before an end brace. This
seems to be the main motivation for the change in C23.
It is much harder to imagine that this "bites" anyone, or even annoys
people in any significant way.
And yet, the link at the top was about somebody proposing to fix this
very thing. And it was accepted.
There was no reason whatsoever to ban:
L:}
L:int x;
other than the grammar happening to define labels in a certain way. That >there are workarounds is not the point.
I think you might need to switch out your crystal ball - your
Now you're going to say the vast majority of C programmers have never
used 'switch' either! 'Probably...'
predictions about what people will say or do have rarely come close to
reality. Or, perhaps, you should stop saying such stupid things.
Scott Lurndal said something stupid and I responded with sarcasm. Yet
you attack me and not him. Biased at all?
This is what I responded to in his post:
* An assertion that the 'vast majority' of C programmers have never used >'goto'.
In article <Xj%PR.313884$yHZ7.10738@fx10.iad>,
Scott Lurndal <slp53@pacbell.net> wrote:
cross@spitfire.i.gajendra.net (Dan Cross) writes:
In article <10upivp$1fkbh$1@dont-email.me>, Bart <bc@freeuk.com> wrote: >>>>This is the same difference between:
a = cond ? x : y;
and:
if (cond) {
a = x;
} else {
a = y;
}
If ?: is considered a convenient, clearer short-cut for the latter,
First, it's not, for reasons that have been explained to you but
that you are ignoring because you don't like them.
But beyond that, I don't think that's what it is "considered" to
be, at all. `?:` can be used in an expression; `if` cannot.
In fact, I find the ternary operator to be used rather
sparingly, even in the kind of code you posted above.
I wouldn't necessarily categorize it as "sparingly"; Searching
through the simulator source base, I find it used rather
frequently in certain scenarios:
diag = snprintf(cp, remaining, "Feature1:%c Feature2:%c", is_feature1()?'+':'-',
is_feature2()?'+':'-');
deliver_interrupt(IntVec_TIMER, is_secure()?FLAGS_SECURE:0u);
Those are the kinds of scenarios where it can be useful; what
percentage of code overall uses that versus `if`, though?
- Dan C.
cross@spitfire.i.gajendra.net (Dan Cross) writes:
In article <Xj%PR.313884$yHZ7.10738@fx10.iad>,
Scott Lurndal <slp53@pacbell.net> wrote:
cross@spitfire.i.gajendra.net (Dan Cross) writes:
In article <10upivp$1fkbh$1@dont-email.me>, Bart <bc@freeuk.com> wrote: >>>>>This is the same difference between:
a = cond ? x : y;
and:
if (cond) {
a = x;
} else {
a = y;
}
If ?: is considered a convenient, clearer short-cut for the latter,
First, it's not, for reasons that have been explained to you but
that you are ignoring because you don't like them.
But beyond that, I don't think that's what it is "considered" to
be, at all. `?:` can be used in an expression; `if` cannot.
In fact, I find the ternary operator to be used rather
sparingly, even in the kind of code you posted above.
I wouldn't necessarily categorize it as "sparingly"; Searching
through the simulator source base, I find it used rather
frequently in certain scenarios:
diag = snprintf(cp, remaining, "Feature1:%c Feature2:%c", is_feature1()?'+':'-',
is_feature2()?'+':'-');
deliver_interrupt(IntVec_TIMER, is_secure()?FLAGS_SECURE:0u);
Those are the kinds of scenarios where it can be useful; what
percentage of code overall uses that versus `if`, though?
Other than the cases above, 'if' dominates by orders of magnitude.
On 2026-05-22 02:24, Keith Thompson wrote:...
...I suggest using the term "name space" rather than "namespace",
which happens to be the name of a C++ feature.
The term "namespace" is, as I observe, a regular English word
(it's in my dictionary, at least).
("A namespace in C++ is denoted by the keyword 'namespace'.")
Bart <bc@freeuk.com> writes:
This is what I responded to in his post:
* An assertion that the 'vast majority' of C programmers have never used
'goto'.
Think about it. You cite 7 projects that have goto's in them.
Google estimates there are 11 to 13 million C programmers.
[snip for brevity]
gcc by default compiles GNU C, not ISO C. It also fails to emit
many language-required diagnostics. GNU C is ISO C with gcc
extensions. Prior to C23, allowing "L:}" is a documented gcc
extension.
In article <10upivp$1fkbh$1@dont-email.me>, Bart <bc@freeuk.com> wrote: [...]
In fact, I find the ternary operator to be used rather
sparingly, even in the kind of code you posted above.
[...]
(I can do it with zero: (cond | L1 | L2); the 'goto' can be implied.)
Yikes.
[...]
On 22/05/2026 15:57, Dan Cross wrote:
In article <10upivp$1fkbh$1@dont-email.me>, Bart <bc@freeuk.com> wrote:[...]
(I can do it with zero: (cond | L1 | L2); the 'goto' can be implied.)
Yikes.
Blame Algol68, that's where implied goto comes from.
In article <10upbvq$1dhq4$1@dont-email.me>, Bart <bc@freeuk.com> wrote:
[...]
In my syntax, it can do this:
goto (c | L1, L2, L3 | Lx)
It's sweeter /because/ label names live in the normal name space. I'm
sure A68 can do this too, but I couldn't get it to work.
[...]
Keith Thompson <Keith.S.Thompson+u@gmail.com> writes:
gcc by default compiles GNU C, not ISO C. It also fails to emit
many language-required diagnostics. GNU C is ISO C with gcc
extensions. Prior to C23, allowing "L:}" is a documented gcc
extension.
I seem to remember that sometime in the past gcc behaved in this
way. And I see that current documentation on the gcc website says
something to that effect. Trying it just now, however, I couldn't
get gcc to accept "L:}" under any circumstances, no matter which
combination of options was used. The unavailability doesn't bother
me; I just thought the observation was interesting and worth
reporting.
I often declare local variables within a compound statement
introduced by a do/while/for/case construct.
In pre or post C99 code, I do not create compound statements
willy-nilly just to hide a prior defined local variable with the
same name (a maintenance nightmare, to be sure) or to locate the
declaration closer to the use in a function.
Tim Rentsch <tr.17687@z991.linuxsc.com> writes:
Keith Thompson <Keith.S.Thompson+u@gmail.com> writes:
gcc by default compiles GNU C, not ISO C. It also fails to emit
many language-required diagnostics. GNU C is ISO C with gcc
extensions. Prior to C23, allowing "L:}" is a documented gcc
extension.
I seem to remember that sometime in the past gcc behaved in this
way. And I see that current documentation on the gcc website says
something to that effect. Trying it just now, however, I couldn't
get gcc to accept "L:}" under any circumstances, no matter which
combination of options was used. The unavailability doesn't bother
me; I just thought the observation was interesting and worth
reporting.
Apparently this extension was introduced in gcc 11, released in 2021.
It also allows labels on declarations.
[...]
and I asked why this works:
fred(9999);
PROC fred = (INT n)VOID: print((n, newline));
print("end")
I speculated it was the same bug. But it still works with 3.12.2. So
this case, it is possible to call 'fred' even though it has not yet been defined.
[...]
Apparently this extension was introduced in gcc 11, released in 2021.
It also allows labels on declarations.
On 22/05/2026 16:47, David Brown wrote:
On 22/05/2026 17:16, Bart wrote:
On 22/05/2026 15:04, Scott Lurndal wrote:
Bart <bc@freeuk.com> writes:
On 22/05/2026 01:24, Keith Thompson wrote:
<https://www.open-std.org/jtc1/sc22/wg14/www/docs/n2496.pdf>
This is very interesting. So perhaps tens of millions of C programmers >>>>> encountered the quirk, found it annoying, and moved on.
The vast majority of C programmers have likely never used goto.
Source? Oh, 'likely', so it is just a random guess.
In any case the issue (labels at the end of a compound statement)
applied also to case labels and to 'default:'.
If a programmer wants a case label or default label at the end of a
switch, doing nothing, then (prior to C23, and excluding compiler
extensions) they have to put in an empty statement - "default : ;".
(Or they could use "break;" to make code feel more symmetrical.) I
can imagine that happening, and I can imagine that is the most likely
situation where you would want a label right before an end brace.
This seems to be the main motivation for the change in C23.
It is much harder to imagine that this "bites" anyone, or even annoys
people in any significant way.
And yet, the link at the top was about somebody proposing to fix this
very thing. And it was accepted.
There was no reason whatsoever to ban:
L:}
L:int x;
other than the grammar happening to define labels in a certain way. That there are workarounds is not the point.
I think you might need to switch out your crystal ball - your
Now you're going to say the vast majority of C programmers have never
used 'switch' either! 'Probably...'
predictions about what people will say or do have rarely come close to
reality. Or, perhaps, you should stop saying such stupid things.
Scott Lurndal said something stupid and I responded with sarcasm. Yet
you attack me and not him. Biased at all?
On 2026-05-22 18:53, Bart wrote:
On 22/05/2026 15:57, Dan Cross wrote:
In article <10upivp$1fkbh$1@dont-email.me>, Bart <bc@freeuk.com> wrote: >> [...]
(I can do it with zero: (cond | L1 | L2); the 'goto' can be implied.)
Yikes.
Blame Algol68, that's where implied goto comes from.
No; since you said "I can do it ..." it was obviously _your decision_
to borrow it for "your language(s)". So only *you* are to be accounted
for *your* language implementation. - Stand by it, coward!
You seem to like proudly pointing out how your languages work, and if
there's criticism you blame the original source where the ideas stem
from. - Disgusting moves; yikes!
On 22/05/2026 18:35, Bart wrote:
And yet, the link at the top was about somebody proposing to fix this
very thing. And it was accepted.
We all know that "L:;}" is used sometimes in code (typically with case
or default labels). One day, someone who happened to use that
construction regularly, and who was already a highly respected C
developer and already worked with the C standards committee,
There was no reason whatsoever to ban:
L:}
L:int x;
other than the grammar happening to define labels in a certain way.
That there are workarounds is not the point.
For someone who is so proud of having "designed" several languages, you
are remarkably ignorant about how languages are designed. Most of the time, people decide what they want to /allow/, not what they want to
ban. And most of the time, the aim is to keep things as simple as
possible (but no simpler) to be able to do what you want to do. For C,
up until C23, one form of "statement" is "labelled statement" that looks like "label : statement". That's simple and easy to write in the standards, simple and easy to use in practice, and simple and easy to support in implementations. While I am not privy to the thoughts of
either Ritchie or early C implementers and influencers, I cannot imagine "L:}" was /banned/ - it was simply not supported in the grammar of the language, probably because that would have been an unnecessary
complication in the descriptions.
For someone who is so proud of having "designed" several languages,
On 2026-05-23 08:17, Keith Thompson wrote:
[...]
Apparently this extension was introduced in gcc 11, released in 2021.
It also allows labels on declarations.
Also in plain (i.e. uninitialized) declarations?
Is there some application case where that's useful?
(Or just making the formulation of a rule simpler?)
On 23/05/2026 10:46, David Brown wrote:
On 22/05/2026 18:35, Bart wrote:
And yet, the link at the top was about somebody proposing to fix this
very thing. And it was accepted.
We all know that "L:;}" is used sometimes in code (typically with case
or default labels). One day, someone who happened to use that
construction regularly, and who was already a highly respected C
developer and already worked with the C standards committee,
Oh, right. So there would have been no point in my doing it. But Keith Thompson had a go a me for not doing that anyway:
"Somebody (notably not you) took the time to write a proposal and submit
it to the commmittee, which accepted it."
There was no reason whatsoever to ban:
L:}
L:int x;
other than the grammar happening to define labels in a certain way.
That there are workarounds is not the point.
For someone who is so proud of having "designed" several languages,
you are remarkably ignorant about how languages are designed. Most of
the time, people decide what they want to /allow/, not what they want
to ban. And most of the time, the aim is to keep things as simple as
possible (but no simpler) to be able to do what you want to do. For
C, up until C23, one form of "statement" is "labelled statement" that
looks like "label : statement". That's simple and easy to write in
the standards, simple and easy to use in practice, and simple and easy
to support in implementations. While I am not privy to the thoughts
of either Ritchie or early C implementers and influencers, I cannot
imagine "L:}" was /banned/ - it was simply not supported in the
grammar of the language, probably because that would have been an
unnecessary complication in the descriptions.
So, you're picking up on the word "ban". How would you have worded it?
... are irrelevant.For someone who is so proud of having "designed" several languages,
All of which ...
On 23/05/2026 12:31, Bart wrote:
On 23/05/2026 10:46, David Brown wrote:
On 22/05/2026 18:35, Bart wrote:
And yet, the link at the top was about somebody proposing to fix
this very thing. And it was accepted.
We all know that "L:;}" is used sometimes in code (typically with
case or default labels). One day, someone who happened to use that
construction regularly, and who was already a highly respected C
developer and already worked with the C standards committee,
Oh, right. So there would have been no point in my doing it. But Keith
Thompson had a go a me for not doing that anyway:
Do you think Keith, me, ....
"Somebody (notably not you) took the time to write a proposal and
submit it to the commmittee, which accepted it."
So, you're picking up on the word "ban". How would you have worded it?
If you "ban" something, you actively and explicitly choose not to allow it.
... are irrelevant.
For someone who is so proud of having "designed" several languages,
All of which ...
On 23/05/2026 12:51, David Brown wrote:
On 23/05/2026 12:31, Bart wrote:
On 23/05/2026 10:46, David Brown wrote:
On 22/05/2026 18:35, Bart wrote:
And yet, the link at the top was about somebody proposing to fix
this very thing. And it was accepted.
We all know that "L:;}" is used sometimes in code (typically with
case or default labels). One day, someone who happened to use that
construction regularly, and who was already a highly respected C
developer and already worked with the C standards committee,
Oh, right. So there would have been no point in my doing it. But
Keith Thompson had a go a me for not doing that anyway:
Do you think Keith, me, ....
Irrelevant. That was clearly a snide comment aimed at me.
KT:
"Somebody (notably not you) took the time to write a proposal and
submit it to the commmittee, which accepted it."
So, you're picking up on the word "ban". How would you have worded it?
If you "ban" something, you actively and explicitly choose not to
allow it.
So you don't like the word. Maybe it wasn't intentional at the start,
and maybe neither was doing nothing about it for decades
(unintentionally of course), but the end result was the same.
... are irrelevant.
For someone who is so proud of having "designed" several languages,
All of which ...
My choices of label placement are irrelevant to the choices made in C in
a paragraph about language design? OK.
But you managed to squeeze in another snarky comment anyway.
ALWAYS with the personal attacks in this group.
esides thay talke on topic (thoug i would say only on part of it as they
say most interesting part of topics too)
On 23/05/2026 10:46, David Brown wrote:
For someone who is so proud of having "designed" several languages,
All of which allow labels at the end of a block or before declarations,
when the latter could be mixed within executable code.
Actually, I probably allow labels in too many places, including in the middle of expressions, like here (expressed in A68G syntax):
INT a, b:=2, c:=3, d:=4;
a := IF b=c THEN fred: c ELSE d FI;
GOTO fred;
Declaring the label is not an error with A68G, but trying to jump into
the expression doesn't work; presumably the scope of 'fred' is limited
to the block so it just can't see it.
Jumping out of the expression via GOTO is allowed however.
My languages allow both. Jumping into the middle of a block is sometimes done and can be safe. Into the middle of an expression is less so, so
should ideally be detected and blocked, probably in a later compiler pass.
On 22/05/2026 18:32, Dan Cross wrote:
In article <10upkur$1fkbh$2@dont-email.me>, Bart <bc@freeuk.com> wrote:
[snip]
Second, how was it compiled? What optimization settings? What
compiler? Did you test different compilers to see if they did
things differently? You've made a broad general assertion about
the code generated in response to a `switch` statement; my own
experience is that that varies widely based on compiler, target
architecture, and optimization settings.
It will depend on the task and spread of workload. If the program being
run is spending a lot of time in libraries, then bytecode dispatch is
less of the overall overhead.
Finally, your assertion was that the version using your multiple
dispatch is faster due to branch predicition: you've shown that
this is faster, but you haven't shown _why_, let alone that it
is due to branch prediction. Since you're running this on
x86_64, did you try to look at the machine's perf counters to
see what's actually happening?
It's a well-known technique. It was used on CPython (as compiled for
Linux, since on Windows it doesn't use gcc), and may still be. Athough
there are some experients to move towards complex threaded code via TCO >instead.
In any case, it seems to work, so who cares why?
Switching between the two means changing 'doswitch' to 'doswitchu'.
In C switching would require a rewrite, unless you write it in a
contrived manner. But then it still needs that bunch of macros shown
below, and it needs that table of label pointers to be manually maintained. >>>
[snip]
I'm not sure what you mean here. You've given examples of
timings using computed gotos and `switch` in C; it's unclear
what would "require a rewrite" since you appear to already have
both versions.
You know what normal switch looks like:
while (!stopped) {
switch (opcode) {
case ADD:
...
Computed goto would first need a jumptable:
void* jumptable[] = {&&ADDLAB, ....};
Then dispatch at every point:
goto *jumptable[opcode];
ADDLAB:
....
goto *jumptable[opcode]
It is quite different. In Tinypas, it wraps it up in macros so that the
same body of the 'switch' can be used for both choices.
In article <10uqamg$1nrsr$1@dont-email.me>, Bart <bc@freeuk.com> wrote:
On 22/05/2026 18:32, Dan Cross wrote:
In article <10upkur$1fkbh$2@dont-email.me>, Bart <bc@freeuk.com> wrote:
[snip]
Well, Bart, you'll be pleased to know that I was able to
replicate your results.
Here is a post from a person who noted a much _smaller_
performance difference between a switch-based loop and computed
gotos in the cpython interpreter, after seeing a regression due
to clang-19: https://blog.nelhage.com/post/cpython-tail-call/
This asserts that modern compilers are capable of generating
code for a `switch` that is essentially the same as that
manually generated with computed gotos. I don't know why this
doesn't seem to be the case with your code.
I still don't understand what you mean by "rewrite": it appeared
you were referring to work that had yet to be done, but the work
was done.
Back in those days, languages that needed multipass compilers (e.g.
Algol 68) were considered complicated and expensive to implement.
That’s why C went for a single-pass language design, like Pascal. And
like Pascal, it has forward declarations to mitigate this somewhat.
You need some kind of use-before-define facility in any realistic
language, if you want to allow recursion, and in particular mutual
recursion.
It’s amusing to think that C++, that behemoth that, in terms of sheer complexity, leaves old-style monsters like Algol 68 or PL/I in the
dust, is still essentially a single-pass language design.
On 2026-05-22 16:57, Dan Cross wrote:
In article <10upivp$1fkbh$1@dont-email.me>, Bart <bc@freeuk.com> wrote:
[...]
In fact, I find the ternary operator to be used rather
sparingly, even in the kind of code you posted above.
I haven't got that impression.
Some piece of software (it's just a random example, to be clear)
that has
130 C-files
250249 total LOC
shows
3864 lines with ?: appearing
I think this is a lot!
(I haven't inspected their application-types or done any ad hoc >classification; these are just [huge] raw numbers. If interested
in details the sample grep-output is currently available through >http://volatile.gridbug.de/ternaries.txt .)
[snip]
If it's the "'goto' can be implied" you should know that in Algol 68
it's equivalent to say either of
GOTO stop
GO TO stop
stop
These are semantically equivalent forms; you don't need that indecent
'goto' keyword. IF n = 0 THEN stop FI or ( n = 0 | stop ) are
both fairly clear forms (without an explicit GOTO).
Bart <bc@freeuk.com> writes:
On 21/05/2026 07:45, David Brown wrote:
[...]
So do you have an example of where code has been written to take
advantage of the separate name spaces?
No. This is why I claimed it was pointless.
This entire discussion is pointless. [...]
Keith Thompson <Keith.S.Thompson+u@gmail.com> writes:
Bart <bc@freeuk.com> writes:
On 21/05/2026 07:45, David Brown wrote:
[...]
So do you have an example of where code has been written to take
advantage of the separate name spaces?
No. This is why I claimed it was pointless.
This entire discussion is pointless. [...]
I was hoping you would stop there. Your point would have
been made much more effectively.
On 23/05/2026 12:51, David Brown wrote:[...]
Do you think Keith, me, ....
Irrelevant. That was clearly a snide comment aimed at me.
KT:
"Somebody (notably not you) took the time to write a proposal and
submit it to the commmittee, which accepted it."
In article <10ujm3r$3pnbb$1@dont-email.me>,
David Brown <david.brown@hesbynett.no> wrote:
[snip] I have almost never had need of "goto" or labels (excluding
switch case labels, of course), and don't expect ever to do so in the
future.
While I generally try to avoid it, there are times (in C in
particular) when it really is the right tool; [...]
Breaking out of nested loops without a lot of unnecessary
ceremony is sort of an obvious example; [...]
cross@spitfire.i.gajendra.net (Dan Cross) writes:
In article <10ujm3r$3pnbb$1@dont-email.me>,
David Brown <david.brown@hesbynett.no> wrote:
[snip] I have almost never had need of "goto" or labels (excluding
switch case labels, of course), and don't expect ever to do so in the
future.
While I generally try to avoid it, there are times (in C in
particular) when it really is the right tool; [...]
Yes.
Breaking out of nested loops without a lot of unnecessary
ceremony is sort of an obvious example; [...]
Another scenario where using goto can be attractive is when
so-called "loop and a half" processing is needed. Something
like this
goto ENTRY;
do {
....
ENTRY:
....
} while( ..condition.. );
is in many cases more attractive than goto-less alternatives.
In article <10ure81$1s756$1@dont-email.me>,
Janis Papanagnou <janis_papanagnou+ng@hotmail.com> wrote:
On 2026-05-22 16:57, Dan Cross wrote:
In article <10upivp$1fkbh$1@dont-email.me>, Bart <bc@freeuk.com> wrote: >>> [...]
In fact, I find the ternary operator to be used rather
sparingly, even in the kind of code you posted above.
I haven't got that impression.
Some piece of software (it's just a random example, to be clear)
that has
130 C-files
250249 total LOC
shows
3864 lines with ?: appearing
I think this is a lot!
That's about 1.5% of source lines. I'd say that's pretty
sparingly used.
(I haven't inspected their application-types or done any ad hoc
classification; these are just [huge] raw numbers. If interested
in details the sample grep-output is currently available through
http://volatile.gridbug.de/ternaries.txt .)
I'm not terribly. I never said it wasn't used; just used
sparingly. That's not a precise unit of measure, unfortunately;
perhaps a better way to put it would have been that, compared to
`if` or even `if (foo) bar = 1; else bar = 2;` the ternary
operator is used much less frequently.
[snip]
If it's the "'goto' can be implied" you should know that in Algol 68
it's equivalent to say either of
GOTO stop
GO TO stop
stop
These are semantically equivalent forms; you don't need that indecent
'goto' keyword. IF n = 0 THEN stop FI or ( n = 0 | stop ) are
both fairly clear forms (without an explicit GOTO).
Careful or you're going to make me say "yikes" again.
I think
these are awful design decisions. Wirth had strong opinions
about Algol 68; I can see why.
- Dan C.
[...]
Another scenario where using goto can be attractive is when
so-called "loop and a half" processing is needed. Something
like this
goto ENTRY;
do {
....
ENTRY:
....
} while( ..condition.. );
is in many cases more attractive than goto-less alternatives.
cross@spitfire.i.gajendra.net (Dan Cross) writes:
In article <10ujm3r$3pnbb$1@dont-email.me>,
David Brown <david.brown@hesbynett.no> wrote:
[snip] I have almost never had need of "goto" or labels (excluding
switch case labels, of course), and don't expect ever to do so in the
future.
While I generally try to avoid it, there are times (in C in
particular) when it really is the right tool; [...]
Yes.
Breaking out of nested loops without a lot of unnecessary
ceremony is sort of an obvious example; [...]
Another scenario where using goto can be attractive is when
so-called "loop and a half" processing is needed. Something
like this
goto ENTRY;
do {
....
ENTRY:
....
} while( ..condition.. );
is in many cases more attractive than goto-less alternatives.
left = q = malloc( sizeof *q );goto A5;
right = q = malloc( sizeof *q );goto A5;
b = a;return;
b = 0;return;
right = p->left, p->left = r;} else {
left = p->right, p->right = s;
left = p->right, p->right = r;}
right = p->left, p->left = s;
In article <10ure81$1s756$1@dont-email.me>,
Janis Papanagnou <janis_papanagnou+ng@hotmail.com> wrote:
On 2026-05-22 16:57, Dan Cross wrote:
In article <10upivp$1fkbh$1@dont-email.me>, Bart <bc@freeuk.com> wrote: >>> [...]
In fact, I find the ternary operator to be used rather
sparingly, even in the kind of code you posted above.
I haven't got that impression.
Some piece of software (it's just a random example, to be clear)
that has
130 C-files
250249 total LOC
shows
3864 lines with ?: appearing
I think this is a lot!
That's about 1.5% of source lines. I'd say that's pretty
sparingly used.
On 24/05/2026 03:48, Dan Cross wrote:
In article <10ure81$1s756$1@dont-email.me>,
Janis Papanagnou <janis_papanagnou+ng@hotmail.com> wrote:
On 2026-05-22 16:57, Dan Cross wrote:
In article <10upivp$1fkbh$1@dont-email.me>, Bart <bc@freeuk.com>
wrote:
[...]
In fact, I find the ternary operator to be used rather
sparingly, even in the kind of code you posted above.
I haven't got that impression.
Some piece of software (it's just a random example, to be clear)
that has
130 C-files
250249 total LOC
shows
3864 lines with ?: appearing
I think this is a lot!
That's about 1.5% of source lines. I'd say that's pretty
sparingly used.
I think if you take just about any feature, you will find it is only
used in a small percentage of lines.
But don't just take my opinion, I put it to to the test.
I took sqlite3.c sources from 2018 (sqlite3.c + shell.c + header, which
are combined into a 235Kloc source file).
This was preprocessed down to a 84Kloc file, which would increase the percentages and work againt my argument. Even so, these were the figures
I got:
goto 0.75 % (no. of instances as percentage of line count)
for 1.00
while 0.36
funcs 2.31 (no. of function definitions)
+ 2.38
- minus 1.66
On 22/05/2026 16:44, Dan Cross wrote:
In article <10un0j7$obiv$2@dont-email.me>,
David Brown <david.brown@hesbynett.no> wrote:
[snip]
There is definitely potential for a language's type system to make it
harder to make some kinds of mistakes. But there is a risk in making
the language too restrictive - people end up writing horrible code to
work around restrictions, or use "unsafe" code too much.
Interesting. I've found it to be somewhat the opposite; using a
richer type system has lead to code that is easier to understand
and reason about, and less buggy: type-oriented programming can
make entire categories of errors *unrepresentable*, so it's not
just _harder_ to make certain kinds of mistakes, but
_impossible_. The existance of an object of some type can be
thought of as an existence proof that the invariants the type
represents hold.
In general, I agree - I prefer strongly typed languages, and it's always >best if an error is identified by the code being uncompilable rather
than waiting for run-time testing (or testing by evil hackers). But if >people want to do something with the language that is hard to achieve >efficiently within the norms of the language, and there are escape
hatches ("unsafe" code, inline assembly, calling external C functions,
etc.) then people will use them.
People do sometimes write C code that
knowingly depends on undefined behaviour, because it makes their results >more efficient and "it worked when I tested it".
Type safety - like any kind of safety - can sometimes get in the way of >"getting things done". A balance always needs to be found to ensure
that people can do what they need to do without breaking rules or using >"escapes" more than absolutely necessary. If people find that much of
their Rust code is "unsafe", then most of the point of using Rust is
lost. (Of course this also means that different languages are suited to >different tasks.)
And Alexis King wrote the great, "Parse, Don't Validate" essay
some years ago where she talks about "Type-Driven Development":
https://lexi-lambda.github.io/blog/2019/11/05/parse-don-t-validate/
More recently, Yaron Minsky gave a talk and discussed this in an
OCaml context (https://www.youtube.com/watch?v=rUYP4C29yCw; the
most relevant bits start at about 6:35. The talk is about AI
and constraining agents, but the discussion around types is more
general).
I wrote the production OS loader for Oxide compute sleds using
this technique in the virtual memory system (and other places):
the loader uses multiple page sizes, and the rule is that, when
mapping a region of memory, it uses the largest page size that
it can, given size and alignment constraints. But I use the
type system to make it impossible to, say, map a 2MiB "large"
physical page frame to a non-2MiB aligned virtual boundary.
I did some work, eons ago, in a language called XC that was specifically >>> for XMOS microcontrollers. The language and tools had a feature that
made data races impossible by not allowing competing access to shared
variables from different threads. Since threads were part of the
hardware, and the tools analysed the code flow through threads, this
could all be enforced at build time - data had to be passed in messages, >>> not shared memory. But for some things that involved large buffers,
that was hopelessly inefficient - and these devices were regularly used
with USB, audio, and similar things that needed large and predictable
buffers. So code - even library and example code from the manufacturer
- was full of inline assembly to work around the "smart-arse" language
and tools.
Hmm. This reads less like an indictment of the idea of stronger
typing, but rather a failure to provide adequate abstractions in
the type system.
It is not an indication of problems with stronger typing, but it is an >indication that the restrictions inherent in the language and tools made >them somewhat unsuitable for a lot of use-cases that fitted the hardware >well. (Modern XMOS tools have changed significantly since those days, >perhaps partly because of such issues.)
I'm going to mention Rust again; apologies. When confined to
the safe subset, it _also_ has data race freedom. The rules
that give this property are:
1. Every object has exactly one owner, and assignments of
non-trivial types change ownership (they are logically a
"move");
2. References to an object may be "borrowed" from the owner,
and mutable references (that is, references that may be used
to write to the object) are distinct from immutable
references (that is, references that may only be used to read
from an object);
3. Mutable and immutable references are temporally mutually
exclusive: a mutable reference may be borrowed from an object
iff that is the only live reference to that object at the
time: that is, it is not permitted to borrow a mut ref to an
object if either another mut ref or any immutable refs to it
are live; any number of immutable references may be taken to
an object concurrently.
If all of these rules are obeyed (and in safe rust, they're
verified at compile time by the borrow checker) then you cannot
have data races.
That's all good, by the sound of it. But if people find this gets in
the way of efficiency - perhaps because they know that it is safe to
hold two separate mutable references to an object for reasons the borrow >checker can't see - the temptation to use "unsafe" or other workarounds >comes quickly.
That does not mean that a language should not try to have such rules,
and enforce them at compile time - far from it. The aim should be that
as much code as possible is correct, or at least immune to particular >classes of bugs, checked by the compiler and tools.
At first glance it appears that it must suffer from the same
drawbacks as `XC`, which you mentioned above. Except that the
language does provide controlled ways to share data
concurrently.
Using the _unsafe_ subset, there is one place where it is
permitted to have multiple, mutable references to an object: the
`UnsafeCell`. This gives Rust interior mutability, which means
that you can build safe abstractions for data sharing, like
`Mutex` types that own the data they protect.
If "unsafe" gives enough, but is rarely needed in most code, then it
sounds like a good balance.
I took sqlite3.c sources from 2018 (sqlite3.c + shell.c + header, which
are combined into a 235Kloc source file).
On 5/24/26 13:30, Bart wrote:
I took sqlite3.c sources from 2018 (sqlite3.c + shell.c + header,
which are combined into a 235Kloc source file).
I don't understand your obsession with “combining” different
sources into a single file.
I'm no C expert, but I've been
coding for 40 years, so let's take an example (simplified here)
from my real-life experience as a programmer:
/* here the various classic includes */
static unsigned count;
void count_increment()
{
count++;
}
void count_display()
{
printf("count is %ld\n", count);
}
/* just a little compil unit :) */
If you "combine" that _independent_ code in an huge
single file, and there was more than one compilation
unit who use a static variable named count, two things
can happen :
a) your friend the compiler is barfing.
b) nasal daemons are not accurate.
On 24/05/2026 21:09, tTh wrote:
On 5/24/26 13:30, Bart wrote:
I took sqlite3.c sources from 2018 (sqlite3.c + shell.c + header,
which are combined into a 235Kloc source file).
I don't understand your obsession with “combining” different
sources into a single file.
It's not me doing it. sqlite3.c is an amalgamation of over 100 C files >created by the SQLite developers. It's done to simplify embedding within >your own applications. That file itself was already 220Kloc.
On 5/24/26 13:30, Bart wrote:
I took sqlite3.c sources from 2018 (sqlite3.c + shell.c + header,
which are combined into a 235Kloc source file).
I don't understand your obsession with “combining” different
sources into a single file. I'm no C expert, but I've been
coding for 40 years, so let's take an example (simplified here)
from my real-life experience as a programmer:
/* here the various classic includes */
static unsigned count;
void count_increment()
{
count++;
}
void count_display()
{
printf("count is %ld\n", count);
}
/* just a little compil unit :) */
If you "combine" that _independent_ code in an huge
single file, and there was more than one compilation
unit who use a static variable named count, two things
can happen :
a) your friend the compiler is barfing.
b) nasal daemons are not accurate.
choise you toxicity. I'm very curious...
In article <10uprpn$1hfpm$1@dont-email.me>,
David Brown <david.brown@hesbynett.no> wrote:
On 22/05/2026 16:44, Dan Cross wrote:
In article <10un0j7$obiv$2@dont-email.me>,
David Brown <david.brown@hesbynett.no> wrote:
[snip]
There is definitely potential for a language's type system to make it
harder to make some kinds of mistakes. But there is a risk in making
the language too restrictive - people end up writing horrible code to
work around restrictions, or use "unsafe" code too much.
Interesting. I've found it to be somewhat the opposite; using a
richer type system has lead to code that is easier to understand
and reason about, and less buggy: type-oriented programming can
make entire categories of errors *unrepresentable*, so it's not
just _harder_ to make certain kinds of mistakes, but
_impossible_. The existance of an object of some type can be
thought of as an existence proof that the invariants the type
represents hold.
In general, I agree - I prefer strongly typed languages, and it's always
best if an error is identified by the code being uncompilable rather
than waiting for run-time testing (or testing by evil hackers). But if
people want to do something with the language that is hard to achieve
efficiently within the norms of the language, and there are escape
hatches ("unsafe" code, inline assembly, calling external C functions,
etc.) then people will use them.
Sure; that seems pretty clear, given extensive available
evidence.
People do sometimes write C code that
knowingly depends on undefined behaviour, because it makes their results
more efficient and "it worked when I tested it".
Yes, or they force some knob on their compiler into the required
position to give them guaranteed results. Linux does this, for
example; last time I worked in it, Google's massive (2BLOC) code
base similarly.
Type safety - like any kind of safety - can sometimes get in the way of
"getting things done". A balance always needs to be found to ensure
that people can do what they need to do without breaking rules or using
"escapes" more than absolutely necessary. If people find that much of
their Rust code is "unsafe", then most of the point of using Rust is
lost. (Of course this also means that different languages are suited to
different tasks.)
It _can_, but I wouldn't take it as a given that it _does_, and
what I'm trying to say is that contrary to often getting in the
way, it can be used effectively to make programs better and
safer, with no runtime downside. Indeed, a well-typed program
can be faster than the alternative, since a) the compiler can
prove that some properties hold, and b) it can be more
aggressively optimized at a higher level. An example here are
non-nullable references; there's no need to check whether they
are NULL or not before indirecting through them, since by
definition they cannot be NULL.
Hmm. This reads less like an indictment of the idea of stronger
typing, but rather a failure to provide adequate abstractions in
the type system.
It is not an indication of problems with stronger typing, but it is an
indication that the restrictions inherent in the language and tools made
them somewhat unsuitable for a lot of use-cases that fitted the hardware
well. (Modern XMOS tools have changed significantly since those days,
perhaps partly because of such issues.)
Sure, but that's a failure of that system to provide good
abstractions: the system was not rich enough to provide the
functionality you needed or wanted.
I think the central tension here is that the idea of a strong
type system that is also semantically rich is being conflated
with one that's highly restrictive. What I'm suggesting is that
the opposite is true.
I'm going to mention Rust again; apologies. When confined to
the safe subset, it _also_ has data race freedom. The rules
that give this property are:
1. Every object has exactly one owner, and assignments of
non-trivial types change ownership (they are logically a
"move");
2. References to an object may be "borrowed" from the owner,
and mutable references (that is, references that may be used
to write to the object) are distinct from immutable
references (that is, references that may only be used to read
from an object);
3. Mutable and immutable references are temporally mutually
exclusive: a mutable reference may be borrowed from an object
iff that is the only live reference to that object at the
time: that is, it is not permitted to borrow a mut ref to an
object if either another mut ref or any immutable refs to it
are live; any number of immutable references may be taken to
an object concurrently.
If all of these rules are obeyed (and in safe rust, they're
verified at compile time by the borrow checker) then you cannot
have data races.
That's all good, by the sound of it. But if people find this gets in
the way of efficiency - perhaps because they know that it is safe to
hold two separate mutable references to an object for reasons the borrow
checker can't see - the temptation to use "unsafe" or other workarounds
comes quickly.
Not really. The rule is that borrowing a mutable reference to
some object is mutually exclusive with borrowing any other kind
of reference to that same object at the same time. `unsafe`
doesn't change that rules, or let the programmer off the hook
for violating it: `unsafe` code isn't allowed to mix mut ref
with other references any more than _safe_ code. All `unsafe`
means is that the burden of upholding the rules of the language
falls to the programmer, without compiler assistence, because
the programmer knows something that the compiler does not (and
probably cannot) for some reason.
However, if it really _is_ the case that multiple simultaneous
writes are ok, the ability to create new types and assign them
semantics and behavior menas that a programmer can build a _new_
abstraction that lets them do something similar. Atomics are
an interesting case in point: one can store to them using a
non-mutable reference: while this seems counter-intuitive at
first, it sort of makes sense for the same reason that a `Mutex`
is accessed via an immutable reference: one cannot observe an
atomic in an intermediate state due to an update, and `load` and
`store` are explicit operations (in practice, the compiler
lowers those to single load/store operations and whatever
barriers are appropriate for the target architecture).
A hardware device modeling a framebuffer might be another case;
those are output-only devices, and there's no reason that two
threads cannot write to different parts of the buffer at the
same time, but given a suitable interface, the interface doesn't
need to expose unsafety. And if you've got a rich type system,
you have the rules to build that interface.
That does not mean that a language should not try to have such rules,
and enforce them at compile time - far from it. The aim should be that
as much code as possible is correct, or at least immune to particular
classes of bugs, checked by the compiler and tools.
I think you're saying something different than what I'm saying;
my point is that the safe interface shouldn't be seen as a
burden and, if well-executed, it shouldn't add a runtime tax in
terms of reduced performance. You seem to be taking it as a
given that the rules of the language will make both of those
things true, however.
At first glance it appears that it must suffer from the same
drawbacks as `XC`, which you mentioned above. Except that the
language does provide controlled ways to share data
concurrently.
Using the _unsafe_ subset, there is one place where it is
permitted to have multiple, mutable references to an object: the
`UnsafeCell`. This gives Rust interior mutability, which means
that you can build safe abstractions for data sharing, like
`Mutex` types that own the data they protect.
If "unsafe" gives enough, but is rarely needed in most code, then it
sounds like a good balance.
Yes. And moreover, one can hide that unsafety behind a safe
interface.
cross@spitfire.i.gajendra.net (Dan Cross) writes:
In article <Xj%PR.313884$yHZ7.10738@fx10.iad>,
Scott Lurndal <slp53@pacbell.net> wrote:
cross@spitfire.i.gajendra.net (Dan Cross) writes:
[example of conditional operator: a = cond ? x : y; ]
In fact, I find the ternary operator to be used rather
sparingly, even in the [example shown above].
I wouldn't necessarily categorize it as "sparingly"; Searching
through the simulator source base, I find it used rather
frequently in certain scenarios:
diag = snprintf(cp, remaining, "Feature1:%c Feature2:%c",
is_feature1()?'+':'-', is_feature2()?'+':'-');
deliver_interrupt(IntVec_TIMER, is_secure()?FLAGS_SECURE:0u);
Those are the kinds of scenarios where it can be useful; what
percentage of code overall uses that versus `if`, though?
Other than the cases above, 'if' dominates by orders of magnitude.
In article <10ure81$1s756$1@dont-email.me>,
Janis Papanagnou <janis_papanagnou+ng@hotmail.com> wrote:
On 2026-05-22 16:57, Dan Cross wrote:
In fact, I find the ternary operator to be used rather
sparingly, even in the [same example as before].
I haven't got that impression.
Some piece of software (it's just a random example, to be clear)
that has
130 C-files
250249 total LOC
shows
3864 lines with ?: appearing
I think this is a lot!
That's about 1.5% of source lines. I'd say that's pretty
sparingly used.
On 24/05/2026 21:08, Dan Cross wrote:
In article <10uprpn$1hfpm$1@dont-email.me>,
David Brown <david.brown@hesbynett.no> wrote:
[snip]
People do sometimes write C code that
knowingly depends on undefined behaviour, because it makes their results >>> more efficient and "it worked when I tested it".
Yes, or they force some knob on their compiler into the required
position to give them guaranteed results. Linux does this, for
example; last time I worked in it, Google's massive (2BLOC) code
base similarly.
What you then get is code that is UB in C, but not UB in >C-with-no-strict-aliasing, or whatever "augmented" language you pick.
(I have a strict policy of putting any such knobs in "GCC optimize"
pragmas, so that the code correctness does not depend on the flags
picked in a makefile. Of course such code remains equally limited in
its portability.)
[snip]
Hmm. This reads less like an indictment of the idea of stronger
typing, but rather a failure to provide adequate abstractions in
the type system.
It is not an indication of problems with stronger typing, but it is an
indication that the restrictions inherent in the language and tools made >>> them somewhat unsuitable for a lot of use-cases that fitted the hardware >>> well. (Modern XMOS tools have changed significantly since those days,
perhaps partly because of such issues.)
Sure, but that's a failure of that system to provide good
abstractions: the system was not rich enough to provide the
functionality you needed or wanted.
I think the central tension here is that the idea of a strong
type system that is also semantically rich is being conflated
with one that's highly restrictive. What I'm suggesting is that
the opposite is true.
To be clear - the restrictions here were not part of the type system.
The XC language did have some differences in typing from C, if I
remember the details correctly - rather than C pointers it had
references which could not be null. And that's fine.
The restrictions were enforced by whole-program analysis. I think (and
you are the Rust expert here, not me) that the Rust borrow checker is >similar. The rules of the language specify aspects of what can and
cannot be done with access to data, and these are enforced at a higher
level than the main compilation. They are language rules, but not part
of the type system.
[snip]
Not really. The rule is that borrowing a mutable reference to
some object is mutually exclusive with borrowing any other kind
of reference to that same object at the same time. `unsafe`
doesn't change that rules, or let the programmer off the hook
for violating it: `unsafe` code isn't allowed to mix mut ref
with other references any more than _safe_ code. All `unsafe`
means is that the burden of upholding the rules of the language
falls to the programmer, without compiler assistence, because
the programmer knows something that the compiler does not (and
probably cannot) for some reason.
That is roughly what I was trying to express, although there is perhaps
a grey area between "I am following the rules even though the tools
can't see it" and "I am breaking the rules but I know it is safe to do
so here without compromising the reason for those rules".
In C (trying desperately to bring us back to c.l.c. :-) ), you might use >casts to do something that looks wrong to the compiler, but which you
know is safe and correct.
However, if it really _is_ the case that multiple simultaneous
writes are ok, the ability to create new types and assign them
semantics and behavior menas that a programmer can build a _new_
abstraction that lets them do something similar. Atomics are
an interesting case in point: one can store to them using a
non-mutable reference: while this seems counter-intuitive at
first, it sort of makes sense for the same reason that a `Mutex`
is accessed via an immutable reference: one cannot observe an
atomic in an intermediate state due to an update, and `load` and
`store` are explicit operations (in practice, the compiler
lowers those to single load/store operations and whatever
barriers are appropriate for the target architecture).
OK. Though that is probably a level of detail that would be clearer to
me if I read the Rust documentation first!
A hardware device modeling a framebuffer might be another case;
those are output-only devices, and there's no reason that two
threads cannot write to different parts of the buffer at the
same time, but given a suitable interface, the interface doesn't
need to expose unsafety. And if you've got a rich type system,
you have the rules to build that interface.
That is the kind of thing that was difficult in XC. Perhaps it would be >fair to say that its type system was not powerful enough to work >conveniently with its data race and thread safety checking systems.
That does not mean that a language should not try to have such rules,
and enforce them at compile time - far from it. The aim should be that
as much code as possible is correct, or at least immune to particular
classes of bugs, checked by the compiler and tools.
I think you're saying something different than what I'm saying;
my point is that the safe interface shouldn't be seen as a
burden and, if well-executed, it shouldn't add a runtime tax in
terms of reduced performance. You seem to be taking it as a
given that the rules of the language will make both of those
things true, however.
I am saying that a safe interface /can/ sometimes be a burden - not that
it has to be.
An interface defines what can be done, and also what cannot be done.
It's power as an interface comes from both of these - letting you do
useful things, and preventing you from doing harmful things. Sometimes, >however, there may be things you want to do that you know are useful and
not harmful, but that the interface disallows. If the language or
interface is well designed, and a good fit for the tasks people want to
do with the language, then such situations will be minimal. But I think
it is quite easy to fall into a trap when designing an interface where
you exclude too many useful cases. If that happens, then programmers
have to use riskier or less efficient workarounds, or find alternative >interfaces (or languages).
As a concrete example, I recently wanted to use a std::variant<> in my
C++ code. A std::variant<> is, approximately, a struct like :
struct {
enum { ... } type_tag;
union { ... } contents;
}
In C++, this is all type-safe - you don't have direct access to these >fields, but instead they are kept consistent automatically so that
objects of different types can be stored in the union and have their >constructors and destructors called correctly. But in my case, I wanted
to access the fields independently - I wanted to fill the "contents"
with data retrieved over a network, and set the "type_tag" according to >other data in the network packet. I knew this was safe (since no >constructors or destructors were needed). But there was no way to
handle this within the interface. My options included a risky,
non-portable and difficult workaround (digging through the <variant>
header and directly "hacking" the variant object using unsigned char* >pointers), serious run-time inefficiencies (with extra copies of the
data), or finding a different interface. I opted for the last one,
making my own simple variation of std::variant that gave me the access I >needed.
This, of course, does not necessarily mean the interface of
std::variant<> is bad. If my needs were highly unusual, then a standard >library does not need to support them. But it is an example where the >restrictions imposed by a safe, powerful interface limited how I could
work with what is essentially the same kind of object.
At first glance it appears that it must suffer from the same
drawbacks as `XC`, which you mentioned above. Except that the
language does provide controlled ways to share data
concurrently.
Using the _unsafe_ subset, there is one place where it is
permitted to have multiple, mutable references to an object: the
`UnsafeCell`. This gives Rust interior mutability, which means
that you can build safe abstractions for data sharing, like
`Mutex` types that own the data they protect.
If "unsafe" gives enough, but is rarely needed in most code, then it
sounds like a good balance.
Yes. And moreover, one can hide that unsafety behind a safe
interface.
I do like that the unsafety is marked explicitly and clearly.
In article <10v17h4$18mp3$1@dont-email.me>,
David Brown <david.brown@hesbynett.no> wrote:
On 24/05/2026 21:08, Dan Cross wrote:
In article <10uprpn$1hfpm$1@dont-email.me>,
David Brown <david.brown@hesbynett.no> wrote:
[snip]
People do sometimes write C code that
knowingly depends on undefined behaviour, because it makes their results >>>> more efficient and "it worked when I tested it".
Yes, or they force some knob on their compiler into the required
position to give them guaranteed results. Linux does this, for
example; last time I worked in it, Google's massive (2BLOC) code
base similarly.
What you then get is code that is UB in C, but not UB in
C-with-no-strict-aliasing, or whatever "augmented" language you pick.
Yes. An observation pointed out to me at my last gig, and that
I now point out occasionally myself, is that Linux (for example)
is not so much written in C but rather in Linux C, which is the
dialect of the language defined by the compilers they use and
the specific behaviors they force using flags, pragmas, etc, for
those compilers. At any rate, it's certainly not strictly
conforming ISO C: is _any_ large program strictly conforming at
this point? I suspect many projects strive for such, but few
actually attain it.
(I have a strict policy of putting any such knobs in "GCC optimize"
pragmas, so that the code correctness does not depend on the flags
picked in a makefile. Of course such code remains equally limited in
its portability.)
Good idea, though I like the idea of the compiler failing with a
usage error if an option is no longer available (or someone is
trying to build using an old compiler or what have you).
[snip]
Hmm. This reads less like an indictment of the idea of stronger
typing, but rather a failure to provide adequate abstractions in
the type system.
It is not an indication of problems with stronger typing, but it is an >>>> indication that the restrictions inherent in the language and tools made >>>> them somewhat unsuitable for a lot of use-cases that fitted the hardware >>>> well. (Modern XMOS tools have changed significantly since those days, >>>> perhaps partly because of such issues.)
Sure, but that's a failure of that system to provide good
abstractions: the system was not rich enough to provide the
functionality you needed or wanted.
I think the central tension here is that the idea of a strong
type system that is also semantically rich is being conflated
with one that's highly restrictive. What I'm suggesting is that
the opposite is true.
To be clear - the restrictions here were not part of the type system.
The XC language did have some differences in typing from C, if I
remember the details correctly - rather than C pointers it had
references which could not be null. And that's fine.
The restrictions were enforced by whole-program analysis. I think (and
you are the Rust expert here, not me) that the Rust borrow checker is
similar. The rules of the language specify aspects of what can and
cannot be done with access to data, and these are enforced at a higher
level than the main compilation. They are language rules, but not part
of the type system.
Sorry, let me be clear here: the _language_ is not expressive
enough to provide the kinds of abstractions that would be useful
for the types of applications the hardware seems well-suited
for.
But the original context was types; my point was that a richer
langauge can provide the building blocks so that one can build a
safe abstraction around those kinds of behaviors, even if one
uses `unsafe` in the implementation of those abstractions.
This was meant in response to your earlier statement, that e.g.
performance might cause one to want to reach for `unsafe` to
work _around_ the language; my point is that the language gives
you tools to work _with_ it. That is, the initial desire to
reach for `unsafe` is _reduced_ by the ability to effectively
hide it.
Implicit in this is a distinction between building what one
might call infrastructure within one's program, and using that infrastructure: the former might involve some controlled (and
hopefully limited) use of `unsafe`, while the latter ideally
does not. I find this is the opposite of what many people
assume when coming from other language backgrounds, where the
desire is to push `unsafe` to the point of use; often this is
because in other languages, building up those abstractions
requires a lot of machinery with high runtime costs.
[snip]
Not really. The rule is that borrowing a mutable reference to
some object is mutually exclusive with borrowing any other kind
of reference to that same object at the same time. `unsafe`
doesn't change that rules, or let the programmer off the hook
for violating it: `unsafe` code isn't allowed to mix mut ref
with other references any more than _safe_ code. All `unsafe`
means is that the burden of upholding the rules of the language
falls to the programmer, without compiler assistence, because
the programmer knows something that the compiler does not (and
probably cannot) for some reason.
That is roughly what I was trying to express, although there is perhaps
a grey area between "I am following the rules even though the tools
can't see it" and "I am breaking the rules but I know it is safe to do
so here without compromising the reason for those rules".
That's what I'm saying; a program really _doesn't_ get to break
the rules and retain any guarantee of behavior, unsafe or not.
In that sense, writing `unsafe` Rust code is _harder_ than
writing C or another unsafe language. There is no grey area:
either the programmer makes sure the program follows the rules,
or all bets are off.
In C (trying desperately to bring us back to c.l.c. :-) ), you might use
casts to do something that looks wrong to the compiler, but which you
know is safe and correct.
That's qualitatively different, though. If, say, I manually
validate that a pointer points to valid memory, and is properly
aligned, and so forth, then I can cast that pointer to some
type. You can absolutely do that in Rust, too. But if,
instead, you say, "I'm going to intentionally invoke UB here..."
then both languages will bite you.
However, if it really _is_ the case that multiple simultaneous
writes are ok, the ability to create new types and assign them
semantics and behavior menas that a programmer can build a _new_
abstraction that lets them do something similar. Atomics are
an interesting case in point: one can store to them using a
non-mutable reference: while this seems counter-intuitive at
first, it sort of makes sense for the same reason that a `Mutex`
is accessed via an immutable reference: one cannot observe an
atomic in an intermediate state due to an update, and `load` and
`store` are explicit operations (in practice, the compiler
lowers those to single load/store operations and whatever
barriers are appropriate for the target architecture).
OK. Though that is probably a level of detail that would be clearer to
me if I read the Rust documentation first!
A hardware device modeling a framebuffer might be another case;
those are output-only devices, and there's no reason that two
threads cannot write to different parts of the buffer at the
same time, but given a suitable interface, the interface doesn't
need to expose unsafety. And if you've got a rich type system,
you have the rules to build that interface.
That is the kind of thing that was difficult in XC. Perhaps it would be
fair to say that its type system was not powerful enough to work
conveniently with its data race and thread safety checking systems.
Yes. It also doesn't give you any means to build an abstraction
that lets you do the thing you want to do.
That does not mean that a language should not try to have such rules,
and enforce them at compile time - far from it. The aim should be that >>>> as much code as possible is correct, or at least immune to particular
classes of bugs, checked by the compiler and tools.
I think you're saying something different than what I'm saying;
my point is that the safe interface shouldn't be seen as a
burden and, if well-executed, it shouldn't add a runtime tax in
terms of reduced performance. You seem to be taking it as a
given that the rules of the language will make both of those
things true, however.
I am saying that a safe interface /can/ sometimes be a burden - not that
it has to be.
Well yes, of course; I thought that was a given in the context
of this discussion.
An interface defines what can be done, and also what cannot be done.
It's power as an interface comes from both of these - letting you do
useful things, and preventing you from doing harmful things. Sometimes,
however, there may be things you want to do that you know are useful and
not harmful, but that the interface disallows. If the language or
interface is well designed, and a good fit for the tasks people want to
do with the language, then such situations will be minimal. But I think
it is quite easy to fall into a trap when designing an interface where
you exclude too many useful cases. If that happens, then programmers
have to use riskier or less efficient workarounds, or find alternative
interfaces (or languages).
Yes. And what I'm saying is that when working with a language
that allows you to create useful, robust, types, you can often
bridge the two worlds and build a new, useful abstraction. C is
not a language where one can easily do that; you've got structs,
unions, and functions...and that's about it. But you cannot
restrict the behavior of a struct in the way that King described
in the "Parse, Don't Validate" piece I linked earlier.
An instance of a struct is Not a proof that some property holds,
but it _may_ be in Rust or Haskell or OCaml, SML, etc. I
suspect one cannot easily do that in C++, because it's not
managed and doesn't make ownership a first-class property,
though one can usefully write a number of classes and so forth
that give much the same effect, if not actual guarantees.
Interestingly (maybe?), I find Ada lacking in this area.
As a concrete example, I recently wanted to use a std::variant<> in my
C++ code. A std::variant<> is, approximately, a struct like :
struct {
enum { ... } type_tag;
union { ... } contents;
}
In C++, this is all type-safe - you don't have direct access to these
fields, but instead they are kept consistent automatically so that
objects of different types can be stored in the union and have their
constructors and destructors called correctly. But in my case, I wanted
to access the fields independently - I wanted to fill the "contents"
with data retrieved over a network, and set the "type_tag" according to
other data in the network packet. I knew this was safe (since no
constructors or destructors were needed). But there was no way to
handle this within the interface. My options included a risky,
non-portable and difficult workaround (digging through the <variant>
header and directly "hacking" the variant object using unsigned char*
pointers), serious run-time inefficiencies (with extra copies of the
data), or finding a different interface. I opted for the last one,
making my own simple variation of std::variant that gave me the access I
needed.
This, of course, does not necessarily mean the interface of
std::variant<> is bad. If my needs were highly unusual, then a standard
library does not need to support them. But it is an example where the
restrictions imposed by a safe, powerful interface limited how I could
work with what is essentially the same kind of object.
Well, sure, that's the danger of general libraries: you're
getting something that is a decent combination of functionality
and expressiveness contrasted with safety for many needs, but
(almost by definition) not for all needs. As you point out,
that is useful for many cases, though.
At first glance it appears that it must suffer from the same
drawbacks as `XC`, which you mentioned above. Except that the
language does provide controlled ways to share data
concurrently.
Using the _unsafe_ subset, there is one place where it is
permitted to have multiple, mutable references to an object: the
`UnsafeCell`. This gives Rust interior mutability, which means
that you can build safe abstractions for data sharing, like
`Mutex` types that own the data they protect.
If "unsafe" gives enough, but is rarely needed in most code, then it
sounds like a good balance.
Yes. And moreover, one can hide that unsafety behind a safe
interface.
I do like that the unsafety is marked explicitly and clearly.
It goes beyond that, though: _if_ the programmer is successful
in providing a safe abstraction around the unsafety, then it is
impossible to compromise the memory safety of the program by
using that abstraction. Naturally, "if" is doing a lot of work
here: as I said, the burden for writing `unsafe` code is higher
in Rust than it is in C, and it requires a lot of care. But the
resulting guarantees are much stronger.
- Dan C.
Multipass is for when you have only kilobits of core memory, where
the only way to realistically compile your code is to dump
intermediate results (in the assembler case, memory address of
labels) on a magnetic tape.
| Sysop: | DaiTengu |
|---|---|
| Location: | Appleton, WI |
| Users: | 1,118 |
| Nodes: | 10 (0 / 10) |
| Uptime: | 16:37:02 |
| Calls: | 14,340 |
| Calls today: | 3 |
| Files: | 186,356 |
| D/L today: |
2,558 files (803M bytes) |
| Messages: | 2,532,455 |