I wonder if it would make sense to add @ to the basic character set. Virtually everyone is using it in comments and strings already anyway
(for email addresses), and I don't see anything preventing
implementations from supporting it, as it is available in both ASCII and common EBCDIC code pages:
http://www.colecovision.eu/stuff/proposal-basic-@.html
On 12/5/20 2:58 AM, Philipp Klaus Krause wrote:
I wonder if it would make sense to add @ to the basic character set.
Virtually everyone is using it in comments and strings already anyway
(for email addresses), and I don't see anything preventing
implementations from supporting it, as it is available in both ASCII and
common EBCDIC code pages:
http://www.colecovision.eu/stuff/proposal-basic-@.html
'@' is not in the ISO/IEC 646 invariant subset; in the Danish, Dutch,
French, French Canadian, German, Italian, Spanish, Swedish, and Swiss national variants, that code point is assigned to some other character.
With UTF-8 (on Unix-like systems) and UTF-16 (on Windows systems)
becoming so common place, that is less of a concern than it used to be,
but it is still something the committee is likely to pay attention to.
There are other characters that already are part of the C basic
character set that aren't in the invariant subset: "# [ ] { } \ | ~ ^". However, all of those characters played an important role in C syntax
long before ISO/IEC 646; that's not the case for '@'. Trigraphs were
invented to allow those characters to be used on systems that didn't
support them natively.
@ is used in existing C implementations as an extension feature. In particular, a number of embedded C compilers allow a syntax like
"uint8_t reg @ 0x1234;" to mean "reg is a uint8_t object located at
absolute address 0x1234". If people are consistent about using spaces,
this could easily be solved if @ is made a letter by simply making @
alone into a keyword. But if those compilers accept "uint8_t
reg@0x1234", then that fails.
Another cause for concern is if the symbol is used in identifiers, then
these could cause trouble for assemblers and/or linkers on some systems.
(This applies to the common extension of allowing $ as a "letter" in C
- the gcc manual notes that this is not supported on some targets due to
the meaning of $ in assembly on those targets.)
The standards committee are always reluctant to make changes that could interfere with known implementations and existing code, even if the
conflict is with an implementation-specific extension to the language.
On 12/5/20 2:58 AM, Philipp Klaus Krause wrote:
I wonder if it would make sense to add @ to the basic character set.
Virtually everyone is using it in comments and strings already anyway
(for email addresses), and I don't see anything preventing
implementations from supporting it, as it is available in both ASCII and
common EBCDIC code pages:
http://www.colecovision.eu/stuff/proposal-basic-@.html
'@' is not in the ISO/IEC 646 invariant subset; in the Danish, Dutch,
French, French Canadian, German, Italian, Spanish, Swedish, and Swiss national variants, that code point is assigned to some other character.
With UTF-8 (on Unix-like systems) and UTF-16 (on Windows systems)
becoming so common place, that is less of a concern than it used to be,
but it is still something the committee is likely to pay attention to.
There are other characters that already are part of the C basic
character set that aren't in the invariant subset: "# [ ] { } \ | ~ ^". However, all of those characters played an important role in C syntax
long before ISO/IEC 646; that's not the case for '@'. Trigraphs were
invented to allow those characters to be used on systems that didn't
support them natively.
James Kuyper <jameskuyper@alumni.caltech.edu> writes:
On 12/5/20 2:58 AM, Philipp Klaus Krause wrote:
I wonder if it would make sense to add @ to the basic character set.
Virtually everyone is using it in comments and strings already anyway
(for email addresses), and I don't see anything preventing
implementations from supporting it, as it is available in both ASCII and >>> common EBCDIC code pages:
http://www.colecovision.eu/stuff/proposal-basic-@.html
'@' is not in the ISO/IEC 646 invariant subset; in the Danish, Dutch,
French, French Canadian, German, Italian, Spanish, Swedish, and Swiss
national variants, that code point is assigned to some other character.
With UTF-8 (on Unix-like systems) and UTF-16 (on Windows systems)
becoming so common place, that is less of a concern than it used to be,
but it is still something the committee is likely to pay attention to.
There are other characters that already are part of the C basic
character set that aren't in the invariant subset: "# [ ] { } \ | ~ ^".
However, all of those characters played an important role in C syntax
long before ISO/IEC 646; that's not the case for '@'. Trigraphs were
invented to allow those characters to be used on systems that didn't
support them natively.
Apparently the C++ committee felt that it was of so little concern that
they removed trigraphs in C++17. I don't know of any plans to do the
same in C.
There are three printable ASCII characters that aren't in C's basic
character set: '$', '`', and '@'. A guarantee that all three can be
used in string literals, character constants, and comments could be
useful. (Most programmers probably already assume they can be.)
On 05/12/2020 22:17, Keith Thompson wrote:
James Kuyper <jameskuyper@alumni.caltech.edu> writes:
On 12/5/20 2:58 AM, Philipp Klaus Krause wrote:
I wonder if it would make sense to add @ to the basic character set.
Virtually everyone is using it in comments and strings already anyway
(for email addresses), and I don't see anything preventing
implementations from supporting it, as it is available in both ASCII
and
common EBCDIC code pages:
http://www.colecovision.eu/stuff/proposal-basic-@.html
'@' is not in the ISO/IEC 646 invariant subset; in the Danish, Dutch,
French, French Canadian, German, Italian, Spanish, Swedish, and Swiss
national variants, that code point is assigned to some other character.
With UTF-8 (on Unix-like systems) and UTF-16 (on Windows systems)
becoming so common place, that is less of a concern than it used to be,
but it is still something the committee is likely to pay attention to.
There are other characters that already are part of the C basic
character set that aren't in the invariant subset: "# [ ] { } \ | ~ ^".
However, all of those characters played an important role in C syntax
long before ISO/IEC 646; that's not the case for '@'. Trigraphs were
invented to allow those characters to be used on systems that didn't
support them natively.
Apparently the C++ committee felt that it was of so little concern that
they removed trigraphs in C++17. I don't know of any plans to do the
same in C.
There are three printable ASCII characters that aren't in C's basic
character set: '$', '`', and '@'. A guarantee that all three can be
used in string literals, character constants, and comments could be
useful. (Most programmers probably already assume they can be.)
1) Trigraphs were proving to be a road-block for C++. In addition they
are so rarely used (certainly in C++) that many (probably most)
programmers fail to recognise them. WG14 appears reluctant to remove
things even when they have no practical use in modern code. The argument
that they are needed for legacy systems is, I think, very weak;
compilers will continue to support them where necessary by providing
legacy code switches.
2) As one design feature of C is portability it is time that the three characters you mention that are added to the basic character set. I do
not see how that would have a negative effect on implementations that
already use them for extensions. Those uses do not (or should not) rely
on them not being part of the basic character set.
3) Instead of speculating that their inclusion would cause problems to
some programmers we need evidence that that is the case. Considering
that it would be hard to use a modern computer system without having
both @ and $ available (think mobile and portable computer technology) I would be surprised if it would be a serious problem for anyone.
Just my 2c/p/d
Francis
On 06/12/2020 13:25, Francis Glassborow wrote:
On 05/12/2020 22:17, Keith Thompson wrote:
James Kuyper <jameskuyper@alumni.caltech.edu> writes:
On 12/5/20 2:58 AM, Philipp Klaus Krause wrote:
I wonder if it would make sense to add @ to the basic character set. >>>>> Virtually everyone is using it in comments and strings already anyway >>>>> (for email addresses), and I don't see anything preventing
implementations from supporting it, as it is available in both ASCII >>>>> and
common EBCDIC code pages:
http://www.colecovision.eu/stuff/proposal-basic-@.html
'@' is not in the ISO/IEC 646 invariant subset; in the Danish, Dutch,
French, French Canadian, German, Italian, Spanish, Swedish, and Swiss
national variants, that code point is assigned to some other character. >>>> With UTF-8 (on Unix-like systems) and UTF-16 (on Windows systems)
becoming so common place, that is less of a concern than it used to be, >>>> but it is still something the committee is likely to pay attention to. >>>> There are other characters that already are part of the C basic
character set that aren't in the invariant subset: "# [ ] { } \ | ~ ^". >>>> However, all of those characters played an important role in C syntax
long before ISO/IEC 646; that's not the case for '@'. Trigraphs were
invented to allow those characters to be used on systems that didn't
support them natively.
Apparently the C++ committee felt that it was of so little concern that
they removed trigraphs in C++17. I don't know of any plans to do the
same in C.
There are three printable ASCII characters that aren't in C's basic
character set: '$', '`', and '@'. A guarantee that all three can be
used in string literals, character constants, and comments could be
useful. (Most programmers probably already assume they can be.)
Agreed.
1) Trigraphs were proving to be a road-block for C++. In addition they
are so rarely used (certainly in C++) that many (probably most)
programmers fail to recognise them. WG14 appears reluctant to remove
things even when they have no practical use in modern code. The argument
that they are needed for legacy systems is, I think, very weak;
compilers will continue to support them where necessary by providing
legacy code switches.
There is also the difference that C is used on a much wider range of
systems than C++, especially older systems. C++ is able to drop support
for odder systems (such as those with more limited character sets, or stranger integer representation) simply because it has not been used on
such systems.
2) As one design feature of C is portability it is time that the three
characters you mention that are added to the basic character set. I do
not see how that would have a negative effect on implementations that
already use them for extensions. Those uses do not (or should not) rely
on them not being part of the basic character set.
As long as they are only available (by standard) for using in strings
and comments, not identifiers, there should be no conflict unless they
can't be represented in the source (for comments) or execution (for
string literals) character set of the system. But if these characters
are supported by the relevant character sets, then in any real-world
compiler (such as any that support ASCII), they will already be
supported as extended characters.
In other words, there is not actually anything significant useful to be gained by putting these characters in the basic character set. Equally, there is no real risk in doing so. It is purely a hypothetical issue, AFAICS. And the C standards committee are not known for spending extra effort on something that makes no difference in reality.
The issue with making them part of the basic character set is that it
makes any system that can't do this, because it uses a strange character
set, non-conforming. Since systems ARE allowed to add any characters
they want to the source or execution character set, those that currently support them can do so. Forcing them to be included drops some system
from being able to have a conforming implementation, and the committee
has traditionally avoided gratuitously making systems non-conforming.
The only case that can be made to make them part, is that then programs
that use those characters might be able to become strictly conforming programs instead of just being conforming programs, but strict
conformance isn't really that big of a deal in practicality, as
virtually all real programs are going to fail strict performance because
they are going to depend on some aspect of the environment (Like how I/O actually works)
Richard Damon <Richard@Damon-Family.org> writes:
[...]
The issue with making them part of the basic character set is that it
makes any system that can't do this, because it uses a strange character
set, non-conforming. Since systems ARE allowed to add any characters
they want to the source or execution character set, those that currently
support them can do so. Forcing them to be included drops some system
from being able to have a conforming implementation, and the committee
has traditionally avoided gratuitously making systems non-conforming.
(Context: The ASCII characters '@', '$', and '`'.)
I'd be interested in seeing an implementation for which this would
be relevant. Such an implementation (a) would be unable to (easily) represent those three character in source code and/or during
execution *and* (b) would otherwise conform to the hypothetical
edition of the C standard that would add them to the basic character
set if it were not for this change.
Implementations that can't support those characters are likely to be
for tiny exotic target systems, and very likely won't be conforming
anyway, and so could simply ignore the addition of those characters
to the basic character set.
The only case that can be made to make them part, is that then programs
that use those characters might be able to become strictly conforming
programs instead of just being conforming programs, but strict
conformance isn't really that big of a deal in practicality, as
virtually all real programs are going to fail strict performance because
they are going to depend on some aspect of the environment (Like how I/O
actually works)
I suppose I agree that it's not that big a deal. Code that uses
those characters is *practically* 100% portable already, and I haven't
found a way to coax either gcc or clang to warn about puts("$@`").
The benefit would be minor, and the cost would be very close to zero
(unless an implementation as I've described above actually exists).
It would be one less thing to think about when writing code that's
intended to be as portable as possible.
On 12/6/20 5:07 PM, Keith Thompson wrote:
Richard Damon <Richard@Damon-Family.org> writes:
[...]
The issue with making them part of the basic character set is that it
makes any system that can't do this, because it uses a strange character >>> set, non-conforming. Since systems ARE allowed to add any characters
they want to the source or execution character set, those that currently >>> support them can do so. Forcing them to be included drops some system
from being able to have a conforming implementation, and the committee
has traditionally avoided gratuitously making systems non-conforming.
(Context: The ASCII characters '@', '$', and '`'.)
I'd be interested in seeing an implementation for which this would
be relevant. Such an implementation (a) would be unable to (easily)
represent those three character in source code and/or during
execution *and* (b) would otherwise conform to the hypothetical
edition of the C standard that would add them to the basic character
set if it were not for this change.
As was mentioned, all that you need is to want to support ISO/IEC 646
for a naional character set that doesn't define code point 64 as @
This includes Canadian, French, German, Irish, and a number of others.
See https://en.wikipedia.org/wiki/ISO/IEC_646 for a chart of these.
There are three printable ASCII characters that aren't in C's basic
character set: '$', '`', and '@'. A guarantee that all three can be
used in string literals, character constants, and comments could be
useful. (Most programmers probably already assume they can be.)
Implementations that can't support […] are likely to beI made that mistake before, with N2576. Spoiler: ctype.h would be hard
for tiny exotic target systems,
Richard Damon <Richard@Damon-Family.org> writes:
On 12/6/20 5:07 PM, Keith Thompson wrote:
Richard Damon <Richard@Damon-Family.org> writes:
[...]
The issue with making them part of the basic character set is that it
makes any system that can't do this, because it uses a strange character >>>> set, non-conforming. Since systems ARE allowed to add any characters
they want to the source or execution character set, those that currently >>>> support them can do so. Forcing them to be included drops some system
from being able to have a conforming implementation, and the committee >>>> has traditionally avoided gratuitously making systems non-conforming.
(Context: The ASCII characters '@', '$', and '`'.)
I'd be interested in seeing an implementation for which this would
be relevant. Such an implementation (a) would be unable to (easily)
represent those three character in source code and/or during
execution *and* (b) would otherwise conform to the hypothetical
edition of the C standard that would add them to the basic character
set if it were not for this change.
As was mentioned, all that you need is to want to support ISO/IEC 646
for a naional character set that doesn't define code point 64 as @
This includes Canadian, French, German, Irish, and a number of others.
See https://en.wikipedia.org/wiki/ISO/IEC_646 for a chart of these.
What C implementations support those character sets (and are likely to attempt to conform to a future C standard that adds '@' to the basic character set)?
The following characters are also not part of the invariant character
set: # [ \ ] ^ { | } ~ (We have trigraphs for those. I *do not*
suggest adding trigraphs for @ $ `.)
C++ has already dropped trigraphs because support for the old 7-bit
national character sets was considered unimportant. (But C++17
did not add @ $ ` to its basic character set.) I understand that
C has different issues than C++, but in my opinion adding @ $
` to C's basic character set would cause no actual harm.
I wonder if it would make sense to add @ to the basic character set. >Virtually everyone is using it in comments and strings already anyway
(for email addresses), and I don't see anything preventing
implementations from supporting it, as it is available in both ASCII and >common EBCDIC code pages:
http://www.colecovision.eu/stuff/proposal-basic-@.html
On 12/6/20 6:49 PM, Keith Thompson wrote:
Richard Damon <Richard@Damon-Family.org> writes:
On 12/6/20 5:07 PM, Keith Thompson wrote:
Richard Damon <Richard@Damon-Family.org> writes:
[...]
The issue with making them part of the basic character set is that it >>>>> makes any system that can't do this, because it uses a strange character >>>>> set, non-conforming. Since systems ARE allowed to add any characters >>>>> they want to the source or execution character set, those that currently >>>>> support them can do so. Forcing them to be included drops some system >>>>> from being able to have a conforming implementation, and the committee >>>>> has traditionally avoided gratuitously making systems non-conforming. >>>>(Context: The ASCII characters '@', '$', and '`'.)
I'd be interested in seeing an implementation for which this would
be relevant. Such an implementation (a) would be unable to (easily)
represent those three character in source code and/or during
execution *and* (b) would otherwise conform to the hypothetical
edition of the C standard that would add them to the basic character
set if it were not for this change.
As was mentioned, all that you need is to want to support ISO/IEC 646
for a naional character set that doesn't define code point 64 as @
This includes Canadian, French, German, Irish, and a number of others.
See https://en.wikipedia.org/wiki/ISO/IEC_646 for a chart of these.
What C implementations support those character sets (and are likely to
attempt to conform to a future C standard that adds '@' to the basic
character set)?
gcc (and many others) with the right choice of file encoding options.
The key point here is that this change would be telling a number of
national bodies that their whole national character set (and thus in
some respects their language) will no longer be supported.
Philipp Klaus Krause wrote:
I wonder if it would make sense to add @ to the basic character set. >>Virtually everyone is using it in comments and strings already anywayJust to add to the "used as an extension" list of compilers; the Dignus compilers (and the SAS/C compilers) for the mainframe use @ to be similar
(for email addresses), and I don't see anything preventing
implementations from supporting it, as it is available in both ASCII and >>common EBCDIC code pages:
http://www.colecovision.eu/stuff/proposal-basic-@.html
to &, except that it can accept an rvalue. If an rvalue is present
after a @, then the address of a copy is generated. The copy is
declared within
the inner-most scope.
This is helpful in some situations on the mainframe where pass-by-reference is the norm, as in:
FOO(@1, @2);
(where FOO is defined in some other language, e.g. PL/I, where the
parameters
are pass-by-reference.)
Richard Damon <Richard@Damon-Family.org> writes:
On 12/6/20 6:49 PM, Keith Thompson wrote:
Richard Damon <Richard@Damon-Family.org> writes:
On 12/6/20 5:07 PM, Keith Thompson wrote:
Richard Damon <Richard@Damon-Family.org> writes:
[...]
The issue with making them part of the basic character set is that it >>>>>> makes any system that can't do this, because it uses a strange character >>>>>> set, non-conforming. Since systems ARE allowed to add any characters >>>>>> they want to the source or execution character set, those that currently >>>>>> support them can do so. Forcing them to be included drops some system >>>>>> from being able to have a conforming implementation, and the committee >>>>>> has traditionally avoided gratuitously making systems non-conforming. >>>>>(Context: The ASCII characters '@', '$', and '`'.)
I'd be interested in seeing an implementation for which this would
be relevant. Such an implementation (a) would be unable to (easily) >>>>> represent those three character in source code and/or during
execution *and* (b) would otherwise conform to the hypothetical
edition of the C standard that would add them to the basic character >>>>> set if it were not for this change.
As was mentioned, all that you need is to want to support ISO/IEC 646
for a naional character set that doesn't define code point 64 as @
This includes Canadian, French, German, Irish, and a number of others. >>>>
See https://en.wikipedia.org/wiki/ISO/IEC_646 for a chart of these.
What C implementations support those character sets (and are likely to
attempt to conform to a future C standard that adds '@' to the basic
character set)?
gcc (and many others) with the right choice of file encoding options.
The key point here is that this change would be telling a number of
national bodies that their whole national character set (and thus in
some respects their language) will no longer be supported.
OK. Can you explain precisely how to invoke gcc with the right choice
of file encoding options? I've found this option in the gcc manual:
'-finput-charset=CHARSET'
Set the input character set, used for translation from the
character set of the input file to the source character set used by
GCC. If the locale does not specify, or GCC cannot get this
information from the locale, the default is UTF-8. This can be
overridden by either the locale or this command-line option.
Currently the command-line option takes precedence if there's a
conflict. CHARSET can be any encoding supported by the system's
'iconv' library routine.
but I had never used it.
I just used "iconv -l" to get what I presume is a list of valid CHARSET values (there are over 1000 of them), which led me to this:
gcc -std=c11 -pedantic-errors -finput-charset=ISO646-FR -c c.c
With this source file:
#include <stdio.h>
int main(void) {
puts("$@`");
}
it produced a cascade of errors, starting with:
In file included from <command-line>:31:
/usr/include/stdc-predef.h:18:1: error: stray ‘\302’ in program
18 | #ifndef _STDC_PREDEF_H
| ^
It looks like something translated the # character to \302 (0xc2).
I have no idea why. (And it didn't complain about "$@`".)
If there's a way to invoke gcc telling it to use a character set that
doesn't include those characters, that would be a good refutation
to my point. If doing so is actually useful in some contexts,
it would be an even better refutation. So far I'm not convinced,
but I'm prepared to be.
My impression is that the old 7-bit national character sets are
no longer relevant, and that dropping support for them in the
C standard (more precisely, updating the C standard in a manner
that's inconsistent with those character sets) would be very nearly
harmless. I'm looking for evidence that that's not the case.
[...]
On 12/7/20 3:16 PM, Keith Thompson wrote:
Richard Damon <Richard@Damon-Family.org> writes:
On 12/6/20 6:49 PM, Keith Thompson wrote:
Richard Damon <Richard@Damon-Family.org> writes:
On 12/6/20 5:07 PM, Keith Thompson wrote:
Richard Damon <Richard@Damon-Family.org> writes:
[...]
The issue with making them part of the basic character set is that it >>>>>>> makes any system that can't do this, because it uses a strange character(Context: The ASCII characters '@', '$', and '`'.)
set, non-conforming. Since systems ARE allowed to add any characters >>>>>>> they want to the source or execution character set, those that currently
support them can do so. Forcing them to be included drops some system >>>>>>> from being able to have a conforming implementation, and the committee >>>>>>> has traditionally avoided gratuitously making systems non-conforming. >>>>>>
I'd be interested in seeing an implementation for which this would >>>>>> be relevant. Such an implementation (a) would be unable to (easily) >>>>>> represent those three character in source code and/or during
execution *and* (b) would otherwise conform to the hypothetical
edition of the C standard that would add them to the basic character >>>>>> set if it were not for this change.
As was mentioned, all that you need is to want to support ISO/IEC 646 >>>>> for a naional character set that doesn't define code point 64 as @
This includes Canadian, French, German, Irish, and a number of others. >>>>>
See https://en.wikipedia.org/wiki/ISO/IEC_646 for a chart of these.
What C implementations support those character sets (and are likely to >>>> attempt to conform to a future C standard that adds '@' to the basic
character set)?
gcc (and many others) with the right choice of file encoding options.
The key point here is that this change would be telling a number of
national bodies that their whole national character set (and thus in
some respects their language) will no longer be supported.
OK. Can you explain precisely how to invoke gcc with the right choice
of file encoding options? I've found this option in the gcc manual:
'-finput-charset=CHARSET'
Set the input character set, used for translation from the
character set of the input file to the source character set used by
GCC. If the locale does not specify, or GCC cannot get this
information from the locale, the default is UTF-8. This can be
overridden by either the locale or this command-line option.
Currently the command-line option takes precedence if there's a
conflict. CHARSET can be any encoding supported by the system's
'iconv' library routine.
but I had never used it.
I just used "iconv -l" to get what I presume is a list of valid CHARSET
values (there are over 1000 of them), which led me to this:
gcc -std=c11 -pedantic-errors -finput-charset=ISO646-FR -c c.c
With this source file:
#include <stdio.h>
int main(void) {
puts("$@`");
}
it produced a cascade of errors, starting with:
In file included from <command-line>:31:
/usr/include/stdc-predef.h:18:1: error: stray ‘\302’ in program
18 | #ifndef _STDC_PREDEF_H
| ^
It looks like something translated the # character to \302 (0xc2).
I have no idea why. (And it didn't complain about "$@`".)
If there's a way to invoke gcc telling it to use a character set that
doesn't include those characters, that would be a good refutation
to my point. If doing so is actually useful in some contexts,
it would be an even better refutation. So far I'm not convinced,
but I'm prepared to be.
My impression is that the old 7-bit national character sets are
no longer relevant, and that dropping support for them in the
C standard (more precisely, updating the C standard in a manner
that's inconsistent with those character sets) would be very nearly
harmless. I'm looking for evidence that that's not the case.
[...]
One problem is that file is NOT compatible with ISO646-FR as the '#' character in it would not be a HashTag (or Pound Sign), but would be the character £ which is illegal in C. It is one of the encodings that NEEDS
the trigraphs or digraphs in the files to use C.
I just used "iconv -l" to get what I presume is a list of valid CHARSET values (there are over 1000 of them), which led me to this:
gcc -std=c11 -pedantic-errors -finput-charset=ISO646-FR -c c.c
With this source file:
#include <stdio.h>
int main(void) {
puts("$@`");
}
it produced a cascade of errors, starting with:
In file included from <command-line>:31:
/usr/include/stdc-predef.h:18:1: error: stray ‘\302’ in program
18 | #ifndef _STDC_PREDEF_H
| ^
It looks like something translated the # character to \302 (0xc2).
I have no idea why. (And it didn't complain about "$@`".)
The first file it complains about, /usr/include/stdc-predef.h,
is part of the implementation (specifically part of glibc).
Either the implementation doesn't support ISO646-FR, or there's
some configuration I would need to perform to make it support it.
On Dez 07 2020, Keith Thompson wrote:
The first file it complains about, /usr/include/stdc-predef.h,
is part of the implementation (specifically part of glibc).
Either the implementation doesn't support ISO646-FR, or there's
some configuration I would need to perform to make it support it.
The system files are encoded in UTF-8, so if you want to use them in a ISO646-FR context, you have to convert them first.
On Dez 07 2020, Keith Thompson wrote:
The first file it complains about, /usr/include/stdc-predef.h,
is part of the implementation (specifically part of glibc).
Either the implementation doesn't support ISO646-FR, or there's
some configuration I would need to perform to make it support it.
The system files are encoded in UTF-8, so if you want to use them in a ISO646-FR context, you have to convert them first.
Andreas.
Andreas Schwab <schwab@linux-m68k.org> writes:
On Dez 07 2020, Keith Thompson wrote:
The first file it complains about, /usr/include/stdc-predef.h,
is part of the implementation (specifically part of glibc).
Either the implementation doesn't support ISO646-FR, or there's
some configuration I would need to perform to make it support it.
The system files are encoded in UTF-8, so if you want to use them in a
ISO646-FR context, you have to convert them first.
I suppose that would work (and would break the implementation for my
normal use).
That's not a reasonable thing to expect a user to do. If that's the
simplest way to get the implementation to support ISO646-FR, then I'd
say the implementation doesn't support ISO646-FR.
On 12/7/20 6:27 PM, Keith Thompson wrote:
Andreas Schwab <schwab@linux-m68k.org> writes:
On Dez 07 2020, Keith Thompson wrote:
The first file it complains about, /usr/include/stdc-predef.h,
is part of the implementation (specifically part of glibc).
Either the implementation doesn't support ISO646-FR, or there's
some configuration I would need to perform to make it support it.
The system files are encoded in UTF-8, so if you want to use them in a
ISO646-FR context, you have to convert them first.
I suppose that would work (and would break the implementation for my
normal use).
That's not a reasonable thing to expect a user to do. If that's the
simplest way to get the implementation to support ISO646-FR, then I'd
say the implementation doesn't support ISO646-FR.
Actually, unless the files use characteres outside the basic set, all
that it requires is encoding the problematic characters as trigraphs or digraphs, which will work for all users.
Thomas David Rivers <rivers@dignus.com> writes:
Philipp Klaus Krause wrote:
I wonder if it would make sense to add @ to the basic character set. >>>Virtually everyone is using it in comments and strings already anyway >>>(for email addresses), and I don't see anything preventing >>>implementations from supporting it, as it is available in both ASCII and >>>common EBCDIC code pages:Just to add to the "used as an extension" list of compilers; the Dignus >>compilers (and the SAS/C compilers) for the mainframe use @ to be similar >>to &, except that it can accept an rvalue. If an rvalue is present
http://www.colecovision.eu/stuff/proposal-basic-@.html
after a @, then the address of a copy is generated. The copy is
declared within
the inner-most scope.
This is helpful in some situations on the mainframe where pass-by-reference >>is the norm, as in:
FOO(@1, @2);
(where FOO is defined in some other language, e.g. PL/I, where the >>parameters
are pass-by-reference.)
You can do the same thing with a compound literal starting in C99:
#include <stdio.h>
void FOO(int *a, int *b) {
printf("%d %d\n", *a, *b);
}
int main(void) {
FOO(&(int){1}, &(int){2});
}
I suspect the extension predates compound literals.
I wonder if it would make sense to add @ to the basic character set. Virtually everyone is using it in comments and strings already anyway
(for email addresses), and I don't see anything preventing
implementations from supporting it, as it is available in both ASCII and common EBCDIC code pages:
http://www.colecovision.eu/stuff/proposal-basic-@.html
Am 05.12.20 um 08:58 schrieb Philipp Klaus Krause:
I wonder if it would make sense to add @ to the basic character set.
Virtually everyone is using it in comments and strings already anyway
(for email addresses), and I don't see anything preventing
implementations from supporting it, as it is available in both ASCII and
common EBCDIC code pages:
http://www.colecovision.eu/stuff/proposal-basic-@.html
After some discussion and thought, IMO, the way forward is to add @ to
the source and execution character sets, but not the basic source
character set:
http://www.colecovision.eu/stuff/proposal-@.html
Do you think this proposal makes sense as is? If yes, do you have a preference for adding them as single bytes vs. not specifying if they
are single bytes? If yes, why?
Philipp Klaus Krause <pkk@spth.de> writes:
Am 05.12.20 um 08:58 schrieb Philipp Klaus Krause:
I wonder if it would make sense to add @ to the basic character set.
Virtually everyone is using it in comments and strings already anyway
(for email addresses), and I don't see anything preventing
implementations from supporting it, as it is available in both ASCII and >>> common EBCDIC code pages:
http://www.colecovision.eu/stuff/proposal-basic-@.html
After some discussion and thought, IMO, the way forward is to add @ to
the source and execution character sets, but not the basic source
character set:
http://www.colecovision.eu/stuff/proposal-@.html
Do you think this proposal makes sense as is? If yes, do you have a
preference for adding them as single bytes vs. not specifying if they
are single bytes? If yes, why?
It's not *necesary*, but I wouldn't object to it.
If this change is going to be made, I'd advocate also adding $
(mentioned in the proposal) and ` (not mentioned). None of @,
$, and ` are required for any C tokens, but many implementations
allow $ in identifiers. @, $, and ` are the only ASCII characters
that are not part of the C basic character sets. All are commonly
used in character constants and string literals. (`, backtick,
is used in Markdown and some other languages.)
The *basic* characters are those that are required for all
implementations. The set of *extended* characters is
implementation-defined, and may be empty. The @, $, and ` characters
are extended characters in most or all current implementations. If @, $,
and ` are going to be required, I think they should be in the basic
character set. That's the point of the distinction between basic and extended characters.
Am 05.12.20 um 08:58 schrieb Philipp Klaus Krause:
I wonder if it would make sense to add @ to the basic character set.
Virtually everyone is using it in comments and strings already anyway
(for email addresses), and I don't see anything preventing
implementations from supporting it, as it is available in both ASCII and
common EBCDIC code pages:
http://www.colecovision.eu/stuff/proposal-basic-@.html
After some discussion and thought, IMO, the way forward is to add @ to
the source and execution character sets, but not the basic source
character set:
http://www.colecovision.eu/stuff/proposal-@.html
Do you think this proposal makes sense as is? If yes, do you have a preference for adding them as single bytes vs. not specifying if they
are single bytes? If yes, why?
Sysop: | DaiTengu |
---|---|
Location: | Appleton, WI |
Users: | 1,030 |
Nodes: | 10 (0 / 10) |
Uptime: | 78:56:05 |
Calls: | 13,351 |
Files: | 186,574 |
D/L today: |
240 files (62,977K bytes) |
Messages: | 3,358,993 |