Folks,
is it possible in Tcl9 to get the old (8.x) behaviour back,
that tcl files are read with system encoding instead of
utf-8?
Is there e.g. an environment variable or a configure
switch to change this?
I found:
--with-encoding encoding for configuration values (default: utf-8)
but, what is meant with "configuration values"?
I've the problem that almost all of my sources contain some
umlauts and are, for legacy reasons, in iso8859-1. If it's
not possible to get the old behavior back, I have to
edit every pkgIndex.tcl and every tclsh <script> call
before I'm able to migrate.
Thanks in advance
Uwe
source -encoding iso8859-1 $file
I put those in the package index files which source the package files.
Remark that msgcat message files are always in utf-8.
Am 13.12.2024 um 16:02 schrieb Uwe Schmitz:
Folks,
is it possible in Tcl9 to get the old (8.x) behaviour back,
that tcl files are read with system encoding instead of
utf-8?
Is there e.g. an environment variable or a configure
switch to change this?
I found:
--with-encoding encoding for configuration values (default: utf-8)
but, what is meant with "configuration values"?
I've the problem that almost all of my sources contain some
umlauts and are, for legacy reasons, in iso8859-1. If it's
not possible to get the old behavior back, I have to
edit every pkgIndex.tcl and every tclsh <script> call
before I'm able to migrate.
Thanks in advance
Uwe
Hi Uwe,
source -encoding iso8859-1 $file
I put those in the package index files which source the package files.
Remark that msgcat message files are always in utf-8.
The great TCL9 migration helper by Ashok tries to find the related files.
Take care,
Harald
Sorry that I have to come back to this issue.
As stated before, we use iso8859-1 as system encoding.
With Tcl9 we now got errors reading source files with e.g. umlauts,
because Tcl9 interprets all sources as utf-8 by
default. That means we have to add "-encoding iso8859-1"
to ALL source and ALL tclsh calls in ALL scripts.
So far, so good(or bad?).
What initially seems quite doable, looks more and more scary
to me. First, if we ever may switch encoding to utf-8 we
have to alter all those lines again. Either we switch them to utf-8 or
we remove the -encoding and went back
to the state before Tcl9.
Another point: we have MANY scripts only for development needs.
Coded quickNdirty for code generation, documentation, packaging, etc.
Most of them called by "tclsh helperScript.tcl ..." (they have no
shebang or
whatever). They now have to be called by
"tclsh -encoding iso8859-1 helperScript.tcl ..."
Thats a lot more typing.
Some of them have a usage message like:
usage: tclsh helperScript.tcl arg1 arg2
...
Do we now have to change it to:
usage: tclsh -encoding iso8859-1 helperScript.tcl arg1 arg2
...
?
Side note: The open command, which opens a file with the
system encoding by default, has thankfully not changed
in the same manner as source and tclsh :-).
Now my suggestion:
Wouldn't it be convenient for Tcl9 to have a global switch
(e.g. Environment variable) to get back the Tcl8 encoding
behaviour?
Or, isn't it best to keep the old encoding mimik in Tcl9.
I see no advantage in the new behavior. Even if you have
all sources in utf-8, you might also have choosen utf-8 as
system encoding.
Do we have (or can we have) a magic comment or something else
with which we can choose the encoding of a source file in
the file itself?
Python e.g. has https://peps.python.org/pep-0263/.
Maybe I’m missing something crucial...
Thanks in advance
Uwe
They now have to be called by
"tclsh -encoding iso8859-1 helperScript.tcl ..."
Thats a lot more typing.
THanks, Uwe.
Sorry, for the inconvenience.
For me, this is a big step forward.
With tcl 8.6, I always have to type:
source -encoding utf-8 script.tcl
as I don't know the system encoding.
It is not setable for me.
So, this change is a big advantage, as now, I can type:
source script.tcl.
On Tue, 7 Jan 2025 18:00:15 +0100, Uwe Schmitz wrote:
They now have to be called by**************************
"tclsh -encoding iso8859-1 helperScript.tcl ..."
Thats a lot more typing.
The lot more typing problem can be solved with a shell alias.
In Tcl, using 'source,' you can create an alias too.
Not exactly what you wanted, but readily available here and now.
Harald,
THanks, Uwe.on all the systems I've worked on so far, I've been able to
Sorry, for the inconvenience.
For me, this is a big step forward.
With tcl 8.6, I always have to type:
source -encoding utf-8 script.tcl
as I don't know the system encoding.
It is not setable for me.
So, this change is a big advantage, as now, I can type:
source script.tcl.
to set the system encoding, even as a normal user.
Users of our in-house software stack (similar to BAWT, but only
for Linux) are advised to set iso8859-1 encoding, before running
any programs.
Anyhow, what we should have at least is a magic comment as described
in my other post. This would give you the option of placing
the encoding where it really belongs. And this would avoid
having to include the encoding with every source/tclsh call.
If you ever change the encoding, you have to find all this places
and correct them. Good luck to find them all...
To summarize, I am more and more getting to the opinion,
that Tcl9 forces developers to encode their source codes in utf-8.
Otherwise you end up in an encoding nightmare.
Best regards,
Uwe
Am 08.01.2025 um 11:35 schrieb Uwe Schmitz:
Harald,
THanks, Uwe.on all the systems I've worked on so far, I've been able to
Sorry, for the inconvenience.
For me, this is a big step forward.
With tcl 8.6, I always have to type:
source -encoding utf-8 script.tcl
as I don't know the system encoding.
It is not setable for me.
So, this change is a big advantage, as now, I can type:
source script.tcl.
to set the system encoding, even as a normal user.
Users of our in-house software stack (similar to BAWT, but only
for Linux) are advised to set iso8859-1 encoding, before running
any programs.
Anyhow, what we should have at least is a magic comment as described
in my other post. This would give you the option of placing
the encoding where it really belongs. And this would avoid
having to include the encoding with every source/tclsh call.
If you ever change the encoding, you have to find all this places
and correct them. Good luck to find them all...
To summarize, I am more and more getting to the opinion,
that Tcl9 forces developers to encode their source codes in utf-8.
Otherwise you end up in an encoding nightmare.
Best regards,
Uwe
Sorry, I am MS-Windows only.
I can only set the system encoding system wide (well, the answer is more complicated - it depends on the application manifest and the system wide system encoding).
As I distribute my software worldwide, I am not in control of the system encoding.
Sorry, different use-case, different answer.
If you want this feature, please file a bug report at the bug tracker.
Take care,
Harald
However, before I consider any of the above options,
I try to solve problems as close to the actual cause as possible. **************************
I can get my encoding in the ::env array on Linux. Can you on Windows?**************************
If so, maybe you can use that to set the encoding in the beginning
of all scripts and never have to change it because it will adjust >automatically.
And you may change the behaviour in the automatically sourced startup file in your home folder or wherever, Linux folks may help.
rename source source2
proc source args {
source2 -encoding iso8859-1 {*}$args
}
On Wed, 8 Jan 2025 11:35:19 -0300, Luc wrote:
I can get my encoding in the ::env array on Linux. Can you on Windows?**************************
If so, maybe you can use that to set the encoding in the beginning
of all scripts and never have to change it because it will adjust
automatically.
Another idea: force all scripts to source a set_encoding.tcl file
stored somewhere. If you ever have to change, you change the one file
and move on. You could even make it blank if convenient or necessary.
Another idea: force all scripts to source a set_encoding.tcl fileNice try, but I don't think it's possible to set the encoding within the >file. And that for one simple reason: the file has already been read. **************************
stored somewhere. If you ever have to change, you change the one file
and move on. You could even make it blank if convenient or necessary.
On Wed, 8 Jan 2025 16:04:26 +0100, Uwe Schmitz wrote:
**************************Another idea: force all scripts to source a set_encoding.tcl fileNice try, but I don't think it's possible to set the encoding within the >>file. And that for one simple reason: the file has already been read.
stored somewhere. If you ever have to change, you change the one file
and move on. You could even make it blank if convenient or necessary.
That doesn't sound quite true to me. Why is there an 'encoding' command
then? Is it useless because whenever you use it it's too late because
the file has already been read? Unlikely.
Source the set_encoding.tcl file before anything else, before you even
try to read anything. If you can set the encoding on the command line,
you can set it on the first line of the script that command line is
supposed to run.
situation. If the script is iso-8859 encoded, but Tcl's default
parsing reads it as UTF-8, then all of the iso-8859 characters
inside are already corrupted *before* even the first command in the
script is executed. So there's no way to "source" a
"set_encoding.tcl" /in the main script itself/, that would adjust
the encoding before the main script is parsed using the wrong
encoding.
On Wed, 8 Jan 2025 17:04:26 -0000 (UTC), Rich wrote:
situation. If the script is iso-8859 encoded, but Tcl's default**************************
parsing reads it as UTF-8, then all of the iso-8859 characters
inside are already corrupted *before* even the first command in the
script is executed. So there's no way to "source" a
"set_encoding.tcl" /in the main script itself/, that would adjust
the encoding before the main script is parsed using the wrong
encoding.
I see.
How about trading places?
Instead of main.tcl sourcing set_encoding.tcl, starter.tcl runs some 'encoding' command then sources main.tcl. Basically, a wrapper.
Another option is to run 'iconv' recursively on all those source files.
I did something like that some 15 years ago. But my case involved a migration. I had a ton of legacy iso-8859 files on a system-wide
utf-8 Linux system. That caused me problems too, but iconv fixed it.
Instead of main.tcl sourcing set_encoding.tcl, starter.tcl runs some
'encoding' command then sources main.tcl. Basically, a wrapper.
Yes, that works. But then Uwe has to go and "wrapperize" all the
various scripts, on all the various client systems. So he's back in
the same boat of "major modifications need be made now" as changing all
the launching instances to launch with "-encoding iso-8859".
I've resisted pointing this one out, but long term, yes, updating all
the scripts to be utf-8 encoded is the right, long term, answer. But
that belies all the current, short term effort, involved in doing so.
Instead of main.tcl sourcing set_encoding.tcl, starter.tcl runs some
'encoding' command then sources main.tcl. Basically, a wrapper.
Yes, that works. But then Uwe has to go and "wrapperize" all the
various scripts, on all the various client systems. So he's back in
the same boat of "major modifications need be made now" as changing all >>the launching instances to launch with "-encoding iso-8859".
True, but he has considered that kind of effort. His words:
"That means we have to add "-encoding iso8859-1"
to ALL source and ALL tclsh calls in ALL scripts.
So far, so good(or bad?)."
"What initially seems quite doable, looks more and more scary
to me. First, if we ever may switch encoding to utf-8 we
have to alter all those lines again."
So in my mind, the "customer" accepts (though grudgingly) making
large scale changes, but is concerned with possible new changes
in the future. A wrapper can handle the future quite gracefully.
I've resisted pointing this one out, but long term, yes, updating all
the scripts to be utf-8 encoded is the right, long term, answer. But
that belies all the current, short term effort, involved in doing so.
Actually, when I mentioned my migration case, I was also thinking that
I could afford to do it because I was migrating to Linux and utf-8 was
not even the future anymore, it was pretty much the present. But maybe >running iconv wouldn't be acceptable because Uwe is (I assume) on
Windows. Does a Windows user want to convert his files to utf-8?
Won't that cause problems if the system is iso-8859-1? Windows still
uses iso-8859-1, right?
So yes, I guess Tcl9 causes trouble to 8859-1 users. Yes, sounds like
it needs some fixing.
More suggestions: how about not using Tcl9 just yet? I'm stil on 8.6
and the water is fine. Early adopters tend to pay a price. In my case, >absent packages.
I have my own special case, I use Debian 9 which only ships 8.6.6 so
I had to build 8.6.15 from source because I really need Unicode.
But for some time I used Freewrap as a single-file batteries included
Tcl/Tk interpreter. So maybe Uwe should just use a different interpreter, >likely just a slightly older version of Tcl/Tk and embrace Tcl9 later.
I wonder if one can hack the encoding issue on the Tcl9 source and
rebuild it.
--
Luc
I did something like that some 15 years ago. But my case involved a
migration. I had a ton of legacy iso-8859 files on a system-wide
utf-8 Linux system. That caused me problems too, but iconv fixed it.
In my case, I used the \uxxxx escapes for anything that was not plain
ASCII, so all my scripts are both "basic 8859" and "utf-8" at the same
time, and having Tcl 9 source them as utf-8 won't cause an issue. But
it sounds like Uwe directly entered the extended 8859 characters into
the scripts. Which very well may have made perfect sense if he had
more than one or two of them per script.
On Wed, 8 Jan 2025 19:32:24 -0000 (UTC), Rich wrote:
Instead of main.tcl sourcing set_encoding.tcl, starter.tcl runs some
'encoding' command then sources main.tcl. Basically, a wrapper.
Yes, that works. But then Uwe has to go and "wrapperize" all the
various scripts, on all the various client systems. So he's back in
the same boat of "major modifications need be made now" as changing all >>the launching instances to launch with "-encoding iso-8859".
True, but he has considered that kind of effort. His words:
"That means we have to add "-encoding iso8859-1"
to ALL source and ALL tclsh calls in ALL scripts.
So far, so good(or bad?)."
"What initially seems quite doable, looks more and more scary
to me. First, if we ever may switch encoding to utf-8 we
have to alter all those lines again."
So in my mind, the "customer" accepts (though grudgingly) making
large scale changes, but is concerned with possible new changes
in the future. A wrapper can handle the future quite gracefully.
I've resisted pointing this one out, but long term, yes, updating all
the scripts to be utf-8 encoded is the right, long term, answer. But
that belies all the current, short term effort, involved in doing so.
Actually, when I mentioned my migration case, I was also thinking that
I could afford to do it because I was migrating to Linux and utf-8 was
not even the future anymore, it was pretty much the present. But maybe running iconv wouldn't be acceptable because Uwe is (I assume) on
Windows.
Does a Windows user want to convert his files to utf-8?
Won't that cause problems if the system is iso-8859-1?
Windows still uses iso-8859-1, right?
So yes, I guess Tcl9 causes trouble to 8859-1 users.
Yes, sounds like it needs some fixing.
I have my own special case, I use Debian 9 which only ships 8.6.6 so
I had to build 8.6.15 from source because I really need Unicode.
But for some time I used Freewrap as a single-file batteries included
Tcl/Tk interpreter. So maybe Uwe should just use a different interpreter, likely just a slightly older version of Tcl/Tk and embrace Tcl9 later.
I wonder if one can hack the encoding issue on the Tcl9 source and
rebuild it.
On 1/8/2025 2:32 PM, Rich wrote:
I did something like that some 15 years ago. But my case involved a
migration. I had a ton of legacy iso-8859 files on a system-wide
utf-8 Linux system. That caused me problems too, but iconv fixed it.
In my case, I used the \uxxxx escapes for anything that was not plain
ASCII, so all my scripts are both "basic 8859" and "utf-8" at the same
time, and having Tcl 9 source them as utf-8 won't cause an issue. But
it sounds like Uwe directly entered the extended 8859 characters into
the scripts. Which very well may have made perfect sense if he had
more than one or two of them per script.
Interesting thread.
Is there a way check a script file for such incompatibilities ahead
of time?
Would this work as a solution? You build your own Tcl/Tk and add or duplicate the source command from an earlier version that you are happy with. Then you start up your app as you do currenly, and once it is
loaded, you switch the source command to the new version and change the system encoding.
Won't that cause problems if the system is iso-8859-1?
Only if windows tries to interpret the UTF-8 data as iso-8859
characters. But as far as the Tcl scripts go, once the scripts are
UTF-8, and [source] is using UTF-8 to read them, the fact that windows >system might be iso-8859 is irrelivant.
8.6.6 handled Unicode fine. In fact, 8.5 handled Unicode (so long as
one stuck to the BMP) just fine.
On Wed, 8 Jan 2025 22:53:40 -0000 (UTC), Rich wrote:
Won't that cause problems if the system is iso-8859-1?
Only if windows tries to interpret the UTF-8 data as iso-8859
characters. But as far as the Tcl scripts go, once the scripts are
UTF-8, and [source] is using UTF-8 to read them, the fact that windows >>system might be iso-8859 is irrelivant.
I was thinking that if the Windows user edits the file on Windows,
maybe Windows will write it as iso-8859. I honestly don't know.
8.6.6 handled Unicode fine. In fact, 8.5 handled Unicode (so long as
one stuck to the BMP) just fine.
I am positive that 8.6.6 only partially supports Unicode.
I found many characters that would not display correctly on a text
widget
and would be saved as garbled content if captured in the widget and
written to file.
I even had problems with glob and other commands when applied to
some file names. For example, some html page I had downloaded from
somewhere had something to do with countries and the page title had
Unicode flags in the title, so the title and the flags carried over
to the file name when I saved it.
The complete implementation of Unicode begins in 8.6.10 or 8.6.13, I
can't remember which, I think it's 8.6.13.
Display depends upon whether your font being used had a glyph for the >codepoint - no glyph in the font, no display in the text widget
That also depends upon what your system encoding was set to, and
That is probably when support for the extended Unicode characters
(planes beyond the BMP) started to be added.
On Thu, 9 Jan 2025 03:57:14 -0000 (UTC), Rich wrote:
Display depends upon whether your font being used had a glyph for the
codepoint - no glyph in the font, no display in the text widget
That also depends upon what your system encoding was set to, and**************************
That is probably when support for the extended Unicode characters
(planes beyond the BMP) started to be added.
Nothing to do with fonts or encoding. The problem vanished as soon as
I used 8.6.13, later 8.6.15. It was extended Unicode characters.
You can see my discussion here, at the end of the page:
https://wiki.tcl-lang.org/page/Unicode
Luc <luc@sep.invalid> wrote:
On Wed, 8 Jan 2025 19:32:24 -0000 (UTC), Rich wrote:
Instead of main.tcl sourcing set_encoding.tcl, starter.tcl runs some
'encoding' command then sources main.tcl. Basically, a wrapper.
Yes, that works. But then Uwe has to go and "wrapperize" all the
various scripts, on all the various client systems. So he's back in
the same boat of "major modifications need be made now" as changing all
the launching instances to launch with "-encoding iso-8859".
True, but he has considered that kind of effort. His words:
"That means we have to add "-encoding iso8859-1"
to ALL source and ALL tclsh calls in ALL scripts.
So far, so good(or bad?)."
"What initially seems quite doable, looks more and more scary
to me. First, if we ever may switch encoding to utf-8 we
have to alter all those lines again."
So in my mind, the "customer" accepts (though grudgingly) making
large scale changes, but is concerned with possible new changes
in the future. A wrapper can handle the future quite gracefully.
Uwe's reality is likely that at some point a "mass migration" may very
well have to be done. There's at least two possibilities:
1) Tcl9 remains as it is today, loading all scripts as UTF-8 unless
told otherwise by a user provided option. Either all iso-8859 scripts
have to be modifed to become:
1a) UTF-8 encoded;
1b) modified pass the -encoding parameter to [source]
1c) a wrapper deployed that 'adjusts' things such that the main
script, and all sourced scripts use -encoding to source as iso-8859
All appear to be substantial work based on Uwe's statements so far, and
all have a risk of overlooking one or more that should have been
modified.
2) Tcl9 patch X reverts to using "system encoding" (and the user's of
these scripts are on systems where "system encoding" is presently
returning iso-8859). So things work again, with no changes, for the
moment. But then Windows version 1Y.Z changes things such that it now
uses a system encoding of UTF-8. Suddenly, the same problem from 1
returns unless the user's have the abilty to adjust their system
encoding back (and if 'system encoding' is an "administrator
controlled" setting these users, then this option is not available).
So my two cents, for what it is worth, given that I suspect this change
will eventually 'force itself' no matter what Tcl9 patch level X might
do, would be to begin the process of migrating all of these scripts to
UTF-8 encoding. It will be hard, but once done, it likely will be
stable again for the future.
I've resisted pointing this one out, but long term, yes, updating all
the scripts to be utf-8 encoded is the right, long term, answer. But
that belies all the current, short term effort, involved in doing so.
Actually, when I mentioned my migration case, I was also thinking that
I could afford to do it because I was migrating to Linux and utf-8 was
not even the future anymore, it was pretty much the present. But maybe
running iconv wouldn't be acceptable because Uwe is (I assume) on
Windows.
From his posts on this thread, we can assume that his scripts are being
used on windows systems. That does not imply much about where Uwe
develops those same scripts. I have lots of my own scripts that I use
on $work's windows machine, but all of them are written on Linux.
Does a Windows user want to convert his files to utf-8?
The average/median windows user does not even know what UTF-8 means nor
why it is significant. They just expect that when the launch "icon X"
that expected program X appears, and that the text inside is as
expected. So it is much more likely the work/effort of "convert to
utf-8" will fall on Uwe, as it is very likely the windows users know
nothing of any of this (or if they 'know' anything, it is something
simple for them, such as: "set this selection box in this windows
config pane to say Y" and that ends their knowledge).
Won't that cause problems if the system is iso-8859-1?
Only if windows tries to interpret the UTF-8 data as iso-8859
characters. But as far as the Tcl scripts go, once the scripts are
UTF-8, and [source] is using UTF-8 to read them, the fact that windows
system might be iso-8859 is irrelivant.
Windows still uses iso-8859-1, right?
Honestly I have no idea. The *only* windows machine I use is $work's
windows machine, and the 'administrator' controls most of it so I can
only adjust things in a very narrow band (very irritating at times, but
their machine, their rules).
So yes, I guess Tcl9 causes trouble to 8859-1 users.
Only if they directly entered any codepoints that were beyond plain
ASCII. Code points 0 through 127 are identical between 8859 and UTF-8.
If the files used plain ASCII, and the \uXXXX escapes, there would be
no trouble at all. Of course if one is using a lot of non-English
characters for non-English languages, seeing the actual characters in
the scripts vs. walls of \u00b0 \u00a0 \u2324 everywhere makes for an
easier development effort.
Yes, sounds like it needs some fixing.
Agreed. Uwe may be able to put off the fixig for some more time, but
this change is going to arrive one day. He will likely have to make it
at some point.
I have my own special case, I use Debian 9 which only ships 8.6.6 so
I had to build 8.6.15 from source because I really need Unicode.
8.6.6 handled Unicode fine. In fact, 8.5 handled Unicode (so long as
one stuck to the BMP) just fine.
But for some time I used Freewrap as a single-file batteries included
Tcl/Tk interpreter. So maybe Uwe should just use a different interpreter,
likely just a slightly older version of Tcl/Tk and embrace Tcl9 later.
That is another option, a custom build that defaults to iso-8859.
I wonder if one can hack the encoding issue on the Tcl9 source and
rebuild it.
The answer is likely a "yes". But I've not looked at the code to know
that for sure. But this just feels like a "one line change" followed
by a recompile. But now one has to also deliver that custom runtime
as well as the scripts that go with it.
Nevertheless, this point should be noted under "Important
Incompatibilities in Tcl 9.0"
on the Tcl9 page:
https://www.tcl.tk/software/tcltk/9.0.html
Rich,
at first, thank you very much for explaining my situation very well.
I couldn't have argued better ;-)
Let me add a note on why characters outside the iso7-bit range
cannot always be replaced by the \uXXXX notation:
Comments.
If you like to write comments in your native language it
should be no very readable to code e.g. german umlauts as \uXXXX.
Especially if you extract the program documentation out of from
the source code in a kind of “literate programming” (which I often
do), the use of \u notation is very cumbersome.
On Thu, 9 Jan 2025 03:57:14 -0000 (UTC), Rich wrote:
Display depends upon whether your font being used had a glyph for the >>codepoint - no glyph in the font, no display in the text widget
That also depends upon what your system encoding was set to, and**************************
That is probably when support for the extended Unicode characters
(planes beyond the BMP) started to be added.
Nothing to do with fonts or encoding. The problem vanished as soon as
I used 8.6.13, later 8.6.15. It was extended Unicode characters.
You can see my discussion here, at the end of the page:
https://wiki.tcl-lang.org/page/Unicode
If you develop on Linux (or have a Linux machine available) you may
wish to begin experimenting with using iconv to convert some scripts to >UTF-8 encoding.
If things work properly, it might be best to start
that conversion (even if you do it slowly over time) sooner rather than >later. It will be work, but it is work that you are likely going to
have to perform at some point anyway.
On Thu, 9 Jan 2025 15:37:22 -0000 (UTC), Rich wrote:
If things work properly, it might be best to start that conversion
(even if you do it slowly over time) sooner rather than later. It
will be work, but it is work that you are likely going to have to
perform at some point anyway.
Yes, but now I think that Tcl9 is wrong. Blanket imposition of any
enconding is unfair.
2) Use the "system encoding" (which is still 'imposing', just
'imposing' whatever the OS itself imposes).
On Fri, 10 Jan 2025 00:12:48 -0000 (UTC), Rich wrote:
2) Use the "system encoding" (which is still 'imposing', just
'imposing' whatever the OS itself imposes).
Is the OS really imposing though? I honestly don't know about Windows,
but Linux lets me choose the system-wide encoding. And whatever I
chose, I must've chosen it for some reason. It's not Tcl's place
to challenge my decision.
And if the poor sorry Windows user really can't choose his encoding,
then why should Tcl make the user's life even more difficult?
The 8.6 way is wiser.
Uwe Schmitz <schmitzu@mail.de> wrote:
Rich,
at first, thank you very much for explaining my situation very well.
I couldn't have argued better ;-)
Let me add a note on why characters outside the iso7-bit range
cannot always be replaced by the \uXXXX notation:
Comments.
If you like to write comments in your native language it
should be no very readable to code e.g. german umlauts as \uXXXX.
Especially if you extract the program documentation out of from
the source code in a kind of “literate programming” (which I often
do), the use of \u notation is very cumbersome.
This was my suspision. In my case, the non-ascii characters are not
part of the language (English in my case) script, they are extras (such
as arrows/lines or the degree symbol, etc.) and so the script is 99.9% readable, with a few \uXXXX sometimes occurring.
But writing a string out where every third character is \uXXXX makes
for a very human unreadable string (be it a comment, or a string for
the code to use).
If you develop on Linux (or have a Linux machine available) you may
wish to begin experimenting with using iconv to convert some scripts to
UTF-8 encoding. If things work properly, it might be best to start
that conversion (even if you do it slowly over time) sooner rather than later. It will be work, but it is work that you are likely going to
have to perform at some point anyway.
Am 09.01.2025 um 10:12 schrieb Uwe Schmitz:
Nevertheless, this point should be noted under "Important Incompatibilities in Tcl 9.0"
on the Tcl9 page:
https://www.tcl.tk/software/tcltk/9.0.html
Hi Uwe,
thanks for all your contributions.
Here is the wiki page for TCL script migration:
https://core.tcl-lang.org/tcl/wiki?name=Migrating+scripts+to+Tcl+9&p
Please look to section "Default encoding for scripts is UTF-8".
The also mentioned migration tools by Ashok also check the codepage issue. You may consider to use those tools also to detect other incompatible changes.
https://github.com/apnadkarni/tcl9-migrate
I am happy to include any missing information to this page.
Thank you and take care,
Harald
Another thing that hurts me and is off-topic here (sorry):
The changed variable name resolution also affects itcl::class
defintions. The following leads to an error:
::itcl::class A {
public common tclVersion $tcl_version
}
Because the ::itcl::class commands open a namespace, the resolution
of the global variable tcl_version doesn't succeed. You
have to use the complete path $::tcl_version.
The 8.6 way is wiser.
Not sure... your choice of encoding may not be the one of your
application users.
In this case, your code may fail to load in the user's encoding choice. **************************
On Fri, 10 Jan 2025 06:38:02 +0000, eric wrote:
The 8.6 way is wiser.
Not sure... your choice of encoding may not be the one of your**************************
application users.
In this case, your code may fail to load in the user's encoding choice.
The user's choice is and always will be a point of uncertainty.
Tcl9 introduces an additional uncertainty with the developer.
Sysop: | DaiTengu |
---|---|
Location: | Appleton, WI |
Users: | 1,028 |
Nodes: | 10 (0 / 10) |
Uptime: | 144:21:01 |
Calls: | 13,330 |
Files: | 186,574 |
D/L today: |
1,404 files (389M bytes) |
Messages: | 3,355,633 |