• Remove new line char

    From alexandru.dadalau@alexandru.dadalau@meshparts.de (alexandru) to comp.lang.tcl on Wed Mar 19 11:47:55 2025
    From Newsgroup: comp.lang.tcl

    I'm accessing the Windows clipboard using twapi.
    When the text in the clipboard was copied from an MS Excel cell, then
    text contains some new line char that I could not identify and thus
    unable to remove it using

    regsub {[\r\n]+$} $text ""

    What other way are there besides "string trim" to remove new line chars?

    Many thanks
    Alexandru

    --
    --- Synchronet 3.20c-Linux NewsLink 1.2
  • From Ralf Fassel@ralfixx@gmx.de to comp.lang.tcl on Wed Mar 19 14:53:23 2025
    From Newsgroup: comp.lang.tcl

    * alexandru.dadalau@meshparts.de (alexandru)
    | I'm accessing the Windows clipboard using twapi.
    | When the text in the clipboard was copied from an MS Excel cell, then
    | text contains some new line char that I could not identify and thus
    | unable to remove it using

    | regsub {[\r\n]+$} $text ""

    | What other way are there besides "string trim" to remove new line chars?

    First step would be to identify the offending char by printing it in hex
    or octal. Depending on the result, regexp pattern "whitespace" might be
    of help

    regsub {[[:space:]]+$} $text ""
    regsub {[\s]+$} $text ""

    (cf https://www.tcl-lang.org/man/tcl/TclCmd/re_syntax.htm)

    HTH
    R'
    --- Synchronet 3.20c-Linux NewsLink 1.2
  • From Arjen@user153@newsgrouper.org.invalid to comp.lang.tcl on Wed Mar 19 15:20:09 2025
    From Newsgroup: comp.lang.tcl


    alexandru.dadalau@meshparts.de (alexandru) posted:

    I'm accessing the Windows clipboard using twapi.
    When the text in the clipboard was copied from an MS Excel cell, then
    text contains some new line char that I could not identify and thus
    unable to remove it using

    regsub {[\r\n]+$} $text ""

    What other way are there besides "string trim" to remove new line chars?

    Many thanks
    Alexandru

    --

    These applications may insert all manner of invisible characters. Yesterday,
    a colleague copied some source code from a PowerPoint presentation and that turned out to be riddled with invisible characters that in an editor showed
    up as "Â".

    So it goes, to quote Kurt Vonnegut.
    --- Synchronet 3.20c-Linux NewsLink 1.2
  • From Harald Oehlmann@wortkarg3@yahoo.com to comp.lang.tcl on Wed Mar 19 16:32:25 2025
    From Newsgroup: comp.lang.tcl

    Am 19.03.2025 um 12:47 schrieb alexandru:
    I'm accessing the Windows clipboard using twapi.
    When the text in the clipboard was copied from an MS Excel cell, then
    text contains some new line char that I could not identify and thus
    unable to remove it using

    regsub {[\r\n]+$} $text ""

    What other way are there besides "string trim" to remove new line chars?

    Many thanks
    Alexandru

    --

    what about:
    string map "\n {} \r {}" $text

    Harald
    --- Synchronet 3.20c-Linux NewsLink 1.2
  • From alexandru.dadalau@alexandru.dadalau@meshparts.de (alexandru) to comp.lang.tcl on Wed Mar 19 17:44:39 2025
    From Newsgroup: comp.lang.tcl

    Thanks Ralf.

    I tried:

    format %x [scan [string index $text end] %c]

    but it only identifies 0 as last char.

    both

    set text [regsub {[[:space:]]+$} $text ""]

    and

    set text [regsub {[\s]+$} $text ""]

    made no difference...

    --
    --- Synchronet 3.20c-Linux NewsLink 1.2
  • From alexandru.dadalau@alexandru.dadalau@meshparts.de (alexandru) to comp.lang.tcl on Wed Mar 19 17:47:39 2025
    From Newsgroup: comp.lang.tcl

    Above answer goes to Harald.

    --
    --- Synchronet 3.20c-Linux NewsLink 1.2
  • From alexandru.dadalau@alexandru.dadalau@meshparts.de (alexandru) to comp.lang.tcl on Wed Mar 19 17:46:31 2025
    From Newsgroup: comp.lang.tcl

    Well, that's what my orginal reexp does. No effect.

    --
    --- Synchronet 3.20c-Linux NewsLink 1.2
  • From Rich@rich@example.invalid to comp.lang.tcl on Wed Mar 19 20:47:00 2025
    From Newsgroup: comp.lang.tcl

    alexandru <alexandru.dadalau@meshparts.de> wrote:
    Well, that's what my orginal reexp does. No effect.

    No, it Harold's is not identical to your regexp.

    Your original (since you did not quote anything):

    regsub {[\r\n]+$} $text ""

    If this is exactly what you are doing, this replaces a run of \r or \n characters that also occur adjacent to the end of the string.

    Harold suggested:

    string map "\n {} \r {}" $text

    This deletes every occurrence of \n and every occurrence of \r,
    anywhere in the string.

    Which is very different.
    --- Synchronet 3.20c-Linux NewsLink 1.2
  • From Michael Soyka@mssr953@gmail.com to comp.lang.tcl on Wed Mar 19 16:54:20 2025
    From Newsgroup: comp.lang.tcl

    Hi.

    I'm sorry I missed what came before but ...

    On 03/19/2025 1:44 PM, alexandru wrote:
    Thanks Ralf.

    I tried:

    format %x [scan [string index $text end] %c]

    but it only identifies 0 as last char.

    Just to be clear, are you saying that scan returns the integer value
    zero or does it return 48, the ascii code for zero? If it's the integer
    zero, that says to me that "text" contains a NUL-terminated string. If
    so, does not
    regsub {\x00$} $text {} text_sans_nul
    do what you want?


    both

     set text [regsub {[[:space:]]+$} $text ""]

    and

     set text [regsub {[\s]+$} $text ""]

    made no difference...

    --
    -mike
    --- Synchronet 3.20c-Linux NewsLink 1.2
  • From alexandru.dadalau@alexandru.dadalau@meshparts.de (alexandru) to comp.lang.tcl on Thu Mar 20 09:25:40 2025
    From Newsgroup: comp.lang.tcl

    On Wed, 19 Mar 2025 20:47:00 +0000, Rich wrote:

    Harold suggested:

    string map "\n {} \r {}" $text

    This deletes every occurrence of \n and every occurrence of \r,
    anywhere in the string.

    Which is very different.

    Of course, in general.
    In my case it does the same, since there is only one new line char at
    the end.

    --
    --- Synchronet 3.20c-Linux NewsLink 1.2
  • From alexandru.dadalau@alexandru.dadalau@meshparts.de (alexandru) to comp.lang.tcl on Thu Mar 20 09:24:14 2025
    From Newsgroup: comp.lang.tcl

    On Wed, 19 Mar 2025 20:54:20 +0000, Michael Soyka wrote:

    so, does not
    regsub {\x00$} $text {} text_sans_nul
    do what you want?


    -mike

    Hi Mike,

    Thanks, I tried your regex but it makes no difference.

    --
    --- Synchronet 3.20c-Linux NewsLink 1.2
  • From alexandru.dadalau@alexandru.dadalau@meshparts.de (alexandru) to comp.lang.tcl on Thu Mar 20 09:34:40 2025
    From Newsgroup: comp.lang.tcl

    Here is the code.
    Can anywone test on Windows with Excel?
    The test procedure is:

    Open an Excel table with data an copy one single cell.
    In the Tcl/Tk console execute the code below.
    See how the copied text is output to the window.
    In my case there is an additional line under the copied text, meaning a
    new line was still created.


    package require twapi
    proc ::ClipboardGetText {} {
    ::twapi::open_clipboard
    set text [::twapi::read_clipboard 1]
    ::twapi::close_clipboard
    return $text
    }
    proc ::ClipboardSetText {text} {
    # Write clipboard content
    ::twapi::open_clipboard
    ::twapi::empty_clipboard
    ::twapi::write_clipboard_text $text
    ::twapi::close_clipboard
    }

    ## Remove line breaks at the end of string (mostly to avoid issues when
    copying from Excel)
    proc ::test {} {
    set text [::ClipboardGetText]
    puts [format %x [scan [string index $text end] %c]]
    set text [regsub {[\r\n]+$} $text ""]
    set text [regsub {[[:space:]]+$} $text ""]
    set text [regsub {[\s]+$} $text ""]
    set text [regsub {\x0$} $text ""]
    # Do not use "trim" since we want to preserve trailing white spaces
    but remove new line char
    puts $text
    }
    ::test

    --
    --- Synchronet 3.20c-Linux NewsLink 1.2
  • From Simon Geard@simon@whiteowl.co.uk to comp.lang.tcl on Thu Mar 20 10:07:35 2025
    From Newsgroup: comp.lang.tcl

    On 19/03/2025 11:47, alexandru wrote:
    I'm accessing the Windows clipboard using twapi.
    When the text in the clipboard was copied from an MS Excel cell, then
    text contains some new line char that I could not identify and thus
    unable to remove it using

    regsub {[\r\n]+$} $text ""

    What other way are there besides "string trim" to remove new line chars?

    Many thanks
    Alexandru

    --
    Could this be an encoding issue? I remember that the default output from
    a Powershell is UTF-16 LE so maybe that's true of Excel (which I don't
    have so can't test).
    Another thought is to create a list split on the unwanted newline
    character then reassemble it.

    Simon
    --- Synchronet 3.20c-Linux NewsLink 1.2
  • From Harald Oehlmann@wortkarg3@yahoo.com to comp.lang.tcl on Thu Mar 20 11:22:54 2025
    From Newsgroup: comp.lang.tcl

    Hi Alexandru,
    please allow me to share this wish 8.6.14 32 bit session:

    Excel 2013 with a cell with the content "AÄ".



    % clipboard get


    % lmap i [split $c ""] {scan $i %c}
    65 196 10
    % string trimright $c "\x0a"

    %

    So, "string trimright \x0a" does the job for me.

    Take care,
    Harald
    --- Synchronet 3.20c-Linux NewsLink 1.2
  • From alexandru.dadalau@alexandru.dadalau@meshparts.de (alexandru) to comp.lang.tcl on Thu Mar 20 11:56:50 2025
    From Newsgroup: comp.lang.tcl

    On Thu, 20 Mar 2025 10:22:54 +0000, Harald Oehlmann wrote:

    Hi Alexandru,
    please allow me to share this wish 8.6.14 32 bit session:

    Excel 2013 with a cell with the content "AÄ".



    % clipboard get


    % lmap i [split $c ""] {scan $i %c}
    65 196 10
    % string trimright $c "\x0a"

    %

    So, "string trimright \x0a" does the job for me.

    Take care,
    Harald

    Harald, I think you indirectly showed me the causing issue: It's:

    ::twapi::read_clipboard 1

    Somehow it does different things than "clipboard get".

    In the twapi manual it states:

    "The content is an exact copy of the contents of the clipboard in binary format. Callers will need to use Tcl commands such as binary and
    encoding to parse the data."

    I tried

    set text [encoding convertfrom cp1252 $text]

    but no success.

    --
    --- Synchronet 3.20c-Linux NewsLink 1.2
  • From Michael Soyka@mssr953@gmail.com to comp.lang.tcl on Thu Mar 20 11:12:08 2025
    From Newsgroup: comp.lang.tcl

    On 03/20/2025 5:34 AM, alexandru wrote:
    Here is the code.
    Can anywone test on Windows with Excel?
    The test procedure is:

    Open an Excel table with data an copy one single cell.
    In the Tcl/Tk console execute the code below.
    See how the copied text is output to the window.
    In my case there is an additional line under the copied text, meaning a
    new line was still created.


    package require twapi
    proc ::ClipboardGetText {} {
     ::twapi::open_clipboard
     set text [::twapi::read_clipboard 1]
     ::twapi::close_clipboard
     return $text
    }
    proc ::ClipboardSetText {text} {
     # Write clipboard content
     ::twapi::open_clipboard
     ::twapi::empty_clipboard
     ::twapi::write_clipboard_text $text
     ::twapi::close_clipboard
    }

    ## Remove line breaks at the end of string (mostly to avoid issues when copying from Excel)
    proc ::test {} {
     set text [::ClipboardGetText]
     puts [format %x [scan [string index $text end] %c]]
     set text [regsub {[\r\n]+$} $text ""]
     set text [regsub {[[:space:]]+$} $text ""]
     set text [regsub {[\s]+$} $text ""]
     set text [regsub {\x0$} $text ""]
     # Do not use "trim" since we want to preserve trailing white spaces
    but remove new line char
     puts $text
    }
    ::test

    --

    I'm running Windows 10 with tcl 9 in the USA. Following your suggestion
    above and using your code, I did this:

    set text [::ClipboardGetText]
    foreach c [split $text {}] {puts [scan c $c]}

    and I get the desired text plus 3 trailing characters: \r\n\x00.

    So, would this regular expression: {\r\n\x00} do what you want?

    -mike
    --- Synchronet 3.20c-Linux NewsLink 1.2
  • From Rich@rich@example.invalid to comp.lang.tcl on Thu Mar 20 18:03:24 2025
    From Newsgroup: comp.lang.tcl

    alexandru <alexandru.dadalau@meshparts.de> wrote:
    I tried

    set text [encoding convertfrom cp1252 $text]

    but no success.

    Are you sure the clipboard is cp1252?

    If it is utf-8 then converting as if it is cp1252 will create strange
    results.
    --- Synchronet 3.20c-Linux NewsLink 1.2
  • From Harald Oehlmann@wortkarg3@yahoo.com to comp.lang.tcl on Fri Mar 21 08:32:39 2025
    From Newsgroup: comp.lang.tcl

    Am 20.03.2025 um 19:03 schrieb Rich:
    alexandru <alexandru.dadalau@meshparts.de> wrote:
    I tried

    set text [encoding convertfrom cp1252 $text]

    but no success.

    Are you sure the clipboard is cp1252?

    If it is utf-8 then converting as if it is cp1252 will create strange results.

    Dear Rich,
    the Windows clipboard is more complicated/sophisticated.

    Each entry has a type. This is nowdays mostly CF_UNICODETEXT, which is
    16 bit unicode.
    A sending application can put an entry into the clipboard in multiple
    formats and the receiving application can pick one or more.

    You see that, if you paste a word snippet into LibreOffice. You have
    multiple options: paste normal (which is Rich Text Format), paste as
    text (which is a unicode string).

    And types may be image, file, folder,...

    https://learn.microsoft.com/en-us/windows/win32/dataxchg/clipboard-formats

    We have often "enhanced" this code in TCL to adopt it to modern usage.
    There should not be any issues with the encoding, expect that the
    sending application does anything strange.

    Take care,
    Harald
    --- Synchronet 3.20c-Linux NewsLink 1.2
  • From Rich@rich@example.invalid to comp.lang.tcl on Fri Mar 21 12:59:44 2025
    From Newsgroup: comp.lang.tcl

    Harald Oehlmann <wortkarg3@yahoo.com> wrote:
    Am 20.03.2025 um 19:03 schrieb Rich:
    alexandru <alexandru.dadalau@meshparts.de> wrote:
    I tried

    set text [encoding convertfrom cp1252 $text]

    but no success.

    Are you sure the clipboard is cp1252?

    If it is utf-8 then converting as if it is cp1252 will create strange
    results.

    Dear Rich,
    the Windows clipboard is more complicated/sophisticated.

    Yes, that I am well aware.

    Each entry has a type. This is nowdays mostly CF_UNICODETEXT, which is
    16 bit unicode.
    A sending application can put an entry into the clipboard in multiple formats and the receiving application can pick one or more.

    Yes, but alexandru's example he posted, he's not picking a format from
    the clipboard, he's assuming it was cp1252 and forcefully decoding it
    as if it was cp1252. If he left out telling us he requested cp1252
    from the clipboard then that's on him.

    We have often "enhanced" this code in TCL to adopt it to modern
    usage. There should not be any issues with the encoding, expect that
    the sending application does anything strange.

    This is good, because handling the clipboard on windows is an ugly
    beast. My comment was to what alexandru posted showing simply forcing
    a decode as if what he got was cp1252, without showing he asked the
    clipboard to supply cp1252 data.

    --- Synchronet 3.20c-Linux NewsLink 1.2
  • From alexandru.dadalau@alexandru.dadalau@meshparts.de (alexandru) to comp.lang.tcl on Fri Mar 21 20:15:46 2025
    From Newsgroup: comp.lang.tcl

    I wasn't able to respond until now.
    Currently I'm happy with the workaround using "clipboard get" instead of
    twapi equivalent command.
    Stil one might suspect a bug in twapi. I would expect that twapi can
    best handle the Windows clipboard.

    --
    --- Synchronet 3.20c-Linux NewsLink 1.2
  • From Harald Oehlmann@wortkarg3@yahoo.com to comp.lang.tcl on Sun Mar 23 13:52:33 2025
    From Newsgroup: comp.lang.tcl

    Am 21.03.2025 um 21:15 schrieb alexandru:
    I wasn't able to respond until now.
    Currently I'm happy with the workaround using "clipboard get" instead of twapi equivalent command.
    Stil one might suspect a bug in twapi. I would expect that twapi can
    best handle the Windows clipboard.

    TWAPI gives you more control.
    But you should know what you do.

    In your use case (text transmission), the native clipboard code is IMHO
    best suited.

    If you want to do enhanced things (transmit images, whatever), use TWAPI.

    Take care,
    Harald
    --- Synchronet 3.20c-Linux NewsLink 1.2
  • From Harald Oehlmann@wortkarg3@yahoo.com to comp.lang.tcl on Sun Mar 23 13:54:56 2025
    From Newsgroup: comp.lang.tcl

    Am 21.03.2025 um 13:59 schrieb Rich:
    Harald Oehlmann <wortkarg3@yahoo.com> wrote:
    Am 20.03.2025 um 19:03 schrieb Rich:
    alexandru <alexandru.dadalau@meshparts.de> wrote:
    I tried

    set text [encoding convertfrom cp1252 $text]

    but no success.

    Are you sure the clipboard is cp1252?

    If it is utf-8 then converting as if it is cp1252 will create strange
    results.

    Dear Rich,
    the Windows clipboard is more complicated/sophisticated.

    Yes, that I am well aware.

    Each entry has a type. This is nowdays mostly CF_UNICODETEXT, which is
    16 bit unicode.
    A sending application can put an entry into the clipboard in multiple
    formats and the receiving application can pick one or more.

    Yes, but alexandru's example he posted, he's not picking a format from
    the clipboard, he's assuming it was cp1252 and forcefully decoding it
    as if it was cp1252. If he left out telling us he requested cp1252
    from the clipboard then that's on him.

    We have often "enhanced" this code in TCL to adopt it to modern
    usage. There should not be any issues with the encoding, expect that
    the sending application does anything strange.

    This is good, because handling the clipboard on windows is an ugly
    beast. My comment was to what alexandru posted showing simply forcing
    a decode as if what he got was cp1252, without showing he asked the
    clipboard to supply cp1252 data.


    Thanks Rich, great.
    You may look to the clipboard code in Tk and its history.
    We often enhanced that.

    I have answered Alexandru when to use what. I think, that is a good advice.

    With TWAPI, you get raw data and you have to care (e.g. remove trailing
    zero, translate /n/r to /n). Tk native does that for you, but you have
    less control.

    Take care,
    Harald
    --- Synchronet 3.20c-Linux NewsLink 1.2
  • From Ashok@apnmbx-public@yahoo.com to comp.lang.tcl on Mon Mar 24 08:31:48 2025
    From Newsgroup: comp.lang.tcl

    To get *text* from the clipboard using twapi, use
    twapi::read_clipboard_text, not twapi::read_clipboard.

    As has been pointed out by others, Alexandru's issue is he is using [read_clipboard 1] which returns raw binary data in a specific encoding.
    The encoding itself is stored as CF_LOCALE clipboard format in the
    clipboard. To work correctly, the code would need to

    - retrieve the encoded text using [read_clipboard 1]
    - retrieve CF_LOCALE
    - Map CF_LOCALE to the appropriate Tcl encoding
    - Use [encoding] to decode the data into text

    And even that is not guaranteed to work because the default CF_LOCALE
    encoding setting may not support all the Unicode characters pasted into
    the clipboard if the paster had used the CF_UNICODETEXT (13) format.

    So either use Tcl's "clipboard get" or twapi's "read_clipboard_text". If insisting on using "read_clipboard" for whatever reason, use
    "read_clipboard 13" to read in UTF-16 encoding and then [encoding
    convertfrom unicode $clipdata].

    Hope that clarifies

    /Ashok

    On 3/23/2025 6:22 PM, Harald Oehlmann wrote:
    Am 21.03.2025 um 21:15 schrieb alexandru:
    I wasn't able to respond until now.
    Currently I'm happy with the workaround using "clipboard get" instead of
    twapi equivalent command.
    Stil one might suspect a bug in twapi. I would expect that twapi can
    best handle the Windows clipboard.

    TWAPI gives you more control.
    But you should know what you do.

    In your use case (text transmission), the native clipboard code is IMHO
    best suited.

    If you want to do enhanced things (transmit images, whatever), use TWAPI.

    Take care,
    Harald

    --- Synchronet 3.20c-Linux NewsLink 1.2