• Re: awk not outputting in scientific notation despite %e

    From Keith Thompson@Keith.S.Thompson+u@gmail.com to comp.lang.awk on Sun May 7 18:29:03 2023
    From Newsgroup: comp.lang.awk

    Keith Thompson <Keith.S.Thompson+u@gmail.com> writes:
    [...]
    I've submitted a bug report to the bug-gawk mailing list. I'll post the
    URL when the archive updates.

    Here's the bug report:

    https://lists.gnu.org/archive/html/bug-gawk/2023-05/msg00010.html
    --
    Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
    Working, but not speaking, for XCOM Labs
    void Void(void) { Void(); } /* The recursive call of the void */
    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From Keith Thompson@Keith.S.Thompson+u@gmail.com to comp.lang.awk on Sun May 7 19:59:20 2023
    From Newsgroup: comp.lang.awk

    Keith Thompson <Keith.S.Thompson+u@gmail.com> writes:
    Keith Thompson <Keith.S.Thompson+u@gmail.com> writes:
    [...]
    I've submitted a bug report to the bug-gawk mailing list. I'll post the
    URL when the archive updates.

    Here's the bug report:

    https://lists.gnu.org/archive/html/bug-gawk/2023-05/msg00010.html

    And thanks to a reply on the mailing list from Andrew J. Schorr: https://lists.gnu.org/archive/html/bug-gawk/2023-05/msg00011.html
    I no longer believe this is a bug.

    What I should have noticed is that the values for which setting
    "OFMT=.15e" *doesn't* produce output in scientific notation are
    precisely the values that are mathematically integers. Sufficiently
    large floating-point values are always equal to integers.

    Here's what POSIX says: https://pubs.opengroup.org/onlinepubs/9699919799/utilities/awk.html

    A numeric value that is exactly equal to the value of an integer
    (see Concepts Derived from the ISO C Standard) shall be converted to
    a string by the equivalent of a call to the sprintf function (see
    String Functions) with the string "%d" as the fmt argument and the
    numeric value being converted as the first and only expr argument.

    And here's a demonstration showing what's happening:
    ```
    $ cat foo.awk
    #!/usr/bin/awk -f

    BEGIN {
    OFMT="%.16e"
    for (i = 50; i <= 55; i ++) {
    x = 2 ** i - 0.5
    printf("2**%d - 0.5 = %.3f = ", i, x)
    print(x)
    }
    }
    $ ./foo.awk
    2**50 - 0.5 = 1125899906842623.500 = 1.1258999068426235e+15
    2**51 - 0.5 = 2251799813685247.500 = 2.2517998136852475e+15
    2**52 - 0.5 = 4503599627370495.500 = 4.5035996273704955e+15
    2**53 - 0.5 = 9007199254740992.000 = 9007199254740992
    2**54 - 0.5 = 18014398509481984.000 = 18014398509481984
    2**55 - 0.5 = 36028797018963968.000 = 36028797018963968
    $
    ```

    The value in the parent article was 4.483923595133619e+29 / 1000, which
    is an exact integer value (whether evaluated mathematically or in double precision floating point).

    Using printf rather than setting OFMT is the *solution* to the original problem, not a workaround for a bug.
    --
    Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
    Working, but not speaking, for XCOM Labs
    void Void(void) { Void(); } /* The recursive call of the void */
    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From Ross@chaudhry.ross@gmail.com to comp.lang.awk on Sun May 7 20:02:25 2023
    From Newsgroup: comp.lang.awk

    On Sunday, 7 May 2023 at 19:30:47 UTC-6, Keith Thompson wrote:
    Keith Thompson <Keith.S.T...@gmail.com> writes:
    [...]
    I've submitted a bug report to the bug-gawk mailing list. I'll post the URL when the archive updates.
    Here's the bug report:

    https://lists.gnu.org/archive/html/bug-gawk/2023-05/msg00010.html
    --
    Keith Thompson (The_Other_Keith) Keith.S.T...@gmail.com
    Working, but not speaking, for XCOM Labs
    void Void(void) { Void(); } /* The recursive call of the void */

    Ok, thanks for the quick and comprehensive response, everybody. I'll use printf as a workaround and keep an eye on the bug report.
    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From Ross@chaudhry.ross@gmail.com to comp.lang.awk on Sun May 7 20:06:27 2023
    From Newsgroup: comp.lang.awk

    On Sunday, 7 May 2023 at 21:02:27 UTC-6, Ross wrote:
    On Sunday, 7 May 2023 at 19:30:47 UTC-6, Keith Thompson wrote:
    Keith Thompson <Keith.S.T...@gmail.com> writes:
    [...]
    I've submitted a bug report to the bug-gawk mailing list. I'll post the URL when the archive updates.
    Here's the bug report:

    https://lists.gnu.org/archive/html/bug-gawk/2023-05/msg00010.html
    --
    Keith Thompson (The_Other_Keith) Keith.S.T...@gmail.com
    Working, but not speaking, for XCOM Labs
    void Void(void) { Void(); } /* The recursive call of the void */
    Ok, thanks for the quick and comprehensive response, everybody. I'll use printf as a workaround and keep an eye on the bug report.

    Ah, that response from Andrew makes sense and I appreciate the clarification. --- Synchronet 3.20a-Linux NewsLink 1.114
  • From Keith Thompson@Keith.S.Thompson+u@gmail.com to comp.lang.awk on Sun May 7 20:10:35 2023
    From Newsgroup: comp.lang.awk

    Ross <chaudhry.ross@gmail.com> writes:
    On Sunday, 7 May 2023 at 19:30:47 UTC-6, Keith Thompson wrote:
    Keith Thompson <Keith.S.T...@gmail.com> writes:
    [...]
    I've submitted a bug report to the bug-gawk mailing list. I'll post the >> > URL when the archive updates.
    Here's the bug report:

    https://lists.gnu.org/archive/html/bug-gawk/2023-05/msg00010.html

    Ok, thanks for the quick and comprehensive response, everybody. I'll
    use printf as a workaround and keep an eye on the bug report.

    See my latest followup. It's not a bug, and using printf is the correct solution, not just a workaround. (One could argue that it's a
    misfeature in the awk language, but it's hard to avoid in a language
    that distinguishes integers from non-integers by value rather than by
    type.)
    --
    Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
    Working, but not speaking, for XCOM Labs
    void Void(void) { Void(); } /* The recursive call of the void */
    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From Kaz Kylheku@864-117-4973@kylheku.com to comp.lang.awk on Mon May 8 05:14:06 2023
    From Newsgroup: comp.lang.awk

    On 2023-05-08, Keith Thompson <Keith.S.Thompson+u@gmail.com> wrote:
    Keith Thompson <Keith.S.Thompson+u@gmail.com> writes:
    Keith Thompson <Keith.S.Thompson+u@gmail.com> writes:
    [...]
    I've submitted a bug report to the bug-gawk mailing list. I'll post the >>> URL when the archive updates.

    Here's the bug report:

    https://lists.gnu.org/archive/html/bug-gawk/2023-05/msg00010.html

    And thanks to a reply on the mailing list from Andrew J. Schorr: https://lists.gnu.org/archive/html/bug-gawk/2023-05/msg00011.html
    I no longer believe this is a bug.

    What I should have noticed is that the values for which setting
    "OFMT=.15e" *doesn't* produce output in scientific notation are
    precisely the values that are mathematically integers.

    That doesn't mean formatting should break.

    I don't see the requirement in POSIX that OFMT should bypass the format
    for values without a fractional part.

    In TXR Lisp, I made sure that even bignums work with ~e:

    (typeof (expt 2 300))
    bignum
    (fmt "~e" (expt 2 300))
    "2.037e90"

    There are limitations: it works by conversion to floating-point.

    So no, you cannot use it on arbitrarily large integers:

    (fmt "~e" (expt 2 1000))
    "1.072e301"
    (fmt "~e" (expt 2 2000))
    ** out-of-range floating-point result
    ** during evaluation at expr-2:1 of form (fmt "~e" (expt 2 2000))

    Better than nothing.

    large floating-point values are always equal to integers.

    Here's what POSIX says: https://pubs.opengroup.org/onlinepubs/9699919799/utilities/awk.html

    A numeric value that is exactly equal to the value of an integer
    (see Concepts Derived from the ISO C Standard) shall be converted to
    a string by the equivalent of a call to the sprintf function (see
    String Functions) with the string "%d" as the fmt argument and the
    numeric value being converted as the first and only expr argument.

    I think that means, shall be converted to a string *when needed to be
    one*. Not that integers should be considered to be strings, so that
    OFMT is then bypassed.
    --
    TXR Programming Language: http://nongnu.org/txr
    Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
    Mastodon: @Kazinator@mstdn.ca
    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From Kaz Kylheku@864-117-4973@kylheku.com to comp.lang.awk on Mon May 8 07:36:54 2023
    From Newsgroup: comp.lang.awk

    On 2023-05-08, Keith Thompson <Keith.S.Thompson+u@gmail.com> wrote:
    Ross <chaudhry.ross@gmail.com> writes:
    On Sunday, 7 May 2023 at 19:30:47 UTC-6, Keith Thompson wrote:
    Keith Thompson <Keith.S.T...@gmail.com> writes:
    [...]
    I've submitted a bug report to the bug-gawk mailing list. I'll post the >>> > URL when the archive updates.
    Here's the bug report:

    https://lists.gnu.org/archive/html/bug-gawk/2023-05/msg00010.html

    Ok, thanks for the quick and comprehensive response, everybody. I'll
    use printf as a workaround and keep an eye on the bug report.

    See my latest followup. It's not a bug, and using printf is the correct solution, not just a workaround. (One could argue that it's a
    misfeature in the awk language, but it's hard to avoid in a language
    that distinguishes integers from non-integers by value rather than by
    type.)

    But printf is in the language, and printf("%e", expr) handles it fine.

    The OFMT feature just has to treat numeric-valued expressions
    using whatever logic that is already making printf("%e", expr) work.
    --
    TXR Programming Language: http://nongnu.org/txr
    Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
    Mastodon: @Kazinator@mstdn.ca
    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From Andrew Schorr@aschorr@telemetry-investments.com to comp.lang.awk on Mon May 8 08:08:20 2023
    From Newsgroup: comp.lang.awk

    Hi,
    On Monday, May 8, 2023 at 3:36:57 AM UTC-4, Kaz Kylheku wrote:
    But printf is in the language, and printf("%e", expr) handles it fine.

    The OFMT feature just has to treat numeric-valued expressions
    using whatever logic that is already making printf("%e", expr) work.
    I thought Keith clarified this issue above. The POSIX spec says that integer values
    should be converted with an implicit "%d" instead of using CONVFMT.
    It is also discussed here in the gawk manual: https://www.gnu.org/software/gawk/manual/html_node/Strings-And-Numbers.html
    "As a special case, if a number is an integer, then the result of converting it to a string is always an integer, no matter what the value of CONVFMT may be."
    You may ask why OFMT and CONVFMT are handled the same way. I can think of 3 reasons:
    1. It's the same code path inside gawk.
    2. Historically, there was no distinction between OFMT and CONVFMT, but it was added to address
    some corner cases. You can find a discussion of this in the Posix AWK spec.
    3. The POSIX awk spec simply codifies how Unix awk has always worked, and this is how it works.
    If you don't like it, please use printf to request explicitly what you want. The OFMT and CONVFMT logic
    is intended to do the right thing, and it does so in the vast majority of usage cases.
    Regards,
    Andy
    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From Kaz Kylheku@864-117-4973@kylheku.com to comp.lang.awk on Mon May 8 23:31:00 2023
    From Newsgroup: comp.lang.awk

    On 2023-05-08, Andrew Schorr <aschorr@telemetry-investments.com> wrote:
    Hi,

    On Monday, May 8, 2023 at 3:36:57 AM UTC-4, Kaz Kylheku wrote:
    But printf is in the language, and printf("%e", expr) handles it fine.

    The OFMT feature just has to treat numeric-valued expressions
    using whatever logic that is already making printf("%e", expr) work.

    I thought Keith clarified this issue above. The POSIX spec says that integer values
    should be converted with an implicit "%d" instead of using CONVFMT.

    The requirement applies to CONVFMT; it isn't written that it applies to
    OFMT.

    It is written that CONVFMT is newer: OFMT came first and then CONVFMT
    was derived from it as a kind of fork to separate the messy semantics
    of field conversion from printing arbitrary arguments (or something
    like that).

    It is not clear that requirements applying to the derivative CONVFMT
    should flow backwards into OFMT. Generally speaking, if a document
    introduces some Y that is a new entity, similar to and inheriting
    requirements from an existing X, and then also gives requirements only
    about Y, those Y requirements do not propagate back to X; they
    are one of the attributes of Y that distinguish it from X.

    Under CONVFMT, the results of conversion are not going to the display;
    they loop back into the program. CONVFMT controls how numbers convert to strings. If the floating-point format applies to integer values, it can
    mess up associative array keys involving integers, which is quite
    common.

    Therefore, it's easy to see why a hack like that would be in CONVFMT,
    but not required in OFMT.

    Upon reading the rationale, one might have the impression that
    this is one of those corner cases that originally existed in OFMT
    that were separated out into CONVFMT.

    Therfore, historic knowledge that OFMT implementations before CONFVMT
    existed had this hack behavior doesn't amount to anything. POSIX
    changed the behavior by introducing an entirely new variable,
    such that OFMT no longer controlled conversions, breaking programs
    depending on that. At that time, OFMT should have been considered
    to not have the %d requirement any more, in the same stroke.
    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From Ben Bacarisse@ben.usenet@bsb.me.uk to comp.lang.awk on Tue May 9 01:59:42 2023
    From Newsgroup: comp.lang.awk

    Kaz Kylheku <864-117-4973@kylheku.com> writes:

    On 2023-05-08, Andrew Schorr <aschorr@telemetry-investments.com> wrote:
    Hi,

    On Monday, May 8, 2023 at 3:36:57 AM UTC-4, Kaz Kylheku wrote:
    But printf is in the language, and printf("%e", expr) handles it fine.

    The OFMT feature just has to treat numeric-valued expressions
    using whatever logic that is already making printf("%e", expr) work.

    I thought Keith clarified this issue above. The POSIX spec says that
    integer values should be converted with an implicit "%d" instead of
    using CONVFMT.

    The requirement applies to CONVFMT; it isn't written that it applies to
    OFMT.

    I think it is. Where OFMT applies (in print) the text refers to the
    conversion that otherwise uses CONVFMT:

    "All expression arguments shall be taken as strings, being converted if
    necessary; this conversion shall be as described in Expressions in
    awk, with the exception that the printf format in OFMT shall be used
    instead of the value in CONVFMT."

    and it's the referenced text that has the integer exception with CONVFMT
    used for other values. Using OFMT in place of CONVFMT in that text does
    not remove the exception.

    Of course, if the integer exception was added to the "Exceptions in awk" section at some later stage, the effect on the use of OFMT in print
    statements might have been unintentional.
    --
    Ben.
    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From Andrew Schorr@aschorr@telemetry-investments.com to comp.lang.awk on Tue May 9 06:03:49 2023
    From Newsgroup: comp.lang.awk

    On Monday, May 8, 2023 at 8:59:47 PM UTC-4, Ben Bacarisse wrote:
    I think it is. Where OFMT applies (in print) the text refers to the conversion that otherwise uses CONVFMT:

    "All expression arguments shall be taken as strings, being converted if necessary; this conversion shall be as described in Expressions in
    awk, with the exception that the printf format in OFMT shall be used
    instead of the value in CONVFMT."

    and it's the referenced text that has the integer exception with CONVFMT used for other values. Using OFMT in place of CONVFMT in that text does
    not remove the exception.
    I think that's right. It also says this:
    "The intent has been to specify historical practice in almost all cases."
    And I think the history is that all implementations of awk have always converted
    integral values using the equivalent of "%d", regardless of the CONVFMT or OFMT setting.
    Regards,
    Andy
    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From Kpop 2GM@jason.cy.kwan@gmail.com to comp.lang.awk on Fri Oct 27 15:13:58 2023
    From Newsgroup: comp.lang.awk

    On Tuesday, May 9, 2023 at 9:03:51 AM UTC-4, Andrew Schorr wrote:
    On Monday, May 8, 2023 at 8:59:47 PM UTC-4, Ben Bacarisse wrote:
    I think it is. Where OFMT applies (in print) the text refers to the conversion that otherwise uses CONVFMT:

    "All expression arguments shall be taken as strings, being converted if necessary; this conversion shall be as described in Expressions in
    awk, with the exception that the printf format in OFMT shall be used instead of the value in CONVFMT."

    and it's the referenced text that has the integer exception with CONVFMT used for other values. Using OFMT in place of CONVFMT in that text does not remove the exception.
    I think that's right. It also says this:

    "The intent has been to specify historical practice in almost all cases."

    And I think the history is that all implementations of awk have always converted
    integral values using the equivalent of "%d", regardless of the CONVFMT or OFMT setting.

    Regards,
    Andy
    That would *nearly* be true if not for gawk + GMP's lovely behavior with parsing the decimal dot :
    gawk -Mbe 'BEGIN { OFS = RS;
    . . . . . print x = 10.^10 * 5^12 * 3, --x, "--------------",
    . . . . . . . . . . y = 10^10 * 5^12 * 3, --y, "-----------------", 2^63-1 }' 7324218750000000000
    7324218750000000000
    --------------
    7324218750000000000
    7324218749999999999
    -----------------
    9223372036854775807
    everything about "x" and "y" are mathematically identical - both are ***supposed*** to be integers,
    but the mere presence of an extra dot (".") at the 1st "10" ( "10." instead of "10") causes gawk + GMP to treat the whole thing as double-precision floating point value instead of GMP's arbitrary precision integers.
    which can be confirmed when using the -d- flag to dump variables to /dev/stdout :
    x: 7.32421875e+18
    y: 7324218749999999999
    The same undesirable effect also occurs when the exponent is written as "10^10." instead of "10^10". but interestingly enough, if one adds a
    . . . x = int(x)
    before
    . . . . --x
    then it circumvents this annoyance and returns a properly decremented integer value instead
    — The 4Chan Teller
    --- Synchronet 3.20a-Linux NewsLink 1.114