Forum: War Ensemble BBS

Display Isograms Using Awk and Inverse ANSI

From porkchop@porkchop@invalid.foo (Mike Sanders) to comp.lang.awk on Wed Oct 11 00:51:07 2023

From Newsgroup: comp.lang.awk

# Michael Sanders 2023
# https://busybox.neocities.org/notes/isogram.txt
#
# awk script that displays isograms using inverse ANSI
# escapes (meaning fore & background colors are swapped)
# requires an ANSI capable terminal, rename this file
# and invoke script as:
#
# awk -f isogram.awk file
#
# isogram test block...
#
# aberration lucrative concurrent espouse obfuscate
# garrulous promenade epiphany requiem juxtapose
# languid ephemeral abscond extricate circumvent
# obstinate vivacious corroborate attenuate paragon
# penchant serendipity superfluous immutable mitigate
# aplomb concatenate ethereal diaphanous demagogue
# cogitate pervasive anathema juxtaposition memento
# disparate oscillate ennui perfunctory parabola
# mellifluous recumbent ephemeral sycophant timorous
# voracious quixotic serenade conundrum vicarious
# insipid ornate camaraderie cogent introspection
# sanguine deleterious impeccable extraneous loquacious

BEGIN { print "\nisograms...\n" }

function hilite(str) { return "\033[7m" str "\033[0m" }

function isogram(str, c, x, y) {
y = length(str)
for (x = 1; x <= y; x++) {
c = substr(str, x, 1)
if (index(substr(str, x + 1), c) > 0) return 0 # !isogram
}
return 1 # isogram
}

{
word = ""
line = ""
for (x = 1; x <= length($0); x++) {
c = substr($0, x, 1)
if (c ~ /[[:space:]]/ || x == length($0)) {
if (x == length($0) && c !~ /[[:space:]]/) word = word c
line = (isogram(word) ? line hilite(word) : line word)
if(c ~ /[[:space:]]/) line = line c
word = ""
} else {
word = word c
}
}
print line
}

# eof
--
:wq
Mike Sanders

--- Synchronet 3.20a-Linux NewsLink 1.114

From Janis Papanagnou@janis_papanagnou+ng@hotmail.com to comp.lang.awk on Wed Oct 11 10:15:37 2023

From Newsgroup: comp.lang.awk

A quick glimpse at your code gives the impression that you
are parsing the line character-wise to identify "words".
In Awk it is usually better to use the inherent splitting
procedure and operate on $1, $2, etc. Even for cases where
punctuation and other characters may get into your way you
can just define the FS regular expression so that it fits
your needs. That should make your program much simpler and
also easier to understand and maintain.

Janis

On 11.10.2023 02:51, Mike Sanders wrote:

# Michael Sanders 2023
# https://busybox.neocities.org/notes/isogram.txt
#
# awk script that displays isograms using inverse ANSI
# escapes (meaning fore & background colors are swapped)
# requires an ANSI capable terminal, rename this file
# and invoke script as:
#
# awk -f isogram.awk file
#
# isogram test block...
#
# aberration lucrative concurrent espouse obfuscate
# garrulous promenade epiphany requiem juxtapose
# languid ephemeral abscond extricate circumvent
# obstinate vivacious corroborate attenuate paragon
# penchant serendipity superfluous immutable mitigate
# aplomb concatenate ethereal diaphanous demagogue
# cogitate pervasive anathema juxtaposition memento
# disparate oscillate ennui perfunctory parabola
# mellifluous recumbent ephemeral sycophant timorous
# voracious quixotic serenade conundrum vicarious
# insipid ornate camaraderie cogent introspection
# sanguine deleterious impeccable extraneous loquacious

BEGIN { print "\nisograms...\n" }

function hilite(str) { return "\033[7m" str "\033[0m" }

function isogram(str, c, x, y) {
y = length(str)
for (x = 1; x <= y; x++) {
c = substr(str, x, 1)
if (index(substr(str, x + 1), c) > 0) return 0 # !isogram
}
return 1 # isogram
}

{
word = ""
line = ""
for (x = 1; x <= length($0); x++) {
c = substr($0, x, 1)
if (c ~ /[[:space:]]/ || x == length($0)) {
if (x == length($0) && c !~ /[[:space:]]/) word = word c
line = (isogram(word) ? line hilite(word) : line word)
if(c ~ /[[:space:]]/) line = line c
word = ""
} else {
word = word c
}
}
print line
}

# eof

--- Synchronet 3.20a-Linux NewsLink 1.114

From porkchop@porkchop@invalid.foo (Mike Sanders) to comp.lang.awk on Wed Oct 11 10:12:13 2023

From Newsgroup: comp.lang.awk

Janis Papanagnou <janis_papanagnou+ng@hotmail.com> wrote:

A quick glimpse at your code gives the impression that you
are parsing the line character-wise to identify "words".
In Awk it is usually better to use the inherent splitting
procedure and operate on $1, $2, etc. Even for cases where
punctuation and other characters may get into your way you
can just define the FS regular expression so that it fits
your needs. That should make your program much simpler and
also easier to understand and maintain.

Hi Janis.

Sure enough, you're 100% correct on this in my thinking.
In fact, I'm working now on a variant that does use $1, $2,
etc... One issue I'm groping to understand is how *not* to
destroy the layout of a given file upon output. In other
words, I want the output equal to the input with only
difference being that isograms are inverse color. The only
way I've worked through, so far at least, is to not assume
any file structure other than words...

Its an interesting problem to think about =)
--
:wq
Mike Sanders

--- Synchronet 3.20a-Linux NewsLink 1.114

From Janis Papanagnou@janis_papanagnou+ng@hotmail.com to comp.lang.awk on Wed Oct 11 18:41:51 2023

From Newsgroup: comp.lang.awk

On 11.10.2023 12:12, Mike Sanders wrote:

Janis Papanagnou <janis_papanagnou+ng@hotmail.com> wrote:

A quick glimpse at your code gives the impression that you
are parsing the line character-wise to identify "words".
In Awk it is usually better to use the inherent splitting
procedure and operate on $1, $2, etc. Even for cases where
punctuation and other characters may get into your way you
can just define the FS regular expression so that it fits
your needs. That should make your program much simpler and
also easier to understand and maintain.

Hi Janis.

Sure enough, you're 100% correct on this in my thinking.
In fact, I'm working now on a variant that does use $1, $2,
etc... One issue I'm groping to understand is how *not* to
destroy the layout of a given file upon output. In other
words, I want the output equal to the input with only
difference being that isograms are inverse color. The only
way I've worked through, so far at least, is to not assume
any file structure other than words...

Its an interesting problem to think about =)

Yes. A solution may also depend on the Awk version you are
allowed to use. With GNU Awk you can preserve the formatting
by using its newer features (array of separators!).

And with standard Awk you can work on patterns and preserve
formatting e.g. with a frame like

function predicate (s) { ...here's your isogram function... }

function escape (s) { return predicate(s) ? "E" s "E" : s }
# replace the two "E" by your ANSI escape code strings

{
out = ""
for (line=$0; match(line, /[[:alpha:]]+/); line=substr(line,RSTART+RLENGTH)) {
out = out substr(line,1,RSTART-1)
escape(substr(line,RSTART,RLENGTH))
}
out = out line
print out
}

This code specifies the 'alpha' words as entities to consider;
change as desired. (I saw that your code also highlights a '#'
for example; not sure this is intended, though.)

Janis

--- Synchronet 3.20a-Linux NewsLink 1.114

From Janis Papanagnou@janis_papanagnou+ng@hotmail.com to comp.lang.awk on Wed Oct 11 18:47:17 2023

From Newsgroup: comp.lang.awk

On 11.10.2023 18:41, Janis Papanagnou wrote:

[...]

{
out = ""
for (line=$0; match(line, /[[:alpha:]]+/); line=substr(line,RSTART+RLENGTH)) {
out = out substr(line,1,RSTART-1)
escape(substr(line,RSTART,RLENGTH))
}
out = out line
print out
}

[...]

(Sorry, my newsreader splitted two of the long lines.)
The two lines in above code starting at column 1 shall
be on one line:

for (line=$0; match(...); line=substr(...)) {

out = out substr(...) escape(substr(...))

Janis

--- Synchronet 3.20a-Linux NewsLink 1.114

From porkchop@porkchop@invalid.foo (Mike Sanders) to comp.lang.awk on Thu Oct 12 18:54:30 2023

From Newsgroup: comp.lang.awk

Janis Papanagnou <janis_papanagnou+ng@hotmail.com> wrote:

Yes. A solution may also depend on the Awk version you are
allowed to use. With GNU Awk you can preserve the formatting
by using its newer features (array of separators!).

Very much for me, I cant always install things I'd like to,
but its okay, I'll work it out =)

function predicate (s) { ...here's your isogram function... }

Excellent name for a function.

function escape (s) { return predicate(s) ? "E" s "E" : s }
# replace the two "E" by your ANSI escape code strings

{
out = ""
for (line=$0; match(line, /[[:alpha:]]+/); line=substr(line,RSTART+RLENGTH)) {
out = out substr(line,1,RSTART-1)
escape(substr(line,RSTART,RLENGTH))
}
out = out line
print out
}

This code specifies the 'alpha' words as entities to consider;
change as desired. (I saw that your code also highlights a '#'
for example; not sure this is intended, though.)

A single char... isogram or not? Probably not really, and then
there's the case of 'mixed' strings as your snippet deals with,
'abc-321'.

But back to the single character issue I'm going with:

function isogram(str, c, x, y) {
y = length(str)
if (y < 2) return 0 # !isogram <-- added this

for (x = 1; x <= y; x++) {
c = substr(str, x, 1)
if (index(substr(str, x + 1), c) > 0) return 0 # !isogram
}
return 1 # isogram
}

Thanks for your input Janis.
--
:wq
Mike Sanders

--- Synchronet 3.20a-Linux NewsLink 1.114

From porkchop@porkchop@invalid.foo (Mike Sanders) to comp.lang.awk on Thu Oct 12 22:33:25 2023

From Newsgroup: comp.lang.awk

Mike Sanders <porkchop@invalid.foo> wrote:

# requires an ANSI capable terminal...

for your notes:

tags: ANSI, escapes, colors, code

invert fore/background color: "\033[7m" str "\033[0m"

clear screen: "\033[H\033[2J"

hide cursor: "\033[?25l"

show cursor: "\033[?25h"

output to row & column: "\033[<ROW>;<COLUMN>H"

set titlebar (for terminals that support it): \033]0;Your Title Here\007

reset colors: "\033[0m"

foreground colors...

"\033[30mBlack"
"\033[31mRed"
"\033[32mGreen"
"\033[33mYellow"
"\033[34mBlue"
"\033[35mMagenta"
"\033[36mCyan"
"\033[37mWhite"

background colors...

"\033[40mBlack"
"\033[41mRed"
"\033[42mGreen"
"\033[43mYellow"
"\033[44mBlue"
"\033[45mMagenta"
"\033[46mCyan"
"\033[47mWhite"
--
:wq
Mike Sanders

--- Synchronet 3.20a-Linux NewsLink 1.114

From Janis Papanagnou@janis_papanagnou+ng@hotmail.com to comp.lang.awk on Fri Oct 13 09:22:53 2023

From Newsgroup: comp.lang.awk

On 12.10.2023 20:54, Mike Sanders wrote:

Janis Papanagnou <janis_papanagnou+ng@hotmail.com> wrote:

function predicate (s) { ...here's your isogram function... }

Excellent name for a function.

It's meant as generic name for the code pattern I wanted to show.

[...]

This code specifies the 'alpha' words as entities to consider;
change as desired. (I saw that your code also highlights a '#'
for example; not sure this is intended, though.)

A single char... isogram or not? Probably not really, and then
there's the case of 'mixed' strings as your snippet deals with,
'abc-321'.

Oh, my question was more whether a non-alpha character shall be
considered a possible isogram.

But back to the single character issue I'm going with:

function isogram(str, c, x, y) {
y = length(str)
if (y < 2) return 0 # !isogram <-- added this
[...]

That's why I also think a pattern based approach has advantages.

Janis

--- Synchronet 3.20a-Linux NewsLink 1.114

From porkchop@porkchop@invalid.foo (Mike Sanders) to comp.lang.awk on Sat Oct 14 00:34:34 2023

From Newsgroup: comp.lang.awk

Janis Papanagnou <janis_papanagnou+ng@hotmail.com> wrote:

That's why I also think a pattern based approach has advantages.

Yes, anything the human mind can conceive is valid.
--
:wq
Mike Sanders

--- Synchronet 3.20a-Linux NewsLink 1.114

From Kpop 2GM@jason.cy.kwan@gmail.com to comp.lang.awk on Fri Oct 27 16:14:03 2023

From Newsgroup: comp.lang.awk

if you want an ultra quick ANSI color chart :
jot - 16 231 | mawk ' BEGIN { print (ORS = _)
. . . . . . . . . . . . . . . . . . . . . _ *= __ = RS RS
} $NF = sprintf("\33[38;5;%dm%3d%.*s%.*s",
. . . . . . . . . . . . . . . . . . . $_, $_, NR % 6^2 == _, RS,
. . . . . . . . . . . . . . . . . . . . . . . . NR % 6^3 == _, __)'
the result is a VERY wide table spanning 6 rows, but that's the only way I could get the colors to properly line up with each other without complicated math.
— The 4Chan Teller
--- Synchronet 3.20a-Linux NewsLink 1.114

Who's Online
Recent Visitors
- Microbot
  Mon May 6 20:15:29 2024
  from Moore, Ok via Telnet
- Duke
  Mon May 6 11:17:35 2024
  from London via Telnet
- Microbot
  Tue May 7 18:43:53 2024
  from Moore, Ok via Telnet
- Grey Gamer
  Tue May 7 06:11:28 2024
  from Show Low, Az via Telnet

System Info

Sysop:	DaiTengu
Location:	Appleton, WI
Users:	920
Nodes:	10 (0 / 10)
Uptime:	106:10:08
Calls:	12,190
Calls today:	2
Files:	186,527
Messages:	2,237,550

Display Isograms Using Awk and Inverse ANSI

Who's Online

Recent Visitors

System Info