run as...You don't need to do that `delete` - just having "tmp" listed in the
awk -f uniqueChars.awk
output...
Input string: Mary had a little lamb who's fleece was white as snow... Unique chars: Mary hdlitembwo'sfcn.
script...
BEGIN {
a = "Mary had a little lamb who's fleece was white as snow..."
b = uniqueChars(a)
print "Input string: " a
print "Unique chars: " b
}
function uniqueChars(str, x, y, c, tmp, uniqueStr) {
y = length(str)
uniqueStr = ""
delete tmp # clear array for each new string
while(++x <= y) {Using a `while` instead of `for` loop for that makes your code a bit
c = substr(str, x, 1)Idiomatically that'd be implemented as
if (!(c in tmp)) {
uniqueStr = uniqueStr c
tmp[c]
}
}
return uniqueStr
}
You don't need to do that `delete` - just having "tmp" listed in the
args list will re-init it every time the function is called. Removing
that statement will also make your script portable to awks than don't support `delete array` (but most, possibly all, modern awks do support
that even though it's technically still undefined behavior).
while(++x <= y) {Using a `while` instead of `for` loop for that makes your code a bit
less clear, a bit more fragile (what if `x` gets set above?), and a bit harder to maintain (what if in future you need to increment x by 2 every iteration?).
It's not worth saving the few characters over the
traditional `for ( x=1; x<=y; x++ )`
c = substr(str, x, 1)Idiomatically that'd be implemented as
if (!(c in tmp)) {
if ( !tmp[c]++ ) {
and then you'd remove the `tmp[c]` below but the array in that case is almost always named `seen[]` rather than `tmp[]`.
uniqueStr = uniqueStr cAlternatively, if the order of the characters returned doesn't matter,
tmp[c]
}
}
return uniqueStr
}
you could do:
function uniqueChars(str, x, y, c, tmp, uniqueStr) {
y = length(str)
uniqueStr = ""
for ( x=1; x<=y; x++ ) {
tmp[substr(str,x,1)]
}
for ( c in tmp ) {
uniqueStr = uniqueStr c
}
return uniqueStr
}
I don't expect that to be any faster or anything, it's just different,
but if you have GNU awk then it can be tweaked to:
function uniqueChars(str, x, y, c, tmp, uniqueStr) {
y = length(str)
uniqueStr = ""
for ( x=1; x<=y; x++ ) {
tmp[substr(str,x,1)]
}
PROCINFO["sorted_in"] = "@ind_str_asc"
for ( c in tmp ) {
uniqueStr = uniqueStr c
}
return uniqueStr
}
and then it'll return the unique characters sorted in alphabetic order
which may be useful.
Sysop: | DaiTengu |
---|---|
Location: | Appleton, WI |
Users: | 1,030 |
Nodes: | 10 (0 / 10) |
Uptime: | 200:56:36 |
Calls: | 13,340 |
Calls today: | 3 |
Files: | 186,574 |
D/L today: |
3,503 files (1,084M bytes) |
Messages: | 3,357,058 |