summaryrefslogtreecommitdiff
path: root/testsuite
diff options
context:
space:
mode:
authorDenys Vlasenko2023-12-31 15:49:54 +0100
committerDenys Vlasenko2023-12-31 15:49:54 +0100
commit789ccac7d9d1a9e433570ac9628992a01f946643 (patch)
tree1208e688dae9191740b57b0d9dbdb36e008b0a6a /testsuite
parent5e0e54827fb0fa80d2c894eb67e8696921095935 (diff)
downloadbusybox-789ccac7d9d1a9e433570ac9628992a01f946643.zip
busybox-789ccac7d9d1a9e433570ac9628992a01f946643.tar.gz
awk: fix handling of empty fields
Patch by M Rubon <rubonmtz@gmail.com>: Busybox awk handles references to empty (not provided in the input) fields differently during the first line of input, as compared to subsequent lines. $ (echo a ; echo b) | awk '$2 != 0' #wrong b No field $2 value is provided in the input. When awk references field $2 for the "a" line, it is seen to have a different behaviour than when it is referenced for the "b" line. Problem in BusyBox v1.36.1 embedded in OpenWrt 23.05.0 Same problem also in 21.02 versions of OpenWrt Same problem in BusyBox v1.37.0.git I get the correct expected output from Ubuntu gawk and Debian mawk, and from my fix. will@dev:~$ (echo a ; echo b) | awk '$2 != 0' #correct a b will@dev:~/busybox$ (echo a ; echo b ) | ./busybox awk '$2 != 0' #fixed a b I built and poked into the source code at editors/awk.c The function fsrealloc(int size) is core to allocating, initializing, reallocating, and reinitializing fields, both real input line fields and imaginary fields that the script references but do not exist in the input. When fsrealloc() needs more field space than it has previously allocated, it initializes those new fields differently than how they are later reinitialized for the next input line. This works fine for fields defined in the input, like $1, but does not work the first time when there is no input for that field (e.g. field $99) My one-line fix simply makes the initialization and clrvar() reinitialization use the same value for .type. I am not sure if there are regression tests to run, but I have not done those. I'm not sure if I understand why clrvar() is not setting .type to a default constant value, but in any case I have left that untouched. function old new delta ------------------------------------------------------------------------------ (add/remove: 0/0 grow/shrink: 0/0 up/down: 0/0) Total: 0 bytes Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
Diffstat (limited to 'testsuite')
-rwxr-xr-xtestsuite/awk.tests7
1 files changed, 7 insertions, 0 deletions
diff --git a/testsuite/awk.tests b/testsuite/awk.tests
index 5a792c2..063084a 100755
--- a/testsuite/awk.tests
+++ b/testsuite/awk.tests
@@ -592,6 +592,13 @@ testing 'awk gensub backslashes \\0' \
\\0|\\0
' '' ''
+# References to empty (not provided in the input) fields in first versus subsequent lines
+testing 'awk references to empty fields' \
+ 'awk '$sq'$2 != 0'$sq \
+ 'a
+b
+' '' 'a\nb\n'
+
# The "b" in "abc" should not match <b* pattern.
# Currently we use REG_STARTEND ("This flag is a BSD extension, not present in POSIX")
# to implement the code to handle this correctly, but if your libc has no REG_STARTEND,