Site Network: Home | About

AWK - Associate Arrays

I just have gone through arrays in AWK. These r not regular arrays that we see in many programming languages. Here I am putting some points that describe it briefly.
Arrays in awk are different: they are associative. This means that each array is a collection of pairs: an index, and its corresponding array element value:
Element 4     Value 30
Element 2     Value "foo"
Element 1     Value 8
Element 3     Value ""
Any number, or even a string, can be an index. For example, here is an array, which translates words from English into French:
Element "dog" Value "chien"
Element "cat" Value "chat"
Element "one" Value "un"
Element 1     Value "un"
The principal way of using an array is to refer to one of its elements. An array reference is an expression, which looks like this:
array[index]
For example, foo[4.3] is an expression for the element of array foo at index 4.3.
If you refer to an array element that has no recorded value, the value of the reference is "", the null string.You can find out if an element exists in an array at a certain index with the expression:
index in array
The expression has the value 1 (true) if array[index] exists, and 0 (false) if it does not exist.
Example:
This program sorts the lines by making an array using the line numbers as subscripts. It then prints out the lines in sorted order of their numbers.
{
  if ($1 > max)
    max = $1
  arr[$1] = $0
}

END {
  for (x = 1; x <= max; x++)
    print arr[x]
}
The first rule keeps track of the largest line number seen so far; it also stores each line into the array arr, at an index that is the line's number.
The second rule runs after all the input has been read, to print out all the lines.
When this program is run with the following input:
5  I am the Five man
2  Who are you?  The new number two!
4  . . . And four on the floor
1  Who is number one?
3  I three you.
its output is this:
1  Who is number one?
2  Who are you?  The new number two!
3  I three you.
4  . . . And four on the floor
5  I am the Five man
If a line number is repeated, the last line with a given number overrides the others.
Gaps in the line numbers can be handled with an easy improvement to the program's END rule:
END {
  for (x = 1; x <= max; x++)
    if (x in arr)
      print arr[x]
}

For moore information on Associate arrays, here is the link

http://www.cs.uu.nl/docs/vakken/st/nawk/nawk_80.html

Thanks
Karteek

0 Comments:

Post a Comment