11.3.5 Array processing in assembly language
In section 8.9 of Chapter 8, an array was defined as a collection of data elements, all of the same type, that are stored in contiguous memory locations. To access an item stored in an array, high-level languages require both the name of the array and the relative position, or “offset” of the item of interest. Hence, A[0] is a reference to the first element of the array A, because the offset is zero. Likewise, A[1] is a reference to the 2nd element because it is offset from the start of the array by one element.
Variables, as well as constants, may be used for subscripts. Hence, the statement:
A[i] = -1;
places the value –1 into the element at index position i of array A. Similarly the statement:
X = A[j];
places a copy of element j of A into the variable X.
This ability to access and modify array elements using variable subscripts is important when one wishes to access or modify all of the elements of an array. Generally, if one wants to perform an operation on all of the elements of an array, the desired operation is described in terms of some generic element, such as the element at index position i, and then the operation is embedded within a repetition construct.
As an example, perhaps one wants to initialize all of the elements of an array to some initial value. The following high-level Watson JavaScript program initializes each of the elements of a 30-element numeric array, named myArray, to –1.
How can such a program be translated into assembly language?
Declaring an array is simple – just reserve the appropriate amount of space via the .BLOCK command. For a 30-element array of integers named myArray the command:
MYARRAY .BLOCK 30
does the job. Accessing the elements of this array via variable subscripts is somewhat more challenging to implement. In fact, in order to use arrays effectively, it will be necessary for us to find a way to calculate the memory address of each of the elements of the array. We will also need access to assembly language instructions that can load data from, and store data to, calculated addresses.
We will look at the calculation of addresses in a moment. But first, let’s examine the way in which a value can be retrieved from a memory location via the “load indirect” statement assuming the memory address calculation has already been done for us.
The form of the load indirect instruction is:
LOADIND destination register, source location register
A statement such as:
LOADIND REGA, REGB
will place a copy of the value held in the memory address indicated by register B into register A.
Consider a high-level assignment statement such as:
x = A[5];
which places the value of the element of A located at index position 5 into the variable x. Assuming register B held the address of A[5], such a statement could be implemented by the following code segment:
Placing a value into an array works in a similar manner, except that the “store indirect” command is used rather than the “load indirect” command. The form of the store indirect instruction is:
STOREIND source register, destination location register
Hence, the statement:
STOREIND REGE, REGF
will store the value held in register E into the memory address indicated by register F.
These instructions, LOADIND and STOREIND, are referred to as “indirect” instructions due to the fact that they do not directly specify which memory address is to be accessed. Instead, the statements indirectly specify the target address by indicating where that address may be found – i.e., indirect instructions specify a register that is consulted for the actual target address.
Indirect instructions add a lot of power to an assembly language by allowing programs written in that language to compute “on the fly” the memory addresses of objects that are to be accessed or modified. Without some form of indirection, array processing with variable subscripts would be impossible, since the memory location specified by an expression such as a[i] depends on the value of the subscript i.
It should now be clear that if we had the address of a particular element of an array we could both access that element (via LOADIND) and modify the element (via STOREIND). This begs the question: “How can the address of an array element be computed?”
The address of the element at index position i of an array can be computed from the expression:
base_address_of_array + ( i * size_of_each_element )
where:
array element.
Since we are limiting ourselves to the study of integers (which require a single word of memory), size_of_each_element is one and can be safely ignored, leading to the simplified form of the expression:
base_address_of_array + i
The term i is often referred to as the “offset” into the array, since it specifies how far past the base address the element of interest is located.
Here is an example that illustrates the process of memory address computation for array elements. If we assume that the array A begins at memory location 10, then A[3] (the 4th element) will be stored at location 13.
Figure 11.11: Initializing all 30 elements of an array to -1
The above algorithm for calculating the location of index position i of an array can be translated into assembly language as follows:
At the end of this process registerz will contain the address of the array element at index position i.
Figure 11.11 presents a complete assembly language program for initializing all of the elements of a 30-element array to –1. The figure also contains two high-level Watson JavaScript programs for accomplishing the same task – one using a “for” loop, the other a “while” loop.
Exercises for Section 11.3.5