Aggregate Data. 1

Arrays. 2

Design Issues for Arrays. 2

Array Notation. 3

Declaring Arrays. 4

Array Size. 5

Determining the Lower- and Upper-Bound of an Array. 6

Array Organization. 6

Array Starting Positions. 7

Where Do Arrays Start?. 8

Where Do Arrays End?. 9

How Many Elements Does 10 Get Us?. 9

Initializing Arrays. 10

Perl Arrays. 12

FORTRAN 90 Arrays. 13

Questions: 14

Array Traversing. 15

Array Operations===where does this go??. 16

Using foreach with Arrays. 16

Array Cross Sections. 16

Array Slices and Sections. 17

Design Issues for Array Slices. 18

Negative Array Indexes. 19

Perl Arrays. 20

Python Arrays. 20

PL/I Label Arrays. 20

Array of Size Zero. 20

Perl Shift and Push for Arrays. 21

Ragged Arrays. 22

JavaScript Arrays. 22

Ruby Arrays. 22

Questions: 22

 

 

Creative Commons License
This work is licensed under a Creative Commons Attribution-No Derivative Works 3.0 United States License.

Aggregate Data

 

Simple variables are called elementary data items. They are manipulated as a unit. For example here are some elementary or simple variables:

 

   int count = 0;

   float rate = 15.50;

 

But often we want groups or aggregations of elementary data objects. Examples are arrays or records. Objects such as arrays or records are called data structures. They are an aggregate of other data objects. In this chapter, I go over arrays. There are so many ways to handle arrays in the different programming languages, that we could devote a whole book to just arrays. But this chapter will provide a fairly good review on the many programming techniques used with arrays and will point the reader to languages that can be studied for those that want more than what is provided here.

 

Arrays

Arrays allow us to represent many values with one variable name. In some languages (e.g. COBOL) arrays are called tables. An individual value in an array is called an element. A subscript or index is used to indicate which one of the array elements we want to use.

 

Design Issues for Arrays

While arrays work about the same in most languages there are some interesting differences. Here are some design issues for arrays.

 

 

There are arrays in some language that has each of the above choices.

 

Most descriptions of arrays would indicate that an array is group of homogenous type elements, that is the same type (like all integer or all double). But then scripting languages such as JavaScript, Perl, and Ruby showed up with arrays that can have elements of any type, including other arrays!

 

Some languages do great things with arrays. If you are interested in what a language can do with arrays, you might look at APL, which has an extensive set of array operations. But APL needed many special characters for operations and thus needed special input and output devices, which probably doomed the language. Other languages that use regular keyboards but still do wonderful things with arrays are FORTRAN 77 (and later versions), PL/I, and Ada. Modern scripting languages have very interesting array operations.

 

Array Notation

One interesting challenge is how do we tell arrays from other things like functions. This is sometimes called name space for arrays. ALGOL 60 contributed the square bracket [ ] for enclosing array subscripts. Since FORTRAN was already available the people designing ALGOL had probably run into the problem of using round parentheses for both functions “sqrt(85.0)” and arrays “taxrate(4)”. For example, if we have the following FORTRAN statement:

 

     z = 2 * baz(z)

 

Is baz an array or a function? In FORTRAN baz could be either an array or a function since both use round parentheses. This makes syntax checking for the FORTRAN compiler difficult and error messages often misleading. This problem is covered in more detail in the Parentheses section of chapter 1.

 

The second question with using arrays, is how do we tell arrays from simple variables? Can we have an array name and a variable name that are the same in the program? Thus can we have an array named rates[] and a variable names rates? Each language has its own rules about this. But just reading a programming textbook will often give few hints. Try using identical array and variable names in the same program and see what happens. You can make some sophisticated guesses on what will happen. For example, languages that have array operations will make it impossible to have identical names for arrays and variables. For example, some languages (e.g., PL/I) allow this pseudocode for the array rates:

 

   rates = 0

 

In these languages all elements of the rates array are set to zero. But this would mean we can not have a simple variable by the name of rates also. Array operations are covered later on in this chapter.

 

The third array notation question is how to indicate multiple-dimensioned arrays. In FORTRAN and other languages, we use:

 

     dimension scores(5,4)

 

where array dimensions are listed in one set of parentheses. But the C family of languages use:

 

     int scores[5][4];

 

where each array dimension is in a separate set of brackets. Table x.1 illustrates some two dimensional array examples by language.

 

Notation

Languages

rates(2, 3)

FORTRAN, BASIC, COBOL, Ada

rates[2][3]

Java, C++, C#, Modula-2

rates[2, 3]

C#, Pascal, Modula-2

Array Notation by Language

Table x.1

 

As you can see, most older languages use parentheses for their arrays (the early input devices did not have brackets), but newer languages use brackets. Thus the newer languages can clearly indicate that zap(x) is a function, and zap[x] is an array. C# and Pascal use only one set of brackets, (i.e., rates[2, 3]), for a multidimensional array, while Java and C++ use a set of brackets, (i.e., rates[2][3]), for each dimension of the array. The situation where C# uses two sets of brackets is a special situation, which is covered later in this chapter when ragged arrays are discussed. Modula-2 allows either [i, k] or [i][k] for array indexes.

Declaring Arrays

The declaring of arrays is similar in many languages but there are a few variations. To illustrate the differences between languages we will declare the same arrays in several languages. First, we declare a 1-dimensional integer array maxy with 10 elements. Next, we declare a floating point 2-dimensional array total with 3 rows and 5 columns. Finally, if possible in that language, we declare a floating point array tides with indexes from –5 to 5. Here are some examples by language.

 

Java/C++/C:

     int maxy [9];     //arrays start at zero.

     float total [2][4];

 

Ada:

     type maxy is array (1..10) of integer;

     type total is array (1..3, 1..5) of float;

     type tides is array (-5..5) of float;

 

ALGOL:

     integer array maxy [1:10];

     real array total [1:3, 1:5];

     real array tides [-5:5];

 

C#:

     int[] maxy = new int[9];     //arrays start at zero.

     float[,] total = new float[2,4];

 

COBOL:

     05  MAXY    PICTURE 9999 OCCURS 10 TIMES.

     05  ROWS-TOTAL  OCCURS 3 TIMES.

         10  TOTAL  PICTURE 9999V99  OCCURS 5 TIMES.

 

FORTRAN IV:

     DIMENSION MAXY(10), TOTAL(3,5)   !arrays start at one.

     REAL, DIMENSION (-5:5) :: TIDES  ! FORTRAN 77

 

Pascal, Moldula-3 similar):

     var maxy = array [1..10] of integer;

         total = array [1..3, 1..5] of real;

         tides = array [-5..5] of real;

 

As you can see the array declarations are slightly different by language. The numbers used for the array bounds change depending on whether the array starts at zero or one. Some of the languages can handle arrays that use indexes of any starting value, like used for the array tides. “Modern” languages like ALGOL-60 (joke) can do tricky things like that. It is interesting that few modern programming languages can start an array at any place except zero.

Array Size

How array size is indicated has several variations. First, can we use variables or just constants for array size?

 

     dim taxrate(100)   or dim taxrate(size)

 

Some languages, such as early FORTRAN, limit array declarative sizes to just integer literals. But even in FORTRAN we can pass a variable to a function or subroutine and then use the variable for the array size:

 

     FUNCTION SUM(TAXRATE, SIZE)

     DIM TAXRATE(SIZE)

 

While array bounds are often restricted to an integer or integer expression, a few languages are more generous. For example, in BASIC, we can use:

 

     dim tax(11.7)

 

and the array size 11.7 will be rounded to 12. I am not clear why we need this but I am sure many people can tell me. Other languages allow real numbers to be used but truncate the value to an integer.

 

The C family allows variables to be used if the variable has a value before execution of that function. In C we could use #define to do what we need.

 

   #define SIZE 100

   main() {

     int taxrate(SIZE);

     int otherrates(SIZE * 2);

 

Not only can we use the integer constant SIZE, but we an also use integer expressions (SIZE * 2) to indicate the size of the array. In C++ we can change the #define line to:

 

     const int SIZE=100;

 

and the rest would be the same.

 

Determining the Lower- and Upper-Bound of an Array

 

Mid versions of BASIC allowed any starting and ending value. So we could have arrays like the following:

 

     dim years (1998 to 2008)

     dim tide.level(-5 to +5)

 

Many “modern” languages do not allow this freedom with declaring arrays, instead making the programmers deal with array subscript offsets (often unsuccessfully) to take care of a very simple situation.

 

Since BASIC allows programmers to indicate both the lower and upper array bound [ DIM TIDE (-3 to 3) ] and to change the array bound while execution (instead of the constant 3, variables can be used), it is useful to be able to find out what the present bounds of the array are. QBASIC has the functions lbound(array-name, n) and ubound(array-name, n) where n is the nth dimension, to determine these array bounds. Thus for the array

 

   DIM TIDE (-5 to 3)

 

Lbound(TIDE, 1) returns –5, and Ubound(TIDE, 1) returns 3. Many languages including C#, Perl, and Python have methods or functions to determine the length of the array.

Array Organization

 

FORTRAN was one of the first high-level language to have arrays. Since FORTRAN is for scientific programmers, the arrays were organized a little differently. For example, FORTRAN arrays are organized by column (column major order) instead of by row. This means the first subscript increases the most rapidly, followed by the second subscript, and so on. For example:

 

     DIMENSION  K(4, 3)

 

declares an array of 12 elements, and the order in memory will be K(1,1), K(2,1), K(3,1), K(4,1), K(1,2), ....

 

For many situations it may not matter how the array is stored in memory, but it does matter for operations that transmit entire arrays. One example is input and output, which FORTRAN can handle without specifying the elements:

 

            READ *, K

 

The above will read the next 12 items of data into the array K. If we assume the input data is the first 12 integers then the array will look like this when full:

 

     1  5   9

     2  6  10

     3  7  11

     4  8  12

 

There are other operations such as use of DATA statement to fill an array where similar results will occur. FORTRAN was one of the few languages to store arrays by column.

 

The next historical language, COBOL stores arrays by row (row major order). COBOL and most other languages store elements by row. So if we read in the first 12 integers, the array would look like this in most other languages:

 

      1   2   3

      4   5   6

      7   8   9

     10  11  12

 

Today most languages fill and process arrays by row instead of columns. But for the early mathematical programmers multi-dimension arrays were just multiple vectors and that arrangement seemed correct. FORTRAN DO loops can be used to change the order of array processing. Business applications treat arrays like records, and thus like row major order. But scientists often need columns of numbers and then column major order is more useful. It is interesting that all modern languages have followed the COBOL order instead of the FORTRAN order.

 

One unanswered or at least not agreed on characteristic for arrays is where to start them. Should arrays start with element zero or element one? If you ask programmers this question you can probably figure out what language they started programming in. Next, if we start arrays at zero and ask for 5 elements, do we get elements 0-5 or 0-4? The next sections will survey how different languages have answered these questions.

 

Array Starting Positions

In FORTRAN II and IV arrays automatically start with element 1, and cannot start with a lower or higher value. FORTRAN 77 changed this, allowing negative or positive array starting positions. Thus in FORTRAN 77 and later versions we can declare arrays like the following:

 

  

   integer years(1998:2008)

   real tides (-3:5)

   integer gamepoints (-5:5, 2:5)

 

So the subscripts of the arrays years go from 1998 to 2008, the subscripts of the array tides goes from –3 to 5 (-3, -2, -1, 0, 1, 2, 3, 4, 5), and gamepoints is a two-dimensional array with negative values on the first index.

 

Ada allows similar arrays. The above arrays would look as follows:

 

   years: array (1998..2008) of integer;

 

Visual Basic grants the same wide variety in declaring arrays.

 

Modula-2 and its related languages allow array indexes to start at any value (positive or negative) and allow other cardinal values for subscripts. Here are a few examples:

 

   TYPE

     TideLevel = ARRAY [-10..10] OF REAL;

     vector = ARRAY [1..50] OF REAL;

     lcaseCount = ARRAY [“a”..”z”] OF INTEGER;

     TruthType = ARRAY BOOLEAN OF INTEGER;

 

Most of these are understandable, except maybe the last line that uses FALSE and TRUE for the subscript.

Where Do Arrays Start?

As you can see by the above there is common question of where arrays start, zero or one. ALGOL got around that problem by using a bound pair for declaring the arrays (see above) and thus the beginning array subscript is indicated.

 

Why do C arrays start at zero instead of one? The address of the array is the beginning of the array. Thus the array address points to the first (zero) element of the array. So for the array rates, all the following point to the zero element (in C) of the array:

 

   *rates

   *(rates+0)

   rate[0]

 

After C started their arrays this way most modern languages adopted the same system of starting all arrays at the zero element.

 

In early BASIC, arrays of size ten or less elements did not have to declared. The subscripts went from 1 to 10. In the second edition of BASIC, the zero subscript was allowed, so now we could use A(0) or B(0,0). The other interesting item is that if you declare an array

 

            DIM X(20)

 

you actually have elements X(0) to X(20) available (21 elements). Likewise, an array of B(4,5) goes from B(0,0) to B(4,5) which gets you 30 elements instead of the 20 you get with Java or C. In C when we ask for int x(20) we start at element 0 but only get to element 19, which I have never liked. Too bad Kernighan and Ritchie didn’t read BASIC manuals more, or perhaps they did and rejected my suggestion. They are a lot smarter than me, so I assume they did the right thing. One can easily argue that either approach is best. That is, a declaration of x(20) will get either elements 0-19 or 0-20.

 

Since BASIC was designed for amateur programmers, the interpreter checked for array bound errors. Not many early languages do this for us, probably because this error checking would be a lot more difficult for a compiler than an interpreter. Nowadays computers are very fast so you see C# and VB checking array indexes.

 

There are some problems here, if the language does array processing. For example, if we have a command that prints arrays do we print the zero element (row) or not?

 

QBASIC and Visual BASIC 6.0 have solved this problem by letting you set the default to be zero or 1. But then all the arrays in that program start at zero or one in that language. To do this in these versions of BASIC we use the statement:

 

     option base 0

 

The only choices for option base are zero or one. Since arrays can have a starting and ending value in newer version of BASIC, individual arrays can override this use of option base.

 

Where Do Arrays End?

Since we cannot agree on where arrays start (zero, one, or elsewhere), it is not surprising the end of the array is also not agreed on. In FORTRAN and BASIC an array declared of size 10, ends with the 10th subscript. While in the C family of languages the array declared of size 10, ends with the 9th subscript.

 

How Many Elements Does 10 Get Us?

In FORTRAN (elements 1-10) and C (elements 0-9), we get 10 elements. While in BASIC (elements 0-10) we now get 11 elements. I have always thought the BASIC solution was best, since I do not have to use the zero element, if not interested, and I did not have to ask for 11 elements (like in C) if I want the element with a subscript of 10. It is interesting how many people feel strongly about either solution.

 

Thus when we declare an array of 10 elements, we get a different starting subscript (0 or 1), and a different number of elements, depending on the language. The following table illustrates this in different languages:

 

Language

Element range

# of elements

FORTRAN, PL/I

1-10

10

C++/Java/C#

0-9

10

VB .Net/QBasic

0-10

11

Array Elements by Language

Table x.x

 

The above table shows some of the variations and there are many languages in each category.

 

Initializing Arrays

Once we have arrays, we need some way to initialize the array elements. C/C++ automatically initializes external and static array elements to zero, but does not initialize local arrays, much to the dismay of many programming students. C# initializes all arrays when they are declared. Most languages do not initialize array elements to any value and have no easy way to initialize large number of elements to the same value.

 

For example, suppose we have an array KZAP of 10 elements and want to initialize the first three elements to zero, the next four to 9, and the last two elements to 3. In FORTRAN we would do the following:

   DIMENSION KZAP(10)

   DATA KZAP /3*0, 4*9, 2*3/

 

Many a time have I wanted some way to do something similar in C++.  In Java or C++ we would do the following:

 

   int kzap[9] = {0,0,0,9,9,9,9,3,3};

 

which is not too bad as long as I do not need 100 elements instead of 10 elements.

 

PL/I improved on the FORTRAN method as follows:

 

   DECLARE KZAP(10) FLOAT DECIMAL

           INITITAL((3)0, (4)9, (2)3);

 

which will initialize the array to the same values as the previous two languages ( 3 zeros, 4 nines, and 2 threes. This is an improvement because the array declaration and initialization is done in one statement rather than two. FORTRAN needs both a DIMENSION and a DATA statement. Both PL/I and FORTRAN have ways to repeat values in other declarations. This seems like a very nice method of initializing array elements.

 

Python allows us to repeat values for initializing arrays but their syntax is a little different. If we want an array named numbers with the first eight integers we could do this in Python:

 

   numbers = [1,2,3,4,5,6,7,8]

 

But if we wanted eight zeros instead we could do this:

 

   numbers = [0]*8

 

Likewise, we obtain an array with values similar to the previous PL/I example as follows:

 

   numbers = [0]*3 + [9]*4 + [3]*2

 

Thus all these languages have easy ways to initialize elements in an array to multiple copies of the same value.

 

We can initialize only part of the array in most languages as follows:

 

   int kzap[9] = {0,,2,,4,,6,,8,};

 

The above line would initialize the even-number elements, but leave the odd-number elements not initialized. While this may be the intention, the more likely occurrence is an extra comma being a mistake and then one element left out as follows:

 

   int kzap[9] = {0,1,2,3,4,,6,7,8,9};

 

In this code, element five was not initialized because of the extra comma. Ada, in its great wisdom, has the rule that if any element is initialized, all elements need to be initialized. Do you like this rule?

 

Ada has many ways to initialize an array. Suppose, we have an Ada array named codes that has five elements. Here are some ways to initialize that array in Ada:

 

   type codes is array (1..5) of integer;

 

   codes := (4, 1, 6, 3, 99); --positional association.

   codes := (1..5 =>4);      --sets all element to 4.

   codes := (1|3|5 =>9, others => 7); --uses an or.

   codes := (1..3 =>x, others => z); --use variables.

   codes := (4=>99, 2=>7, 3=>88, 1=>72, 5=>78);

          --above uses subscript to indicate value.

 

The composite symbol => is assignment. For you few students not programming (joke) in Ada (1..5 =>4) assign the value 4 to elements 1 to 5. Few "modern" languages can match this Ada flexibility of array initialization. Critics would say few modern languages would want to.

 

Trying to keep compatible with previous versions of BASIC and Visual Basic hampers Visual Basic. It takes two steps to declare and to initialize the array elements to non-zero values. For example, we do this in Visual Basic 6.0:

 

   Dim intSums(5) as Integer   ‘ declares 6 elements and

                                 initializes them to zero.

   Dim intAges() as Integer = {0, 5, 12, 18, 45, 49}

 

In Visual Basic 6.0 we cannot declare the upper size of the array and initialize the values in the same statement. Thus the parentheses in intAges() is left empty. Some languages do the opposite, but then you can have conflicts with the elements asked for and the values supplied.

Perl Arrays

In Perl the @ symbol indicates we are using an array. Perl allows us to use the range operator to initialize arrays. For example:

 

   @numbs = (1, 2, 3, 4, 5)

 

will set the first five elements of array @numbs to the digits 1 to 5. But we can use the range operator to do it a little easier, as follows:

 

   @numbs = (1..5)

 

This range operator is available in some other languages, but in Perl, the range operator can be used with simple or complicated strings too. For example:

 

   @letters = (‘d’, ‘e’, ‘f’, ‘g’, ‘h’);

 

can be change to

 

   @letters = (‘d’ ..h’);

 

Ranges do not have to be as simple as above, since we can do things like

 

   ‘part01’ .. ‘part05’

 

which get us part01, part02, part03, part04, part05

 

Or even things like aa’ ..cz, which I will let you figure out.[1]

 

When initializing Perl arrays, expressions can be used. Here is an example:

 

   @x = (4, 2+9, 3*4));

 

But the values will be flatten before initializating the array, as follows:

 

   @x = (4, 11, 12));

 

 

FORTRAN 90 Arrays

Any one interested in how arrays can be handled in a programming language needs to look at FORTRAN 90. Older versions of FORTRAN also did many amazing things but the newer version added even more. First, there are several different ways to declare an array. Suppose we want an integer array of 10 elements.

 

FORTRAN IV

 

   DIMENSION X(10), MINX(10)

 

Since early FORTRAN used default types for variables depending on the first letter of the variable name, X would be a real (floating point) array and MINX would be an integer array. Variables and arrays that start with letters I-N default to integer and the rest default to real. If we did not like this, we could do something similar to following:

 

   INTEGER X

   REAL MINX

   DIMENSION X(10), MINX(10)

 

This changes the name X so it now is an integer and the name MINX so it is real.

 

FORTRAN 90 added several interesting ways to assign values to arrays. Here are a few of them:

 

   DIMENSION A(10), B(10), C(12)

   A = 0     ! SETS ALL ELEMENTS IN THE ARRAY TO ZERO.

   B = 2 * X   ! SETS ELEMENTS IN THE ARRAY TO EXPRESSION.

 

This is the simplest way to set all elements to a expression. The value is broadcast to all the array elements. Thus the syntax is:

 

   array-variable = expression

 

A list of values can also be used to assign values to an array. For the above array A we can do this:

 

   A = (/ 2, 4, 6, 8, 10, 12, 14, 16, 18, 20 /)

 

which assigns 2 to the first element, 4 to the second element, etc. This can be done easier using implied do constructor as follows:

 

   A = (/ (2*I, I = 1, 10) /)

 

Or lists and implied do constructors can be used together like this:

 

   A = (/ 2, 4, (I, I = 6, 18, 2), 20 /)

 

A standard counter-controlled loop could be used, but that is less fun, and offers less variation. We could do the above that way as usual:

 

    DO K = 1, 10

       A(K) = 2 *K

    END DO

 

But then most any language can do this.

Questions:

1. How would you design arrays for OPL we are designing? Would you automatically start arrays at zero or 1? If you start arrays at zero, would you provide N elements (0 to N-1) or would you provide N+1 elements? Note that mathematicians need arrays that start with zero. What is the ground floor of a hotel (0 or 1) in U.S., Europe, Asia, or South America? If your desirable date told you to meet on the first floor in these countries which floor should you end up at to avoid standing up your date?

 

2. Would you allow array subscripts to start at any value in OPL? For example, say you want an array for a ten-year period (1996-2005) as follows: DIM AMOUNT (1996::2005). Would this be a good idea to allow? How about array for ocean tides (which can be negative or positive) DIM TIDE (-5::5) that has subscripts form -5 to 5?

 

3. How would you initialize arrays in OPL? Would you have some shorthand method like used in the FORTRAN DATA statement? Look at what Ada and FORTRAN allow for array initialization.

 

4. In several different languages determine exactly what can be used for the size of the array when it is declared. In a simple array a constant is often used

 

   int rate[4];

 

But we may be able to use a constant variable

 

   const int SIZE = 4;

   int rate[SIZE];

 

What exactly can SIZE be? Can we use variables, expressions, floating point values, or what? Your answer will vary by language. You will need to write a program to test out your theories.

 

5. Many languages do not allow subscripted variables to be a control variable in a loop. Here is an example:

 

   for (rates[n] = 0; k<=5; k++)

 

There are a couple of possible problems here, since the array (and subscript) can be changing inside the loop. In a couple of different languages that you know see if you can find out if the above is allowed or forbidden. Refer to your language documentation to see if they tell you. Then write some code to test out what actually is allowed. Check to see if an array element can be used for the increment value or ending value of the loop.

 

Array Traversing

All languages have some favorite counter-controlled way to go through or traverse the array. But the problem with a counter-controlled loop is that it is easy to make a mistake. For example, assume an array where the size of the array is 10. Then what should the pseudocode be:

 

   for (k = 0; k < 10; k++)    or

   for (k = 1; k < 11; k++)    or what?

 

Assuming we do not make a mistake, the correct loop bounds changes by language or even dialects of a language!

 

Visual Basic, C#, Perl, Java, and other newer languages have taken the problem out of our hands by setting up the for each statement that will traverse the loop automatically without using subscripts. Here is what they do in VB:

 

   for each elementName in arrayName

 

The above code will process the complete array for us. Here is a more concrete example:

 

   Dim decRates(10) as decimal

   . . .

   Dim decOutRate as decimal

   For each decOutRate in decRates

      debug.WriteLine(decOutRate)

   Next decOutRate

 

Assuming the decRates array has some values, the For each statement will write out all the elements of the array. The programmer does not have to worry about getting the starting and ending values correct. This command must prevent many errors (especially among programming students), but it only works when you need to use all the elements of the array.

Array Operations

PL/I has a wide variety of array operations and not all will be covered here. If you are interested in how a programming language can manipulate arrays, you might want to find some PL/I documentation. Here are a few operations, briefly explained:

 

  /* declares two arrays. */

   DECLARE A(4,3) FLOAT DECIMAL(6);

   DECLARE B(4,3) FLOAT DECIMAL(6);

 

  /* Sets all elements in array A to 0. */

   A = 0;

 

  /* Array assignment of all elements. */

   B = A;

 

  /* Multiplies all elements in B by 5 and

        assigns them to elements of array A */

   A = B * 5;

 

FORTRAN 90 can do many of the same things.

 

Array Cross Sections

Cross sections (or slices) of array are available in some languages. Thus A(2, *) refers to all elements in row 2, and A(*, 3) refers to all elements in column 3. At a simple level we could do similar to this:

 

   A(2,*) = 0;

 

which will assign zero to all the elements in row two.

 

A more interesting example is the following processing of insurance rates. We have three levels (rows) of services, and four (columns) of customers, ranging from very poor to very good. For example college teachers would be in the very good column, and students would be a lower column which would help college teachers pay less. We use a percentage table to figure out rates:

 

 

Very poor

Weak

Average

Very good

Level 1

1.70

1.24

1.00

0.85

Level 2

1.60

1.30

1.05

0.87

Level 3

1.65

1.20

1.00

0.85

Rate Percentage Surcharge

 

425

510

620

Rate

 

Now we can declare our tables and do the necessary arithmetic in PL/I as follows:

 

   DECLARE SURCHARGE(3,4), RATE(3), CUSTOMER_RATE(3).

 

   CUSTOMER_RATE(*) = RATE(*) * SURCHARGE(*,3);

 

This will take the RATE array and multiply it by the third column of the SURCHARGE array, and place the results in the new array CUSTOMER_RATE.

 

PL/I will do necessary input and output for entire arrays.

 

   GET LIST(SURCHARGE);   /* Reads whole array. */

   PUT LIST(SURCHARGE);   /* Prints entire array. */

 

PL/I uses row major order, which means the right-most subscript varies most rapidly. So in these statements, the order of output would be:

 

    SURCHARGE(1,1), SURCHARGE(1,2), SURCHARGE(1,3),

    SURCHARGE(1,4),SURCHARGE(2,1), etc.

 

FORTRAN would use column major order.

 

Like FORTRAN, PL/I has a DO loop for input or output of arrays. For example, if we wanted to read the above array by column instead of the default of by row, we could do the following:

 

   GET LIST(((SURCHARGE(K,L) DO K=1 TO 3) DO L=1 TO 4));

 

Since the left-most subscript (K) varies the fastest, the array will be filled by column (column-major order) now.

Array Slices and Sections

A slice (or section) is a subset of an array, and may contain one or more elements. For example, a column or row in a multi-dimension array is a simple slice. But elements 4-8 is also another example of a array slice.

Design Issues for Array Slices

There is a wide variety of how array slices are implemented in various programming languages. Here are some of the interesting questions.

 

 

As usual some languages can do one or more of the above design issues. These are called slices or sections.

 

In FORTRAN 90 they added array slices. This allows us to construct a new array by selecting elements from a parent array. The new arrays are called array sections or subarrays in FORTRAN. The syntax for these is as follows:

 

   array-name(subscript-triplet)

 

The array subscript triplet has the form

     lower: upper: stride

and any one of the parts can be left out. This is probably easiest to explain by example. So we first set up a parent array.

 

   DIMENSION X(10)

 

Then a range or section of an array can be indicated as follows:

 

   A(2: 7)  ! elements 2 thru 7.

   A( :4)   ! elements 1 thru 4.

   A(6: )    ! elements 6 thru 10.

 

In many languages (Python) the slice must be continuous sections, but in Fortran 90 we can select non-continuous elements. The stride is used to skip elements as follows:

 

   A(1: 10: 2)  ! odd element 1, 3, 5, etc.

   A(1: 10: 3)  ! elements 1, 4, 7, 10.

 

These can be used any place an array operation is allowed.

 

   A(1: 10: 2) = 7  ! assigns 7 to odd number elements.

   A(2: 10: 2) = 5  ! assigns 5 to even number elements.

 

Besides the above there are some other very interesting things you can do with array slices. One thing to notice is the resulting array does not have to occupy continuous storage. For example, assume the following arrays

 

   DIMENSION Z(10,10), A(10), C(5)

 

   A = Z(2, :)   ! assigns second row of Z to A.

   A = Z(:, 2)   ! assigns second column of Z to A.

   C = A(2:10:2) ! assigns even elements of A to C.

 

Some interesting problems can be created when an array slice is passed as an argument to a function. For example, look at the following two statements:

 

   RESULT = STAXK(A)  ! passes array A to function STAXK.

   RESULT = STAXK(A(2:10:2))  ! passes an array slice.

 

Negative strides can be used in some languages

 

   A[10:1:-2]   ! elements 10, 8, 6, etc.

 

The syntax for array slices vary by language. For example, if we had an array with the following contents:

   { 0.0, 1.1, 2.2, 3.3, 4.4 }

and wanted to get values 1.1, 2.2, 3.3 the syntax by language would be:

 

Python      x[1:4]

Perl           x[1..3]

F90           x(2:4)

 

This will be enough to confuse most cross-language programmers. In both Python and Perl arrays start at the lower bound zero, but in F90 arrays start at one. In Python the slice goes up to but does not include the listed upper bound, but Perl uses the upper bound. In F90 arrays start at one, the range 2 to 4 gets us the values we want.

Negative Array Indexes

Several languages allow negative array indexes which array elements from the end of the array. For instance, an array index value of -1 will access the last element in the array. Both Perl and Python allow negative indexes.  For example, in Perl we could have the following:

 

   @list = (0, 11, 21, 31, 41);

   print $list[-1];   # prints 41, the last value.

   print $list[-3];   # prints 21, third value from end.

 

Python works in a similar way. Both Python and Perl allow negative values for slices too, but the syntax and results are slightly different. A loop could be used with negative subscripts to process an array by iterating backwards.

 

Perl Arrays

Perl does some amazing things with arrays besides having negative indexes. We can access several elements at the same time by listing the desired indexes.

 

   @list = (0, 11, 21, 31, 41);

   print @list[2, -1, 3];   # prints 21, 41, 31.

   print @list[1..3];   # prints slice, values 11, 21, 31.

   print @list[-3..-1];   # prints slice 21, 31, 41.

 

The two dots are used to indicate a slice. That is, 1..3 indicate elements 1 to 3. And we can do slices starting from the end with something like @list[-3..-1].

 

Also, in Perl the bracket is an array operator. Thus (10,43,25) is a Perl list, and (10,43,25)[1]will extract value 43 (element 1) from the list.

Python Tuples

Python tuples are similar to Perl arrays. But we can not select multiple elements by listing the indexes (e.g. [2, -1, 3]). And Python (and JavaScript) slices do not include the last value. It extracts up to, but not including, the 'end' element (if no 'end' is specified, the default is the very last element). Here are a couple examples:

 

   list=(0,11,22,33,44)

   print list[2]    # prints 22

   print list[-2]   # print 33

   print list[1:3]  # prints slice values 11, 22, not 33

   print list[n:m]  # variables can be used.

 

Notice the last print statement with the slice [1:3] prints elements 1 and 2, but not element 3. This is different then Perl.

PL/I Label Arrays

Most languages allow arrays of most any type including simple variables such as integers, floating point, booleans, and even complex types such as structures, records, and objects. The one item commonly missing is label arrays. PL/I also has arrays of labels. These are discussed in the Labels chapter.

 

Array of Size Zero

Few languages allow arrays of length zero. You may ask why anyone would want an array of length zero. I guess the answer is why do we need zero in other situations? In a language that is able to change the array size during execution of the program, we may wish to start the array with zero elements. But a better reason is if we use a named constant or a parameter passed into a function to set the size of the array, then these may be of value zero, and it would be better if the program did not abend (abnormal end) for this minor problem. For example, suppose this is our FORTRAN array declaration:

 

   dimension rates(SIZE)

 

So the array rates is declared in this wonderful function we are writing, but the variable SIZE is passed from another function. The person writing this function may be just mean or stupid enough to pass us the variable SIZE with a value of zero. FORTRAN 90 allows arrays of length zero. Then the program needs to do what ever is appropriate in this situation. Does your favorite language allow you to declare arrays of size zero?

 

Move this- look up more?? The length of the array is listed in Ada constrained arrays. But unconstrained arrays are indicated by <> to indicate the array length can be anything.  The range of the index is specified at the time of actual declaration.

 

In some languages we declare arrays with no size indicated. Then we later dimension the array. For example, in vb .net ??? show redim.

 

Perl Shift and Push for Arrays

Perl has done similar things and even more. In Perl programmers can delete or add elements to either end of an array during execution.

 

In both JavaScript and Perl, the size of the arrays can change during execution of the program. Here is an array in Perl:

 

  @days = (“Sunday”, “Monday”, “Tuesday”, “Wednesday”);

 

The array has four character elements with the days of the week. If we wish to take elements off of the left side of the array, this is often called shifting elements. If we want to add elements to the right side of the array, this is called pushing elements. Now we will add two elements at the end of the array:

 

   push (@days,(“Thursday”, “Friday”));

 

Now our array has six days in it from Sunday to Friday. So our array looks as follows:

 

  @days = (“Sunday”, “Monday”, “Tuesday”, “Wednesday”, “Thursday”, “Friday”);

 

If we want to remove the first element, we do it as follows:

 

   $weekend = shift @days;

 

Now the variable $weekend has “Sunday” in it, and our array @day goes from Monday to Friday and looks like this:

 

   @days = (“Monday”, “Tuesday”, “Wednesday”, “Thursday”, “Friday”);

 

Perl arrays start at element zero. For the array @day, $#day will provide us with the subscript of the last element, and @#day + 1 will give us the number of elements. JavaScript provides us with similar and more array methods.

 

Ragged Arrays

When I was young we only had rectangular arrays, but now some languages have the ability to create a two-dimensional array that is not rectangular. This means the number of elements in each row does not need to be the same. These non-rectangular arrays are called ragged arrays. Languages that can do this create a one dimensional array where each element is also an array. All ragged arrays seem to only be able to have ragged rows, but no language I know about can have ragged columns. Here is your chance to be famous. Invent arrays of jagged columns and how to use them.

 

C# calls their ragged array a jagged array. Here is an example in C#:

 

   int[][] triangularData = new int [3][];

   triangularData[0] = new int [1] {1};

   triangularData[1] = new int [2] {21, 22};

   triangularData[2] = new int [3] {31, 32, 33};

 

We now have an array where the first row has one element, the second row has two elements, and the third row has three elements and all elements are initialized to a value.

JavaScript Arrays

 

Ruby Arrays

The scripting languages do some amazing thing with arrays, such as different types in elements of one array, including other arrays, no declaration needed, but allowed, and have hash arrays. So we can set up the Ruby array as follows:

 

a = [ 5, “Hello”, 3.14159,  ,true ]

Questions:

  1. Several language handle arrays slices, but the syntax is different in each language. For example, some use a colon operator, some two periods and still other use a comma. Compare the syntax for arrays slices and pick the best solution or come up with your own new solution for specifying an array slice.
  2. Exactly how a slice is produced vary by language. Perl includes the value indicated by the last index, but Python does not. Look at those two languages and some other languages that handle slices and list the differences. Then make a recommendation on how you think array slices should work.

 

 

 

2. Several languages have array operators. One interesting question is exactly how the operations work. For example, suppose we have the following pseudocode:

 

   int x[5] = [1,1,1,1,1};

   int z = 2;

   x = x + z;

 

So what happens in this example is the value of z (which is 2) is added to each element in the array. Thus when done array x has the value 3 in all the elements.

 

array x    1 1 1 1 1

add 2      2 2 2 2 2

          ----------

result     3 3 3 3 3

 

 

So far so good. But what happens when we change the assignment to this:

 

   x = x + x[2];

 

What is the problem here? If you do not see it find that smart kid in the class and ask her.

 

What happens when x[2] is modified. Do we then keep using the old value of x[2] to modify the rest of the array or the new value?

 

array x   1 1 1 1 1

x[2]      2 2 ? ? ? do we use the old or new value of x[2]

          ---------

result    3 3 ? ? ?

 

 

PL/I has the rule that it starts using the new value of the element with the next addition. If I remember right Ada fixes the value used and does not change it. What do you think the rule should be and why?

 

3. Ragged arrays are only irregular in length by row in the languages that allow them. Can you design array declarations so that arrays can be ragged by column for OPL? Would this be useful?

 


#define............................................. 5

<> 

Ada arrays........................................ 21

=>

assignment........................................ 12

abend................................................... 21

abnormal end....................................... 21

Ada

arrays................................................ 11

aggregate data....................................... 1

array

Ada................................................... 11

cross section..................................... 16

design issues....................................... 2

labels................................................. 20

name-space......................................... 3

negative index.................................. 19

notation.............................................. 3

organization........................................ 6

starting element.............................. 7, 8

stride................................................. 18

arrays

design issues....................................... 2

broadcast............................................. 13

column major order................................ 6

constrained arrays................................ 21

cross sections....................................... 16

data structures....................................... 1

design issues

arrays.................................................. 2

elementary data items............................ 1

indexes................................................... 2

negative.............................................. 4

jagged array......................................... 22

label array............................................ 20

pushing elements................................. 21

ragged arrays....................................... 22

rectangular arrays................................. 22

row major order............................... 7, 17

sections

array.................................................. 18

shifting elements.................................. 21

slices

arrays................................................ 16

stride

array.................................................. 18

subscript............................... See Indexes

tables...................................................... 2

traversing

arrays................................................ 15

zero size array...................................... 20


exercise, can we use:

1.                  Namespace for arrays and variables are worth discussing and research. For example, can we have an array name and variable name that is the same. Likewise can array names and function names be the same. The answers to these questions vary by language. See if you can find out the answer by to these questions by reading documentation. You may need to write a program to determine the answer. ==??move to array chapter??

 

This file is from hhh.gavilan.edu/dvantassel/history/arrays.html

Date last revised Feburary 17, 2008.

Copyright Dennie Van Tassel, 2004.

Send comments or suggestions to dvantassel@gavilan.edu

I am especially interested in errors or omissions and I have other chapters on History of Programming Languages.

 

Creative Commons License
This work is licensed under a Creative Commons Attribution-No Derivative Works 3.0 United States License.

 

Hits:

The Ada name for arrays is composite types, since they are composed of elementary items. Ada has two types of arrays. The first type is a normal array where the index is specified (constrained), so we know the size of the array. Then Ada has unconstrained arrays, where the size of the array is  not indicated at declare time.

 



[1]  aaaz, babz, ca – cz.