Design
Issues for Arithmetic Operations
Arithmetic
Operators by Language
Result
of Arithmetic Operations
COBOL
Problems with Minus Symbol
Modulus
or Remainder Operation
Operand
Evaluation vs. Precedence
Operator
Precedence for Several Languages
Coarseness
of Operator Precedence
Fixed
Decimal Arithmetic (not near done)
NULL
and Arithmetic (not near done)
This work is licensed under a
Creative Commons Attribution-No Derivative Works 3.0 United States License.
Copyright: Dennie Van Tassel 2008.
Please send suggestions and comments to dvantassel@gavilan.edu
At first glance most people would assume that all languages agree on the arithmetic operations. The operations subtraction, addition, and multiplication are universally agreed on. After going beyond those three operators there is less agreement. The languages differ on what they do, the operators used, and the results obtained. In this chapter I will discuss some of these interesting differences. In early languages some operations were not clearly defined. Modern languages much more clearly indicate what they do, even if the languages do not agree on what is done.
But before we go into this interesting topic, we need a few definitions so we agree, at least in this chapter, what some terms or words mean. Here is a typical arithmetic expression:
cost + profit
In this expression we have the arithmetic operator plus (+). Operators are used to indicate what type of arithmetic operation is needed, such as subtraction, multiplication, etc. The four most common operators are: +, -, *, and /. The operands in this expression are the variables cost and profit in the above statement. Arithmetic operands are often variables, but operands can be constants, expressions, parenthesized expressions, and many other things.
One might assume that arithmetic is done the same in most languages, which is partially true. But there are a lot of differences and here are some design issues:
As you will see as you read through this chapter, there are quite a few differences between language families.
There are three categories of arithmetic operators, and they are classified according to the number of operands. The number of operands that are required is called the arity or adicity.
The first category of operators is the unary operators and they have an arity of one. Here are some unary operations:
-B
+Z
abs(z)
count++
Unary or monadic operators have one operand, that is, only a variable (or operand) on one side of the operator. Thus their arity is one. For example, in the first line, which indicates negation, there is an operand (B) after the minus sign, but no operand before the minus sign. In the last line above, there is only an operand (count) before the increment (++) operator. These examples also illustrate that some operators are a single character, but other operators require more than one symbol to express the operation. Functions such as abs are a special type of “operator.”
The second category of operators is binary operators and they have an arity of two. Here are some binary operations:
a + b
d - e
z * b
1 / 2
Binary or dyadic operators have two operands, that is, variables (or
expressions) on both sides of the operator. Thus in the above set of examples,
on the first line the plus (+) symbol has a variable before the symbol and
after the plus symbol. Likewise, on the last example, there is a 1 before the
division (/) symbol operator and a 2 after the division operator. Another term
for this type of operator is an infix operator since the operators are with-in two operands.
The third category of operator is the ternary (3 parts) operator, which is rare outside the C family. The conditional expression (or ternary operator) has an arity of three. Here are examples of ternary statements from the C family:
average = (count != 0) ? sum/count: 0;
(color) ? color = false :
color = true;
The conditional expression requires two operators (question mark and colon) and three (ternary) operands, loosely defined as the condition, true part, and false part. All three operands are required. If you compare this statement to the standard if-else-then statement, then the question mark indicates the start of the then part, and the colon indicates the start of the else part. In the first example above the conditional expression is (count != 0). The ternary operation is discussed in detail in the Conditional chapter.
We might say that named constants PI, MAX are zeroadic (zero operands) operations. At least it was a sneaky way to use that interesting word. Also, enumerations may be zeroadic.
There is one more item to point out and that is the placement or the fixity of the operator. We have already seen the infix operator, where the operator is between two operands. Prefix operators are placed before the operand. Here are some examples:
-B ++x
In contrast postfix operators are placed after the operand as these examples show
z++ count—
There are a few prefix operators in our regular life. Two are the plus (+7) and minus (–45) symbols. Mathematics has a few postfix operators, such as 5! for 5 factorial, 4’ for four feet, and 40° for 40 degrees. Can you think of any other prefix and postfix operators outside of programming languages? Finally, I think we have finished defining the terms needed for this chapter. You may find out later in this chapter you need to refer back to this section.
Table x.1 list arithmetic operations in several languages. As you can see the languages are in agreement on the operator used for addition, subtraction, multiplication, and part of division. Then for division, modulus, and exponentiation operations there is much less agreement. For example, both Pascal and VBScript have different operators for floating point and integer division.
Operation |
Java/C |
Pascal |
BASIC |
FORTRAN |
VBScript |
|
Addition |
+ |
+ |
+ |
+ |
+ |
|
Subtraction |
- |
- |
- |
- |
- |
|
Multiplication |
* |
* |
* |
* |
* |
|
Real division |
/ |
/ |
/ |
/ |
/ |
|
Integer division |
/ |
div |
/ |
/ |
\ |
|
Modulus (remainder
division) |
% |
mod |
mod |
function used |
mod |
|
Exponen- tiation |
function used |
function used |
^ |
** |
^ |
Table x.1
Since all languages agreed on the symbol and result for addition, subtraction, and multiplication we will start our discussion where things start to disagree. When we look at division, modulus, and exponentiation we will see they have different rules by language and they even provide different results.
Division is the first operation to look at how different languages process it. Division is more complicated than the other binary operators since we can have integer division (13/3 – no decimal points), floating-point division (4.32/7.25 – both have decimal point), and mixed mode (2/3.5 – only one has a decimal point) division. Each language has their own rules for these divisions.
FORTRAN and the C family will use truncation to calculate:
5/2 č 2
since both operands are integers, integer arithmetic is done. In contrast, BASIC, JavaScript, and Perl will do the following:
5/2 č 2.5
These last two languages do not see integers or reals, but just numbers. Still other languages (Pascal, VBScript) have a separate operator for reals (floating-point) division and a different operator for integer division.
Pascal has two division operators using the slash for real division and div for integer division. Thus 4/3 will result in 1.333333 but 4 div 3 will result in 1. Likewise, VBScript uses different operators for real (/ slash) and integer (\ backslash) division.
Python has added floor division, which “rounds down.” This works with both integer and floating point values. The operator for floor division is //. So we can do
4.0 // 2.3
produces the value 1.0, since 2.3 will divide into 4.0 only
once. Floor division solve the problem that integer division and float division
produces different results with the same values. For example, in languages that
have integer division, 1/2 will produce 0, while 1.0/2.0 will
produce 0.5. With floor division, both 1//2 and 1.0//2.0 will
produce the same value zero. Floor division rounds it to the next smallest whole number toward the left on the number
line. With signed numbers, we get the following:
1 // 2 # result is 0
-1 // 2 # result is -1
In the above last line, -1 is the whole digit to the left of -0.5. Floor division has the advantage of being numeric-type-independent
Integer division is not as simple as it looks. What are the rules for integer truncation? For example, does 43/10 get us 4 with integer division? If you put this on a ruler as follows
---4 ----------5 ----
4.3
So what we did is choose the integer left of 4.3. So now we can come up with this rule for integer division: For integer division, divide and then pick the integer to the left (towards minus infinity) and throw away any fractional part. So we divide 43 by 10 and obtain 4.3. We throw away .3 and end up with 4.
But what happens when one or both of the operands are negative? Will we handle –43/10 the same way? Put this on a ruler too.
---(-5) ----------(-4) ----
-4.3
If we choose the integer to the left, that will be –5, not –4! If we always go towards zero, the result is –4. So we could come up with another rule: Always pick the integer towards zero. With this rule, we obtain –4 instead of –5.
Which rule is correct? How about 43/-10 or –43/-10? The answer is not clear and a good argument can be made for either direction. Even Kernighan and Ritchie do not have a clear answer for the question. Here is what they say in the ANSI C edition of their book: “The direction of truncation for / and ... are machine-dependent for negative operands .....”[1] Thus we have two choices for integer truncation and both choices look reasonable.
The choices for integer truncation are as follows:
1. Always pick the integer closest to zero. Thus 4/3 will truncate to 1 and -4/3 will truncate to -1.
2. Always pick the integer towards minus infinity. With this rule 4/3 will truncate to 1 and -4/3 will truncate to -2.
Early FORTRAN used the first rule and most languages followed along to be compatible. Their choice is not necessarily the correct choice. The C family and BASIC have methods to give us both choices. Both language groups have an integer division that will truncate towards zero, and thus matches rule 1. The C language floor function and the BASIC int function returns the whole number that is less than or equal to the argument, and thus matches Rule 2. Regular integer division will do Rule 1.
The desired sign for the result with integer division is also a question. For example, what sign do you want in these four situations: 4/3, -4/3, 4/-3, and -4/-3? At least the rules of mathematics provide us with a consistent sign, but I am not sure that is the sign we want in all cases.
Exponentiation is the next arithmetic operation that often varies. First some languages, such as C++ and Java do not have it as operator, but require a function. Languages that do have exponentiation use either ** (two asterisks) or ^ (caret) for the operator. Then when it is available, it usually looks like:
x**y or x^y
FORTRAN and Perl use two asterisks, while the BASIC family uses the caret. Hordes of Pascal, C++, and Java programmers would gladly have accepted either but these languages use a function. C++ and Java uses pow(2, 3). The rumor is that this was done to warn programmers of the expense of doing exponentiation! There are two operands needed no matter if a function or an operator is used.
Exponentiation is a very complicated arithmetic calculation. Here are some of the possible variations:
2**3 repeated multiplication
2*2*2
2**0.5 square root of an integer.
2**-0.5 uses
inverse, then square root: 1 / 2**0.5
but it can not
be done with an integer result.
-2.0**0.5 can not be done in real numbers.
Besides the different methods of providing exponentiation (function vs. operator), there are major differences on what values are allowed for the operands. If we use this for our discussion:
base^exponent or
base**exponent or pow(base,exponent)
then what type of values can be used for the base value and for the exponent value. Some of the choices are integers, negative values, and real values. Early BASIC restricted the base to positive values and integer exponents. Some examples of possible combinations are:
3^2 result 9
-3^2 either 9 or –9, depending
on operator precedence.
3^-2 result 1/9
3^0.5 square root of 3.
3^-0.5 1 over square root of 3.
2.3^2 float base, integer exponent
2.3^0.5 float base, float
exponent
If I have not made any mistakes all of these are at least well defined mathematically. The first one squares 3, the second one squares negative 3 (maybe, depends on association). Because of the negative exponent, the third one (3^-2), squares the inverse of 3, so we obtain 1/9.
One question for exponentiation is what type should the result be: integer or floating point. For positive exponents, the operation corresponds to repeated multiplication. For example:
3**4 = 3 * 3 * 3 * 3 = 81
3.0**4 = 3.0 * 3.0 * 3.0 * 3.0 = 81.0
Thus for positive integer exponents the result could be the same type as the first operand (in this example 81 or 81.0).
But when the exponent is a negative integer we can have some problems with the above rule. Exactly, what answer we get in the last case varies by language. For example, if we step through the following:
2**-3 => 1/8
What answer would you get if the arithmetic is done with floating point values and what answer will we get with integer arithmetic. Careful! If we do the arithmetic using floating point we will get the obvious value 0.125, but if we use integers we will obtain zero, because of integer truncation in some languages. That is:
2**-3 => 1/8 => 0.125 //
floating point arithmetic
2**-3 => 1/8 => 0.125 = 0 // integer truncation
Try this on some compilers in
different languages and see what happens. Thus two reasonable solutions are
possible: using floating point to do the arithmetic or use integer arithmetic.
For the mathematically challenged (I looked it up before writing this) some operations are not possible. We cannot mathematically do operations like this in the real number system:
-3^0.5 // can’t take square root
of negative value.
0^0.5 // can’t take square root
of zero.
and probably other combinations. I will leave it as an exercise for you to find out what is mathematically possible.
A commn rule is a negative quantity
cannot be raised to a real power. This brings us to the odd situation that
(-4.0)**2.0 (used real exponent) is undefined, but (-4.0)**2 (used integer
exponent) is defined. Exactly what happens varies by language. Some languages
produce a rude error and stop executing, while other languages produce
The use of the minus symbol is next operation causing problems. The first problem is for the lan