Vector Operations

Overview:

Addition of a new concept along with its associated operations. The vector concept is that operations on collections of objects may be specified directly, without the use of looping constructs. Similar concepts in other languages include mapping in functional languages.

Proposed by:

Jeff Walker (language designer)

Experts to Contact:
jwalker@cs.oberlin.edu

Status: Being Considered

Status Rational:

Reason:

Within the confines of object oriented programming, recursion is currently the only repetition method. Object X has the procedural looping constructions only for compatibility with procedural approaches. Vector operations would allow the construction of such actions in a declarative style more compatible with the spirit of object oriented programming.

Description:

The vector concept is that operations may be carried out on collections of object directly. This is achieved by describing those operations in terms of a vector. A vector is the result of the use of one of the for forms of the vector operator. A vector is not a collection type. It is not possible to create a property to hold a vector. Instead a vector is a generalization of an ordered collection of entities of the same type. Vectors have an element type and length. The following additions would be made to the language to support this concept.

Vector Operator:
The vector operator is the dollar sign ('$') and comes in four forms. The forms are all the possible combinations with an asterisk on either side ('$', '*$', '$*' and '*$*'). The vector operator may be used only inside the subscripting operator ('[ ]'). The vector operator is used to select some range of subscripts whose values become the elements of the vector produced. The value to the left of the operator is the start, the value to the right the end. These values are inclusive. So 'array[2$9]' would select the elements 2 through 9 inclusive to be in the vector of length 8. Variables may be used to specify bounds but these cannot be checked at compile time. A value may be omitted from either or both sides in which case the start or end index of the collection is used respectively. This requires that there be some way for the system to determine the length of any collection. This proposal doesn't cover the details of this, but it is suggested that the appropriate methods be defined and required. When the vector operator is used with a asterisk on one side or the other this indicates that the range should attempt to match that of another vector in the expression. Namely, if it is used to produce a vector which will be assigned to then it should make the length equal to that of the length of the vector being assigned to it. It is suggested that the compiler make a best effort to determine this and produce an error when it cannot. This form will not attempt to use indices that are out of the range of the collection. In addition if multiple vectors in an expression have inferred length it may not be possible.

In addition it has been proposed that a value could be used with the asterisk form. In this case it would have the meaning of starting from the first instance of the value on the left and running to the last instance of the value on the right. This is the only instance in which the double asterisk form may be used. This would require much more work to specify accurately.
Scalar operations:
When a vector is used in a situation where a scalar of the same type as its elements would be expected (for instance in an addition) then the operation is performed on every element of the vector and the expression evaluates to the resulting vector. Evaluation is from beginning to end. Scalar expressions are evaluated only once in this case?
Member-wise operations:
When a binary scalar operation is applied to two vectors they must be of the same size. The binary operation is then performed in member-wise fashion from beginning to end. Note that assignment may be used this way if the left hand side is the result of what would otherwise be a lvalue. This allows one to assign the results of a vector calculation into a collection easily.
Vector to scaler operations:
Some operations take a vector value and convert it to a scalar value. These operations are all scalar binary operations performed between the members of the vector. They are written '(x)' where x is some scalar binary operator. For instance '(+) would produce the sum of the elements in the vector. These operators are unary postfix operators. The assignment, member selection, pattern invocation and any unary operators cannot be used in this way. These operators cannot be overloaded, instead their meaning is derived from that of the scalar binary operator. Note that associativity is followed to determine order of operation in such uses.
Vector operations:
It may be useful to define certain special operators as operating on two vectors. These may be overloaded? Examples are [+] union [*] intersection [-] difference. Maybe they should be [and] intersection [or] union and [xor] not intersection?
Scalar to vector operation:
With the introduction of vectors it would be useful to have some basic operators for turning scalar values into vector values. The '+-' operator could be used to produce a vector of two elements containing the sum and difference of the two operands respectively.

Other useful features with vectors would include the production of vectors containing sequence values in a range. Say by 'A'$'Z' for example. Also operators like element of and subset. When comparing two vectors for equality perhaps if they are different length they should simply not be equal. It may also be useful to make ways to prevent vector operators from working in some cases.

Example:

jwalker@cs.oberlin.edu