Yes, I have not given streams any thought. The reason for this is that while my example linearly accessed the members of the array, in practice access is random access, and not all elements are accessed.
The design is significantly more complex than it may seem by just looking at the example above. The actual code is open source, and you’re welcome to have a look. However, it may not be immediately obvious what’s going on.
@lamba92 was asking what the code looks like, so I will share it here with a brief explanation. I don’t actually expect anyone to spend their personal time understanding my code base, so please don’t assume that I want people to solve my problems for me.
At its core, there is the type
APLValue. Instances of this interface represents scalar values, or arrays. The property
dimensions contains the size of the array (if it’s empty, the value is a scalar value).
APLValue instance is an array, it’s either an actual array of values (implemented by
APLArrayImpl) or it’s a composite value, for example the deferred result of a computation. In other words, if the user wishes to add two arrays by typing
a+b, the system will not immediately add these arrays, but instead construct an instance of
ArraySum2Args which contains references to both arguments as well as a function that is to be applied on its members.
valueAt(p: Int) returns the value for a given slot in an array, and the implementation of this method in
ArraySum2Args is what actually performs the computation. In other words, results are not computed until they are needed.
These deferred computations can of course be stacked, such as when a user types
Now, a common operation is to actually realise the underlying values of an array. In other words, force the computation of all the values. This is implemented by the method
collapse. This method recursively descends the stack of deferred values to return an array of all values.
The plan all along has been to implement parallel evaluation of the
collapse function. But, before doing I found myself implementing a specific operation that could benefit from parallelism, so I posted a question here not long ago asking for design advice around coroutines. Here’s the discussion: Coroutines design advice
As a result, I ended up with the implementation of
This implementation worked fine, until I decided to implement parallelisation for
collapse. This code has not committed to Github yet. Its implementation looks similar to the outer loop in the example I posted in the initial post. Once this was implemented, everything worked fine, until the test case that tests the “scan” operator on large arrays. This is when the deadlock happened because the two implementations are not compatible with eachother.
Here are the implementations of the functions mentioned above: