Sum of vectors

Hi,

I would like to calculate the sum of geometric vectors stored in an array. Aside from hand coding it, I am unable to figure out a way to do this in a reasonable way. I expected to find a library function that took a lambda. I assume I’m missing something?

For example:

    val vectorArray = arrayOf(Point(0, 1), Point(2, 3), Point(4, 5), )

    val vector: Point = // == Point(6, 9)

The result should be of type Point since adding vectors results in another vector.

Thank you.

Well, we can do this in many ways, for example:

val vector = Point(vectorArray.sumOf { it.x }, vectorArray.sumOf { it.y })
// or: sumOf(Point::x), sumOf(Point::y)

Above is possible only because this is a rather specific case. Generally, this kind of transformations is called reducing and is usually solved as below:

val vector = vectorArray.reduce { v1, v2 -> Point(v1.x + v2.x, v1.y + v2.y) }

Even better, we can create plus operator for vectors:

operator fun Point.plus(other: Point): Point = Point(x + other.x, y + other.y)

val vector = vectorArray.reduce { v1, v2 -> v1 + v2 }
// or:
val vector = vectorArray.reduce(Point::plus)
4 Likes

I’ll give that a try. I did see reduce, but the name reduce didn’t make me think it was related to what I wanted. Reduce doesn’t strike me as related to sum, accumulate, combine,… or the like.

I was thinking of the first option, but it seemed like 2 rounds of iteration were not efficient and I figured there was something better anyway.

BTW, I’d like to complain how the docs show the description at the end of a long list of function parameter listings. Reduce is a perfect example. I can’t imagine why the docs don’t show a short description first. When I don’t know what I’m seeking, I’m less likely to scroll down a list to see if I have the right function when the name isn’t close to what I would expect. At best that’s annoying even when I think it might be what I want.

Rant over… :slight_smile: Thanks.

KMath-geometry defines arithmetic operations on 2D and 3D vectors as well as bulk operations. But it is in a prototype stage right now. If you want some features added, please create an issue.

1 Like

This is a very common terminology. Such operation is called reduce in I think every language I tried, e.g.: Java, C# (edit: actually, not true - in C# it is Aggregate), Python, Javascript. In the big data we also usually talk about map-reduce algorithms, although this is a little different.

2 Likes

Well, I don’t know Java, C#, Python, or Javascript, so I can’t speak to what other languages do. I still maintain the name “reduce” doesn’t clearly imply “sum” and in some ways the opposite because to reduce something is to make it less.

I’m guessing the reduction is with regard to the number of elements, but that is not helpful to a user looking to sum elements. The name “reduce” in this case is how it performs the action, but that exposes implementation in the name and does not focus on what it does. A name is the first and foremost communication to the user as to what it does and not how it does it. Implementation should never be exposed to an interface.

I have used other languages, but C++ is my best known and there it is accumulate for what that’s worth:

template <class InputIterator, class T>
T accumulate (InputIterator first, InputIterator last, T init);

The Wikipedia pages you provided are with regard to parallel processing or recursive processing, which this case has nothing to do with what I’m seeking, so I fail to see the connection. I don’t care about the steps and techniques used to compute a sum more than I care that it computes a sum.

All these arguments (including mine) are not helpful since Kotlin isn’t another language. If you want to argue a Java programmer would expect the name reduce, there is some weight to that argument since Kotlin and Java are closely associated languages. If you want to make this argument, then why does Kotlin not use the keyword “switch” instead of “when”? The word “when” in English is about “time” and Kotlin’s “when” is not run on a interrupt nor is it temporal in any way. So in this case Kotlin used a different name and it is worse.

The only argument one should make on the name of a function is - is it clear to common use cases. My argument is and has been, adding together objects that are not numbers (vectors, UDTs) is a common use case. These docs are poor in two ways.

First, the name isn’t clear. An opinion I should think others share, but an opinion none the less.

Second, and most importantly, the docs have a long list of function signatures before describing what the function does and when it finally does, the docs say, and I quote

" Accumulates value starting with the first element and applying operation from left to right to current accumulator value and each element."

So I ask you, if the name reduce is clear, then why is “accumulate” the first thing in the description when you finally scroll past the signatures? When the description of a function uses a word to describe it, it seems that word better describes what it does.

Regardless of all of this, the docs should not first have a long list of function signatures before telling the reader what the function does.

This reply make me dislike Kotlin more because it makes me think the engineers at Kotlin don’t care about their users. I sincerely hope you are wrong, but I suspect you are right.

Well, it seems this is a matter of taste and/or personal experience. For me “reduce” is the first name that comes to my mind. Not because Java used that name, but because… well, my impression is that the whole world uses “reduce” as the first choice. If I use an entirely new technology, I need to perform this kind of operation, then I look for “reduce” and in 95% cases it will be there, doing exactly what I would expect. I just checked some random framework that I don’t know very well - ReactiveX. Of course it has the “reduce” operator and it does exactly this. So for me “reduce” is like “for”, “while” or “if” - it is a de facto standard nowadays.

Of course, there are synonyms: fold, aggregate, accumulate - I would check them next if I wouldn’t find “reduce”. But from my experience “reduce” is usually the first choice. And regarding “sum” - it doesn’t make any sense at least to me :smiley:

I guess the main reason this operation is called “accumulate” in C++ is because it was added 20 years ago. Today this name is kind of obsolete, but it doesn’t make sense to rename it just to keep it more modern. BTW, you probably already know that C++17 added a very similar function to good old “accumulate()” and of course they named it “reduce()”. So I don’t understand why you are so surprised about that name.

3 Likes

As far as I can see std::accumulate in C++ has not been obsoleted. Where did you hear that?

Unless I desired parallel processing in C++ I would still use std::accumulate. In my case I do not want the data processed in parallel. Please explain your obsolete comment.

Are you saying reduce in kotlin is using parallel processing? I’m not quite reading that in the docs, but maybe I’m missing something?

As for my confusion in the name… Maybe if you read the reduce docs and saw how many times they used the word accumulate to describe what this function does, you’d think differently about the clarity of the name?

Or… maybe the English definition would help you:

Reduce: make smaller or less in amount, degree, or size.

Accumulate: gradually gather or acquire (a resulting whole).

Now, step away from any code or programming language and stick to math and English for a second. Now ask yourself in that context which one better describes a sum of values?

I don’t know how I can be any clearer.

I didn’t say “accumulate” function in C++ is obsolete. I said “accumulate” name for this kind of operation is kind of obsolete in the industry, because everyone nowadays seem to use “reduce” or “fold”. New languages or frameworks adapt these names, while old languages and frameworks may still use other names. At least this is my observation.

But why sum of values? What if I need to calculate a median, create a histogram or find a set of unique person names? What if I need to reduce a collection of people into a single value of something - should I say in English that I “sum people”? “Reducing” a collection fits all such cases well. “Sum” makes sense only in very specific cases where we actually sum something.

No, it isn’t. From the very beginning I say that reduce, fold and accumulate are pretty much synonyms. Some of them are more common than others - that’s it.

2 Likes
  • But why sum of values? What if I need to calculate a median, create a histogram or find a set of unique person names?

I’m adding 3 vectors. That’s the question I asked.

  • should I say in English that I “sum people”?

No, but should I say in English that I “reduce people”?

  • From the very beginning I say that reduce, fold and accumulate are pretty much synonyms. Some of them are more common than others

In C++ reduce is for parallel processing and is not a synonym for accumulate. Parallel processing isn’t used in common circumstances in my experience. The Wikipedia page you sent says: “…that is commonly used in parallel programming…” I had to ask if Kotlin’s reduce method is run in parallel.

Sorry, this discussion became too long and a little too absurd to me :slight_smile:

In the first Wikipedia article we have a list of languages and other technologies that support this kind of operation. For 46 languages maybe 40 use either “reduce” or “fold” name. Some use “inject”, some have an entirely custom syntax . Ahh, yes and there is also one “accumulate” in the list :wink:

For me this discussion is like you asked why classes are called “classes” in Kotlin. Then you argue this is not a good name and we should use a different one.

2 Likes

@bessermt: I guess there is a small misconception about the concept at hand:

Definition: A reduction is an operation that reduces an array to one scalar (a value of the element type of the array).
A sum is one example of a reduction, but there are other reductions like the product or the length.
A sum is a reduction, but not every reduction is a sum.

Therefore, “reduce” is an adequate name for this method, whereas “sum” would be wrong in the same way as naming a super-class for all animals “Dog” would be wrong.

So, @broot gave you the information, that there isn’t an exact library function for “sum”, but the library function “reduce” can be used to compute the sum.

This should clear up most of the discrepancies in this thread, e.g.:

(Note: I shorted the quotes from bessermt, please see his original posts for the full length statements.)

Which is a good thing like described above. It shouldn’t imply sum, since it is not necessarily a sum.
If you do a lot of sum operations, it makes sense to define and use a dedicated sum method:

operator fun Point.plus(other: Point): Point = Point(x + other.x, y + other.y) // see broot above
fun Array<Point>.sum(): Point = reduce(Point::plus) // definition
val vector = vectorArray.sum() // call

No. The name “reduce” in itself does not expose the implementation, it only exposes that it will reduce an array to one scalar and uses the given operator to do so. Similarly, the name “sum” for the self-defined method above would expose to the user that it will “reduce an array to one scalar by summing the elements up”.
In both cases, implementations can differ: it can be done in parallel or using a sequential loop or recursion or other means.

I guess the person that wrote the documentation followed the rule “don’t describe a word by using the same word”. If a user understands the name “reduce”, she probably doesn’t have to look into the documentation. If the user does not understand the name “reduce”, he will look into the documentation and probably would think it does not help him to just get the method name thrown at him again.
As you pointed out, accumulate is another common word for that operation, so if one does not know “reduce”, there is a chance that using “accumulate” in the documentation helps in understanding what the method does.
Another possibility would have been, to add synonyms at the end of the description, but then you would only get to them at the end. It has advantages and disadvantages.

Now, that this is hopefully settled, I think there are only two points left:
1.

For the complexity, only nesting of iterations is really relevant. Doing two loops after each other doesn’t really make much of a difference. So, it makes sense to prefer readability in these cases.

I guess that’s the point you should focus on. This is probably the same problem for nearly all functions that have hardcoded additional overloads for all primitive types. Do you have an idea how to improve that?

5 Likes

Sorry, it didn’t clear it up.

  • I guess the person that wrote the documentation followed the rule “don’t describe a word by using the same word”. …

That rule is for descriptions of English words using English words. It is this recursive nature that is the problem the rule solves. There it makes sense. In the case of describing function names, the docs are using English words to describe a function, so there is no such problem. Therefore the word that most succinctly describes what the function will do can be used in both the name and description.

  • it only exposes that it will reduce an array to one scalar …

Minor point. Summing vectors doesn’t create a scalar by definition. I think you mean “one value”.

  • … and uses the given operator to do so…

Sum is a result that gives the same result no matter what method you use. Reduce and fold describe how a given operation is accomplished. I think reduce gives non-deterministic results for non commutative operations like subtraction from what I read. This is implementation from sum’s level of abstraction. Sure, it’s descriptive from it’s own level of abstraction, but that a tautology so not helpful to this discussion. My point on the name is that it is at the wrong level of abstraction for computing a summation, hence the reason I didn’t find it.

Maybe this will explain better. You said:

  • If you do a lot of sum operations, it makes sense to define and use a dedicated sum method:

I expected this to already exist. Just like C++

template< class InputIt, class T >
T accumulate( InputIt first, InputIt last, T init );

Where operator + is the default operator.

I did find

And that led me to think there would be a level of abstraction to fit my requirement to sum an array of vectors. I was wrong. This entire problem of the name is one of description at the desired level of abstraction.

  • Do you have an idea how to improve that?

Yes, put a short description at the head of the docs and put the method signatures at the end. A reader is probably needing a quick read to find out if the function/feature is relevant to their needs so they don’t have to scroll down though numerous options. Put yourself in a programmer’s position. They want to achieve a goal but not sure if something exists to help them. They will try to look though lists of possible candidates and then read docs for those that look promising.

The signatures are helpful, but only after a short description and examples.

The problem with broot is the nature of their posts.

  • the whole world uses “reduce” as the first choice
  • I said “accumulate” name for this kind of operation is kind of obsolete in the industry, because everyone nowadays seem to use “reduce” or “fold”

Using words like “everyone” and “whole world” is extreme language often used as a way to win an argument by shutting down differing opinions that don’t match. It is an unfair fighting tactic. Saying “accumulate is kind of obsolete” and “nowadays” is simply an attempt to insult.

I understand what broot is doing. I don’t think you are clearing up anything because there is nothing to clear up. I’m not looking for an argument and I’m definitely not looking for insults. I got my answer and I don’t like it. If I’m not clear on my opinion or I’ve misunderstood something, I’m listening. If I’m going to be told I’m wrong for my opinion to justify another opinion, I’m not interested.

I not only used words like “everyone”. I provided specific examples and numbers that wasn’t prepared by me or by people of similar experience to mine, so I believe these numbers shouldn’t be biased in any way.

Also, this is not that kind of discussion where I say eating of meat is good, because “everyone do this”. When choosing names for newly created language, one of the most important factors is how others named same things. This is to use a common language across the whole industry. So “because everyone …” is a valid argument here.

And finally: you asked a question, why is it called “reduce” in Kotlin. I didn’t take any part in designing Kotlin, but I genuinely believe the main reason was because this is the most common name for this transformation. I basically answered your question according to my knowledge and understanding. I have no idea what’s wrong about that.

3 Likes

Currently, it would be hard to do the same in Kotlin, because we don’t have an interface for “summable” items. Operators in Kotlin have no representation in the type system, so I believe we can’t create a function, that receives a collection of arbitrary items that can be summed with each other. This is one of things that I miss in Kotlin. Existing sum() functions in stdlib are overloaded for each type that is known to be summable.

2 Likes

Disclaimer: I am a fellow software developer - just a user of Kotlin. I’m not involved in the language design process.

I described above in detail why it can be helpful in documentation, too. Documentation that just reiterates the function name is mostly not helpful. But we can disagree on this point, that’s fine.

No. “one value” does not describe what type it is, so it can be basically anything. But the reduce operation returns a scalar of the array.
Here, in your example, we deal with an array of vectors of numbers - a collection of collection of number. Therefore, I can see that the term scalar can be confusing. That’s why I gave the definition above at the first position where I used the term scalar:

So, for an array of vectors, a scalar is a vector.
For a vector the scalar is a number, but in all instances of my post I made it very clear that it is the scalar of the array.

No, reduce is deterministic. The documentation describes the return value of the method in a technical way like one would do in mathematics, but it is still open for various implementation possibilities.
But because of the possibility of non-commutative operators, the documentation needs to specify in which order the applications of the operator are happening - otherwise the user wouldn’t know what the function returns.

And my point is that it is not a summation at all (in general). So, independently from the degree of abstraction, the term “sum” is wrong for it.

Think about a developer that needs to implement the product. He is in the same situation as you with the sum. His desired method does not exist and he has to use the reduce method for it.

===

I understand that you would prefer if there was a method “sum” in the library for an array of vectors. That is a design decision that makes sense if this particular method is used often enough and the possibility to define it yourself with the simple lines given above aren’t sufficient.

However, currently there is no such method. Instead @broot showed you how to easily compute the sum using the reduce method. And I explained to you why naming the existing reduce method “sum” - as you suggested - would not make any sense.

So, that’s two very different things. I see why you want the first one and I clarified why the second does not make any sense.

I hope you agree on this second point (“sum is not an accurate name for reduce”), otherwise I would leave the discussion, because it would mean that I cannot help you in any way.

5 Likes