StringBuilder.indices not updated after char removal

giacomotommaso.petrucci · June 27, 2020, 7:49pm

Hello everyone.
I’m new to Kotlin and I’m still playing with it to learn it. As an exercise, I was writing a function to check if two strings are an anagram of each other. To implement this my idea was to go through the first string and, for each character, delete the first occurrence of that character from the second string. I also delete spaces since I want to ignore them. Then if the second string is empty the two strings are anagrams, otherwise not. This is my function:

fun anagram(string1: String, string2: String): Boolean{
if (strlenWithoutSpaces(string1)!=strlenWithoutSpaces(string2)) return false
var stringBuilder1= StringBuilder(string1)
var stringBuilder2=StringBuilder(string2)

for(index1 in stringBuilder1.indices){
    if (stringBuilder1.get(index1).equals(' ')) continue
    for (index2 in stringBuilder2.indices){
        if (stringBuilder1.get(index1).equals(stringBuilder2.get(index2))) stringBuilder2.deleteCharAt(index2)
        else if (stringBuilder2.get(index2).equals(' ')) stringBuilder2.deleteCharAt(index2)
    }
}

if(stringBuilder2.isEmpty()) return true   //string2 can be empty and not an anagram of string1 only if they differ in size, but I've checked for it at the beginning
return false

}

The problem is that this seems to work only if the second string is the first string reversed, otherwise if it contains the same characters but not in reversed order I get a StringIndexOutOfBoundsException in the nested for loop! I also tried some variations using for (index2 in 0…stringBuilder2.lenght-1) and for(index2 in 0 until stringBuilder2.lenght) but with the same result. So seems that the length of a StringBuilder get chaced when is used in a for statement or it isn’t updated when I remove a charachter, but why is that? Shouldn’t the indices and the lenght properties be used exactly to avoid out of bounds exceptions? Am I doing something wrong? If you wish to play with it, here’s a complete example: pastebin

Thanks for your help.

naxymatt · June 28, 2020, 2:02am

The problem is when you delete a character, the indices change. Consider the string “ABC”. When you are at index 0, if you delete “A”, then “B” is now in position zero; but your loop moves on to position 1, so you skip “B”.

An easier way is just to convert each string to a character array, filter out any characters you don’t want to consider (for example, spaces), and sort the arrays:

fun String.isAnagramOf(other: String) = 
    toCharArray().filter{ it != ' '}.sorted() == 
        other.toCharArray().filter{ it != ' '}.sorted()

Or if you prefer to avoid repetition…

private fun String.toNormalizedCharArray() = toCharArray().filter{ it != ' '}.sorted()
fun String.isAnagramOf(other: String) = toNormalizedCharArray() == other.toNormalizedCharArray()

giacomotommaso.petrucci · June 28, 2020, 8:06am

Thanks for your explanation. That wasn’t the right way to check if two strings are anagrams, but I still don’t understand why I was getting the IndexOutOfBoundException. In my understanding when I write

for (index2 in stringBuilder2.indices)

for each iteration first it checks if index2 is a valid index for stringBuilder2 and only if it is valid proceeds to execute the loop’s body. But in my case it gets executed even if index2 isn’t a valid index. Am I missing something?

Wasabi375 · June 28, 2020, 8:16am

Noo it doesn’t. There is nothing special about .indices . It’s just a simple IntRange. This means what your code does is somethig more like this java code

IntRange indices = stringBuilder2.indices;
for(int i =  0; i < indices.length; i++) {
    int index2 = indices.get(index2);
    // your code
}

Note that the for loop in this case doesn’t use a simple int to iterate over the indices array and instead uses an Itertator object to iterate the indices. So this example code is not an exact translation of kotlin to java but just an explanation of how it works.

giacomotommaso.petrucci · June 28, 2020, 10:12am

Thanks. I knew that StringBuilder.indices returns an IntRange, but I was expecting it to be called before each iteration. I still feel that this shouldn’t be the behavior inside a for statement, but now I understand what’s going on.

Wasabi375 · June 28, 2020, 10:52am

Even if it was called for each iteration how would kotlin know which entry was removed? You could avoid an IndexOverflowException but there would be no way to guarantee that you wouldn’t skip any element. Iterating over lists that are changed at the same time is hard and there is no default way to do it since you could add/remove elements anywhere.
If you want to use a loop instead of @naxymatt’s version you could use while(stringBuilder2.isNotEmpty) and then always remove the first character. Also you can stop the loop early if you can’t remove a char, because you already know at that point that the string isn’t an anagarm.

Also you don’t need to use a.equals(b) in kotlin. Kotlin doesn’t treat == like java does.
If you compare java and kotlin equals you get the following

Kotlin	Java
a == b	a.equals(b)
a === b	a == b

The short is that `a == b` is true equality (based on the equals implementation) while `a===b` is reference equality.

giacomotommaso.petrucci · June 28, 2020, 6:12pm

Thanks for your suggestions. Of course Kotlin can’t guarantee that I wouldn’t skip elements, my implementation was wrong. But I think that the point of having an indices property is to avoid out of bounds indices. By not calling it at every iteration Kotlin is greatly reducing its power without gaining much: it’s faster, but if you want to avoid computing the range at each iteration you could easily write something like

val indexRange=stringBuilder.indices
for(index in indexRange){
...
}

and still having the benefit of not risking to go out of bound in the general case. Of course it’s only my opinion, but I think that not accessing the indices property at each iteration should be seen as an optimization to be done only if we’re dealing with immutable objects, like when we’re iterating over a String, or when the compiler can infer that the object being accessed isn’t being modified. But I also get that iterating and modifying a list at the same time isn’t a good practice, so this could be a reason to optimize the check away, but a reason I don’t completely agree on.

ebrowne72 · June 29, 2020, 6:41pm

If what you want is that for (foo in bar) evaluates bar on every iteration of the loop instead of just once at the beginning of the loop, that’s a huge change. Evaluating bar may have side effects, and a change to the language like that would break a lot of code out there.

giacomotommaso.petrucci · July 8, 2020, 5:15pm

Yes, I know that once it’s decided to do it in one way or another there’s no going back. That was the behavior I was expecting, but I’m not stating that they have to change it. Now I know how it works and I can deal with it.

Topic		Replies	Views
StringBuilder does not compile in JS JavaScript	1	960	May 17, 2018
Reverse String Support	7	3793	March 30, 2021
Is it possible to replace all loops while with a loops for in this code? Support	2	1080	November 28, 2021
StringBuilder.replaceRange() Language Design	3	1252	March 18, 2019
`for` loop with dynamic condition	12	34000	August 12, 2022

StringBuilder.indices not updated after char removal

Related topics