kotlin icon indicating copy to clipboard operation
kotlin copied to clipboard

Add startsWithAny and endsWithAny

Open ByteZ1337 opened this issue 3 years ago • 26 comments

Just some functions I would find useful.

ByteZ1337 avatar Jul 18 '20 10:07 ByteZ1337

Why not just use any(string::startsWith) or with arguments? I don't think this would be helpful if it can be replaced just with one call.

Also, the name "startsWithAny" is confusing, in my opinion the semantics could be better if it were Iterable<CharSequence>.containsPrefixOf(str:CharSequence).

fee1-dead avatar Jul 22 '20 08:07 fee1-dead

Why not just use any(string::startsWith) or with arguments? I don't think this would be helpful if it can be replaced just with one call.

The same reason you use Collection#addAll instead adding every element with forEach. Methods like these are just way more convenient to use and make the code way more readable than using obscure lambdas in if statements.

Also, the name "startsWithAny" is confusing

How so? I think it's pretty easy to understand. I don't really see an issue with the name since there are other methods called "any" in the Standard Library.

Iterable<CharSequence>.containsPrefixOf(str:CharSequence)

I think the receiver should stay CharSequence since the actual string is checked. Again I use Collection#addAll not Collection#addAllTo

ByteZ1337 avatar Jul 22 '20 15:07 ByteZ1337

Methods like Collection#addAll are way more convenient to use and make the code way more readable

Nope, that's not the case at all. having that method helps with increasing performance, for example if it were an ArrayList it checks the capacity of its backing array and everytime the method add is called it increases the capacity by copying the array if necessary. Using addAll will make the ArrayList perform just one copy action. The method was there for performance, not convenience.

Also, I don't see why this function should be in the standard library. All the functions from kotlin.collections.Collections.kt / kotlin.text.Strings.kt contain their own implementations, not one-liners. I also don't understand why this function should be in the standard library, because you can just create these function in your own program.

fee1-dead avatar Jul 22 '20 16:07 fee1-dead

Sill makes it way more readable than obscure lambda that you don't instantly understand.

contain their own implementations, not one-liners.

What about the empty and null checks? Those could also easily be implemented but still got into the standard library. And I also wouldn't call it its own function. It's just an extension of startsWith. With your logic methods like substringAfter would also be one liner functions.

I also don't understand why this function should be in the standard library, because you can just create these function in your own program.

Well obviously you can implement everything yourself in Strings.kt. Iirc these functions exist so you don't have to implement them every time you need them. I really don't understand why you're fighting this so much? They are just 2 functions that are useful and don't make any major changes to the standard library.

ByteZ1337 avatar Jul 22 '20 16:07 ByteZ1337

Why are you fighting this so much?

I am not fighting this, I am asking why this should be in the standard library. If you could explain your use case it would be great. I tried to ask you twice, but you do not seem to understand this at all. Why would users look for such function that does this in the standard library?

Also, looking at contributing.md, you did not:

  • Talk to the jetbrains team on the kotlin slack
  • Create an issue on YouTrack
  • Include a sample using @sample macro (example)

What about the empty and null checks?

Null checks are helpful because when using the value from elvis/if expression you need to use parenthesis.

Sill makes it way more readable than obscure lambda

How is it more readable? How is string.startsWithAny(listStrings) more readable than listStrings.any(string::startsWith)?

I think the names are pretty easy to understand

The names are confusing, you think they are easy to understand because you know what they do. By just looking at the function names from the title, I had no idea what these functions are for.

fee1-dead avatar Jul 23 '20 03:07 fee1-dead

I think the function names are easy to understand and improve readability greatly. Just having string.startsWithAny instantly tells you that you check if your string starts with any (of the following) strings. If you instead would use listStrings.any you have no idea what you exactly check for. I remember starting in kotlin I was confused by this method and still think that the naming is very poor.

If you just skim through a program the method string.startsWithAny tells you faster and more clearly what is does compared to listStrings.any(string::startsWith)

flimosch avatar Jul 23 '20 12:07 flimosch

First of all, if you're neutral and just curious about why this should be implemented why did you dislike my post? Second of all, I'm having an open discussion with you right now, aren't I? You're giving me your opinion and I'm telling you mine and respectfully disagree.

For my use cases :

  • Checking file extensions,
  • Parsing a file that has comments but has multiple options to start a comment.

I tried to ask you twice, but you do not seem to understand this at all.

Sorry if I missed it but I never saw you explicitly asking for these?

I didn't open an issue or talk to the employees because I thought 2 small functions would just waste their time. So I just opened this pull request and see if it gets merged. If this was wrong I would like to apologize. I did however not see the part with sample code which I will probably add to this pull request tomorrow.

And I don't know what your "educated guess" is supposed to mean? (Getting a bit personal there huh) I did build it by following the instructions in the readme. But I don't understand why I should build my own forked standard library. Since I mostly do open source stuff this wouldn't make sense. Also, as you mentioned in your second comment I can just implement it myself which would make building the whole standard library a bit overkill.

Also, if you find these names confusing you should probably open an issue to rename CharSequence#indexOfAny.

ByteZ1337 avatar Jul 23 '20 15:07 ByteZ1337

Checking file extensions:

val extensions = listOf("txt","exe")
val file = File(...)
file.extension in extensions
val fileName = "foo.bar"
fileName.substringAfterLast('.', "") in extensions

Parsing a file: I do not think that is the average use case.

fee1-dead avatar Jul 24 '20 03:07 fee1-dead

Or

path.endsWithAny("exe", "txt", ignoreCase = true)

Which is way nicer to look at and supports ignoreCase.

ByteZ1337 avatar Jul 24 '20 09:07 ByteZ1337

nicer to look: O(n) contains in hashset, toLowerCase for ignorecase: O(1)

fee1-dead avatar Jul 24 '20 11:07 fee1-dead

So I ran

println("HashSet check: " + measureTimeMillis {
      val extensions = hashSetOf("exe", "json", "xml", "bin", "kt", "jar", "txt")
      val file = File("test.txt")
      file.extension.toLowerCase() in extensions
})

and

println("endsWithAny: " + measureTimeMillis { "test.txt".endsWithAny("exe", "json", "xml", "bin", "kt", "jar", "txt", ignoreCase = true) })

independently from each other and got these results (average of 10 runs):

HashSet check: 31
endsWithAny: 15

I ran this code on play.kotlinlang.org to make sure it's not because of my hardware or some other stuff.

When adding more extensions to the list both functions take longer at a pretty equal rate. So the time complexity seems to be O(n) for both methods.

Here is the link if you want to try it yourself. Just uncomment the version you want to try.

ByteZ1337 avatar Jul 24 '20 12:07 ByteZ1337

A quick benchmark suggests otherwise.

fee1-dead avatar Jul 24 '20 15:07 fee1-dead

benchmark is rigged

x4e avatar Jul 24 '20 15:07 x4e

benchmark is rigged

@cookiedragon234 how so?

fee1-dead avatar Jul 24 '20 15:07 fee1-dead

Well mine shows the actual time it takes to run, not some benchmark

ByteZ1337 avatar Jul 24 '20 15:07 ByteZ1337

Also javadoc has tons of spelling mistakes you might wanna fix

x4e avatar Jul 24 '20 15:07 x4e

Something other than the missing plural? I copied the doc from the other methods and just modified it a bit.

ByteZ1337 avatar Jul 24 '20 15:07 ByteZ1337

Kotlin uses this benchmark plugin, ASM uses JMH, which is what kotlinx.benchmark(the plugin i'm using) under the hood, doubts?

fee1-dead avatar Jul 24 '20 15:07 fee1-dead

Still, why shouldn't mine be perfectly ok? It's the time it takes to execute the tasks. Also if its rigged then it means nothing either way (Don't have time to check rn).

ByteZ1337 avatar Jul 24 '20 16:07 ByteZ1337

ok, sorry that I wasted your time when my benchmark is rigged and your tests are perfectly fine. I know your intentions now. I will not fight this when you just wanted to get your functions merged. I'll just do what I usually do, to enhance my own definition of confirmation bias and bigotry.

fee1-dead avatar Jul 24 '20 16:07 fee1-dead

Looking and running your benchmarks they seem fine. (You use startsWith instead of ends if but adjusting the string length to account for that doesn't change the speeds too much). I still would like to have that functions, maybe just with more efficient methods. (This will be hard to implement because your methods can only be used for checking file extension and not general string where you don't have a delimiter)

flimosch avatar Jul 24 '20 16:07 flimosch

How am I not respecting your opinion? I'm just saying that I don't see anything wrong with my tests and that someone said that yours might be rigged? I never accused you of anything. It just doesn't make sense why these benchmarks would show different results.

ByteZ1337 avatar Jul 24 '20 16:07 ByteZ1337

What? I didn't accuse you of not respecting my opinion, where are you getting this from?

Now since you asked why they show different results, initializing a file takes time. constructing a hashset takes time. I don't know why you didn't consider this.

fee1-dead avatar Jul 24 '20 16:07 fee1-dead

You use startsWith instead of ends

I have fixed it but the result should be the same, starts or ends 🤷‍♂️

can only be used for checking file extension and not general string where you don't have a delimiter

I asked for a use case and checking file extension is the only one that made sense, so I said why not just use my method?

fee1-dead avatar Jul 24 '20 16:07 fee1-dead

Isn't that what "bigotry" means? English isn't my native language sorry. And I think when an actual user would check file extension he would probably also initialize a new HashSet and File. Since it's something you probably only do a few times in a project they would probably not make a function for it. And in my personal case, I'm reading a lot of files, and initializing a file for every file would impact the performance a lot. I guess it just depends on the actual context of your project which version is better. But as I said before a simple function that can be called in an if-statement is still nicer than implementing an own function. But that's just my opinion. If you see that, I respect that. It's just that these functions probably won't even bother you when they get implemented but they would help someone like me. So I think arguing if they should be implemented isn't really great. However, I'm always open to tips and suggestions to improve performance.

ByteZ1337 avatar Jul 24 '20 16:07 ByteZ1337

YouTrack: https://youtrack.jetbrains.com/issue/KT-40891

fee1-dead avatar Sep 05 '20 14:09 fee1-dead