Always use IO thread for file operations?

Hey,

I’m wondering, if we should always switch coroutine context to IO thread, when doing file operations.

withContext(Dispatchers.IO) {
    Files.writeString(...)
}

I’ve noticed that linter shows a warning, if function has a suspend keyword. But there is no warning, if function doesn’t have it. Does that mean it’s not necessary to switch coroutine? What’s the right thing to do in this case?

/** suspend */ fun smth() {
    Files.writeString(...)
}

Thank you!

I’m not sure if I get your question correctly. We need to use Dispatchers.IO whenever we invoke a blocking code from a coroutine. It could be I/O, it could be Thread.sleep(), it could be locks, etc. On the other hand, if we do non-blocking I/O we don’t need Dispatchers.IO.

There are no restrictions on running blocking code in a regular, non-suspendable function. However, if your smth function internally calls blocking code, that means this function is blocking itself. If we call smth in a coroutine, we need to use Dispatchers.IO.

1 Like

What is the warning?

My question was related to Java related IO functions (File, Path, etc.), that appear to be non-blocking. From that perspective, I’m wondering if I should always use IO thread, when calling them.

Possibly blocking call in non-blocking context could lead to thread starvation

Why do they appear non-blocking to you? I would assume all methods in Files are blocking and they should be called using Dispatchers.IO if we are within coroutines.

I’m confused on this, because I know Java has a nio package which is meant to be for non-blocking IO, but I don’t know how to actually use it in a non-blocking way, IE with coroutines or whatever. So I just assume all file operations are blocking.

Take everything I say with a grain of salt as I/O isn’t exactly my area of expertise, but:

  1. If you mean letter “N” in “NIO”, then I believe it is not for “non-blocking”, but “new”. java.nio is simply a newer set of APIs, designed to better fit architectures of modern operating systems and allow better performance.
  2. Part of the confusion comes from the fact that “blocking/non-blocking” means something a little different in the I/O and coroutines nomenclatures. If we develop an application in Kotlin and look for an optimal way of doing I/O with coroutines, I think we shouldn’t look for “non-blocking I/O”, but for “asynchronous I/O”.
  3. For async I/O Java provides AsynchronousChannel and its subtypes. All operations can be performed by providing a CompletionHandler, so it could be easily translated to suspending.
  4. Please be aware AsynchronousChannel underneath may do pretty much the same as Kotlin with Dispatchers.IO - keep a thread pool and block them on I/O. It could potentially use some optimizations that Kotlin is not able to do, it could use less threads by utilizing non-blocking I/O, etc., but I don’t know if it does this in practice. In the end of the day we may be in the same place as when using Dispatchers.IO. I’m not sure if we should prefer one way over another and which one.

I wrote a small lib some time ago

2 Likes

Well apparently it’s both, according to Wikipedia anyway… Non-blocking I/O (Java) - Wikipedia

Nice! I’m having a look now. :slight_smile:

According to the very first sentence in the Wikipedia entry: “NIO stands for New Input/Output”. I suppose someone named the article incorrectly, then it is hard to change it. But you won’t find any mentions in the article that “NIO” may mean something else. JSR 51 was named: “New I/O APIs for the JavaTM Platform”. Then JSR 203 was: "More New I/O APIs for the JavaTM Platform (“NIO.2”)’ - it even uses “New” with a big letter, as the name of the feature. And as you discovered yourself, 95% of APIs in java.nio are blocking. Even SelectableChannel, which is one of the main components for writing non-blocking I/O, is by default blocking and needs to be reconfigured.

I mean. this is just a name, each person can have their own interpretation of “NIO”. But if for a given person this is confusing that java.nio provides blocking APIs, then this is clearly their misinterpretation of this API.

That’s why I’m confused; the heading says that it’s Non-blocking IO. :stuck_out_tongue: Maybe it is just an incorrectly named article, idk.

1 Like

NIO was released at the time when there was a lot of hype around “non-blocking I/O”. And it in fact brought modern non-blocking I/O to the language, this was one of features of NIO. This probably caused a lot of confusion.

I’m not surprised if in the discussion about Java someone calls “NIO” a “non-blocking API” . But if they actually think java.io is for blocking I/O and java.nio is for non-blocking I/O, then this is entirely incorrect.

That’s certainly what I thought from stuff I’ve read online. :frowning: I guess that’s my take away from this thread; despite what Google or other search engines may tell me, Java’s NIO package is NOT non-blocking IO.

1 Like

Your discussion sums it pretty well guys. That’s exactly what I’m confused about - what blocks and what doesn’t. And also when it’s worth changing to Dispatcher.IO and when it’s not. I hope there was better API for file operations.

1 Like

I think if documentation doesn’t say otherwise, you can safely assume I/O operations are blocking. This includes not only “heavy” operations like reading/writing, but also “light” operations like: create a folder, check if file exists, get the file size, etc. It is safer to wrap all such cases in Dispatchers.IO.

If you use a library which clearly says it is asynchronous, it uses futures, callbacks or at least docs for the method says it returns “immediately”, then probably you can safely call it from coroutines without switching the dispatcher.

It is much more tricky with just any arbitrary method, not related to an I/O library. There is no reliable way to know if a random method may block or not. If it throws IOException, then this is a good indicator it probably may block. But this is not a reliable approach. And there are other reasons why the code may block: Thread.sleep, Thread.join, synchronized blocks, locks or other synchronization utils, etc.

In many cases our code doesn’t have to be 100% blocking-safe. If we accidentally called a synchronized method inside a coroutine, it probably won’t hurt as too much. But we should at least be mindful of the problem and try to avoid such cases. Whenever we do an obvious I/O, we should use Dispatchers.IO. This should cover most of cases.

1 Like

Regarding the linter, if the function is marked as suspend then the linter will do some extra, coroutine-specific checks. One of them looks to see if the function contains a call that might block but doesn’t use Dispatchers.IO, and shows that warning if it does.

If the function is not marked as suspend then it doesn’t run the extra checks.