Kotlin's default visibility should be internal


#1

Why the default should be internal

Visibility should be minimized

I can’t say it better than Josh Bloch, and I doubt this is controversial here.

  • Effective Java, Item 13: Minimize the accessibility of classes and members
  • How to Design a Good API and Why it Matters (PDF), slides 14 and 16 (note that public classes and members constitute an API).

Internal is a safe default

If a class or member initially has the wrong visibility, it’s much easier to increase visibility than reduce it. That is, changing an internal class or member to public requires no effort since there are no external callers – doing the reverse after external callers have been written could have large ripple effects. Therefore, making internal the default can save substantial work as code bases evolve.

Why the default is public

The default visibility was changed from internal to public in M13, on pragmatic grounds. The blog post announcing M13 included this reasoning for the change:

In real Java code bases (where public/private decisions are taken explicitly), public occurs a lot more often than private (2.5 to 5 times more often in the code bases that we examined, including Kotlin compiler and IntelliJ IDEA). This means that we’d make people write public all over the place to implement their designs, that would make Kotlin a lot more ceremonial, and we’d lose some of the precious ground won from Java in terms of brevity. In our experience explicit public breaks the flow of many DSLs and very often — of primary constructors. So we decided to use it by default to keep our code clean.

Why the reasoning for public-by-default is flawed

Kotlin is a deeply pragmatic language, designed with great care and feet planted firmly on the ground. That spirit clearly motivates the above argument. While the motivation is definitely to be applauded, the reasoning misses the mark; it conflates brevity with cleanliness and ends up sweeping a major problem under the rug.

Kotlin tries to encourage good practice

Broadly, Kotlin does an excellent job of correcting flaws in Java that make it easier to write bad code than good.

A similar analysis of Java code would likely have revealed that most classes are not final. However, we know that most classes should be closed (Effective Java Item 17: Design and document for inheritance or else prohibit it). Consequently, Kotlin (correctly) makes classes closed by default despite the preponderance of bad practice in Java code bases.

Closed-by-default is only one example of Kotlin encouraging good practice with language constructs; nullable types and vals are others.

The analysis uses flawed data

Because Java provides inadequate options for visibility, developers are forced to choose the lesser of two evils. More experienced developers tend to address this problem with naming conventions and documentation; less experienced developers tend to throw up their hands and just make things public without even that.

Consequently, most Java code bases have many more public classes and members than their authors need or want. We cannot simply look at the use of Java visibility modifiers in an average code base and assume that it reflects its author’s wishes, let alone best practice!

For instance, let’s look at okhttp, a code base written by experienced Java developers who strive to minimize visibility despite the limitations of Java.

Here are its public packages, the ones intended to comprise its API:

           Methods     Classes
------------------------------
Public    :      618        46
Package   :      283        13
Protected :        2         0
Private   :      237         0
------------------------------
                1140        59

And here are its “internal” packages, which would ideally only be visible within the module.

             Methods  Classes
-----------------------------
Public    :      559       49
Package   :      398       24
Protected :       13        0
Private   :      423        0
-----------------------------
                1393       73

A simple count of public entities would show 46% public methods and 71% public classes. This is already much better than the average code base, which is the direction we should encourage.

But the internal packages should not be public at all! About half of the methods and classes which are public in okhttp are public only because Java requires them to be. If Java had Kotlin’s visibility modifiers, we should expect closer to 24% public methods and 35% public classes. Further, 48% of methods and 65% of classes would be internal!

The potential of internal visibility is squandered

In Java, there’s no choice but to make module-internal entities public and use convention and documentation to discourage their use. Kotlin’s internal visibility fixes this flaw in Java, but the choice of public for default visibility ignores that important correction.

Public-by-default squanders the potential of Kotlin’s internal visibility. It’s a rare (lone?) example of an unsafe default in Kotlin. It uncharacteristically encourages a bad practice that Java actually discourages, and in so doing is big step backward from Java when Kotlin has the means to take a big step forward.

It’s not too late!

Changing the default from public to internal will result in compile-time errors which are easily found and fixed in IntelliJ; it should even be possible to for the plugin to offer a quick-fix, finding entities with default visibility which are referenced outside the module and adding the public modifier, while leaving unreferenced entities as (correctly) internal by default.

Kotlin is fast approaching 1.0, but we’re not there quite yet. The adoption so far has been by an enthusiastic community of well-wishers who have followed Kotlin through a series of breaking changes in the milestones and betas. This is a passing but still present opportunity to make an important change to default visibility.


A bit about picking defaults
#2

Public vs. private stats for Ceylon SDK and IDE (plus some opinions): https://twitter.com/1ovthafew/status/672803401990848513

For application code, public vs. internal doesn’t really matter. For library code, defaulting to public seems odd.

It would be good to have a well thought out visibility story for Kotlin + JDK9/Jigsaw (which defaults packages to internal).


#3

It’s definitely not too late. The current behavior has existed for less than 4 months.

I’m hoping to use Kotlin for a decade or more, and need the language to support large teams building large projects. Encapsulation is one of the few tools we have to manage complexity, and visibility plays into that.


#4

Not true in general. If you don’t control the library, you are dead in the water if the author of the library made the visibility too restrictive.

“apparently” is a bit mean here. JB documented their approach with plenty of data.

We don’t know it. It’s a suggestion which, while it sounds good on paper, ends up going against the productivity goals that Kotlin serves.

In my experience, I have been inconvenienced a lot more by libraries making everything final and private than by the hypothetic breakages that overriding a member that was not supposed to be is alleged to cause. These breakages are extremely rare in my opinion and making these members public by default allows to compensate the lack of foresight that most library authors have when they design their libraries (not a knock against them, it’s just plain impossible to predict how users will use your code).


#5

I certainly didn’t intend to be mean. Removed.


#6

This is absolutely true in general.

You never had access in the first place so the effect on your code of something increasing in visibility is zero. However, if a public member is reduced in visibility it does affect consumer’s code since a non-zero amount of them will have been using the method.

The issue of something being not being available to you from a library (not being public) is completely orthogonal to discussions about the default.


#7

If you take a look at the entirety of what comprises your code base, I would argue that you don’t have access to a majority of the code you depend on, so “absolutely true in general” sounds exaggerated to me.

That’s fair enough, but we know how to handle that (publishing a library that’s backward incompatible).

It’s at the very heart of the argument Logan is making in this proposal. Whether it’s orthogonal to a discussion about the default is a separate matter: we’re discussing the letter of the proposal, not its spirit.


#8

I don’t think there’s much of a difference between internal and public in most applications, except for libraries of course.

I agree with the argument though that keeping the visibility small by default is a good idea. Thus, I’d propose a private by default approach, except for classes of courses, since Kotlin lacks a package local visibility modifier.


#9

You’re right. I don’t have access to a majority of things. This is fantastic! It means all these libraries can change implementations details, naming, structure, and add and remove internal dependencies all without affecting me. Java correctly chose a reasonable default which doesn’t expose these implementation details.

So every omitted visibility reduction now requires a major version compared to promotion of visibility which requires a patch version? Looking forward to Guava v98.0.0 and OkHttp v146.0.1.

This sounds like truly awful world to live in. You punish consumers of libraries for the ignorance of their developers in forgetting a single visibility reduction modifier?


#10

You’re making a huge deal of something that’s completely benign.

If public remains the default, I bet that in one or two years from now, hardly anyone will have been inconvenienced by the catastrophic scenario you’re describing.


#11

By the way,

Java’s default (package protected) is the worst of all the options.


#12

Encapsulation is not about access modifiers. It’s about module boundaries. You can define those using interfaces.


#13

Let’s talk about libraries and applications, each developed by different teams. I’ll assume that library authors intend to be deliberate about which functions are exported: everything that is supposed to be public is public, and everything that is supposed to be private is private.

Wherever the visibility is wrong, it’s a bug. Furthermore, (because it’s the only interesting case), I’ll also assume that an app developer wants to call this incorrectly-visible function.

Case 1: a function is accidentally public.

The app developer uses the function.

The next version of the library fixes the bug by making the function private. This is not backwards-compatible, so library authors do a major version bump, and possibly also a deprecation phase. Or neither, semantic versioning is a hassle. All of the library’s users need to be careful when receiving this backwards-incompatible update!

Next, possibly years later, that app’s developer gets to discover that the function is now private. Maybe this isn’t the same person that authored the code in the first place. Upgrading the library is now blocked, and the app now depends on this function that they can’t have. Remember when Guava took away closeQuietly() ?

Perhaps the developer can convince the library author that the function should be exposed after all.

Case 2: a function is accidentally hidden.

The app developer looks for the function and cannot find it in the public API. Either the developer gets at the desired behavior through the actually public APIs, or she requests that the library author expose the function. Pull requests work wonders!

@cedricbeust
Not true in general. If you don’t control the library, you are dead in the water if the author of the library made the visibility too restrictive.

This is sad! But you’re in similar trouble if you can’t upgrade a library because a required function went missing. Only with fewer options.

It’s hard to put the genie back in the bottle.

Note that this doesn’t mean that most functions should be private. Public functions can be great as they give power to application developers.

But when I want to call a public function, I want to be sure that it’s there on purpose and that it’s there to stay. An explicit public modifier gives me that confidence.


#14

Wikipedia: Access modifiers

Access modifiers are a specific part of programming language syntax used to facilitate the encapsulation of components.


#15

I think these debates come back to the fact that visibility is one of the least well designed aspects of the entire OO paradigm. I posted some suggestion for a rethink of visibility a few weeks ago. When you try and figure out what the specific goals are and how JVM visibility solves those goals, it’s hard to be satisfied.

The “best practices” from Effective Java were written by people whose primary experience of Java was developing the standard libraries. Not surprisingly, they think a lot about making minimal APIs that can last a long time and which can operate in a sandboxed environment, and thus minimising accessibility is going to be one of their recommendations. That doesn’t mean their experience generalises to everyone.

Many Java developers are not writing long term APIs. They’re simply writing apps that use them. For these people, the entire concept of visibility is a big net loss:

  • Makes it harder to monkey-patch libraries when they almost but not quite do what is needed.
  • Makes it harder to write white box unit tests
  • Often results in useful code in a library that the developer didn’t quite want to support for the long term being non-reusable, resulting in the worst case in copy/paste coding with all the well known downsides.

I have been on both sides of this, both designing and maintaining a largeish Java API for many years, and also being a user of other people’s big APIs.

When I first made the library I maintained (bitcoinj), I tried to design for a long term stable API that could have a tight and supportable design. So a lot of stuff was private or package private. Some things were final. I pretty quickly noticed that the bulk of the “contributions” I was getting to the library were in fact just raising the visibility of things so they could be reused or overridden. Often these were classes that I was pretty sure I wanted to change, but the developers submitting the patches were adamant that they wanted to use the functionality anyway. I didn’t want to pick fights with my own users, so I tended to accept such patches, with the caveat that the API wasn’t stable yet and so they used these things at their own risk.

Eventually it became clear that actually freezing the API was a fools errand and would never happen, because developers overwhelmingly cared more about features and performance than having a stable API. They wanted an API that was stable enough which did not imply completely frozen, and they were absolutely happy to override/patch/reuse stuff at will even if they knew that in future they’d have to change things because of it.

What’s more, sometimes I found that APIs I had felt sure I’d change ended up not really changing, or APIs that I was sure were done ended up needing to be altered. Making a Bitcoin library, especially back then, was not like making an HTTP library or whatever where the problem is well understood and doesn’t change. BitcoinJ was basically the first big OO Bitcoin API so it was hard to know what it should look like.

So virtually everything ended up being public. And whenever I gave into temptation and attempted to use the “best practice” of making something private or final, without fail this would result in a bunch of annoyed users sending me patches to expose it.

The JDK team have gone through the same process. The sun.misc.Unsafe class was meant to be private to the JDK but ended up being widely used because the functionality it gave was available in no other way. It’s now unofficially public API. But they never learn! Now they’re making a new Unsafe class and saying that this time it’s gonna be really really private with super-duper encapsulation via Jigsaw. Except, oh, there’ll be a flag to disable the encapsulation for when you really really need it. Guess which flag many big apps will end up requiring over the next ten years?

The concept of encapsulation has over the years become something close to religious dogma. It creates massive timesinks where developers waste hours and hours fucking about with reflection, uploading patched libraries to Maven repositories, debugging why their super-duper web framework is silently ignoring database writes because some bytecode weaving expected open classes but found a final class, etc.

If I was designing my own OO language, I’d probably chuck the entire concept of visibility down a black hole. I’d just burn it to the ground. Instead I’d have three things:

  • Annotations or modifiers reflecting API stability: stable, beta, unsupported, deprecated etc
  • Annotations or modifiers labelling methods or properties as “impl” (implementation detail) so they don’t show up in JavaDocs or code completion. But they’d still be usable if you typed out the name manually.
  • Finally, a (rare) annotation called @Secure that makes “impl” methods/properties completely inaccessible from sandboxed code. This would correspond to private today, but impervious even to reflection (from within the sandbox).

Then if you’re a developer and you end up using code marked as “impl”, you upgrade your code and it breaks … ok, sucks to be you. Can’t say you weren’t warned. Ditto for if you were using beta/unsupported API. You want seamless upgrades in perpetuity? Gotta stick to the stable stuff.


#16

That is a very nice idea. It is the best of both worlds and I hope that it is not too late to integrate it into the language.
You actually convinced me that going private or internal by default is not a great idea.

Do you have any examples where this was approach was taken before? From a quick Google search, I have found that Guava has some methods annotated as @Beta.


#17

Guava has @Beta and @VisibleForTesting which is another artifact of visibility not matching how people write modern software (OOP concepts of public/private date back a long way and unit testing has become far more important since they were invented).

Otherwise, no, I don’t. Nobody has redesigned visibility for a long time because it’s so hard-coded into the languages and VMs we use.

But it’d be easy to experiment with. All we need to do is:

  1. Define a library that contains the annotations
  2. Fork the Kotlin compiler on github and make it emit warnings if accessing or calling @impl stuff. I guess the same code paths that are used for @Deprecated can be used in almost identical ways.
  3. Fork the IntelliJ Kotlin plugin on github and make it render accesses to @impl stuff with a red/yellow background to indicate “danger”. That should be suppressable with @Suppress. Also to remove them from autocompletion. And maybe, write a build tool to generate reports on access to APIs of non-@stable levels so you can see what your exposure is to API churn.

The fact that Kotlin makes everything final by default would still be an issue of course. Yole mentioned the possibility of an “allopen” modifier to ease usage with frameworks that expect to be able to subclass arbitrary classes on the fly (like Guice). Adding an IDE warning to flag non-open methods would then be easy.

Of course all this flies in the face of conventional wisdom, but hey, what’s the point of conventional wisdom if not to be challenged from time to time? :slightly_smiling:

(edit: thinking about it, it’d be nice if Kotlin supported a @HighlightWith(r,g,b) meta-annotation that could be attached to other annotations. Then usage of something annotated that way would be rendered with the given default colours in the IDE. Of course there are details around letting the user customise the colours etc).


#18

Stability annotations are completely orthogonal to the issue at hand. They are interesting, can be used to great effect, and putting them in the standard library may be worth exploring, but as a separate proposal. They have nothing to do with controlling visibility, and even if you are proposing the removal of visibility as a concept, that too deserves a separate conversation.


#19

@mikehearn:

  • Makes it harder to monkey-patch libraries when they almost but not quite do what is needed.
  • Makes it harder to write white box unit tests

Monkey-patching and white box testing are great for short-term productivity, and awful for long-term maintainability.


#20

Yes, of course it is better if you can work with the library developers to support your use case appropriately. But that can often take years, and deadlines don’t wait. So in practice developers often have to do both. Look at how widely used sun.misc.Unsafe and bytecode rewriting are.