Costs of Kotlin's Delegation
What is Kotlin's delegation compiled into and what are runtime implications?
If there is one thing I really really like about Kotlin, then it's the first class citizenship of delegation throughout the whole language. It provides a simple and concise way to have a replacement for inheritance. The nice thing here is, that abstract properties can be used in an interface, but the state remains in the implemting class, with very low amount of boilerplate, magic and chance of bullshit happening. Or an interface can be implemented by giving a delegating instance and that's it. Here's a short snippet to show what I am talking about (reduced example):
class Delegation(val myImpl: MyInterfaceImpl) : MyInterface by myImpl
Since this can be done in Java as well with default methods in interfaces and IDE code generation, it really is the reduced amount of boilerplate that has to be done to achieve it. It's only a worthy thing, if it can be expressed in readable, efficient code.
And the last one is a very important criteria: Efficient. I don't want to make my JVM code slower as it is already. We're talking about games - if another language is slower, than the code I can produce with Java is the better compromise for me. If I imagine using delegation over inheritance with Kotlin throughout my codebase, in every hierachy..will it slow down my engine remarkably? I ran benchmarks of implementations of a directly implemented field, an inherited field and a delegated field with Java and Kotlin. I expected a field to be faster than a property in general, and a delegated field to be slower alltogether. I used JMH and a blackhole, so there should be everything implemented just fine, but I get these unexpected results:
Benchmark Mode Cnt Score Error Units
getStringJavaDelegation thrpt 200 220181331,464 ± 2144028,358 ops/s
getStringJavaImplementation thrpt 200 171078263,764 ± 889605,110 ops/s
getStringJavaInheritance thrpt 200 170878616,220 ± 818848,070 ops/s
getStringKotlinDelegation thrpt 200 225753956,507 ± 1740352,057 ops/s
getStringKotlinImplementation thrpt 200 168879795,813 ± 2728455,723 ops/s
getStringKotlinInheritance thrpt 200 170414757,249 ± 1515476,325 ops/s
Turns out the delegation to a delegate field is faster than the other two versions... Okay, I have the instantiation included in the benchmarked code, and even though I expected delegation to be slower right then, I removed it from the benchmark - so now a single getString() call is measured. Results:
Benchmark Mode Cnt Score Error Units
getStringJavaDelegation thrpt 200 301713586,642 ± 8160921,344 ops/s
getStringJavaImplementation thrpt 200 225820433,449 ± 3676854,362 ops/s
getStringJavaInheritance thrpt 200 234833613,665 ± 561919,892 ops/s
getStringKotlinDelegation thrpt 200 320742908,021 ± 1406189,583 ops/s
getStringKotlinImplementation thrpt 200 230377534,877 ± 3347435,643 ops/s
getStringKotlinInheritance thrpt 200 230821924,187 ± 1159446,814 ops/s
No chance, same results. And additionally, Kotlin's delegation seem to be even faster then the bare hand implementation with Java. I decided to take a closer look at the bytecode.
DelegationJava Bytecode
public getString()Ljava/lang/String;
L0
LINENUMBER 14 L0
ALOAD 0
GETFIELD de/hanno/playground/DelegationJava.impl : Lde/hanno/playground/MyInterfaceImplementation;
INVOKEVIRTUAL de/hanno/playground/MyInterfaceImplementation.getString ()Ljava/lang/String;
ARETURN
L1
LOCALVARIABLE this Lde/hanno/playground/DelegationJava; L0 L1 0
MAXSTACK = 1
MAXLOCALS = 1
DelegationKotlin Bytecode
public getString()Ljava/lang/String;
L0
ALOAD 0
GETFIELD de/hanno/playground/DelegationKotlin.myInterfaceImplementation : Lde/hanno/playground/MyInterfaceImplementation;
INVOKEVIRTUAL de/hanno/playground/MyInterfaceImplementation.getString ()Ljava/lang/String;
ARETURN
L1
LOCALVARIABLE this Lde/hanno/playground/DelegationKotlin; L0 L1 0
MAXSTACK = 1
MAXLOCALS = 1
ImplementationJava ByteCode
public getString()Ljava/lang/String;
L0
LINENUMBER 14 L0
ALOAD 0
GETFIELD de/hanno/playground/ImplementationJava.myStringField : Ljava/lang/String;
ARETURN
L1
LOCALVARIABLE this Lde/hanno/playground/ImplementationJava; L0 L1 0
MAXSTACK = 1
MAXLOCALS = 1
Not much difference here, just a LINENUMBER instruction. So I have to admit I'm very satisfied with this result, even though I can't explain it. Of course I know that one Bytecode is not everything ... but I don't feel like investing more time here because I have more interesting things to implement :) If anybody has further ideas here, I would like to hear.