zstd-jni icon indicating copy to clipboard operation
zstd-jni copied to clipboard

finalize() should not be overridden

Open leventov opened this issue 5 years ago • 52 comments

Overriding Object.finalize() method is very bad for GC. Instead, sun.misc.Cleaner, or intermediate that automatically detects the presence of java.lang.ref.Cleaner and uses it if present (something like this: https://github.com/OpenHFT/Chronicle-Core/blob/067e963a0b79775d9dee33cb13b5fa5b7e49683e/src/main/java/net/openhft/chronicle/core/cleaner/impl/reflect/ReflectionBasedByteBufferCleanerService.java, for example) should be used.

leventov avatar Nov 28 '18 18:11 leventov

Looks like a good improvement to have. I don't have much experience with the Cleaner-s. Any help here will be very welcome.

luben avatar Nov 28 '18 19:11 luben

Are cleaners needed in this scenario? The finalize method calls close(). What's the use case? Can we just leave finalize out? Was there an actual bug? The user should close things themselves so finalize is just extra work that the GC has to do.

re-thc avatar Feb 22 '19 23:02 re-thc

I have done what's Gzip streams are doing - if you forgot to close the stream the underlying file descriptor will be closed on the next GC cycle when the steam object is collected. I had a bug (https://github.com/luben/zstd-jni/issues/26) that leaked memory and FDs by not closing the streams automatically and users were expecting it.

luben avatar Feb 23 '19 01:02 luben

BTW, calling .close() is recommended - I have noticed an edge case that may lead to leaked file descriptors: if the file was meanwhile deleted, the fd is not released. It's not specific to the library and can be observed without any compression on top if InputStream

luben avatar Feb 23 '19 04:02 luben

Gzip stream use Cleaners as mentioned earlier in this post. Finalizers are deprecated. Maybe it'd just be similar to add a shutdown hook to close the resources on closure or similar.

re-thc avatar Feb 23 '19 05:02 re-thc

Yes, it makes sense. May be we were looking before the Inflater was ported to Cleaners.

luben avatar Feb 23 '19 05:02 luben

Just looked, Inflater and Deflater are still using finalize(): http://hg.openjdk.java.net/jdk8/jdk8/jdk/file/687fd7c7986d/src/share/classes/java/util/zip/Inflater.java#l382 http://hg.openjdk.java.net/jdk8/jdk8/jdk/file/687fd7c7986d/src/share/classes/java/util/zip/Deflater.java#l548

and Zip/Gzip streams rely on them for closing.

BTW, I am open to PR moving us to java.lang.ref.Cleaner but how do we keep compatibility with JDKs < 9?

luben avatar Feb 23 '19 05:02 luben

The deprecation of finalizers and cleaners happened after JDK8.

re-thc avatar Feb 23 '19 05:02 re-thc

Yes, but we still support back to JDK6. Is there a way to implement bi-modal behavior depending on which JDK we are loaded in?

luben avatar Feb 23 '19 05:02 luben

If you look at the Java 9 source code for cleaner it's a Weak/Soft/Phantom list of objects + a background thread or similar that executes the action. You can create a simple daemon thread that does something similar. Make it run on shutdown and/or after quite a long time (so it doesn't cause performance impact).

re-thc avatar Feb 23 '19 06:02 re-thc

I think creating daemon threads and/or using shutdown hooks is wrong in a "stateless" library like zstd-jni.

@luben link in the first message of this issue leads to an example of bi-modal code.

leventov avatar Feb 23 '19 19:02 leventov

Is it possible too use the fork/join pool to do the cleanup? Probably not, WeakReferenceQueue blocks.

There are some other ways to do the bi-modal behavior, you can have both implementations in their own private static inner classes that hold static members of an interface you need to use to clean, and try to touch each of these values from static context in turn until one of them works, catching the appropriate class/method/security error.

scottcarey avatar Mar 26 '19 22:03 scottcarey

@scottcarey this is pretty much what does the code linked in the first message in this thread do. It just probes "sun.misc.Cleaner" and "jdk.internal.ref.Cleaner" directly. "java.lang.ref.Cleaner" can be added to the mix too.

leventov avatar Mar 26 '19 22:03 leventov

@leventov Agree with you. I have encountered this problem with zstd-jni.

Finalizers are inherently problematic and their use can lead to performance issues, deadlocks, hangs, and other problematic behavior.

Furthermore the timing of finalization is unpredictable with no guarantee that a finalizer will be called.

See https://bugs.openjdk.java.net/browse/JDK-8165641

Use of finalizers is yet another source of potential memory leak issues. Whenever a class’ finalize() method is overridden, then objects of that class aren’t instantly garbage collected. Instead, the GC queues them for finalization, which occurs at a later point in time.

Additionally, if the code written in finalize() method is not optimal and if the finalizer queue cannot keep up with the Java garbage collector, then sooner or later, our application is destined to meet an OutOfMemoryError.

mr3 avatar Jul 11 '19 14:07 mr3

Was hit by this issue in production, so it's not a theoretic problem.

We use zstd as a compression format for a large batch of data files using Apache Flink (version 1.9 running on adoptopenjdk 8u222). Sometimes on a job restart (due to crash somewhere in our java code) we experienced JDK crashes with the following backtrace:

Stack: [0x00007f53a4e7d000,0x00007f53a4f7e000],  sp=0x00007f53a4f7c560,  free space=1021k
Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code)
C  [libc.so.6+0x408f0]  abort+0x230
C  [libc.so.6+0x9090a]

Java frames: (J=compiled Java code, j=interpreted, Vv=VM code)
j  com.github.luben.zstd.ZstdInputStream.freeDStream(J)I+0
j  com.github.luben.zstd.ZstdInputStream.close()V+12
j  com.github.luben.zstd.ZstdInputStream.finalize()V+1
J 5784 C2 java.lang.ref.Finalizer.access$100(Ljava/lang/ref/Finalizer;Lsun/misc/JavaLangAccess;)V (6 bytes) @ 0x00007f5391919838 [0x00007f5391919660+0x1d8]
j  java.lang.ref.Finalizer$FinalizerThread.run()V+45
v  ~StubRoutines::call_stub

Flink creates a separate ClassLoader for each job on a separate thread. When job is done, the classloader is closed and all classes (and JNI libraries) are unloaded. The issue is that (I guess) on jvm8 there is no guarantee that finalizers will be run before the JNI library unload happening.

Reproducer code (sorry for scala), which tries to do something similar to Flink:

import java.io.{ByteArrayInputStream,ByteArrayOutputStream,File,FileInputStream,InputStream}
import java.net.URLClassLoader

import org.apache.commons.io.IOUtils

object Main {
  val url = new File("/home/shutty/Downloads/zstd-jni-1.4.3-1.jar").toURL
  val in = IOUtils.toByteArray(new FileInputStream(new File("/home/shutty/Downloads/demo.zst")))

  def main(args: Array[String]): Unit = {
      (0 to 100000).toList.par.foreach(openclose)
  }

  def openclose(id: Int) = {
    var cl = new URLClassLoader(Array(url))
    val result = read(cl)
    cl = null
    System.gc()
  }

  def read(cl: ClassLoader): String = {
    val zstdClass = cl.loadClass("com.github.luben.zstd.ZstdInputStream")
    val constructor = zstdClass.getConstructor(classOf[InputStream])
    val zstd = constructor.newInstance(new ByteArrayInputStream(in)).asInstanceOf[InputStream]
    val buf2 = new String(IOUtils.toByteArray(zstd))
    // yes, we did not close the ZstdInputStream here intentionally
    buf2
  }
}

On jvm11 everything works fine. The hack we did to solve this issue is to wrap the ZstdInputStream and override its finalizer with empty method.

shuttie avatar Sep 04 '19 13:09 shuttie

Scala is fine.

@shuttie, what if we add constructor param (or setter) to make the finalizer noop, something similar with the wrapper you created?

In your production code, are you closing the ZstdInputStream manually? If not it may leak the memory allocated in the zstd native code, if the finalizer is disabled.

luben avatar Sep 04 '19 15:09 luben

@shuttie , in v1.4.4-3 I have added options to all classes no make finalizers noop, e.g. object.setFinalize(false) that may be simple for you than using wrapper.

luben avatar Nov 22 '19 03:11 luben

If out.write(dst, 0, (int) dstPos); will throw an exception due to underlying OutputStream which happens frequently , /* Opaque pointer to Zstd context object */ private final long stream; is still left unclosed and creates Memory Leaks in Big Data Systems and Long Running application like Spark / Hadoop etc, where the underlying filesystem switchover / Bad Data Nodes / Lease Expired etc will give exception on underlying OutputStream Frequently.

dilipkasana avatar May 06 '20 17:05 dilipkasana

Finalizers (and other GC magics like phantom refs) still have GC overhead even for users that don't depend on them for cleanup, i.e users that always close their streams.

An alternative proposal: factor out the existing streams behavior minus finalize() into new base classes, and leave the finalize in the existing streams classes which extend the new ones. This lets users who always close and want to avoid finalizer overhead opt-in, and doesn't break existing users relying on GC magics for cleanup.

raifo avatar Oct 24 '20 04:10 raifo

@raifo , that seems sane approach. I can put it in place shortly.

luben avatar Oct 24 '20 18:10 luben

@raifo , I have added NoFinalizer base classes for the Input/Output streams in: https://github.com/luben/zstd-jni/commit/be9be47fae23b5510344b67f9c0f7b258c2f5890 . Any feedback is welcome. I intend to release it soon with the RecyclingBufferPool optimizations in the later commits. Any recommendation there will be also appreciated.

luben avatar Feb 07 '21 19:02 luben

Just pushed v1.4.8-4 with the new classes without finalizers.

luben avatar Feb 09 '21 10:02 luben

FWIW, by having ZstdOutputStream extend ZstdOutputStreamNoFinalizer a number of breaking changes were introduced. Code that compiled on 1.4.8-3 no longer compiles with 1.4.8-4.

e.g.

ZstdOutputStream wrapper = new ZstdOutputStream(o).setCloseFrameOnFlush(true);

Results in this compiler error:

... error: incompatible types: ZstdOutputStreamNoFinalizer cannot be converted to ZstdOutputStream

This means some runtime issues could be introduced as well. If new code is running with 1.4.8-4 any dependencies compiled with 1.4.8-3 will likely break.

Logic-32 avatar Feb 19 '21 22:02 Logic-32

@Logic-32 , that's a good point. There is pro/cons of keeping the API stability vs tying us to upstream versions. Regarding the runtime issue, most likely they would no break but will not run finalizers.

I will try some F-bounded generics magic this weekend to see if it applies and solves the problem, but if you have any other suggestion how to remove the finalizers in backward compatible way, please chime.

luben avatar Feb 19 '21 22:02 luben

In part, it'd be nice to at least see a major version bump if you pursue the current change. That way, consumers know to expect some breakage.

Semantic versioning aside, a good solution would depend on how backwards compatible you want to be. If you don't care about JRE's < 9 then you could remove the finalize() override, add the Cleaner, and then have ZstdOutputStreamNoFinalizer extend ZstdOutputStream instead.

That way, all existing implementations continue to work (thanks to proper inheritance) and then the constructor for ZstdOutputStreamNoFinalizer can just skip the Cleaner call.

Logic-32 avatar Feb 19 '21 22:02 Logic-32

If you don't care about JRE's < 9 then you could remove the finalize() override, add the Cleaner

This would be my preferred solution. It will not even need 2 classes. But we still care about JVM8 and even compile and work on JVM6.

Regarding major version bump - it will stop corresponding to the upstream Zstd version and I am a little but hesitant - I would prefer backwards compatible solution.

luben avatar Feb 19 '21 22:02 luben

Actually looking in the runtime case - it will not be a problem: the vtable is attached to the instance and we don't change it when we modify it with the set* methods, so it should continue working if we solve the return type stuff that may cause the problem. So we have to fix the compilation but I think I have an idea.

luben avatar Feb 19 '21 22:02 luben

@Logic-32 , I pushed a commit that I thing should fix the problem. Can you try it by building it locally and compiling your software against it?

luben avatar Feb 19 '21 23:02 luben

Thanks for the quick turn around! Just reviewing the latest commit I believe you still need to add generics to ZstdOutputStream before that will work. Otherwise, T will still be a ZstdOutputStreamNoFinalizer.

But, I wouldn't do that yet as it makes working with ZstdOutputStreamNoFinalizer a bit awkward. A good static-code analyzer (or even IDE) would recommend declaring any instance of ZstdOutputStreamNoFinalizer as ZstdOutputStreamNoFinalizer<ZstdOutputStreamNoFinalizer> while ZstdOutputStream would remain generic-less. Which just makes working with it a bit unpleasant.

However, I think you can get away with what you're going for, and still be backwards compatible, by overriding each setter in ZstdOutputStream and changing the return type to ZstdOutputStream without relying on the generics. It's not as pleasant to maintain but better from a consumer standpoint.

e.g.

    @Override
    public synchronized ZstdOutputStream setCloseFrameOnFlush(boolean closeOnFlush) throws IOException {
        return (ZstdOutputStream) super.setCloseFrameOnFlush(closeOnFlush);
    }

Also, while we're on the topic of better for consumers, here's something to debate: how would you feel about moving ZstdOutputStreamNoFinalizer to a different package than ZstdOutputStream and keeping the name ZstdOutputStream? Deprecating the current ZstdOutputStream in favor of the new one should be considered important as relying on the GC to close resources is strongly discouraged since you never know when GC will occur. Since most developers are lazy however, they'll be more inclined to use ZstdOutputStream over ZstdOutputStreamNoFinalizer simply because the name is shorter. The only problem with this approach is that you not only have to come up with a name for the new package but also hope that no one complains about requiring a package name to disambiguate the two. Though, why you'd ever import both in any case is beyond me.

I'll try and check in over the weekend but will otherwise probably not be available until next week. Let me know what your thoughts are!

Logic-32 avatar Feb 20 '21 01:02 Logic-32

Hmm, that's interesting idea to use the packages for disambiguation. Though I could not come with better name than zstd.

Yes, working with the NoFinaliser classes becomes a little bit more cumbersome. Now thinking, we can introduce an empty subclass for the NoFinalizer classes that just fill the generics - another wart on the implementation side.

Edit: now reflecting on why I did caught this issue, there are a few circumstances:

  • the test are written is Scala that will happily infere the right types, so I have not written them.
  • a lot of people that tested in also used Scala. The Java users usually get to it though intermediary libraries that update slower.
  • the workplace forces everybody to be on the latest version (like monorepo but not exactly). And I didn't caught the issue there as everything compiled fine.

luben avatar Feb 20 '21 02:02 luben

I went ahead with creating *Base classes that have all the logic and 2 facades - with or without the finalizer that fill the generic param - this should be compatible with both 1.4.8-3 and previous, and with 1.4.8-4.

I also made the Base classes package-private so we can remove them if we find better strategy of we migrate fully to Java-9.

luben avatar Feb 20 '21 02:02 luben

Thanks for the updates @luben. I think those should work out nicely for us.

FWIW, we're a java shop that likes to be on the latest version of things.

Logic-32 avatar Feb 22 '21 17:02 Logic-32

@Logic-32 , I have pushed v1.4.8-5 with the latest changes.

luben avatar Feb 22 '21 17:02 luben

I will keep the issue open to track adding Cleaners and further changes.

luben avatar Feb 22 '21 17:02 luben

@Logic-32, as you suspected there are also runtime API compatibility problems with v1.4.8-4 and 5. The Avro/Parquet/Spark ecosystem noticed when trying to update dependencies (#161). I have released v1.4.8-6 that achieves the same result but keeping the runtime API compatibility - now the ZstdInputStream/ZstdOutputStream compose/delegate to the NoFinalizer classes: removing the inheritance restored the runtime contract.

Edit: There is also v1.4.8-7 as I forgot to make the NoFinalizer classes public. This will not affect the previos uses of the library, but may break users of sub-version 4 and 5 that started using them.

luben avatar Mar 01 '21 11:03 luben

Hi @luben, we saw some finalizer issue from ZstdDecompressCtx. I noticed there's "setFinalize" to disable finalizer for AutoCloseBase. But from our code path

  [ 0] java.lang.ref.Finalizer
  [ 1] java.lang.ref.Finalizer.register
  [ 2] com.github.luben.zstd.AutoCloseBase.<init>
  [ 3] com.github.luben.zstd.ZstdDecompressCtx.<init>
  [ 4] com.github.luben.zstd.Zstd.decompressByteArray

there seems no way to set it without touching https://github.com/luben/zstd-jni/blob/3df7edc8e7cce6d0fd195dbb7cdfe3edbf23fb58/src/main/java/com/github/luben/zstd/Zstd.java#L407 Is there any workaround we can try? Thanks!

lic9 avatar Mar 18 '21 20:03 lic9

Yes, setFinalize(false) will just make the finalizer noop bit will not remove it completely. Recently I started adding a new hierarchy for the streams classes that don't override the finalizer. May be I can continue with the Compress/Decompress contexts.

luben avatar Mar 18 '21 21:03 luben

Thanks @luben! BTW, I think our issue occurred since (https://github.com/luben/zstd-jni/commit/5209351fc8e071084a3e6df13c0f60bd13f916f6), which led to more 'Finalizer' in heap and longer GC pause. Just curious what is the purpose of https://github.com/luben/zstd-jni/commit/5209351fc8e071084a3e6df13c0f60bd13f916f6? Couldn't find any comments on that commit.

lic9 avatar Mar 18 '21 21:03 lic9

The idea of the Compress/Decompress Ctx is that you can create an instance and use them to compress/decompress multiple independent messages. This lowers the overhead. The particular commit forwards the Zstd methods to the new classes as we don't want to maintain duplicate code-paths.

luben avatar Mar 18 '21 21:03 luben

Gotcha! Thanks for the explanation @luben!

lic9 avatar Mar 18 '21 21:03 lic9

BTW, is it feasible to add an option like "decompressByteArrayWithoutCtx" in Zstd class if we want to avoid Ctx? Thanks! @luben

lic9 avatar Mar 18 '21 23:03 lic9

Internally it will create and discard the context - either we do it on the java side, or it will happen on the C side.

luben avatar Mar 18 '21 23:03 luben

i was thinking more of having an extra java api to directly access the old C code

JNIEXPORT jlong JNICALL Java_com_github_luben_zstd_Zstd_decompressByteArray
  (JNIEnv *env, jclass obj, jbyteArray dst, jint dst_offset, jint dst_size, jbyteArray src, jint src_offset, jint src_size) {
    size_t size = (size_t)(0-ZSTD_error_memory_allocation);
    if (dst_offset + dst_size > (*env)->GetArrayLength(env, dst)) return ERROR(dstSize_tooSmall);
    if (src_offset + src_size > (*env)->GetArrayLength(env, src)) return ERROR(srcSize_wrong);
    void *dst_buff = (*env)->GetPrimitiveArrayCritical(env, dst, NULL);
    if (dst_buff == NULL) goto E1;
    void *src_buff = (*env)->GetPrimitiveArrayCritical(env, src, NULL);
    if (src_buff == NULL) goto E2;
    size = ZSTD_decompress(dst_buff + dst_offset, (size_t) dst_size, src_buff + src_offset, (size_t) src_size);
    (*env)->ReleasePrimitiveArrayCritical(env, src, src_buff, JNI_ABORT);
E2: (*env)->ReleasePrimitiveArrayCritical(env, dst, dst_buff, 0);
E1: return size;

lic9 avatar Mar 18 '21 23:03 lic9

Do we have any estimation on perf penalty for not using ZSTD_decompressDCtx?

lic9 avatar Mar 18 '21 23:03 lic9

I think they will perform the same. The penalty is maintaining duplicate code.

luben avatar Mar 19 '21 11:03 luben

I think they will perform the same. The penalty is maintaining duplicate code.

I see. It might be cleaner to just add Ctx without finalizer then. I took a look at the code, and finalizer in Ctx was from AutoCloseBase. Maybe we can extract "void finalize()" from AutoCloseBase to its subclasses, and then make two versions of Ctx like ZstdInputStream/ZstdOutputStream (the one with finalizer wraps non-finalizer one)?

BTW, the issue we have is on presto (https://github.com/prestodb/presto/pull/15028).

lic9 avatar Mar 19 '21 16:03 lic9

Yes, I am thinking we may not even need the finalizer. I just have to make sure we manually close the contexts everywhere we use them.

luben avatar Mar 19 '21 17:03 luben

Yes, I am thinking we may not even need the finalizer. I just have to make sure we manually close the contexts everywhere we use them.

Thanks! Let me know if something I could help!

lic9 avatar Mar 19 '21 17:03 lic9

@lic9, try v1.4.9-2 : it does not have finalizers in the Base class

luben avatar Mar 31 '21 02:03 luben

@lic9, try v1.4.9-2 : it does not have finalizers in the Base class

Thanks! Just tested it and the issue seemed gone. Will run more tests.

lic9 avatar Mar 31 '21 05:03 lic9

@luben we have upgraded to v1.4.9-2 in presto: https://github.com/prestodb/presto/pull/15966 thanks for the fix!

lic9 avatar Apr 20 '21 03:04 lic9

Just do let you know, finalization will likely be deprecated for removal, see JEP 421.

marschall avatar Dec 05 '21 15:12 marschall