okio icon indicating copy to clipboard operation
okio copied to clipboard

Memory Allocation Optimization for String to Path Conversion on non-JVM Platforms

Open blastmann opened this issue 10 months ago • 2 comments

Issue Description

When converting strings to Path objects using String.toPath() on non-JVM platforms, there's a potential memory allocation inefficiency. Unlike the JVM implementation that utilizes a SegmentPool for recycling Segment objects, the non-JVM implementation creates new Segment objects for each conversion without reusing them.

Technical Analysis

The current implementation in String.commonToPath() creates a new Buffer for each conversion:

internal fun String.commonToPath(normalize: Boolean): Path {
  return Buffer().writeUtf8(this).toPath(normalize)
}

On JVM platforms, Segment objects are pooled through the SegmentPool:

// JVM implementation
internal actual object SegmentPool {
  actual val MAX_SIZE = 64 * 1024 // 64 KiB
  // Segment recycling logic...
}

While on non-JVM platforms, the SegmentPool is effectively a no-op:

// Non-JVM implementation
internal actual object SegmentPool {
  actual val MAX_SIZE: Int = 0
  actual val byteCount: Int = 0
  actual fun take(): Segment = Segment()
  actual fun recycle(segment: Segment) {
  }
}

This means that each toPath() call on non-JVM platforms creates:

  1. A new Buffer
  2. One or more new Segment objects
  3. Processes the path normalization logic
  4. Reads the result into a ByteString
  5. Creates a new Path object

For applications that perform frequent path operations, this can lead to excessive memory allocations and increased GC pressure.

Steps to Reproduce

Benchmark code that demonstrates the issue:

// Create many paths from strings in a loop
fun benchmarkPathCreation() {
    val start = currentTimeMillis()
    for (i in 1..10000) {
        val path = "user/documents/file$i.txt".toPath()
        // Use path...
    }
    val end = currentTimeMillis()
    println("Time: ${end - start}ms")
}

On non-JVM platforms, this creates 10,000+ Buffer and Segment objects that must be garbage collected.

Proposed Solutions

  1. Implement a Segment pooling mechanism for non-JVM platforms Similar to the JVM implementation but adapted for non-JVM environments.

  2. Add a caching mechanism for Path objects Introduce a Path cache for commonly used paths, potentially as an opt-in feature.

  3. Optimize the Path creation process For simple paths that don't require normalization, consider a more direct creation path.

  4. Provide developer guidance Document this behavior and provide best practices for efficient Path handling on non-JVM platforms.

Environment

  • Okio version: 3.9.7
  • Platforms affected: Non-JVM platforms (Kotlin Native)
  • Priority: Medium (performance optimization)

blastmann avatar May 27 '25 12:05 blastmann

Yeah, good call. I think implementing segment pooling on non-JVM platforms is an awesome idea.

swankjesse avatar May 28 '25 21:05 swankjesse

I reimplemented SegmentPool on native platform using kotlinx-atomicfu, which significantly reduced memory pressure. :)

blastmann avatar May 29 '25 10:05 blastmann