iree icon indicating copy to clipboard operation
iree copied to clipboard

Export via `iree-opt-parameter-archive-export-file` bloats file size more than expected

Open ScottTodd opened this issue 1 year ago • 0 comments

What happened?

I'm trying to extract large constants from a .mlirbc file into a .irpa file using the iree-opt-parameter-archive-export-file pass. I'm observing that the original .mlirbc file is 417MB but the extracted .irpa file is 1.05GB. I'd expect the file to be closer to the input size, and not twice as large.

Steps to reproduce your issue

I'm specifically testing with model_BertForMaskedLMTF.timestamp_1683504734j.mlirbc from https://gist.github.com/ScottTodd/db7daa932e50ea9c10d3621d586d0e6a. There are smaller files there that could also be tested.

  1. Download file: gcloud storage cp gs://iree-github-actions-postsubmit-artifacts/7878012964/1/e2e-test-artifacts/model_BertForMaskedLMTF.timestamp_1683504734j.mlirbc D:\dev\projects\iree-data\models\2024_02_13

  2. Run the compiler using that pass:

    ..\iree-build\tools\iree-compile \
      D:\dev\projects\iree-data\models\2024_02_13\model_BertForMaskedLMTF.timestamp_1683504734j.mlirbc \
      --iree-opt-parameter-archive-export-file=D:\dev\projects\iree-data\models\2024_02_13\parameters.irpa \
      --iree-opt-parameter-archive-export-scope=compile \
      --compile-to=flow \
      -o D:\dev\projects\iree-data\models\2024_02_13\model_BertForMaskedLMTF.timestamp_1683504734j_output.mlir
    
  3. Observe the file size of parameters.irpa compared to the .mlirbc

What component(s) does this issue relate to?

Compiler

Version information

IREE source build from https://github.com/openxla/iree/commit/246edee03074f6f8126aa3f6ae7d815f32f245d9

Additional context

The output .mlir after running the pass has contents like this:

module {
  util.global private @constant_hoisted = #stream.parameter.named<"compile"::"constant_hoisted"> : tensor<512x768xf32>
  util.global private @constant_hoisted_0 = #stream.parameter.named<"compile"::"constant_hoisted_0"> : tensor<2x768xf32>
  util.global private @constant_hoisted_1 = #stream.parameter.named<"compile"::"constant_hoisted_1"> : tensor<30522x768xf32>
  util.global private @constant_hoisted_2 = #stream.parameter.named<"compile"::"constant_hoisted_2"> : tensor<768xf32>
  util.global private @constant_hoisted_3 = #stream.parameter.named<"compile"::"constant_hoisted_3"> : tensor<768xf32>
  util.global private @constant_hoisted_4 = #stream.parameter.named<"compile"::"constant_hoisted_4"> : tensor<768xf32>
  util.global private @constant_hoisted_5 = #stream.parameter.named<"compile"::"constant_hoisted_5"> : tensor<768x768xf32>

The input .mlir (after converting from .mlirbc with python -m iree.compiler.tools.ir_tool copy D:\dev\projects\iree-data\models\2024_02_13\model_BertForMaskedLMTF.timestamp_1683504734j.mlirbc -o D:\dev\projects\iree-data\models\2024_02_13\model_BertForMaskedLMTF.timestamp_1683504734j.mlir) looks like this:

module {
  ml_program.global public @vars.__sm_node18__m.bert.embeddings.embeddings(dense<"0x73678F3C1FF ... 170A9FBD"> : tensor<512x768xf32>) : tensor<512x768xf32>
  ml_program.global public @vars.__sm_node17__m.bert.embeddings.token_type_embeddings(dense<"0x1A4EE2391608343C51C572 ...

ScottTodd avatar Feb 13 '24 20:02 ScottTodd