tract icon indicating copy to clipboard operation
tract copied to clipboard

Half-mode LSTM MatMuls failing to optimize with No matrix multiplier for F16xF32 to F16

Open VariantXYZ opened this issue 2 years ago • 1 comments

Hi again!

Tract's performance has been fantastic so far, and I've had a lot of good luck playing around with F16 on different models on hardware that supports it, but I happened to run into an issue when experimenting with an onnx-converted version of this model.

Model translated!
Error: running pass codegen

Caused by:
    0: codegen node #46 "MatMul;functional_3_lstm_4_lstm_cell_4_StatefulPartitionedCall_MatMul1_Gemm__13.ab.split-over-1.384..512" MatMulUnary
    1: No matrix multiplier for F16xF32 to F16

The model isn't anything particularly secret (I just used tf2onnx to quickly convert and test it): 1.zip

use std::error::Error;
use std::io::Cursor;
use tract_core::prelude::*;
use tract_onnx::prelude::*;
use tract_core::model::translator::Translate;

fn main() -> Result<(), Box<dyn Error>> {
    let tract = tract_onnx::onnx();

    let mut c = Cursor::new(include_bytes!("./1.onnx") as &[u8]);
    
    let mut m = tract.model_for_read(&mut c)?;
    m.analyse(true)?;

    let mut m = m.into_typed()?;
    m.declutter()?;

    m = tract_core::half::HalfTranslator.translate_model(&m)?;

    println!("Model translated!");

    m = m.into_optimized()?; // Will fail here

    println!("Model optimized!");

    let _ = m.into_runnable()?;

    println!("Model runnable!");

    Ok(())
}

VariantXYZ avatar Mar 11 '23 02:03 VariantXYZ

Could you give a shot at the current main branch ? I think this is solved.

kali avatar May 15 '23 07:05 kali