jmeter icon indicating copy to clipboard operation
jmeter copied to clipboard

Use Google Magika for file type detection instead of Apache Tika

Open vlsi opened this issue 1 year ago • 2 comments

Use case

Currently, JMeter uses Tika to detect file type

Possible solution

We could replace Tika with https://google.github.io/magika/

On the other hand, Magika depends on Tensorflow, which might be a non-trivial dependency

Possible workarounds

No response

JMeter Version

5.6.3

Java Version

No response

OS Version

No response

vlsi avatar Feb 16 '24 11:02 vlsi

Do we have a real problem with using Tika? I read the repo of magika, that it is a python (and javascript?) solution. Is it easy to add to our dependencies? Is it working locally (without internet access)?

FSchumacher avatar Feb 16 '24 18:02 FSchumacher

I thought tika consumed significant space dependency-wise. Magika model is ~1MiB. However, Magika requires tensorflow, so it would probably involve a lot of deps :( Yes, it works without Internet access

vlsi avatar Feb 16 '24 18:02 vlsi