llama icon indicating copy to clipboard operation
llama copied to clipboard

What is inside llama-2-70b consolidated.00.pth file and how do I read it?

Open imvetri opened this issue 1 year ago • 8 comments

I tried to print out the contents of the consolidated.00.pth file using below lines of code.

import torch

model = torch.load('consolidated.00.pth')
print(model)

what it struck me was the content after printing like that, the size of the file is around 317kb where as the consolidated.00.pth file is close to 17.25gb. weights.txt

Are there other contents in the file ? how do I see it? Attached the exported content for reference.

imvetri avatar Jul 25 '23 06:07 imvetri

My Thoughts before this question.

Input + function gives an output is conventional method of computation.m, where as fundamentally AI or ML or DL computes the function with help of input and known outputs.

Ideally, an 100% learnt AI should result in an function, but community seems to reach the state of consciousness or singularity or universe or self awareness system, which was tried before with conventional programs been tried again with modern program which are all a flipped version of the previous paradigm.

In other words, mathematicians tricked human eyes with their theories with symbols and meanings which only they get right, in my words, a job security for mathematicians.

imvetri avatar Jul 25 '23 19:07 imvetri

Input + function gives an output is conventional method of computation.m, where as fundamentally AI or ML or DL computes the function with help of input and known outputs.

Good idea. Inspired by your thoughts, i think maybe the next paradigm for DL is training a parameters-adjustable model that fits different functions.

XitaoLi avatar Jul 26 '23 05:07 XitaoLi

@XitaoLi it will result in Mandelbrot alike recursion, I mean, whenever recursion is involved, its good idea to snap out of that design and apply some other technique.

imvetri avatar Jul 26 '23 20:07 imvetri

How can I load and use the model ('consolidated.00.pth') and test a prompt? I can do it via GPU or CPU but no idea how to run the whole thing.

hteller22 avatar Aug 10 '23 01:08 hteller22

The .pth files are basically zip files. They should be converted via a convert.py script. See this thread from llama.cpp:

https://github.com/ggerganov/llama.cpp/issues/707

jboero avatar Aug 20 '23 15:08 jboero

Would love to know the difference between these models now, such as bin, pth, pt, onnx, hf, etc. Or how they convert viewing, etc. Because in the process of using it, I found that many models uesd in llama.cpp cannot be used in the llama2.c project

KangkangStu avatar Oct 12 '23 00:10 KangkangStu

same here, the LLM revolutio brings a set of new ways/techs to store and use AI models that I don't know necessarly before. if someone has some ressources to read please share!

aelyoussfi avatar Jan 10 '24 10:01 aelyoussfi