Shadowrocket-ADBlock-Rules icon indicating copy to clipboard operation
Shadowrocket-ADBlock-Rules copied to clipboard

Comparing python and go gives different embeddings

Open ThomasAlxDmy opened this issue 1 year ago • 4 comments

Running the same local model through pytorch and spago gives different embedding for the same input. Let me know if I'm missing something.

python code:

from sentence_transformers import SentenceTransformer

model = SentenceTransformer('all-MiniLM-L6-v2')

query = "test"
embeddings = model.encode(query)
print(embeddings)

Go code:

package main

import (
	"context"
	"fmt"


	. "github.com/nlpodyssey/cybertron/examples"
	"github.com/nlpodyssey/cybertron/pkg/models/bert"
	"github.com/nlpodyssey/cybertron/pkg/tasks"
	"github.com/nlpodyssey/cybertron/pkg/tasks/textencoding"
	"github.com/rs/zerolog"
	"github.com/rs/zerolog/log"
)

func main() {
	zerolog.SetGlobalLevel(zerolog.DebugLevel)
	LoadDotenv()

	modelsDir := HasEnvVar("CYBERTRON_MODELS_DIR")
	modelName := HasEnvVar("CYBERTRON_MODEL") // sentence-transformers_all-MiniLM-L6-v2

	m, err := tasks.Load[textencoding.Interface](&tasks.Config{ModelsDir: modelsDir, ModelName: modelName})
	if err != nil {
		log.Fatal().Err(err).Send()
	}
	defer tasks.Finalize(m)

	fn := func(text string, model int) error {
		result, err := m.Encode(context.Background(), text, model) //int(bert.MeanPooling)
		if err != nil {
			return err
		}
		fmt.Printf("%#v - %d\n\n", result.Vector.Data(), result.Vector.Size())
		return nil
	}

	fn("test", int(bert.MeanPooling))
}

python results:

[ 1.15734097e-02  2.51362156e-02 -3.67017724e-02  5.93248047e-02
 -7.14903139e-03 -4.11940068e-02  7.70873725e-02  3.74424718e-02
  1.24490950e-02 -6.11765310e-03  1.70342419e-02 -7.70153478e-02
 -3.94236675e-04  2.79090423e-02 -1.59890689e-02 -6.82753772e-02
  8.88470281e-03 -2.02807318e-02 -8.03598985e-02 -1.30741457e-02
 -4.10998389e-02 -2.58980840e-02 -2.65385620e-02  3.30523401e-02
 -2.20791362e-02  2.10462175e-02 -5.79220355e-02  3.29487510e-02
  2.97074020e-02 -6.22483529e-02  3.87880951e-02  3.19906622e-02
  1.53306685e-02  4.53070812e-02  5.31493574e-02  1.33605571e-02
  4.12249155e-02  2.81428993e-02  1.93984509e-02 -3.25234700e-03
 -3.61238397e-03 -1.42860323e-01  3.80711891e-02 -1.09161483e-02
  2.60939933e-02  4.13699113e-02 -1.60157923e-02  5.35601340e-02
 -5.68594560e-02  1.22467233e-02 -3.49965207e-02 -3.97541597e-02
 -4.61429730e-02 -3.91123071e-02 -1.80036388e-02  2.16342341e-02
 -6.46130042e-03 -2.65695453e-02  4.87287715e-02  4.34006192e-02
  4.61999923e-02 -3.44789997e-02 -2.42192708e-02  5.56490794e-02
  2.46245228e-02  1.94855593e-02  1.13528064e-02 -2.57623438e-02
 -3.22445184e-02 -3.26521583e-02 -1.45958448e-02  1.46902548e-02
  1.03045451e-02  7.40797743e-02  8.00616965e-02 -4.14367346e-03
  2.71425070e-03 -7.67467245e-02  5.88153452e-02 -4.83907200e-03
 -8.73632878e-02 -4.64620627e-02 -3.96254547e-02  5.24941050e-02
  2.48015728e-02  8.08015168e-02  1.19813651e-01  3.17714810e-02
 -1.13972493e-01  1.06738340e-02  1.97238214e-02  3.87183167e-02
 -3.45423818e-02 -6.24317629e-03 -4.25508097e-02  2.26286557e-02
  8.29358026e-03 -1.94437262e-02 -2.32510027e-02  2.45337605e-01
  4.94420677e-02  2.91474573e-02 -5.75371133e-03 -2.04195566e-02
 -7.76114091e-02 -4.29791398e-02  4.30365093e-04 -6.69260472e-02
  7.07604215e-02  6.50699669e-03 -5.84452003e-02 -5.05727017e-03
  3.09437048e-02  2.75796149e-02  2.71111522e-02 -6.81862310e-02
 -6.57688156e-02  5.90776876e-02 -3.18941548e-02  4.06060964e-02
  5.71018159e-02  6.90348493e-03  6.20758208e-03 -1.32715832e-02
  3.12405266e-02 -4.02969234e-02  7.08522648e-02 -4.84628072e-33
  2.19342969e-02 -1.02704786e-01  5.62132895e-02  9.75837857e-02
 -5.26849367e-02  1.90976840e-02 -1.05075100e-02  7.03573003e-02
 -9.12332814e-03  5.95706254e-02  9.33859590e-03 -1.55230304e-02
 -2.32740566e-02  2.39768121e-02  1.02088429e-01  9.32035223e-02
 -3.31360325e-02  1.17215114e-02 -6.56005815e-02  3.14648002e-02
 -2.38079075e-02 -4.87391874e-02  1.22579187e-02 -4.00142260e-02
 -7.65081719e-02 -3.55533287e-02 -1.36226986e-03 -1.64291803e-02
  1.68824978e-02 -3.74853582e-04 -5.27248904e-02  2.98809353e-02
 -7.29111806e-02  6.91225529e-02 -1.76078174e-02 -5.51019795e-03
  1.29546402e-02 -2.27018893e-02  2.67412830e-02 -2.58457605e-02
 -4.02021147e-02 -1.34951025e-02  7.99826521e-04  2.86012031e-02
  3.19424532e-02 -3.16071324e-02 -2.95498148e-02 -2.02512965e-02
  4.81779017e-02 -1.30770076e-03 -1.37471985e-02  2.00302619e-02
 -6.87055364e-02 -2.19608527e-02 -3.13321128e-02  4.92803864e-02
  1.20947342e-02 -5.88681251e-02 -2.64657382e-02  5.98899983e-02
  6.76487088e-02  3.40156741e-02 -5.28843887e-02  5.97194210e-02
 -2.54826006e-02 -2.07300168e-02 -5.38255349e-02 -9.74107385e-02
  4.79282700e-02  5.23997098e-02 -2.32595764e-02 -6.90713674e-02
  1.66568179e-02  2.84764096e-02 -2.92048529e-02 -3.54866721e-02
 -1.26442248e-02  7.33398721e-02 -1.94349699e-02 -6.32789806e-02
  9.60664675e-02 -7.74386898e-02  1.59395430e-02 -4.48008589e-02
  1.63027123e-02 -7.48103543e-04 -8.70272145e-03 -9.88140479e-02
  5.74282184e-03 -7.19235390e-02 -3.26777287e-02  1.98944993e-02
  3.85973416e-03 -2.55510844e-02  8.23836997e-02  4.08626816e-33
 -2.94882022e-02  2.55514625e-02 -5.10609001e-02  1.55312389e-01
  5.23114502e-02 -3.45481671e-02  1.33146763e-01 -1.92097016e-02
 -5.97670749e-02  1.22896463e-01  1.02029452e-02 -4.96731438e-02
  5.84671088e-02  1.27327591e-02 -1.65863857e-02  1.27954753e-02
  4.57581691e-02 -6.98214546e-02 -4.85241972e-02 -4.96347249e-03
 -9.04365331e-02  6.99212551e-02  9.38911736e-03 -6.74472703e-03
 -1.06092438e-01  3.10949180e-02  4.94258851e-02 -4.48710956e-02
 -7.37172598e-03 -3.35610174e-02  7.60582983e-02  7.23909773e-03
 -4.22013626e-02  7.07915574e-02  4.74720746e-02  2.07829997e-02
  1.53310582e-01 -8.39406624e-03 -2.58807223e-02  6.07991256e-02
  6.68156371e-02  6.47229999e-02  4.98201810e-02  8.87849852e-02
 -3.29411589e-02  7.03574717e-02  1.71954595e-02 -3.01853232e-02
  3.85437869e-02  4.84696999e-02 -6.05099984e-02  3.05323098e-02
  1.56038767e-02 -3.04213203e-02 -9.44044720e-03 -4.10514362e-02
 -6.78978413e-02  1.01995897e-02 -2.56566592e-02  2.17156895e-02
 -6.99786693e-02  9.24746767e-02 -3.57198082e-02  7.01379925e-02
 -6.34204075e-02 -3.29403989e-02 -4.61963229e-02  5.41399866e-02
  5.17233536e-02  4.29222398e-02  1.34750502e-02  1.65966749e-02
 -4.41072211e-02 -1.97202079e-02  3.62014920e-02 -1.96613856e-02
 -1.15679063e-01  5.95492311e-03  4.56692092e-03 -4.49428000e-02
 -6.84021264e-02 -8.53045881e-02 -7.09521770e-02  8.03839117e-02
 -5.79829253e-02  5.78272790e-02  5.02265505e-02  5.94230704e-02
 -3.65563557e-02  9.26963147e-03  5.25237620e-02  2.79896203e-02
 -3.33691314e-02 -5.07849492e-02 -1.28647508e-02 -1.42978536e-08
 -4.05251831e-02 -8.57909322e-02  4.51683141e-02  2.16769502e-02
 -2.23385114e-02  1.22076636e-02 -3.24891433e-02 -1.69530045e-02
 -2.71710195e-02  6.00284012e-03  4.02760841e-02  2.69626696e-02
 -3.56246680e-02  7.40884915e-02  3.23740840e-02 -9.05680060e-02
 -3.17415819e-02  4.09252234e-02 -9.95599013e-03  3.06883864e-02
 -7.69139305e-02  4.15846035e-02  1.96060937e-04  6.27766177e-02
 -3.60905975e-02  4.88440134e-02  5.42269610e-02  1.26619861e-01
 -3.84874595e-03  8.29435594e-04  6.96140304e-02  4.40050326e-02
 -3.20810415e-02 -8.52382705e-02  1.37698781e-02  2.28018332e-02
 -2.84722401e-03 -6.78517343e-03  3.75879258e-02  3.52769010e-02
 -6.67841882e-02  2.15264577e-02  3.75266112e-02 -4.54255491e-02
 -5.10316379e-02 -6.79947212e-02 -3.08671556e-02 -3.63903157e-02
 -1.48750050e-02 -9.36829150e-02 -3.15773897e-02  1.02417255e-02
  1.50771607e-02 -2.38304259e-03  2.41354425e-02 -1.32853491e-02
  6.58379262e-03  2.44415142e-02 -1.37135938e-01  6.39142469e-02
  1.96718305e-01 -6.02960400e-03  5.31940609e-02 -5.52259758e-02]

Go results:

{0.073092446, 0.15883854, -0.23150429, 0.3737195, -0.045419022, -0.25985754, 0.48668596, 0.23607449, 0.078990184, -0.03873565, 0.107460454, -0.48563772, -0.002660518, 0.17542723, -0.100574106, -0.43053365, 0.056375634, -0.12829961, -0.5070323, -0.082665384, -0.259095, -0.16259491, -0.16784763, 0.20849757, -0.1396307, 0.13265425, -0.36529058, 0.20809135, 0.18775468, -0.392495, 0.24458688, 0.20150869, 0.09667532, 0.28612027, 0.33481535, 0.08408837, 0.25986904, 0.17728475, 0.122626334, -0.02040849, -0.022894314, -0.90110195, 0.24004027, -0.06925271, 0.1646482, 0.2612434, -0.101491354, 0.33788753, -0.358114, 0.07691091, -0.2211344, -0.25070798, -0.2909025, -0.24663922, -0.11360111, 0.1363139, -0.041110944, -0.1674774, 0.30678973, 0.27371323, 0.29163897, -0.21735017, -0.15233241, 0.35115469, 0.15551436, 0.122688055, 0.071150586, -0.16229197, -0.20346418, -0.20626289, -0.09228819, 0.09243818, 0.06517689, 0.4678039, 0.50458044, -0.026733657, 0.017010693, -0.48380655, 0.3709105, -0.030437863, -0.551056, -0.29292658, -0.24986246, 0.33100322, 0.15610716, 0.50885403, 0.75587225, 0.20066194, -0.71863174, 0.0674511, 0.12467655, 0.24413729, -0.21850702, -0.038685925, -0.26844263, 0.14311884, 0.052773923, -0.122992516, -0.14688128, 1.5471901, 0.31191814, 0.18406391, -0.036392987, -0.12890016, -0.48959765, -0.27115762, 0.0027654967, -0.42180437, 0.44626626, 0.040984243, -0.3687184, -0.031970464, 0.19484352, 0.17355542, 0.17123955, -0.43002805, -0.41466662, 0.3728804, -0.20093672, 0.25623977, 0.35958108, 0.04410128, 0.039169278, -0.083394825, 0.19674765, -0.25434807, 0.44673866, -3.0581147e-32, 0.13811417, -0.647762, 0.35453412, 0.6162006, -0.33254588, 0.12036189, -0.06596503, 0.4433381, -0.05786411, 0.3761898, 0.059025865, -0.09769136, -0.14660655, 0.15117505, 0.6438716, 0.58792484, -0.20915872, 0.07392529, -0.4138157, 0.19814381, -0.14994372, -0.30714953, 0.07749607, -0.25284076, -0.4823181, -0.22429717, -0.008661747, -0.103971146, 0.10623525, -0.0025554697, -0.33243817, 0.18872425, -0.45963675, 0.43581793, -0.11105046, -0.03485974, 0.081400424, -0.14338344, 0.16870368, -0.1625949, -0.2536515, -0.08509379, 0.0048967004, 0.17998235, 0.20139122, -0.19899516, -0.18618342, -0.12741189, 0.30301422, -0.008304736, -0.086622104, 0.12645222, -0.43342936, -0.13815638, -0.19722173, 0.31083858, 0.07622314, -0.37107897, -0.16687179, 0.37728214, 0.4269638, 0.21448648, -0.3340206, 0.37735975, -0.16075447, -0.13019495, -0.33931202, -0.61420673, 0.30258822, 0.33024263, -0.14714494, -0.4357546, 0.10523506, 0.18014097, -0.18404293, -0.22402923, -0.07950021, 0.4622601, -0.1229818, -0.39823717, 0.60578024, -0.48788673, 0.100366816, -0.2827474, 0.103066936, -0.004805535, -0.054816663, -0.6233817, 0.036420044, -0.45356828, -0.20612428, 0.12596451, 0.024198275, -0.16129492, 0.5197323, 2.576408e-32, -0.18639922, 0.16119629, -0.32215318, 0.97975993, 0.3301013, -0.21803677, 0.8395257, -0.121200025, -0.37718356, 0.77559584, 0.06405789, -0.31360692, 0.36841178, 0.08033751, -0.10495376, 0.08034587, 0.2883264, -0.4405567, -0.3062572, -0.031433396, -0.5700438, 0.4409865, 0.059707426, -0.04280872, -0.66914815, 0.19560885, 0.3120206, -0.28288692, -0.047189973, -0.21215516, 0.47993192, 0.04499831, -0.26636782, 0.4467379, 0.29939848, 0.13133389, 0.96724695, -0.053090006, -0.16352084, 0.3831719, 0.42112204, 0.40811715, 0.31404072, 0.5594851, -0.20792592, 0.44344038, 0.10866323, -0.1901406, 0.24283993, 0.30536872, -0.38104373, 0.19260943, 0.09832611, -0.19200361, -0.05989046, -0.25943702, -0.4287165, 0.06438775, -0.1618985, 0.13687581, -0.44088525, 0.58273256, -0.22531043, 0.44281122, -0.40057015, -0.20713249, -0.29109177, 0.3413024, 0.32587957, 0.27112812, 0.08499702, 0.10473879, -0.27884433, -0.12430772, 0.22811446, -0.12421482, -0.7296945, 0.037679397, 0.028749615, -0.28372627, -0.4318576, -0.53830934, -0.44768983, 0.50719196, -0.36583027, 0.3647774, 0.31663495, 0.37479824, -0.23002678, 0.05914798, 0.33172256, 0.17711763, -0.2097901, -0.32009152, -0.081330135, -9.016733e-08, -0.25536966, -0.5410277, 0.284962, 0.1363995, -0.14117293, 0.07706172, -0.20459078, -0.10649018, -0.17104845, 0.037345253, 0.25411493, 0.1701188, -0.22485965, 0.46689534, 0.20448816, -0.57127875, -0.19971165, 0.2577933, -0.06263276, 0.19327301, -0.4846681, 0.262111, 0.0014361069, 0.39600477, -0.22753182, 0.30760753, 0.34258533, 0.7984235, -0.024403652, 0.0053556664, 0.43905813, 0.2772461, -0.20236914, -0.53738344, 0.086288534, 0.14369787, -0.017453229, -0.042886734, 0.23677924, 0.22266392, -0.42127606, 0.13600984, 0.23676494, -0.28648943, -0.32214728, -0.42859158, -0.19471586, -0.22912739, -0.0936538, -0.5910691, -0.1996514, 0.06473716, 0.09553926, -0.0146507155, 0.15216777, -0.08418982, 0.04153511, 0.15437761, -0.86496675, 0.40314007, 1.240751, -0.03796559, 0.33547962, -0.3481118}

ThomasAlxDmy avatar Mar 29 '23 04:03 ThomasAlxDmy

Hey, Thomas.

just a thought. May be it helps. Try L2 norm for the Go-embeddings.

I have done:

a = np.array([0.073092446, 0.15883854, -0.23150429, ..]) a / np.sqrt(np.sum(np.square(a)))

Output:

array([ 1.15892480e-02, 2.51848081e-02, -3.67064008e-02, ..])

It is np.allclose() to Python embeddings

Best Fedor

fkrasnov avatar Apr 24 '23 16:04 fkrasnov

What is the equivalent for Go?

I try

func l2norm(a []float64) []float64 {
	result := make([]float64, len(a))
	sum := 0.0
	for _, v := range a {
		sum += v * v
	}
	for i, v := range a {
		result[i] = v / math.Sqrt(sum)
	}
	return result
}

but if gives different results.

Is it important?

acheong08 avatar Apr 30 '23 15:04 acheong08

Sorry for my late reply, I’ll back to you shortly.

matteo-grella avatar May 21 '23 14:05 matteo-grella

This gives the same result as in Python:

fmt.Printf("%#v - %d\n\n, result.Vector.Normalize2(), result.Vector.Size())

corani avatar Dec 08 '23 03:12 corani