FaceVerification icon indicating copy to clipboard operation
FaceVerification copied to clipboard

Washed CASIA-webface data set identities

Open shimen opened this issue 8 years ago • 10 comments

Hi, is there a list of identities for the Washed CASIA-webface data set? There are just numbers per each identity. I would like to use several databases for training and would like to remove identities that appear at more than one database.

shimen avatar May 29 '16 09:05 shimen

Any update on the labels for the CASIA-webface data set

shimen avatar Jun 05 '16 07:06 shimen

@shimen how did you manage to unpack the dataset? I've got 5 files of 734mb each with the names .z01-.z05 and one 340mb .zip file, that does not look like a zip file. if I concatenate all these files, I get a lot of

file #435315:  bad zipfile offset (lseek):  4403961856
file #435316:  bad zipfile offset (lseek):  4403970048
file #435316:  bad zipfile offset (lseek):  4403970048
file #435317:  bad zipfile offset (lseek):  4403986432

any idea how to correctly extract all the files?

thanks in advance!

lazydroid avatar Jun 10 '16 00:06 lazydroid

regarding the identities, I spoke with the author of the original dataset, he said they cannot release identities at this time, but told me to check their web site later. not sure what that's supposed to mean =)

lazydroid avatar Jun 10 '16 00:06 lazydroid

that's what I thought. could you please try:

$ unzip t combined.zip

to see if there are any errors in the archive? in my case, there are plenty, the archive seems damaged.

On 06/16/2016 03:46 AM, kihyuks wrote:

@lazydroid https://github.com/lazydroid just in case, you can concatenate them and unzip in ubuntu: $ cat CASIA-maxpy-clean.z01 CASIA-maxpy-clean.z02 CASIA-maxpy-clean.z03 CASIA-maxpy-clean.z04 CASIA-maxpy-clean.z05 CASIA-maxpy-clean.zip > combined.zip $ unzip combined.zip

lazydroid avatar Jun 16 '16 00:06 lazydroid

@lazydroid Actually the one that I suggested before only unzip 1/5. You can try this instead:

$ zip -F CASIA-maxpy-clean.zip --out CASIA-maxpy-clean_fix.zip $ unzip CASIA-maxpy-clean_fix.zip

This gives me around 450K images.

kihyuks avatar Jun 16 '16 01:06 kihyuks

@shimen @lazydroid @kihyuks Could you please provide a link to the washed CASIA dataset.

sidgan avatar Feb 28 '17 20:02 sidgan

@sidgan try this: http://www.down20.com/f-170364248744426

I did not make it, I have just googled the link.

lazydroid avatar Mar 02 '17 01:03 lazydroid

@lazydroid @kihyuks @sidgan I download the dataset, but only 439,532 images, some images missing. Unpack dataset with commands: $ zip -F CASIA-maxpy-clean.zip --out CASIA-maxpy-clean_fix.zip $ unzip CASIA-maxpy-clean_fix.zip Is there any advice?

yao5461 avatar Oct 31 '17 08:10 yao5461

How many photos and classes must be in washed casia web face?

Gerkam avatar Apr 21 '18 10:04 Gerkam

You should use this commands to unzip multi-part zip files. source

zip -s- CASIA-maxpy-clean.zip -O combined.zip
unzip combined.zip

t1t0n avatar Aug 31 '20 09:08 t1t0n