deeplake icon indicating copy to clipboard operation
deeplake copied to clipboard

[BUG] Issue with ImageNet training set.

Open g12bftd opened this issue 2 years ago • 2 comments

🐛🐛 Bug Report

⚗️ Current Behavior

I tried using ImageNet-1k directly from Active Loop. After validating on PyTorch's pre-trained ResNet-18, I get 82% validation accuracy, which is way too high.

Input Code

  • REPL or Repo link if applicable:
import deeplake
from PIL import Image
import numpy as np
import os, time
import torch
from torchvision import transforms, models

# Connect to the training and testing datasets
ds_train = deeplake.load("hub://activeloop/imagenet-train", token="my token")
ds_test = deeplake.load("hub://activeloop/imagenet-val", token="my token")

from torch.utils.data import DataLoader
from torchvision import transforms
import torch
import torchvision
from tqdm import tqdm

def convert_to_rgb(image):
    if image.mode != 'RGB':
        image = image.convert('RGB')
    return image


mean = (0.485, 0.456, 0.406)
std = (0.229, 0.224, 0.225)
tform= transforms.Compose(
            [
                transforms.Resize(256),
                transforms.CenterCrop(224),
                transforms.Lambda(convert_to_rgb),
                transforms.ToTensor(),
                transforms.Normalize(mean, std)
            ]
        )

batch_size = 128

# Since torchvision transforms expect PIL images, we use the 'pil' decode_method for the 'images' tensor. This is much faster than running ToPILImage inside the transform
train_loader = ds_train.pytorch(num_workers = 0, shuffle = True, transform = {'images': tform, 'labels': None}, batch_size = batch_size, decode_method = {'images': 'pil'})
test_loader = ds_test.pytorch(num_workers = 0, transform = {'images': tform, 'labels': None}, batch_size = batch_size, decode_method = {'images': 'pil'})

model = torchvision.models.resnet18(weights="DEFAULT")
device = torch.device("cuda")
model.to(device)
model.eval().cuda()  # Needs CUDA, don't bother on CPUs
correct = 0
total = 0
with torch.no_grad():
    for x, y in tqdm(test_loader):
        y_pred = model(x.cuda())
        correct += (y_pred.argmax(axis=1) == y.cuda()).sum().item()
        total += len(y)
print(correct / total)

Expected behavior/code The ResNet-18 pre-trained model is taken directly from the PyTorch hub. The expected validation accuracy is 69.76% (and I verified this using the Kaggle version of ImageNet). Check this PyTorch link for evidence: https://pytorch.org/vision/main/models/generated/torchvision.models.resnet18.html. Note: In my transforms, I include a "convert_to_rgb" transform because some of the images from the training and testing sets from the Active Loop hub are grayscale.

g12bftd avatar Jun 11 '23 14:06 g12bftd

Hey @g12bftd i want to work on this issue.

pranith7 avatar Aug 10 '23 18:08 pranith7

Hey @g12bftd i want to work on this issue.

Hey @pranith7, thank you! Please do try to replicate my code and results. Let me know if you find a solution, or whether there was a mistake on my end.

g12bftd avatar Aug 11 '23 10:08 g12bftd