Page Nav

HIDE

Breaking News:

latest

Ads Place

Downloading and Using the ImageNet Dataset with PyTorch

https://ift.tt/OLlQspz Train your image classification models with the most popular research dataset Photo by Ion Fet on  Unsplash Ima...

https://ift.tt/OLlQspz

Train your image classification models with the most popular research dataset

Photo by Ion Fet on Unsplash

ImageNet is the most popular dataset in computer vision research. The image dataset contains collected images for all sorts of categories found in the WordNet hierarchy. The 168 GB large dataset contains 1.3 million images separated into 1,000 classes with different grains of label resolution. For example, it contains classes of planes and dogs, but also classes of different dog breeds which are even hard to classify for humans. ImageNet can be used for classification and object detection tasks and provides train, validation, and test splits by default.

You may have heard the terms ImageNet, ImageNet1k, ImNet, ILSVRC2012, ILSVRC12, etc. being used. They all refer to the same dataset that was introduced for the ILSVRC 2012 competition. However, I should mention that it's only a subset of the full ImageNet which exists under the name of “ImageNet21k”. ImageNet21k is occasionally used to pre-train models.

Originally, ImageNet was hosted at www.image-net.org, then the dataset went private, the website went into maintenance, and finally went public again but the download is now only available on request. I must have applied a dozen times in the last few years and never got access. It seems like downloading ImageNet is quite an odyssey.

More recently, the organizers hosted a Kaggle challenge based on the original dataset with additional labels for object detection. As such, the dataset is semi-publicly available: https://www.kaggle.com/competitions/imagenet-object-localization-challenge/

To download the dataset you need to register a Kaggle account and join the challenge. Please note that by doing so, you agree to abide by the competition rules. In particular, you are only allowed to use the dataset for non-commercial research and educational purposes.

Then, install the Kaggle CLI:

pip install kaggle

Now you need to set up your credentials. This step is very important, as otherwise, you won’t be able to start the download. Please follow the official guideline:

To use the Kaggle API, sign up for a Kaggle account at https://www.kaggle.com. Then go to the ‘Account’ tab of your user profile (https://ift.tt/b8RJ2PC) and select 'Create API Token'. This will trigger the download of kaggle.json, a file containing your API credentials. Place this file in the location ~/.kaggle/kaggle.json (on Windows in the location C:\Users\<Windows-username>\.kaggle\kaggle.json - you can check the exact location, sans drive, with echo %HOMEPATH%). You can define a shell environment variable KAGGLE_CONFIG_DIR to change this location to $KAGGLE_CONFIG_DIR/kaggle.json (on Windows it will be %KAGGLE_CONFIG_DIR%\kaggle.json).

Once done, you can start the download. Please be aware that this file is very large (168 GB) and the download will take anywhere from minutes to days depending on your network connection.

kaggle competitions download -c imagenet-object-localization-challenge

After the download is complete you extract the file. For Unix simply use unzip. Note that this will also take a while.

unzip imagenet-object-localization-challenge.zip -d <YOUR_FOLDER>

There are two little additional helper files that we need. You could rewrite the following code to be independent of those but simply using the files is faster and simpler. So, just download them into the ImageNet root folder (the one that contains the ILSVRC folder). If you’re under a Unix System you can use wget:

cd <YOUR_FOLDER>
wget https://raw.githubusercontent.com/raghakot/keras-vis/master/resources/imagenet_class_index.json
wget https://gist.githubusercontent.com/paulgavrikov/3af1efe6f3dff63f47d48b91bb1bca6b/raw/00bad6903b5e4f84c7796b982b72e2e617e5fde1/ILSVRC2012_val_labels.json

All we need to do now is to write a Datasetclass for PyTorch. I think the actual code is pretty boring loading so I’ll not go into details.

import os
from torch.utils.data import Dataset
from PIL import Image
import json
class ImageNetKaggle(Dataset):
def __init__(self, root, split, transform=None):
self.samples = []
self.targets = []
self.transform = transform
self.syn_to_class = {}
with open(os.path.join(root, "imagenet_class_index.json"), "rb") as f:
json_file = json.load(f)
for class_id, v in json_file.items():
self.syn_to_class[v[0]] = int(class_id)
with open(os.path.join(root, "ILSVRC2012_val_labels.json"), "rb") as f:
self.val_to_syn = json.load(f)
samples_dir = os.path.join(root, "ILSVRC/Data/CLS-LOC", split)
for entry in os.listdir(samples_dir):
if split == "train":
syn_id = entry
target = self.syn_to_class[syn_id]
syn_folder = os.path.join(samples_dir, syn_id)
for sample in os.listdir(syn_folder):
sample_path = os.path.join(syn_folder, sample)
self.samples.append(sample_path)
self.targets.append(target)
elif split == "val":
syn_id = self.val_to_syn[entry]
target = self.syn_to_class[syn_id]
sample_path = os.path.join(samples_dir, entry)
self.samples.append(sample_path)
self.targets.append(target)
def __len__(self):
return len(self.samples)
def __getitem__(self, idx):
x = Image.open(self.samples[idx]).convert("RGB")
if self.transform:
x = self.transform(x)
return x, self.targets[idx]

We can now test it by running a validation epoch for a pre-trained ResNet-50 model.

from torch.utils.data import DataLoader
from torchvision import transforms
import torch
import torchvision
from tqdm import tqdm
model = torchvision.models.resnet50(weights="DEFAULT")
model.eval().cuda() # Needs CUDA, don't bother on CPUs
mean = (0.485, 0.456, 0.406)
std = (0.229, 0.224, 0.225)
val_transform = transforms.Compose(
[
transforms.Resize(256),
transforms.CenterCrop(224),
transforms.ToTensor(),
transforms.Normalize(mean, std),
]
)
dataset = ImageNetKaggle(<YOUR_FOLDER>, "val", val_transform)
dataloader = DataLoader(
dataset,
batch_size=64, # may need to reduce this depending on your GPU
num_workers=8, # may need to reduce this depending on your num of CPUs and RAM
shuffle=False,
drop_last=False,
pin_memory=True
)
correct = 0
total = 0
with torch.no_grad():
for x, y in tqdm(dataloader):
y_pred = model(x.cuda())
correct += (y_pred.argmax(axis=1) == y.cuda()).sum().item()
total += len(y)
print(correct / total)

This should print 0.80342 which is the accuracy (80.342%) of the model.

Alternatives to ImageNet

Training with ImageNet is still too expensive for most people. However, there are numerous alternative datasets based on ImageNet with reduced resolution and/or the number of samples and labels. These datasets can be used for training at a fraction of the cost. Some examples are ImageNette, Tiny ImageNet, ImageNet100, and CINIC-10.

References

[1] J. Deng, W. Dong, R. Socher, L. -J. Li, Kai Li and Li Fei-Fei, “ImageNet: A large-scale hierarchical image database,” 2009 IEEE Conference on Computer Vision and Pattern Recognition, 2009, pp. 248–255, doi: 10.1109/CVPR.2009.5206848.

The dataset is free for non-commercial research and education purposes.

Thank you for reading this article! If you enjoyed it please consider subscribing to my updates. If you have any questions feel free to leave them in the comments.


Downloading and Using the ImageNet Dataset with PyTorch was originally published in Towards Data Science on Medium, where people are continuing the conversation by highlighting and responding to this story.



from Towards Data Science - Medium https://ift.tt/EMOZsrH
via RiYo Analytics

No comments

Latest Articles