dress-pattern-recognition-using-CNN icon indicating copy to clipboard operation
dress-pattern-recognition-using-CNN copied to clipboard

Issues

Open kalashjindal opened this issue 2 years ago • 1 comments
trafficstars

Around 15k images are present in the data csv, but only about 10k images in total are used in the notebook. The model was trained as a binary problem, but the real problem is a multi-calss one. The only folder created in create dataset is dataset category, but how is dataset category test used in notebooks? Receiving an accuracy of over 95% but not using other metrics to demonstrate it statistically is not a good thing.

kalashjindal avatar Mar 02 '23 10:03 kalashjindal

Added multithreading for downloading the images much faster

import numpy as np import pandas as pd import requests import os import threading

dress_patterns_df = pd.read_csv('dress_patterns.csv') dress_patterns = dress_patterns_df.values

category

category = set(dress_patterns_df['category']) print(category)

#create a folder dataset and nested folder of category print(os.listdir()) os.mkdir('dataset_category')

for cat in category: print(cat) os.mkdir('dataset_category/'+cat)

print(os.listdir('dataset_category'))

def download_image(url, category, unit_id, i): try: r = requests.get(url, allow_redirects=True) open('dataset_category/'+category+'/'+str(unit_id)+'.jpg', 'wb').write(r.content) except: print('ERROR at: ', i)

save image in respective category folder.

threads = [] for i in range(len(dress_patterns)): if i%5 == 0: print(i, '/', len(dress_patterns)) pattern = dress_patterns[i] url = pattern[3] unit_id = pattern[0] category = pattern[1] thread = threading.Thread(target=download_image, args=(url, category, unit_id, i)) threads.append(thread) thread.start()

# limit the number of threads to 5
if len(threads) == 5:
    for thread in threads:
        thread.join()
    threads = []

wait for any remaining threads to complete

for thread in threads: thread.join()

kalashjindal avatar Mar 02 '23 10:03 kalashjindal