The competition is hosted on Kaggle at this link.
Each competitor will participate individually. Please login using your student mail (@stud.acs.upb.ro).
We provide a starter code which demonstrates how to read the data, train a network and make a submission. You are encouraged to start your work from this notebook.
Beating the baseline on the private leaderboard will reward 1p, top 3 on the private leaderboard will have their final exam grade equal to 10 (4p).
Traditional image classification models heavily rely on accurately labeled data for training, but in real-world scenarios, acquiring large quantities of labeled images can be costly and time-consuming.
Aditionally, available datasets exhibits a notable class imbalance, with benign cases significantly outnumbering malignant ones. This imbalance is consistent with the epidemiological reality that the majority of individuals undergoing screening are found to be cancer-free, as malignancies occur in only a small subset of the tested population.
In this challenge, we provide you with a dataset that poses both obstacles: a significant portion of the training data remains unlabeled and the labeled data is heavily imbalanced.
Your task is to develop innovative deep learning algorithms and techniques to overcome these challenges and build a robust image classification model.
To succeed in this competition, participants are encouraged to explore semi-supervised/unsupervised learning methods that leverage the unlabeled data to improve the model's performance. Developing strategies to mitigate the impact of the class imbalance and enhance the model's ability to generalize effectively will be crucial. We encourage creative ideas.
train_labeled.csv - paths to the labeled training set, with their corresponding labels
train_unlabeled.csv - paths to the unlabeled training set
test.csv - paths for the test set, for which you will need to make predictions
ID - the path to the file
label - 0 for benign samples, 1 for malignant samples
def seed_everything(seed=42): random.seed(seed) os.environ['PYTHONHASHSEED'] = str(seed) np.random.seed(seed) torch.manual_seed(seed) torch.cuda.manual_seed(seed) torch.cuda.manual_seed_all(seed) torch.backends.cudnn.deterministic = True torch.backends.cudnn.benchmark = False