Project

  • Team: 2 members.
  • Project Selection:
    • Option 1: Choose from the list of pre-defined project ideas provided below.
    • Option 2: Propose your own project idea, which requires approval from the course team.

Project Workflow

Each team is required to:

  • Implement their chosen project idea, building a model or system that addresses a specific medical data science problem.
  • Evaluate their approach using appropriate metrics (accuracy, precision, recall, etc.), and compare results to existing state-of-the-art methods.
  • Document their progress and findings in both a formal report and presentation, in English, using the IEEE format available here - IMPORTANT.

Milestones and Deliverables

  • Objective: Define the research context by reviewing and summarising related work in the area you are addressing.
    • Action Items:
      • Conduct a thorough literature review of papers, articles, and studies relevant to your project.
      • Summarise the current state-of-the-art methods in the field.
      • Identify gaps in the research or areas for potential improvement.
    • Documentation: Create a report section (2 pages excluding references) detailing your findings, including citations of key papers and a discussion of how your project will build upon or differ from existing work, in English.
    • Upload Documentation: Upload (Must contain title and authors)
    • Grading
      • References: Have at least a number of 10 references - 0.3
      • Research Questions: Formulate a minimum of 2 meaningful research questions that you plan answering or that you find that should be answered in future research (https://atlasti.com/guides/qualitative-research-guide-part-1/research-question). - 0.2
      • Content: Capture the current landscape of the topic you're aiming to tackle (You can take inspiration from qualitative surveys - check tips). - 0.5. Answer:
        • What datasets are used and why? 0.1
        • What benchmarks / methods for evaluating the systems are used? Any limitations? 0.1
        • What are the current shortcomings? 0.1
        • How do the works presented relate to each other (E.g: one is trying to address limitation of another, multiple are built on same assumptions or techniques)? 0.1
        • What architectures? Any interesting training techniques? 0.1
    • Presentation:
      • 5 minute presentation (slides) in which you present the review.
      • State the topic / problem, the context, and the direction in which you tend to move (Research questions).
      • Graded from 0 to 1, used to scale the M1 score.
  • Tips:
    • How to efficiently read research papers: The 3 pass method.
    • Proposed workflow:
      • Read recent literature reviews (surveys of scholarly sources) for the tasks you're interested in (Example: https://arxiv.org/abs/2405.12833).
      • Single out papers from the reviews and read them.
      • Document the papers you read, underlying their *contributions* (They should explicitly mention them).
      • Conferences: Each conference has a rank: https://portal.core.edu.au/conf-ranks.
        • Check high ranking conferences (NeurIPS, EMNLP, ECCV, CVPR etc.) and scout the accepted papers in the current year (you can search for them on arxiv). Example for NeurIPS: https://neurips.cc/Downloads/2025 .
      • You can also check number of citations (recent work won't have a big number) but also the authors.
    • Research topics you REALLY have an interest or curiosity for. It will make the experience much more FUN.
2. M2 (18.11.24) - Dataset Collection and Baseline Results (1p)
  • Objective: Obtain the datasets required for your project and implement a baseline model for comparison.
    • Action Items:
      • Dataset Collection:
        • Obtain a relevant dataset, either from the provided resources or other public sources (e.g., Kaggle, UCI, Papers with Code).
        • Preprocess the data (e.g., cleaning, normalization, dealing with missing values).
      • Baseline Model:
        • Implement at least one baseline method (e.g., logistic regression, support vector machines, a simple neural network, pretrained model).
        • Obtain preliminary results to compare against future improvements.
      • Evaluation Metrics: Choose appropriate metrics (e.g., accuracy, F1-score, ROC-AUC) and document initial performance.
    • Documentation (IEEE conference paper format): Submit a report section (2 pages excluding references) describing the dataset, preprocessing steps, baseline model, and results.
    • Upload Documentation: Upload
    • Grading
      • Description of selected datasets: purpose, number of records, quality, method of collection, size, feature description - 0.5.
      • Description of implemented baseline: - 0.1
      • Report initial results: - 0.3
      • Result analysis - 0.1
3. M3 (18.12.24) - Own Contribution (1p)
  • Objective: Implement your novel contribution to the field, either by solving a new problem or improving an existing method.
    • Types of Contributions:
      • Address a New Problem: Tackle a medical data science issue that has not been addressed by related work.
      • Improve Existing Methods:
        • Improve Results: Enhance the performance of an existing solution by optimising models or experimenting with different techniques.
        • Extensive Experiments: Conduct extensive experiments to assess your model’s robustness, including testing with different datasets or under varying conditions.
        • New Approach: Introduce a new method or architecture (e.g., switching from traditional CNNs to transformers), even if it does not outperform state-of-the-art methods, as long as it provides a novel perspective.
          • Examples:
            • Use a different deep learning architecture (e.g., ResNet vs. EfficientNet).
            • Apply a novel training strategy, such as self-supervised learning or data augmentation techniques.
            • Propose a hybrid model that combines multiple approaches (e.g., combining CNNs with decision trees).
    • Documentation (IEEE conference paper format): Write a report section (2 pages excluding references) justifying your chosen approach, detailing your contribution, how it differs from existing work, and comparing your experimental results to the baseline and state of the art.
    • Upload Documentation and code: Upload
    • Grading:
      • Description of proposed contribution: 0.3
      • Implementation and results on selected dataset: 0.5
      • Result analysis, comparison with baseline: 0.2
4. M4 (08.01.25) - Final paper + Presentation (1p)
  • Objective: Compile your project into a well-organised academic report.
    • Action Items:
      • Write a research-style report following the IEEE conference paper format.
      • Structure:
        • Abstract: Briefly summarize your project, contributions, and key findings.
        • Introduction: Explain the problem you are addressing, motivation, and background.
        • Related Work: Include the summary from M1.
        • Methodology: Detail your approach, including algorithms, models, and techniques used.
        • Experiments: Describe the datasets, baseline methods, and results from M2.
        • Own Contribution: Document your original contribution, as outlined in M3.
        • Results and Discussion: Present detailed results with visualizations (graphs, tables) and discuss their implications.
        • Conclusion: Summarize the outcomes, limitations, and future work.
    • Documentation: Submit a polished, formal academic report in IEEE format (8 pages excluding references).
    • Upload Documentation: Upload
    • Presentation: Prepare a 6 minute presentation covering:
      • The problem you addressed and its relevance.
      • Key steps of your methodology.
      • Experimental results and key contributions.
      • Conclusions and potential areas for future research.
    • Create well- polished slides with clear visuals, including figures, graphs, and performance metrics.
    • Evaluation: Your presentation will be graded based on clarity, depth of explanation, the quality of results and the Q&A section. The final project grade will be weighted by your presentation quality.
    • Upload Presentation: Upload

Grading System

  • Total Points = M1 + M2 + M3 + M4
    • M1 and M4 will have their grade (G) scaled by the score of the presentation (P): TOTAL = G * P

Examples of Project Ideas

1. Bad Posture Detection

  • Objective: Detect posture abnormalities from videos or images and suggest exercises to correct them.
  • Relevant work: Posture Detection.

2. Smoker Detection

  • Objective: Identify whether a person is a smoker based on lung capacity, voice analysis, or X-ray images.
  • Dataset: Gather data from publicly available voice or medical image datasets.
  • Note: Each of the modalities (audio, video, image) chosen, or their combination may result in a different project, without much overlap.

3. Retinal Lesion Detection

  • Objective: Detect retinal lesions from medical images, aiding early diagnosis of conditions like diabetic retinopathy.
  • Proposed Dataset: Retinal Lesions Dataset.

4. Fracture Detection in X-rays

  • Objective: Develop a model that identifies fractures in X-ray images, which could help radiologists in making faster diagnoses.
  • Proposed Datasets: MURA, RSNA etc.

5. Cancer Detection from Histopathology Images

  • Objective: Predict the severity of COVID-19 cases using patient data such as demographics, clinical tests, and symptoms.
  • Proposed Dataset: Choose an appropriate one from maduc7/Histopathology-Datasets

6. Alzheimer’s Disease Progression Prediction

  • Objective: Predict the progression of Alzheimer’s disease using imaging (e.g., MRI) or genetic data.
  • Proposed dataset: OASIS Alzheimer's Detection.

7. Interpretation of Knee MRI

  • Objective: Develop models for automated interpretation of knee MRs.
  • Proposed dataset: MRNet - Kaggle. / MRNet

8. Your own project

We encourage you to choose and define your own project.

Potential contributions

  • NEW - The term “New” refers to something that was not tried for YOUR task, so you can try to adapt techniques from other tasks and they will count as contributions.

Depending on the task and recent work, contributions may be:

  • New pipelines: Some solutions are implemented using a pipeline of models. You can tackle some parts of it and try to improve them.
  • Different architecture: You can modify the structure of a well established model, BUT the modification should be based on sound reasons, even though in the end it may not give better results. Random “Mutations” of known models won't count as a contribution.
  • Augmentation techniques: Check if you can augment the data in a new way. Maybe synthetically generated data can help, or not.
  • A new benchmark or new evaluation metrics: If you feel the tests in the literature are not robust to some cases, you can design a new set of qualitative tests. This should translate in at least a couple of hundred new tests / examples.
  • Explainability: If you feel that the works you reviewed do not provide much insight into the decisions that are being made, well, you can work on that: evaluate existing explainability tools under your task conditions.
  • Cross-task adaptation: Explore whether techniques that worked in related domains can be adapted to your task.
  • Robustness to noise/adversaries: Investigate how the system performs under noisy, adversarial, or out-of-distribution inputs, and propose methods to improve robustness.
  • Human-in-the-loop integration: Design hybrid workflows where humans assist the model (or vice versa) to achieve better results than either alone.

Tips for contributions:

  • Check limitations and future work: Most papers will have discussions around their limitations and propose future work items. Sometimes it is just that the authors did not focus on that aspect.
  • Error analysis on the baseline: You can analyse the errors made by your baseline and try to propose targeted solutions. In this case, the baseline should be a well performing model from the related work, not a simple finetuned architecture.
dsm/assignments/01.txt · Last modified: 2025/09/29 17:15 by radu.chivereanu
CC Attribution-Share Alike 3.0 Unported
www.chimeric.de Valid CSS Driven by DokuWiki do yourself a favour and use a real browser - get firefox!! Recent changes RSS feed Valid XHTML 1.0