Dr. Luca Giudice | Bioinformatician

Hi, I'm Luca Giudice.

Postdoctoral researcher with expertise in bioinformatics, data science, network science, and AI. Passionate about solving real-world biological problems. Multiple internships in important labs around the world, combined with partnerships involving clinicians, biologists, and medical geneticists

About

I am a Postdoctoral Researcher with expertise in bioinformatics, data science, network science, and artificial intelligence. As the first student in Italy to complete an entire academic career in Bioinformatics/Eng, I have experience solving real-world biological problems, collaborating with experts in various scientific fields, and self-learning mathematical and biomedical concepts. My passion lies in the patient similarity network paradigm and the concept of determining a patient's disease risk based on their genetic similarities with previously known cases. In my work, I am accountable as a project developer, supervisor, and team player.

Experience

Tarja Malm Group

Bioinformatician, Engineer

Independent Software Engineer: I have proven myself capable of independently researching, designing, and developing novel machine learning methods for biological applications. My programming skills encompass Python, R, NetworkX, igraph, PyTorch, DALEX, and more.
Bioinformatician: I have actively investigated the role of coding and non-coding RNAs in brain diseases such as Alzheimer's, Ischemia, and Stroke. For example, I contributed to determining the cirs-7, miR-7, and Cyrano network, a finding published in Cell Reports. As a skilled data scientist and bioinformatics expert, I have successfully analyzed and integrated countless multi-omics sequencing data and multi-modal clinical data.
Project Manager: I excel at communicating with students, colleagues, and collaborators. I am a proactive supervisor with a strategic mind, keeping both myself and my colleagues focused on the priorities essential for project success. I am very well organized and capable of exploiting managing software like Notion and Obsidian.
Teacher: Over the 5 years in my lab, I did multiple presentations to introduce and teach about artificial intelligence methods that can be applied to health data. I successfully presented fundamental computational concepts both at simple and advanced levels, such as principal component analysis, clustering algorithms (e.g., k-means), random forests, and neural networks.

2020 - Current | Kuopio, Finland

Aldo Scarpa Group

Data Scientist

Goal: Unravelling the molecular pathogenesis of combined neuroendocrine cancers by analyzing sequencing data of patients with Combined Large Cell Neuroendocrine Carcinomas (CoLCNEC).
Data Science: My role was to discover biological markers that can differentiate and classify the 7 histological groups of patients. I encountered the challenge of combining data from various sources while maintaining the true biological signals of patients. To overcome this challenge, I utilized techniques such as batch correction, data normalization, unsupervised machine learning and pathway-based patient classification.
Award: I was awarded the grant AdR3826/21 named "Neuroendocrine Carcinomas: towards a molecular classification of the disease subtypes for precision medicine".

2020 - 2021 | Verona, Italia

Infomics Group

Bioinformatician

My PhD goal was to develop a software for the patient classification that could integrate multiple biological omics data and reveal disease characteristics. I was responsible for designing the project from its feasibility study to its publication as a scientific article. I mastered advanced computer science topics as reproducibility, real-world validation and interpretability of machine learning systems, dockerization, chunking and parallel computing in R and Python.

2018 - 2021 | Verona, Italia

Daniel Rico Group

Software Engineer

Goal: The in-vitro protocol for identifying the involvement of cell-type specific non-coding regulatory elements in gene expression and deregulation is costly and can only detect direct associations.
Computer Science: I developed a classifier combining a network-based propagation algorithm and a random forest algorithm that can predict the activity of both direct and indirect non-coding regions based on the gene expression of a cell. I learned that time management is a critical factor in the development of a project, and tasks can vary in their importance depending on when they are performed. A project requires two individuals: one who is optimistic and pushes forward, and one who is realistic and focuses on completion.

2018 - 2021 | Newcastle, UK

Luciano Cascione Group

Junior Developer

Computer Science: I developed a workflow to uncover novel lncRNAs about canine B cell lymphomas. The workflow allowed us to characterize the molecular mechanisms and promote potential prognostic biomarkers about DLBCL, MZL and FL lymphoma.

2018 - 2019 | Bellinzona, Switzerland

Luca Aresu Group

Junior Bioinformatician

Data Analysis: I did an exhaustive characterization of two indolent B-cell lymphomas based on a multi-omics analysis and integration. I used both standard bioinformatic tools and machine learning algorithms.

2018 - 2019 | Turin, Italy

Gary Bader Group

Junior Bioinformatician

Machine Learning: I developed a software module of a patient classifier (netDx) to integrate somatic mutation information to other biological omics (e.g., gene expression, methylation) using graph theory.

2017 - 2019 | Toronto, Canada

Projects

                
StellarPath patient classifier
                  A deep learning patient classifier based on the patient similarity network paradigm
                
Accomplishments
                  Leverages state-of-the-art bioinformatics techniques to select biologically meaningful molecular and pathway features, 
                  the pathway space to integrate different omics and increase the software interpretability, 
                  the PSN paradigm to enable analysis and stratification of patients based on their pathway activity, 
                  and an artificial neural network to account for the similarity information and classify.
                
Esearch3D enhancer predictor
                  Esearch3D is an unsupervised algorithm to predict enhancers. 
                
Accomplishments
                  It reverses engineering the flow of information and identifies intergenic regulatory enhancers using solely gene expression and 3D genomic data. 
                  It models chromosome conformation capture (3C) data as chromatin interaction network (CIN) and then exploits graph-theory algorithms to integrate RNA-seq data 
                  to calculate an imputed activity score (IAS) for intergenic regions.
                
netDx patient classifier
                  A novel workflow for pathway-based patient classification from sparse genetic data.
                
Accomplishments
                  netDx is for biomedical researchers who want to integrate multi-modal patient data to predict outcome or patient subtype. 
                  netDx builds interpretable machine-learning patient classifiers. Unlike standard machine-learning tools, netDx allows modeling of 
                  user-defined biological groups as input features; examples include pathways and co-regulated elements. In addition to patient classification, 
                  top-scoring features provide mechanistic insight, helping drive hypothesis generation for downstream experiments. netDx currently provides native support 
                  for pathway-level features but can be generalized to any user-defined data type and grouping.
                
Simpati patient classifier
                  Pathway-based patient classifier uses network science for multi-omics integration and class prediction.
                
Accomplishments
                  We propose Simpati, an innovative and interpretable patient classifier based on pathway-specific patient similarity networks. 
                  The first classifier to adopt ad-hoc novel algorithms for such graph type. It standardizes the biological high-throughput dataset of patient's 
                  profiles with a propagation algorithm that considers the interconnected nature of the cell’s molecules for inferring a new activity score. 
                  This allows Simpati to classify with dense, sparse, and non-homogenous omic data. Simpati organizes patient’s molecules in pathways 
                  represented by patient similarity networks for being interpretable, handling missing data and preserving the patient privacy. 
                  A network represents patients as nodes and a novel similarity measure determines how much every pair act co-ordinately in a pathway. 
                  Simpati detects signature biological processes based on how much the topological properties of the related networks separate the patient classes.
                
LErNet lncRNA predictor
                  LErNet predicts the function of IncRNAs.
                
Accomplishments
                  LErNet is a method to in silico define and predict the roles of IncRNAs. The core of the approach is a network expansion algorithm which enriches the genomic context of IncRNAs. 
                  The context is built by integrating the genes encoding proteins that are found next to the non-coding elements both at genomic and system level. 
                  The pipeline is particularly useful in situations where the functions of discovered IncRNAs are not yet known. The results show both the outperformance of LErNet compared 
                  to enrichment approaches in literature and its robustness in case of partially missing context information.
                
de-novo transcriptome assembly
                  R workflow performing de-novo transcriptome assembly to uncover novel lncRNAs
                
Accomplishments
                  We developed a customized R pipeline performing a transcriptome assembly by multiple algorithms to uncover novel lncRNAs, 
                  and delineate genome-wide expression of unannotated and annotated lncRNAs. Our pipeline also included a new package for high performance system biology analysis, 
                  which detects high-scoring network biological neighborhoods to identify functional modules.

Skills

Languages and Databases

Python

R

Shell Scripting

Libraries

NumPy

Pandas

scikit-learn

matplotlib

Frameworks

Keras

TensorFlow

PyTorch

Bioconductor

Other

Git

AWS

Docker

Activities

                
Bioinformatics Course 2024
                  I provided a four-day bioinformatics course invited by Dr. Shen Li at Beijing Shijitan Hospital, Capital Medical University, in China
                
Accomplishments
                  In October 2024, I conducted a four-day (32-hour) bioinformatics course in Capital Medical University (CMU) - Beijing, a culmination of my experience in both bioinformatics research and education.
                  A surprising observation is the limited understanding of computer science fundamentals underlying many methods applied to health data. This knowledge gap often leads scientists to misapply methods and misuse terminology. For example, Principal Component Analysis (PCA) is frequently used as a visualization technique instead of Multi-dimensional Scaling (MDS). Similarly, terms like correlation, integration, and concatenation are often confused and misused.
                  Based on my understanding of the critical need for advanced training in the analysis of biological data, I designed this course for both bioinformaticians and computer scientists.
                
IWBBIO Conference 2024
                  I presented my works on the patient similarity network paradigm and patient classification
                
Accomplishments
                
NCB Week 2024
                  I was delighted to present my review about multi-omics data integration methods at the Finnish Nordic Computational Biology Conference 2024
                
Accomplishments
                  I was delighted to present at the Finnish Nordic Computational Biology Conference 2024 on the patient similarity network paradigm for patient classification. My talk covered the paradigm itself, along with existing methods like netDx (Shraddha Pai et al.) and SUPREME (Ziynet Nesibe Kesimoglu et al.), and introduced my own novel approach, StellarPath. I hope this paradigm gains wider adoption for omics data analysis. My sincere thanks to Ahmed Mohamed and the Nordic Computational Biology group for organizing the conference and giving me the opportunity to speak. An extra thanks also go to my PI Tarja Malm who gives me the chance to grow as researcher and partecipate in these amazing events.
                
NCB Week 2023
                  I presented and organized the Finnish Nordic Computational Biology Conference 2023 in Kuopio
                
Accomplishments
                  The Finnish Symposium on Computational Biology 2023 is organised by Luca Giudice under Regional Student Group (RSG) under the International Society for Computational Biology (ISCB) for Finland.
                  The symposium is tailored for Bioinformaticians and Computational Biologists. It's also an excellent opportunity for researchers or students working with biological data who are keen on exploring new bioinformatics methods, workflows, or challenges.

Education

Verona University

Verona, Italy

Degree: PhD in Computer Science
Thesis: Artificial Intelligence Techniques Integrate Biological Omics into Graphs for the Prediction and Pathway Analysis of Patient’s Disease
Date: May 2022

University of Verona, Verona, Italy

Degree: Master's degree in Medical Bioinformatics
GPA: 4.0/4.0
Date: August 2018

Relevant Courseworks:

Data Structures and Algorithms
Database Management Systems
Operating Systems
Machine Learning
Bioinformatics

Contact

luca.giudice@uef.fi

github.com/LucaGiudice

linkedin.com/in/luca-giudice-8982751a4/

researchgate.net/profile/Luca-Giudice

scholar.google.com/citations?user=fRbyTT8AAAAJ&hl=en