Enhancing node classification in Graph Neural Networks through Curriculum Learning - BC-919
Genre de projet: RechercheDiscipline(s) souhaitée(s): Génie - informatique / électrique, Génie, Informatique, Sciences mathématiques, Mathématiques
Entreprise: Mastercard
Durée du projet: 6 mois à 1 an
Date souhaitée de début: Dès que possible
Langue exigée: Anglais
Emplacement(s): Vancouver, BC, Canada
Nombre de postes: 1
Niveau de scolarité désiré: Études de premier cycle/baccalauréatMaîtriseDoctoratRecherche postdoctoraleNouvelle diplômée/nouveau diplômé
Ouvert aux candidatures de personnes inscrites à un établissement à l’extérieur du Canada: No
Au sujet de l’entreprise:
We work to connect and power an inclusive, digital economy that benefits everyone, everywhere by making transactions safe, simple, smart and accessible. Using secure data and networks, partnerships and passion, our innovations and solutions help individuals, financial institutions, governments and businesses realize their greatest potential. Our decency quotient, or DQ, drives our culture and everything we do inside and outside of our company. We cultivate a culture of inclusion for all employees that respects their individual strengths, views, and experiences. We believe that our differences enable us to be a better team – one that makes better decisions, drives innovation and delivers better business results. At AI Garage, we use state-of-the-art AI techniques to solve some of the most important problems in the financial world.
Veuillez décrire le projet.:
Node classification is a fundamental graph-based task that aims to predict the classes of unlabeled nodes, for which Graph Neural Networks (GNNs) are the state-of-the-art methods. Current GNNs assume that nodes in the training set contribute equally during training. However, the quality of training nodes varies greatly, and the performance of GNNs could be harmed by two types of low-quality training nodes: (1) inter-class nodes situated near class boundaries that lack the typical characteristics of their corresponding classes. Because GNNs are data-driven approaches, training on these nodes could degrade the accuracy. (2) mislabeled nodes. In real-world graphs, nodes are often mislabeled, which can significantly degrade the robustness of GNNs. To solve this we want to explore curriculum learning. In particular, curriculum learning is a training strategy that initially trains the machine learning models using an easier training subset and then gradually introduces more difficult samples. By excluding low-quality difficult samples during initial training, curriculum learning mitigates overfitting to data noise, and thus improves models’ accuracy and robustness. Objectives: 1. Investigate the effectiveness of Curriculum Learning in improving node classification performance in GNNs. 2. Develop curriculum strategies tailored to the characteristics of graph-structured data, including imbalanced node distributions and varying degrees of noise. 3. Assess the impact of different curriculum schedules on model convergence, generalization, and robustness. 4. Evaluate the proposed approach on benchmark datasets across diverse domains to demonstrate its effectiveness and versatility. Related Papers: · Wei, Xiaowen, Xiuwen Gong, Yibing Zhan, Bo Du, Yong Luo, and Wenbin Hu. "Clnode: Curriculum learning for node classification." In Proceedings of the Sixteenth ACM International Conference on Web Search and Data Mining, pp. 670-678. 2023. · Zhang, Zheng, Junxiang Wang, and Liang Zhao. "Curriculum learning for graph neural networks: Which edges should we learn first." Advances in Neural Information Processing Systems 36 (2024). · Li, Haoyang, Xin Wang, and Wenwu Zhu. "Curriculum graph machine learning: A survey." arXiv preprint arXiv:2302.02926(2023). |
Expertise ou compétences exigées:
- Good theoretical and practical familiarity of Deep Learn Models.
- Decent understanding of Graph Neural Networks formulations.
- Good understanding of Machine Learning theory.
- Decent understanding of probability theory and statistics.
- Experience in Knowledge Graphs is a plus.
- Good experience with packages such as Pytorch and Tensorflow.