Manifold Learning - BC-888
Project type: ResearchDesired discipline(s): Engineering - computer / electrical, Engineering, Computer science, Mathematical Sciences, Mathematics
Company: Mastercard AI Garage
Project Length: 6 months to 1 year
Preferred start date: 05/01/2024
Language requirement: English
Location(s): BC, Canada
No. of positions: 1
Desired education level: CollegeUndergraduate/BachelorMaster'sPhDPostdoctoral fellowRecent graduate
Open to applicants registered at an institution outside of Canada: Yes
About the company:
We work to connect and power an inclusive, digital economy that benefits everyone, everywhere by making transactions safe, simple, smart and accessible. Using secure data and networks, partnerships and passion, our innovations and solutions help individuals, financial institutions, governments and businesses realize their greatest potential. Our decency quotient, or DQ, drives our culture and everything we do inside and outside of our company. We cultivate a culture of inclusion for all employees that respects their individual strengths, views, and experiences. We believe that our differences enable us to be a better team – one that makes better decisions, drives innovation and delivers better business results. At AI Garage, we use state-of-the-art AI techniques to solve some of the most important problems in the financial world.
Describe the project.:
Manifold learning is a subfield of machine learning that focuses on dimensionality reduction, with the goal of capturing the underlying structure of high-dimensional data in lower-dimensional representations. The concept behind manifold learning is that even though data might live in a high-dimensional space, it might be constrained to a lower-dimensional manifold within that space. Methods like t-SNE, Isomap, and locally linear embedding (LLE) are some popular techniques in this category.
A transaction graph is a representation where nodes represent entities (like accounts, users, or devices) and edges represent transactions or interactions between these entities. In the context of transaction data, manifold learning can be beneficial for several purposes:
- Visualization: A major use case is visualizing high-dimensional data. By reducing the dimensionality of the transaction graph, you can visualize it in 2D or 3D space. This can help in identifying clusters or patterns in the data which might be indicative of groups of users with similar transaction behaviors.
- Fraud Detection: Anomalous patterns are hard to detect in high dimensions due to the curse of dimensionality. By embedding the transaction graph into a lower-dimensional space, anomalies or unusual patterns (possibly indicative of fraud) might become more apparent.
- Clustering & Segmentation: Once the transaction graph is represented in a lower-dimensional space, traditional clustering algorithms can be applied more effectively. This can be used to segment users into different groups based on their transaction behaviors.
- Feature Engineering: The reduced dimensional representation can serve as additional features for other machine learning tasks. For instance, if you're predicting user churn or lifetime value, the manifold coordinates can be supplementary input features.
- Noise Reduction: Manifold learning techniques can help filter out noise from the original data by preserving only the most important structural information.
To use manifold learning on a transaction graph:
- Graph Embeddings: Begin by representing the graph in high-dimensional space. Node2Vec, DeepWalk, and GraphSAGE are some popular methods to embed graphs into continuous vector spaces.
- Apply Manifold Learning: Once you have the high-dimensional embeddings, apply your desired manifold learning technique to reduce its dimensionality.
- Use the Transformed Data: With the lower-dimensional data, perform any of the tasks mentioned above: visualization, clustering, fraud detection, etc.
Required expertise/skills:
- Good theoretical and practical familiarity of Deep Learn Model
- Decent understanding of Graph Neural Networks formulations
- Good understanding of Machine Learning theory
- Decent understanding of probability theory and statistics
- Experience in Knowledge Graphs is a plus
- Good experience with packages such as Pytorch and Tensorflow