Vision-language agent for automated UI/UX testing using playwright - BC-968

Project type: Innovation
Desired discipline(s): Engineering - computer / electrical, Engineering, Computer science, Mathematical Sciences, Mathematics
Company: Farpoint Technologies Inc.
Project Length: 6 months to 1 year
Preferred start date: 09/01/2025
Language requirement: English
Location(s): Vancouver, BC, Canada
No. of positions: 2
Desired education level: Master'sPhD
Open to applicants registered at an institution outside of Canada: No

About the company: 

Farpoint Technologies is a leading AI digital transformation consulting company that empowers top-tier organizations to build the AI-assisted workforce of the future. We specialize in providing consulting services to large public, private, and government entities, helping them create AI-accelerated workflows and innovative solutions. Our expertise includes leveraging LLMs, diffusion models, multimodal models, and executing special projects.

Describe the project.: 

This project introduces a novel vision-language AI framework for automating visual and usability testing in software development. Unlike conventional code-focused AI systems, this innovation integrates visual perception through Vision-Language models (VLLMs), identifying UX issues directly from rendered interfaces. It automatically suggests actionable design and usability improvements, enabling front-end QA processes to proactively address visual regressions and accessibility issues before human review, significantly improving software quality and user experience consistency.

Main tasks for the candidate:
• Enable the current AI coding software to both launch applications, and enable screen and local system audio capture to pass to a computer use agentic system, leveraging the playwright codebase and integrating large parts of it into Slate
• Leverage existing computer use OSS codebases and improve upon them by leveraging a large model slow outer loop, with a fast model inner loop for >1Hz capture of software outputs and user inputs.
• Assess and benchmark Visual LLMs that can perform this task locally, as well as models hosted online
• Prompt engineer a system that can generate clear, actionable UX diagnostic reports referencing accessibility and design standards, which can then be used by the AI coding assistant to improve the codebase.
• Develop a mechanism to automatically create and apply front-end code patches.
• Implement validation loops confirming UX improvements post-refactoring and track progress

Methodology / techniques:
• Automated UI testing using Playwright.
• Testing and validating vLLMs with domain-specific UI/UX prompts and reference designs.
• Accessibility diagnostics (WCAG 2.2 guidelines), contrast checks, and usability metrics.
• Automated CSS and component property patch generation (e.g., Tailwind classes).
• Dashboard implementation (e.g., React, Streamlit) visualizing UX improvements and test outcomes.

Required expertise/skills: 

Required expertise/skills:
• Familiarity with automated UI-testing frameworks (Playwright or Cypress).
• Experience or interest in Vision-Language Models (VLLMs) and fine-tuning pipelines using Hugging Face.
• Solid understanding of front-end technologies (React, Next.js, Tailwind CSS, CSS-in-JS).
• Knowledge of accessibility standards and UX metrics (WCAG 2.2, color contrast analysis).
• Experience developing automated code modification tools or scripts (Node.js-based tooling, ESLint, Stylelint).
• Basic skills in creating data visualization dashboards (Streamlit or React-based solutions).

Assets (optional):
• Experience integrating AI models into continuous integration pipelines.
• Familiarity with design tools like Figma and associated APIs.
• Understanding of computer vision concepts (SSIM, visual differencing).