This contains a draft of my project for CMSC426 Computer Vision course. The goal of the assignment is to create a project and paper on something computer vision related.
This final project presents a real-time wrist gesture control framework that bridges pose estimation from consumer-grade cameras with third-party software through a WebSocket-based communication protocol. We made a prototype system that captures wrist movements using a deep learning pose detector (e.g., YOLOv11), processes them into interpretable gestures, and relays commands to external applications via low-latency WebSocket messaging. The framework demonstrates acceptable performance on low-cost hardware, achieving a system latency of 100ms and reliable gesture movement identification in real-world settings. By eliminating dependencies on specialized hardware, this work provides a scalable basis for vision-based human-computer interaction, with potential extensions to collaborative and accessibility-focused applications.
© 2025 • All content within this project is strictly the property of Eric Xu and is not for public use without permission.