Tech Lead/PM Justin Lin & Nathan Reilly
GitHub https://github.com/uwrealitylabs/universal-text-unity
Scrum Board Link
Expected Delivery EOT W25

Changes to Spec:

Change Date Change Author Change Reason
Aug. 17, 2024 Justin Lin Initial Author
Aug. 31, 2024 Nathan Reilly & Justin Lin Technical revisions for Text Label composition. Added Introduction
Sep 27, 2024 Nathan Reilly Large-scale revision of the implementation

Point Persons:

Role Name Contact Info
Sedra Lead Peter Teertstra [email protected]
Team Lead Justin Lin
Nathan Reilly [email protected]
[email protected]
UW Reality Labs Leads Vincent Xie
Kenny Na
Justin Lin [email protected]
[email protected]
[email protected]

Table of Contents

Introduction

When you prompt a virtual assistant (for example Meta AI on Raybans glasses), what happens when you ask “What am I looking at”? Currently, the pipeline seems rather simplistic. The cameras on the glasses take a picture, that picture is passed through a model that can assign text labels to images, and finally that text label describing the whole image is passed into an LLM. This process, especially the step where a model must describe everything in an image using words, is often inaccurate.

What if we could build a system that…

If we created this, we could use it for…