Tech Lead/PM Justin Lin & Nathan Reilly
GitHub https://github.com/uwrealitylabs/universal-text-unity
Scrum Board https://www.notion.so/uwrl/1f9bc072402f8056b481d64fa56b4ef5?v=1f9bc072402f81b7a7ca000c4e8d9e22&pvs=4
Expected Delivery July 2025

Changes to Spec:

Change Date Change Author Change Reason
Aug. 17, 2024 Justin Lin Initial Author
Aug. 31, 2024 Nathan Reilly & Justin Lin Technical revisions for Text Label composition. Added Introduction
Sep 27, 2024 Nathan Reilly Large-scale revision of the implementation
Jan 20, 2025 Nathan Reilly Revision of the UTT and UTS implementation & other updates for W25

Point Persons:

Role Name Contact Info
Sedra Lead Peter Teertstra [email protected]
Team Lead Justin Lin
Nathan Reilly [email protected]
[email protected]
UW Reality Labs Leads Vincent Xie
Kenny Na
Justin Lin [email protected]
[email protected]
[email protected]

Google docs version of tech spec here.

Table of Contents

Introduction

When you prompt a virtual assistant (for example Meta AI on Raybans glasses), what happens when you ask “What am I looking at”? Currently, the pipeline seems rather simplistic. The cameras on the glasses take a picture, that picture is passed through a model that can assign text labels to images, and finally that text label describing the whole image is passed into an LLM. This process, especially the step where a model must describe everything in an image using words, is often inaccurate.

What if we could build a system that…

If we created this, we could use it for…