Our Team
We’re research scientists who’ve spent years advancing AI avatar and audio-visual generation — publishing at top conferences and shipping ultra-low-latency ML products to millions. We combine frontier research with the ruthless engineering needed for consumer-grade, real-time systems.
Fangchang Ma
CEO
MIT PhD in Robotics and ML. Previously an Engineering Manager at Apple. Published in NeurIPS, ICLR, CVPR, ECCV, ICCV, ICRA, and IROS, with 2400+ citations.
Edward Zhang
CTO
UW PhD in Computer Graphics. Previously a Senior Research Scientist at Apple and tech lead on neural rendering research. Published in CVPR and SIGGRAPH, with 400+ citations.
Karren Yang
Founding Research Scientist
MIT PhD in Audio-visual Synthesis from MIT. Previously a Senior Research Scientist at Apple, with experience at Niantic Labs, Meta Reality Labs, Bosch Center for AI, and Adobe Research. Published in top AI conferences with 1800+ citations.
Claudia Vanea
Founding Research Scientist
Oxford PhD in AI for Health. Published in Nature Communications and NeurIPS. Previously founder at South Park Commons.
Nicole Zhen
Chief of Staff
Yale Economics. Previously management consulting at McKinsey & Company, and enterprise strategy and growth at Nordstrom.
Rajath Kumar
Founding Research Scientist
Columbia Master's in Electrical Engineering from Columbia. Previously a research scientist at Amazon AGI.
Elliott Bartsch
Founding Engineer
Harvard Statistics. Previously a Staff ML Engineer and Senior Manager at Discord, with experience at Day Zero Diagnostics and Firecracker.
Soyong Shin
Founding Research Scientist
Carnegie Mellon PhD in Mechanical Engineering, with experience at Meta Fundamental AI Research (FAIR).
Yaser Sheikh
Advisor
Former VP of Codec Avatars at Meta and Professor at Carnegie Mellon.
Our Approach
Auto-regressive transformers can be trained to model people.
Consider the overwhelming success of today’s LLMs: by simply predicting one token at a time they learn the semantic complexities of human language. Similarly, we believe that by predicting how humans will act, one visual and audio frame at a time, will produce a deeper and more nuanced understanding of being human.
We are building a unified system that can both appropriately express and empathetically interpret the subtle and meaningful landscape of human communication, in real-time.
What differentiates us
Radical transparency
We communicate openly so everyone can make informed decisions.
Relentless speed
We bias towards action, iterate fast, and learn quickly.
Doing right by people
Integrity and respect are not negotiable.
In-person collaboration
Being together fuels our energy and accelerates our problem-solving.
















