OmniJARVIS

Creator

Creator

Seonglae Cho

Created

Created

2024 Oct 21 21:31

Editor

Editor

Seonglae Cho

Edited

Edited

2024 Oct 21 21:39

Refs

Refs

Decision Transformer

notion image

Vision Language Model augmented with additional behavior tokens

CoT

Decoder as Policy

Every 128 steps, OmniJARVIS is forced to reason again and produce new behavior tokens with the latest observation.

notion image

https://arxiv.org/pdf/2407.00114

Recommendations

///////