[Incomplete] A handbook with concepts and terms for Language conditioned RL

Posted by : at

Category : Notes


Re-occuring Models and Papers:

MCIL: Multi-context imitation learning

Multi-context learning is a framework for generalizing across heterogeneous task and goal descriptions. In a nutshell: Collect trajectories (also called ‘play’s in the paper). Look at the end and get image of goal states or some language description of the task. These images/texts are called contexts. Now a context conditioned policy function ($\pi_{\theta}(a_t|s_t,z)$) is trained with the MCIL objective function (where z is the context vector). Use seperate encoders for each scenario (ie image context or text context …) to get z(context vector).

HULC: Hierarchical Universal Language Conditioned Policies

Some terms

  • Natural lanuguage conditioned policy $\pi_{\theta}(a_t s_t,l)$ : outputs action $a_t \in \mathcal{A}$ conditioned on current state $s_t \in \mathcal{S}$ and free-form language instruction $l \in \mathcal{L}$.
About Vihaan Akshaay

I recently completed my M.S. in Computer Science at the University of California, Santa Barbara, where I was mentored by Lei Li. My research focused on blending intuitive concepts with machine learning to tackle real-world challenges across diverse scales. Inspired by how humans approach solving the Rubik's Cube, I developed a novel algorithm under the guidance of Yu-Xiang Wang, introducing a bi-directional framework for goal conditioning in state-space search problems. Additionally, I proposed an edge-attention-based U-Net, drawing on how edges are used to annotate shorelines. In collaboration with Gen Li, I curated a large-scale landslide detection dataset by leveraging 40 years of Landsat imagery, contributing to AI for Earth and advancing the use of machine learning for environmental applications.

I earned my B.Tech in Mechanical Engineering and M.Tech in Robotics from IIT Madras, where my Master’s thesis on unsupervised behavior recognition in mice was guided by B. Ravindran and Dr. Vivek Kumar( The Jackson Laboratory). At IIT Madras, I led the iBot Robotics Club and worked on the ARTEMIS Railroad Crack Detection Robot , which won the International James Dyson Award . I also completed research internships, including analyzing the stability of Deep Q-Networks with Siva Theja Maguluri(Georgia Tech) and working on kernelized eDRVFLs, a type of deep randomized neural model, with P. N. Suganthan(NTU Singapore).

Useful Links