Microsoft and Peking University Unlock AI's Potential in OS

Cryptos: 21,334
Exchanges: 1,863
Market Cap: $2.95T 3.39%
24h Vol: $236,089,803,994 34.92%
Dominance: BTC: 59.0% ETH: 12.5%
Gas: 21 GWEI

Connect

News Newsletter

Search Cryptos

Microsoft and Peking University Unlock AI's Potential in OS Navigation

Written by Aaron S.,

Editor-In-Chief

Last Updated: February 14, 2024

Key Takeaways

Scientists from Microsoft Research and Peking University are trying to enable ChatGPT to operate autonomously within an OS.
The research identified that large language models struggled with operating system tasks due to a lack of understanding, reasoning, exploration, and reflection capabilities.
To address the challenges, AndroidArena was developed, simulating the Android OS environment for more effective AI learning.

Microsoft and Peking University Unlock AI's Potential in OS Navigation

Microsoft Research and Peking University scientists are trying to enable ChatGPT, powered by GPT-4, to autonomously navigate and operate within an operating system.

This development tackles the challenge of equipping large language models (LLMs) to effectively manipulate and interact within digital environments, beyond their established generative text capabilities, a task that has proven difficult.

What is Cardano in Crypto? (Easily Explained!)

Did you know?

Want to get smarter & wealthier with crypto?

Subscribe - We publish new crypto explainer videos every week!

Unlike the success seen in simulated environments like video games, where AI models have demonstrated proficiency through reinforcement learning, operating systems present a multifaceted challenge.

This complexity stems from the need for AI to engage with various components, applications, and programs, coupled with the risk of data loss through errors.

Researchers identified the vast and dynamic action space, the necessity for inter-application cooperation, and the alignment with user constraints such as security and preferences as the three primary obstacles of LLMs.

They also added:

We identify a lack of four key capabilities, i.e., understanding, reasoning, exploration, and reflection, as primary reasons for the failure of LLM agents.

To overcome these challenges, the team introduced AndroidArena, a novel training platform mimicking the Android operating system and allowing LLMs to learn through exploration.

This environment was created to facilitate the learning and exploration of LLMs in a more representative setting, focusing on the unique requirements of operating system manipulation.

By integrating automated feedback about the model's previous attempts and actions into the prompts, a form of "reflection" or memory capability was introduced, which was an effective method to enhance model accuracy by 27%.

This research not only sheds light on the challenges of AI interaction with operating systems but also paves the way for the development of more sophisticated and capable AI assistants.

In November 2023, Microsoft hired former OpenAI CEO Sam Altman and president Greg Brockman to lead a new advanced AI research team.