Publisher: Maaal International Media Company
License: 465734
Microsoft has announced the development of a new AI model called Large Action Model (LAM), which is capable of running Windows programs and performing tasks independently. This model represents a qualitative leap in artificial intelligence capable of actually executing commands.
Unlike traditional language models, such as GPT-4o, whose function is limited to processing and generating texts, Microsoft’s new LAM model has the ability to transform user requests into real actions, whether it is running programs or controlling devices. This idea has existed previously, but LAM is the first model specifically trained to work with Microsoft Office products and other Windows applications.
For example: When shopping online, traditional models can provide text instructions on how to purchase, while the LAM model can perform the purchase itself by navigating the site interface.
According to Microsoft, developing this model requires four main stages: training to plan tasks and break the task down into logical steps, learning from advanced models (such as GPT-4o) to turn plans into actions, self-exploration that allows the model to search for new solutions and overcome obstacles that other models fail to address, in addition to reward-based training to improve implementation accuracy. The researchers tested the LAM model in a test environment for the text editing program “Word”, and it succeeded in implementing the tasks with a rate of 71%, outperforming GPT-4o, which achieved a success rate of 63% without visual information. The LAM model was also faster, taking only 30 seconds to implement the task compared to 86 seconds for GPT-4o. However, when GPT-4o was provided with visual information, its accuracy improved to 75.5%. The Microsoft team relied on thousands of training data extracted from Microsoft documents, wikiHow articles, and Bing searches, and then the team used the GPT-4o model to develop these tasks into other more complex tasks. Researchers see LAM as a major advance in the field of artificial intelligence, noting that it could pave the way for the development of artificial general intelligence (AGI). Instead of systems that are limited to understanding and producing texts, companies may soon provide digital assistants that actually help carry out daily tasks effectively.