MISTRAL LARGE 2 - LATEST AI MODEL. HUGE COMPETITION for GPT-4.0, Claude 3.5 Sonnet, and Llama 3.1
People & Blogs
Introduction
The release of Mistral AI's latest model, Mistral Large 2 (ML2), coincides with Meta's unveiling of its massive 405 billion parameter Llama 3.1 model. Both ML2 and Llama 3.1 offer a 128,000 token context window, enhancing their ability to retain and utilize information across extensive text inputs. This capability is crucial for applications requiring long-term memory and complex language understanding, such as chatbots and other conversational AI systems.
Mistral AI has a longstanding track record of providing tools for developers and businesses around the globe, capable of addressing a wide array of linguistic and coding tasks. According to Mistral's internal tests, ML2 holds its own against formidable models like OpenAI's GPT-4, Claude 3.5 Sonnet from Anthropic, and Meta's Llama 3.145B. Across various language, coding, and mathematics tests on the massive multitask language understanding benchmark, ML2 achieved a score of 84%. Although this is slightly lower than its competitors (GPT-4 at 88.7%, Claude 3.5 Sonnet at 88.3%, and Llama 3.145B at 88.6%), it is noteworthy that human experts typically score around 89.8% on the same test, underscoring ML2's impressive capabilities.
One of the key advantages of ML2 is its efficiency. With 123 billion parameters, it is less than a third the size of Meta's largest model and about one-fourth the size of GPT-4. This smaller size translates into significant benefits for deployment and commercial applications. At full 16-bit precision, ML2 requires approximately 246 GB of memory. While this is still substantial, it is manageable on a server with 4 to 8 GPUs without needing quantization, a feat that larger models like GPT-4 or Llama 3.145B struggle to achieve.
The efficiency of ML2 also means higher throughput, as performance in large language models (LLMs) is often limited by memory bandwidth. In practical terms, this allows ML2 to generate responses faster than its larger counterparts when operating on the same hardware. This speed advantage is crucial for applications requiring real-time interactions, such as customer service bots and interactive learning systems.
Mistral AI has also focused on addressing common challenges in AI models, particularly the issue of hallucinations, where models generate plausible but incorrect information. ML2 has been fine-tuned to be more cautious and discerning in its responses, better recognizing when it lacks sufficient information to answer a query. This refinement enhances the reliability and trustworthiness of the model, making it more suitable for applications where accuracy is paramount.
Additionally, ML2 excels at following complex instructions, especially in extended conversations. This improvement in prompt-following capabilities increases the model's versatility and user-friendliness across a variety of use cases. For example, in customer support scenarios, the ability to accurately interpret and respond to nuanced queries over long interactions can significantly enhance user satisfaction.
Mistral AI has also optimized ML2 to generate concise responses where appropriate. While verbose outputs can boost benchmark scores, they often result in increased compute time and operational costs. By producing more succinct answers, ML2 can help businesses manage these costs more effectively, making it an attractive option for commercial deployment. This focus on efficiency and cost-effectiveness without sacrificing performance positions ML2 as a practical choice for a wide range of business applications.
In terms of accessibility, ML2 is freely available on popular AI repositories like Hugging Face. However, its licensing terms are more restrictive than some of Mistral's previous offerings. Unlike the open-source Apache 2 license used for the Mistral Nemo 1.2B model, ML2's blend of high performance with fewer resources is a game-changer for the industry. Its smaller size and efficient design mean that it can be deployed more easily and cost-effectively than larger models. This accessibility opens up opportunities for a wider range of applications, from small businesses to large enterprises, and from academic research to commercial product development.
Moreover, ML2's strong performance in diverse language and coding tasks, combined with its efficiency and practical features, positions it as a highly attractive option for developers and businesses. Its ability to handle multiple languages and coding frameworks makes it particularly valuable in a globalized and technologically diverse world.