Meta’s latest AI models, Llama 4 Scout and Llama 4 Maverick, are now part of Amazon SageMaker JumpStart, with a serverless version coming soon on Amazon Bedrock. These models are designed to handle both text and image data, offering enhanced applications by activating only the necessary ‘expert’ components, which optimizes computational efficiency.
In collaboration with Meta, AWS has released these Llama 4 models on its platform, making them available through Amazon SageMaker JumpStart. Soon, you’ll be able to access them as serverless options in Amazon Bedrock. The Llama 4 Scout 17B and Llama 4 Maverick 17B models boast advanced multimodal capabilities and substantial context windows, enhancing their performance and efficiency compared to previous versions.
This development broadens AWS’s range of AI models, allowing you to build, deploy, and scale applications more easily. AWS is committed to providing new models from leading AI developers like Meta, ensuring secure and scalable generative AI solutions for enterprises. With over 135 AI/ML training courses for all expertise levels, AWS emphasizes its dedication to model diversity.
The integration of Meta’s models highlights AWS’s commitment to offering a variety of models. Llama 4 Scout 17B can process up to 10 million tokens simultaneously, a significant leap from the previous 128,000-token limit. This supports applications requiring comprehensive data analysis, like extensive document summarization and complex codebase assessments.
Llama 4 Maverick 17B excels in multilingual and multimodal tasks, making it ideal for sophisticated AI applications. Built with native multimodality, both models integrate text and image understanding seamlessly, unlike previous models that handled these inputs separately. The efficient mixture of experts (MoE) architecture, a first for Meta, ensures high performance and cost-effectiveness by activating only the necessary components for each task.
Llama 4 Scout 17B, much like a detail-oriented assistant, can recall vast amounts of information and provide meaningful context. Meanwhile, Llama 4 Maverick 17B, similar to a creative director, excels in multilingual image and text tasks. With 17 billion active parameters, Llama 4 Scout 17B has an industry-leading context window of up to 10 million tokens, while Llama 4 Maverick 17B, with 400 billion total parameters, uses 128 experts for task-specific efficiency.
AWS is also making strides in quantum computing with its new ‘Ocelot’ chip, which reduces error correction needs by 90%, speeding up real-world quantum applications. The MoE architecture is like a team of specialists, activating necessary components for each query, similar to specialized medical care, enhancing efficiency and accessibility.
Developers can use these advanced capabilities to create applications that handle extensive data and support multilingual, multimodal processing. Looking ahead, AWS plans to continue expanding its AI model offerings, empowering you to fully utilize generative AI.