Meta Launches Llama 4: Now Available on AWS with Major Advances in Multimodal AI

Last update: 07/04/2025
Author Isaac
  • Meta has launched its new generation of models IA, Llama 4, in collaboration with Amazon Web Services.
  • Llama 4 Scout and Maverick are multimodal models, capable of processing text and images simultaneously.
  • Its Mixture of Experts architecture enables greater computational efficiency and optimized performance.
  • They are available on SageMaker JumpStart and coming soon to Amazon Bedrock.

Llama 4 artificial intelligence model on AWS

Meta has taken an important step in its expansion strategy in Artificial Intelligence with the official deployment of Llama 4, its next generation of large-scale language models (LLM), which are now available on Amazon Web Services (AWS).

These models, including Scout and Maverick, have been designed not only to improve text processing but also to incorporate multimodal capabilities, that is, to process text and images simultaneously with a high level of efficiency.

Scout and Maverick models: design and purpose

The first two models in the series, the Llama 4 Scout and Maverick, are already integrated with AWS via SageMaker JumpStart, allowing developers and organizations to begin testing or integrating their solutions directly from the platform.

Scout is characterized by its ability to maintain a context window of up to 10 million tokens., something unprecedented in the field of publicly accessible models. This makes it an ideal choice for analyzing large volumes of information, such as large documents or code files. If you need more information on how to handle large volumes of data, you can consult our guide on Troubleshooting in Windows.

For its part, Maverick offers superior performance in logical reasoning tasks, programming and text comprehensionAlthough it has 400 billion parameters in total, it activates only 17 billion per task thanks to its efficient architecture, known as “Mixture of Experts” (MoE).

Both models share a differential characteristic: they are natively multimodal.This means they can understand and generate responses that combine text and images in a coherent way, a feature that is especially useful in contexts such as visual analysis, content description, or intelligent assistants that require richer contextual understanding.

  How to detect if a video was created by AI: a complete guide

Mixture of Experts Architecture: Efficiency as a Priority

One of the most innovative elements of Llama 4's architecture is the use of specialized experts, where different parts of the model are activated depending on the nature of the task to be performed.

This segmentation allows only small portions of the model to be activated in each inference., which significantly reduces the consumption of computing resources without harming the results. While Scout has 16 experts, Maverick employs 128However, in both cases, the simultaneous use of experts remains limited, which further optimizes performance. This structure also allows the model to be distributed on smaller architectures, such as a single GPU. NVIDIA H100.

Thanks to this efficiency, Llama 4 presents itself as a viable option not only for large companies with advanced infrastructure, but also for developers with more limited resources looking to incorporate quality AI into their projects.

Availability on AWS: SageMaker and Amazon Bedrock

Llama 4 is now available through SageMaker JumpStart., an AWS tool designed to make it easier to test and deploy trained models without having to develop complex environments from scratch.

Additionally, it has been announced that in the coming weeks, Llama 4 will be available as a serverless model on Amazon Bedrock, which will allow its use on demand, without the need to manage or scale servers manually.

Amazon Bedrock has positioned itself as an ideal alternative to integrate Generative AI in applications without worrying about the underlying infrastructureWith this addition, Meta and AWS strengthen their collaboration to bring advanced solutions to more users, in a flexible and secure way.

Integration, capabilities and limitations

In addition to being available on AWS, Llama 4 models can also be downloaded from platforms such as Hugging Face or from the official Meta website., allowing developers to adapt them to their own infrastructures.

In more accessible adaptations, such as the Meta AI assistant integrated into WhatsApp, Instagram o MessengerScout and Maverick are already operating with end-user deployments. However, in some regions, such as the European Union, certain regulatory restrictions are limiting their full deployment.

  How to install Alexa skills to use ChatGPT and Gemini

Depending on the environment, the models offer quantized and optimized versions for different types of hardwareThis allows it to run on both powerful data centers and more conventional devices. For more details on hardware optimization, check out our article on RJ45 cable types.

In terms of security, Llama 4 includes tools such as Prompt Guard and Llama Guard, developed by Meta to prevent inappropriate responses and strengthen protection against malicious use.

Upcoming developments: Behemoth and LlamaCon

Meta has confirmed the existence of two other models in development: Llama 4 Reasoning and Llama 4 BehemothThe latter, with nearly two trillion parameters and a learning architecture, has been used as a basis for training Scout and Maverick through co-distillation techniques.

Behemoth has shown outstanding performance in assessments such as GPQA Diamond and MATH-500, even surpassing advanced models such as GPT-4.5 and Claude Sonnet 3.7.

The LlamaCon event, scheduled for April 29, will serve as a platform to announce additional developments in the Llama 4 family roadmap, including potential new versions and collaboration opportunities within the open source ecosystem.

Meta has also strengthened its open source focus, allowing the technical community to access, test, and contribute to the advancement of these models., in collaboration with NVIDIA, AMD, AWS, Microsoft Azure and other key industry players.

With this strategy, Llama 4 not only seeks to position itself as an advanced technical solution, but also as an open, flexible, and secure platform for developing AI in multiple contexts of use.

With integration into AWS, the inclusion of multimodal capabilities, an efficient architecture, and the support of a growing community, Llama 4 is emerging as one of the most comprehensive and accessible offerings in the current artificial intelligence landscape. The collaboration between Meta and Amazon marks a milestone that promises to simplify access to advanced models without the need for complex infrastructure or disproportionate investments.