Articles
The article addresses the challenge of non-technical business users accessing data due to a lack of SQL proficiency, leading to delayed insights. It proposes an innovative solution leveraging Amazon Bedrock Agents, powered by Amazon Nova Lite, to create a natural language interface for Amazon Athena queries. The solution's architecture integrates several AWS services including Amazon Cognito for secure authentication, AWS Amplify for the frontend, AWS Lambda for action groups, and AWS Glue for data cataloging. It details a step-by-step deployment process using AWS CloudFormation templates and provides a practical example using AWS Cost and Usage Reports (CUR). The system enables users to ask questions in plain English, which are then translated into precise SQL queries, executed by Athena, and results presented in a user-friendly format. The article also outlines how to adapt this solution for other Amazon S3-backed Athena databases, emphasizing its versatility and practical value in enhancing data accessibility and self-service analytics.
This article introduces Amazon Bedrock AgentCore Identity, a new service specifically designed to manage identity and access for AI agents. It highlights the unique security challenges posed by agentic AI systems, such as inbound and outbound authentication, enterprise integration, and compliance requirements, which traditional security models struggle with. AgentCore Identity addresses these by providing a centralized agent identity directory, an agent authorizer for inbound requests, a resource credential provider, and a secure token vault for outbound access. The service supports a dual authentication model (inbound/outbound), integrates with existing identity providers (e.g., Amazon Cognito, Okta), and offers seamless SDK integration. A practical example demonstrates how to deploy a secure developer productivity agent using AgentCore Identity, showcasing its ability to handle complex OAuth flows and secure token management automatically. The article concludes with essential security best practices for implementing the service.
The article introduces Amazon Bedrock AgentCore Runtime, a new serverless hosting solution designed to move AI agent prototypes from 'proof of concept purgatory' to production. It tackles common enterprise challenges, including the need for framework/model agnosticism, complex security due to agents' stochastic nature, identity and access control, handling large multimodal payloads, unpredictable compute resource needs, and infrastructure management overhead. AgentCore Runtime provides a secure environment with features like microVM-based session isolation, ensuring complete compartmentalization of agent state and credentials. It simplifies deployment with minimal code changes and supports embedded identity using IAM SigV4 and OAuth for secure access to external services. Furthermore, it manages state persistence through ephemeral session state and integrates with AgentCore Memory for durable long-term context. The service aims to free developers from infrastructure complexities, allowing them to focus on building intelligent agent functionalities.
The article addresses the significant challenges of legacy system modernization, such as time-consuming architecture reviews and complex migrations, which can lead to lost market opportunities and increased operational costs. It introduces an innovative solution that leverages Amazon Q Developer, Amazon Bedrock Data Automation (BDA), and Anthropic's Model Context Protocol (MCP). This integration allows developers to transform initial ideas from whiteboard sketches and team discussions into fully deployed, secure, and scalable AWS CloudFormation templates in minutes. The piece elaborates on MCP as an open standard for secure, two-way connections between AI models and various data sources, and explains how BDA complements it by automating the extraction, transformation, and loading of unstructured multimodal enterprise data (e.g., from meeting recordings, diagrams) into AI workflows. A practical, step-by-step guide is provided for setting up and utilizing the Bedrock Data Automation MCP server with Amazon Q CLI, demonstrating its application in generating and provisioning cloud infrastructure.
This article announces Amazon SageMaker HyperPod's new support for P6e-GB200 UltraServers, powered by NVIDIA GB200 NVL72, designed for developing and deploying AI models at trillion-parameter scale, leveraging SageMaker HyperPod's resilient and scalable infrastructure. It details the UltraServer's technical specifications, including its configuration with 36 NVIDIA Grace CPUs and 72 Blackwell GPUs within a single NVLink domain, offering 360 petaflops of FP8 compute and 1.4 exaflops of FP4 compute. The post highlights significant performance benefits such as industry-leading GPU power, high-performance networking (130 TBps NVLink bandwidth, 28.8 Tbps EFA v4), and substantial local NVMe SSD storage (405 TB) with Amazon FSx for Lustre integration. Key use cases include efficient training of trillion-parameter models and 30x faster real-time inference for frontier LLMs. The article also explains how to acquire UltraServer capacity through flexible training plans and integrate it with SageMaker HyperPod clusters, emphasizing topology-aware scheduling for optimized performance. The solution aims to accelerate the AI lifecycle, reduce operational complexity, and lower costs for innovators.
This article details how Lexbe, a leader in legal document review software, leveraged Amazon Bedrock and other AWS services to overcome the challenges of managing and analyzing vast legal datasets. By integrating Amazon Bedrock Knowledge Bases and an AI-powered Q&A assistant (Lexbe Pilot), legal teams can instantly query and extract grounded insights from entire case documents, significantly enhancing their ability to identify critical 'smoking gun' documents and moving beyond traditional keyword searches. The solution's architecture, built on AWS services like Amazon OpenSearch, AWS Fargate, and Amazon S3, ensures scalability and high performance. The article highlights an eight-month collaboration between Lexbe and Amazon, which iteratively improved the system's recall rate to 90% through parameter adjustments and the introduction of reranker technology. This integration has resulted in capabilities like broad, human-style reporting and deep, automated inference, significantly streamlining eDiscovery processes while ensuring cost efficiency and robust security.
This article, the second part of a series, focuses on the technical implementation of an enterprise-grade AIOps architecture using Amazon SageMaker Unified Studio. It outlines a comprehensive workflow from project initialization to production deployment, addressing the distinct needs of administrators, data scientists, and ML engineers. The solution leverages AWS services like EventBridge, Lambda, Step Functions, GitHub Actions, and AWS CDK to automate project setup, model development, and deployment. Key architectural components include project-specific repositories (model build and deploy), a shared services layer for CI/CD and event-driven automation, and a robust development environment centered around SageMaker Unified Studio. The article emphasizes security and governance features through IAM, version control, and automated checks, providing a scalable, secure, and efficient foundation for ML model development and deployment with strong governance.
The article introduces Amazon Bedrock AgentCore Memory, a fully managed service designed to address the inherent statelessness of large language models (LLMs) and the resulting challenges developers face in building context-aware AI agents. It details how developers traditionally struggle with context window constraints, complex state management, and the need for intelligent memory recall. AgentCore Memory solves these issues by abstracting underlying infrastructure, offering secure storage, maintaining conversational continuity, and providing hierarchical data organization through namespaces. The service distinguishes between short-term memory (raw interaction events) and long-term memory (extracted insights via intelligent strategies like semantic, summary, and user preferences). Furthermore, it includes advanced features such as branching for managing alternative conversation paths and checkpointing for resuming multi-session tasks. By integrating seamlessly with other Bedrock AgentCore services, it enables developers to create personalized, continuous AI agent experiences without the overhead of complex memory infrastructure management.
Large Language Models (LLMs) often suffer from "hallucinations," undermining user trust. This article addresses this by advocating for source citations, which improve factual accuracy, build transparency, support ethical practices, and enhance usability. It introduces Amazon Nova, a new generation of foundation models available on Amazon Bedrock, designed for frontier intelligence and price performance. The core of the article demonstrates how to engineer prompts for Amazon Nova understanding models (like Nova Pro) to generate responses with embedded citations from provided contexts, using Amazon shareholder letters as an example. Furthermore, it details a scalable method for evaluating the accuracy and faithfulness of these AI-generated citations using the "LLM-as-a-judge" technique on Amazon Bedrock evaluations, leveraging models like Anthropic Claude 3.5 Sonnet v1. The evaluation results show high scores for Nova Pro's responses across various metrics, indicating its capability to produce coherent, helpful, and accurate cited content. The post concludes by emphasizing the practical approach of Nova's citation capabilities for more reliable AI interactions.
The article introduces Intelligent Document Processing (IDP) as a technology to automate information extraction from documents, significantly enhanced by generative AI for advanced understanding and classification. It highlights the new Amazon Bedrock Data Automation service, which streamlines IDP solutions by offering features such as confidence scores with bounding box data for explainability, pre-built blueprints for rapid development and customization, automatic document classification, and robust data normalization, transformation, and validation capabilities. The authors present a fully serverless architecture leveraging Amazon Bedrock Data Automation with AWS Step Functions and Amazon Augmented AI (A2I) for cost-effective and scalable document processing, including human-in-the-loop validation. Practical examples, like processing child support enrollment forms, illustrate how these features ensure data accuracy, compliance, and seamless integration with existing systems. The solution supports diverse document types and offers a GitHub repository for deployment, significantly reducing development time and improving data quality for robust IDP.