Project 05: Agentic RAG Assistant on Azure
Purpose
Build an agentic retrieval-augmented assistant so you practice AI application delivery, retrieval, safety controls, and platform operations on Azure.
Scenario
Assume an internal team wants an assistant that can answer questions from approved content, possibly use tools, and remain safe enough for real organizational use. A model endpoint alone is not enough. The system needs retrieval, runtime identity, safety controls, secret management, telemetry, and a controlled application surface.
This project is useful because it forces you to treat AI as a governed Azure workload rather than a standalone model demo.
Architecture
User request
-> Azure Container Apps or API Management
-> Azure OpenAI and Foundry Agent Service
-> Azure AI Search
-> Azure AI Content Safety
-> Application Insights, Key Vault, and managed identities
What You Will Build
- A user-facing application endpoint.
- A retrieval-backed assistant over approved content.
- Safety, identity, and telemetry controls around the AI path.
- A clear explanation of how retrieval, safety, and runtime identity fit together.
Why This Architecture Works
Azure OpenAI provides model access, Foundry Agent Service supports orchestration, Azure AI Search gives retrieval over approved content, and Content Safety provides a managed moderation layer. Container Apps or API Management defines the application boundary, while Key Vault, managed identities, and Application Insights complete the security and observability model.
Services Used
- Azure Container Apps
- Azure API Management
- Azure OpenAI
- Foundry Agent Service
- Azure AI Search
- Azure AI Content Safety
- Azure Key Vault
- Managed Identities
- Application Insights
- Microsoft Foundry
Skills Practiced
- AI application integration
- Retrieval design
- Agent workflow planning
- AI safety and observability
- Explaining AI as a governed Azure workload
Implementation Steps
- Define the assistant's scope, user type, source content, and evaluation criteria.
- Choose the user-facing runtime and API boundary for the application.
- Configure model access and retrieval over the approved content corpus.
- Add agent behavior only where it adds clear value over a simpler workflow.
- Apply managed identities, Key Vault, Content Safety, and telemetry before calling the system ready.
- Document how prompts, retrieval, safety controls, and operational ownership fit together.
Security and Operations Considerations
Review prompt injection, document governance, runtime identities, secret storage, and content safety checks as part of the initial implementation. The workload becomes credible when you can explain what happens if retrieval is poor, the model fails, or safety checks block or flag a result.
Cost Considerations
Model usage, search, safety checks, and container runtime costs can increase quickly, so keep testing scope controlled and monitor usage patterns early.
How to Extend This Project
- Add user authentication and role-based access.
- Add evaluation and feedback capture.
- Add CI/CD and usage dashboards.
- Add more explicit tool-calling or workflow automation with narrow permissions.
Portfolio Value
This project shows that you can frame AI work as a cloud engineering system with security, identity, observability, and deployment concerns rather than as a simple prompt demo.