Architecture OverviewΒΆ
The Retail AI system is built as a sophisticated agent-based architecture that routes queries to specialized agents based on the nature of the request. This approach enables domain-specific handling while maintaining a unified interface.
ποΈ Core ComponentsΒΆ
Message Routing and ProcessingΒΆ
- Message Validation: Validates incoming requests against required configuration parameters
- Router Agent: Analyzes user queries and routes them to the appropriate specialized agent
- Factuality Check: Ensures responses are factually accurate through iterative refinement
Specialized AgentsΒΆ
The system includes seven specialized agents, each optimized for specific retail operations:
- General Agent: Handles general inquiries about store policies and basic information
- Product Agent: Provides detailed product specifications, availability, and compatibility
- Inventory Agent: Offers real-time inventory checks and stock availability across locations
- Recommendation Agent: Suggests products based on user preferences and purchase history
- Orders Agent: Manages order status inquiries, tracking, and order history
- Comparison Agent: Compares different products to help customers make informed decisions
- DIY Agent: Offers project advice, tutorials, and step-by-step instructions for DIY projects
Guardrails and Quality ControlΒΆ
- Factuality Judge: Evaluates responses for factual accuracy and triggers refinement when needed
- Configuration Validation: Ensures all required parameters are provided before processing
- Retry Mechanism: Implements intelligent retry logic when responses don't meet quality thresholds
π§ Technical ImplementationΒΆ
The system is implemented using:
- LangGraph: For workflow orchestration and state management
- LangChain: For LLM interactions and chain composition
- MLflow: For model deployment and serving
- Databricks LLM APIs: As the foundation models for natural language processing
Architecture FlowΒΆ
The architecture follows a graph-based state machine pattern:
- User Input: Messages enter through validation
- Routing: Messages are routed by the router agent
- Processing: Specialized agents process domain-specific requests
- Quality Check: Responses undergo factuality checking
- Refinement: If needed, responses are refined until they meet quality thresholds
graph TD
A[User Input] --> B[Message Validation]
B --> C[Router Agent]
C --> D[General Agent]
C --> E[Product Agent]
C --> F[Inventory Agent]
C --> G[Recommendation Agent]
C --> H[Orders Agent]
C --> I[Comparison Agent]
C --> J[DIY Agent]
D --> K[Factuality Judge]
E --> K
F --> K
G --> K
H --> K
I --> K
J --> K
K --> L{Quality Check}
L -->|Pass| M[Response]
L -->|Fail| N[Refinement]
N --> K
ποΈ Project StructureΒΆ
retail_ai/
βββ agents.py # Agent implementations
βββ catalog.py # Unity Catalog integration
βββ graph.py # LangGraph workflow definition
βββ models.py # MLflow model integration
βββ nodes.py # Agent node definitions
βββ tools.py # Tool definitions
βββ vector_search.py # Vector search utilities
notebooks/
βββ 05_agent_as_code_driver.py # Model logging & registration
βββ 06_evaluate_agent.py # Model evaluation
βββ 07_deploy_agent.py # Model deployment & permissions
streamlit_store_app/ # Store management interface
βββ app.py # Main Streamlit application
βββ components/ # Reusable UI components
βββ pages/ # Application pages
βββ utils/ # Utility functions
π Development WorkflowΒΆ
The development workflow is organized into focused notebooks:
- Data Setup:
01_ingest-and-transform.py,02_provision-vector-search.py - Model Development:
05_agent_as_code_driver.py- Model development, logging, and registration - Evaluation:
06_evaluate_agent.py- Formal MLflow evaluation and performance metrics - Deployment:
07_deploy_agent.py- Model alias management, endpoint deployment, and permissions
π οΈ Technology StackΒΆ
Core TechnologiesΒΆ
- Python 3.12+: Primary development language
- LangGraph: Workflow orchestration and state management
- LangChain: LLM interactions and tool composition
- MLflow: Model lifecycle management and serving
- Pydantic: Data validation and serialization
Databricks PlatformΒΆ
- Unity Catalog: Data governance and function management
- Vector Search: Semantic search capabilities
- Model Serving: LLM endpoint hosting
- Genie: Natural language to SQL conversion
- SQL Warehouse: Query execution engine
Frontend & InterfaceΒΆ
- Streamlit: Store management interface
- REST APIs: Model serving endpoints
- WebSocket: Real-time chat functionality
π Security & GovernanceΒΆ
Data SecurityΒΆ
- Unity Catalog: Centralized data governance
- Row-level Security: Fine-grained access control
- Audit Logging: Complete activity tracking
- Encryption: Data at rest and in transit
Model GovernanceΒΆ
- MLflow Model Registry: Version control and lineage
- Model Validation: Automated quality checks
- A/B Testing: Safe model deployment
- Performance Monitoring: Real-time metrics
π Monitoring & ObservabilityΒΆ
Application MonitoringΒΆ
- MLflow Tracing: End-to-end request tracking
- Custom Metrics: Business-specific KPIs
- Error Tracking: Comprehensive error logging
- Performance Metrics: Latency and throughput monitoring
Data QualityΒΆ
- Data Validation: Input/output schema validation
- Drift Detection: Model performance monitoring
- Quality Metrics: Accuracy and relevance scoring
- Alerting: Automated issue detection
This architecture provides a robust, scalable foundation for retail AI operations while maintaining flexibility for future enhancements and integrations.