Alin Balan
Alin Balan

Software Developer

Verified Expert in Engineering

Full Stack Developer

CTO

Magento Expert

Infrastructure Architect

Blog Post

Building AI Agents: A Practical Guide for Developers

January 6, 2025 Technology Learning
Building AI Agents: A Practical Guide for Developers

Building an AI agent might seem like a complex undertaking, but with the right approach and modern tools, it’s more accessible than ever. In this guide, we’ll walk through the process of creating a functional AI agent from scratch, covering everything from architecture decisions to implementation details.

Understanding Agent Architecture

Before diving into code, it’s important to understand the core components of an AI agent:

1. The Brain: Language Model

The language model serves as the agent’s reasoning engine. It processes instructions, understands context, and decides what actions to take. Popular choices include:

  • OpenAI GPT-4: Powerful and widely available
  • Anthropic Claude: Excellent reasoning capabilities
  • Open Source Models: Llama, Mistral, or local models for privacy

2. The Tools: Function Calling

Agents need to interact with the world through tools. These can be:

  • APIs: REST, GraphQL, or gRPC endpoints
  • File Systems: Reading and writing files
  • Databases: Querying and updating data
  • Web Browsers: Navigating and extracting information
  • Code Execution: Running scripts and commands

3. The Memory: Context Management

Agents need to remember:

  • Conversation History: What was discussed
  • Tool Results: What actions were taken and their outcomes
  • User Preferences: Customization and settings
  • Long-term Knowledge: Persistent information

4. The Orchestrator: Agent Loop

The agent loop coordinates everything:

  1. Receive user input or task
  2. Process with language model
  3. Decide on actions (tool calls)
  4. Execute tools
  5. Process results
  6. Update memory
  7. Repeat until task complete

Building Your First Agent

Let’s build a simple research agent that can search the web, summarize information, and answer questions.

Step 1: Set Up Your Environment

# requirements.txt
openai>=1.0.0
requests>=2.31.0
python-dotenv>=1.0.0
# .env
OPENAI_API_KEY=your_api_key_here

Step 2: Define Your Tools

import requests
from typing import Dict, Any

class WebSearchTool:
    """Tool for searching the web"""
    
    def __init__(self):
        self.name = "web_search"
        self.description = "Search the web for information on a given query"
    
    def execute(self, query: str) -> Dict[str, Any]:
        # In production, use a real search API like Google, Bing, or Serper
        # This is a simplified example
        results = {
            "query": query,
            "results": [
                {
                    "title": f"Result about {query}",
                    "snippet": f"Information related to {query}...",
                    "url": f"https://example.com/{query}"
                }
            ]
        }
        return results

class SummarizeTool:
    """Tool for summarizing text"""
    
    def __init__(self):
        self.name = "summarize"
        self.description = "Summarize a given text into key points"
    
    def execute(self, text: str) -> Dict[str, Any]:
        # Simple word count summary (in production, use LLM)
        words = text.split()
        summary = {
            "word_count": len(words),
            "key_points": words[:50]  # Simplified
        }
        return summary

Step 3: Create the Agent Core

from openai import OpenAI
import json

class SimpleAgent:
    def __init__(self, api_key: str):
        self.client = OpenAI(api_key=api_key)
        self.tools = {
            "web_search": WebSearchTool(),
            "summarize": SummarizeTool()
        }
        self.memory = []
    
    def get_tool_definitions(self):
        """Convert tools to OpenAI function format"""
        return [
            {
                "type": "function",
                "function": {
                    "name": "web_search",
                    "description": "Search the web for information",
                    "parameters": {
                        "type": "object",
                        "properties": {
                            "query": {
                                "type": "string",
                                "description": "The search query"
                            }
                        },
                        "required": ["query"]
                    }
                }
            },
            {
                "type": "function",
                "function": {
                    "name": "summarize",
                    "description": "Summarize text content",
                    "parameters": {
                        "type": "object",
                        "properties": {
                            "text": {
                                "type": "string",
                                "description": "The text to summarize"
                            }
                        },
                        "required": ["text"]
                    }
                }
            }
        ]
    
    def run(self, user_query: str, max_iterations: int = 5):
        """Main agent loop"""
        messages = [
            {
                "role": "system",
                "content": "You are a helpful research assistant. Use tools to gather information and provide comprehensive answers."
            },
            {
                "role": "user",
                "content": user_query
            }
        ]
        
        for iteration in range(max_iterations):
            # Call the language model
            response = self.client.chat.completions.create(
                model="gpt-4",
                messages=messages,
                tools=self.get_tool_definitions(),
                tool_choice="auto"
            )
            
            message = response.choices[0].message
            messages.append(message)
            
            # Check if the model wants to call a tool
            if message.tool_calls:
                for tool_call in message.tool_calls:
                    tool_name = tool_call.function.name
                    tool_args = json.loads(tool_call.function.arguments)
                    
                    # Execute the tool
                    tool = self.tools[tool_name]
                    result = tool.execute(**tool_args)
                    
                    # Add tool result to messages
                    messages.append({
                        "role": "tool",
                        "tool_call_id": tool_call.id,
                        "content": json.dumps(result)
                    })
            else:
                # Agent has finished
                return message.content
        
        return "Agent reached maximum iterations"

Step 4: Use Your Agent

from dotenv import load_dotenv
import os

load_dotenv()

agent = SimpleAgent(api_key=os.getenv("OPENAI_API_KEY"))
result = agent.run("What are the latest developments in quantum computing?")
print(result)

Advanced Patterns

Multi-Step Planning

For complex tasks, agents can create and execute plans:

class PlanningAgent(SimpleAgent):
    def create_plan(self, goal: str):
        """Create a step-by-step plan"""
        plan_prompt = f"""
        Create a detailed plan to achieve this goal: {goal}
        Break it down into specific, actionable steps.
        """
        
        response = self.client.chat.completions.create(
            model="gpt-4",
            messages=[{"role": "user", "content": plan_prompt}]
        )
        
        plan = response.choices[0].message.content
        return self.parse_plan(plan)
    
    def execute_plan(self, plan: list):
        """Execute each step of the plan"""
        results = []
        for step in plan:
            result = self.run(step)
            results.append(result)
        return results

Error Handling and Retries

Robust agents handle failures gracefully:

def execute_with_retry(self, tool_name: str, args: dict, max_retries: int = 3):
    """Execute a tool with retry logic"""
    for attempt in range(max_retries):
        try:
            tool = self.tools[tool_name]
            return tool.execute(**args)
        except Exception as e:
            if attempt == max_retries - 1:
                return {"error": str(e)}
            # Wait before retry
            time.sleep(2 ** attempt)  # Exponential backoff

Memory Management

For long conversations, implement memory management:

class MemoryAgent(SimpleAgent):
    def __init__(self, *args, max_memory_size: int = 10, **kwargs):
        super().__init__(*args, **kwargs)
        self.max_memory_size = max_memory_size
        self.important_memories = []
    
    def add_memory(self, content: str, importance: float = 0.5):
        """Add a memory, keeping only important ones"""
        if importance > 0.7:
            self.important_memories.append(content)
            if len(self.important_memories) > self.max_memory_size:
                self.important_memories.pop(0)
    
    def get_relevant_memories(self, query: str):
        """Retrieve memories relevant to the current query"""
        # In production, use embeddings for semantic search
        return self.important_memories[-5:]  # Simplified

Best Practices

1. Start Simple

Begin with a single-purpose agent before building complex multi-agent systems. Each agent should have a clear, focused responsibility.

2. Validate Tool Inputs

Always validate inputs before executing tools:

def validate_input(self, tool_name: str, args: dict) -> bool:
    """Validate tool inputs"""
    required_params = self.tools[tool_name].required_params
    for param in required_params:
        if param not in args:
            return False
    return True

3. Implement Logging

Comprehensive logging helps debug and improve agents:

import logging

logging.basicConfig(level=logging.INFO)
logger = logging.getLogger(__name__)

def run(self, user_query: str):
    logger.info(f"Agent received query: {user_query}")
    # ... agent logic
    logger.info(f"Agent completed with result: {result}")

4. Set Clear Boundaries

Define what your agent can and cannot do:

ALLOWED_TOOLS = ["web_search", "summarize"]
BLOCKED_ACTIONS = ["delete_files", "modify_system"]

def is_action_allowed(self, action: str) -> bool:
    return action in ALLOWED_TOOLS and action not in BLOCKED_ACTIONS

5. Test Thoroughly

Create test cases for your agent:

def test_agent():
    agent = SimpleAgent(api_key="test_key")
    
    # Test basic functionality
    result = agent.run("What is Python?")
    assert result is not None
    
    # Test tool usage
    assert "web_search" in agent.tools
    
    # Test error handling
    # ... more tests

Common Pitfalls to Avoid

  1. Infinite Loops: Always set maximum iterations
  2. Token Limits: Monitor context window usage
  3. Cost Management: Track API call costs
  4. Security: Never expose API keys or sensitive data
  5. Hallucination: Verify agent outputs, especially for critical tasks

Next Steps

Once you’ve built a basic agent, consider:

  • Adding More Tools: Expand capabilities with new integrations
  • Improving Memory: Implement vector databases for semantic memory
  • Multi-Agent Systems: Coordinate multiple specialized agents
  • Fine-tuning: Customize models for your specific use case
  • Deployment: Package and deploy your agent as a service

Conclusion

Building AI agents is an exciting journey that combines language models, tool integration, and thoughtful architecture. Start simple, iterate, and gradually add complexity as you learn. The agent you build today might be the foundation for something much more sophisticated tomorrow.

Remember: the best agents are those that solve real problems. Focus on creating value, and the technical sophistication will follow naturally. Happy building!

Comments