Building a Portfolio Management Agent with CrewAI
Priya Patel
Product manager at an AI startup. Explores how agents reshape workflows.
Single-LLM approaches to financial analysis hit a wall fast. You can ask GPT-4 to analyze a stock, but it can't simultaneously monitor risk exposure, scan for opportunities, and execute rebalancing de...
Building a Multi-Agent Portfolio Management System with CrewAI
Why Multi-Agent Systems for Portfolio Management
Single-LLM approaches to financial analysis hit a wall fast. You can ask GPT-4 to analyze a stock, but it can't simultaneously monitor risk exposure, scan for opportunities, and execute rebalancing decisions — at least not well. The context window fills up, the prompt becomes an incoherent mess of competing objectives, and the model starts hallucinating ticker symbols.
Multi-agent architectures solve this by decomposing the problem into specialized roles, each with focused context and tools. CrewAI makes this tractable without building your own orchestration layer from scratch.
This tutorial walks through building a functional four-agent portfolio management system. We'll use real market data, implement actual risk calculations, and be honest about where this system breaks down.
What we're building:
┌─────────────┐ ┌──────────────┐ ┌─────────────┐ ┌──────────────┐
│ Researcher │───▶│ Analyst │───▶│ Trader │───▶│ Risk Manager │
│ │ │ │ │ │ │ │
│ • News scan │ │ • Valuation │ │ • Signals │ │ • Drawdown │
│ • Earnings │ │ • Technical │ │ • Sizing │ │ • Correlation│
│ • Macro data │ │ • Sentiment │ │ • Execution │ │ • Limits │
└─────────────┘ └──────────────┘ └─────────────┘ └──────────────┘
Environment Setup
pip install crewai crewai-tools yfinance pandas numpy ta python-dotenv
You'll need an OpenAI API key (or swap in another provider CrewAI supports):
# .env
OPENAI_API_KEY=sk-your-key-here
I'm using gpt-4o-mini throughout this tutorial for cost reasons. You can upgrade to gpt-4o for better financial reasoning, but expect 5-10x the cost per crew run.
Defining Custom Tools
CrewAI agents are only as useful as their tools. The LLM itself doesn't have market data — we need to give it functions that fetch real information.
# tools/market_tools.py
import yfinance as yf
import pandas as pd
import numpy as np
from datetime import datetime, timedelta
from crewai.tools import tool
@tool("Get Stock Price Data")
def get_stock_data(ticker: str, period: str = "3mo") -> str:
"""
Fetch historical price data for a stock ticker.
Period options: 1d, 5d, 1mo, 3mo, 6mo, 1y, 2y
Returns OHLCV data summary.
"""
try:
stock = yf.Ticker(ticker)
hist = stock.history(period=period)
if hist.empty:
return f"No data found for ticker {ticker}"
current_price = hist['Close'].iloc[-1]
period_return = ((hist['Close'].iloc[-1] / hist['Close'].iloc[0]) - 1) * 100
volatility = hist['Close'].pct_change().std() * np.sqrt(252) * 100
avg_volume = hist['Volume'].mean()
return (
f"Ticker: {ticker}\n"
f"Current Price: ${current_price:.2f}\n"
f"Period Return ({period}): {period_return:.2f}%\n"
f"Annualized Volatility: {volatility:.1f}%\n"
f"Average Daily Volume: {avg_volume:,.0f}\n"
f"52-Week High: ${hist['High'].max():.2f}\n"
f"52-Week Low: ${hist['Low'].min():.2f}\n"
f"Last Updated: {hist.index[-1].strftime('%Y-%m-%d')}"
)
except Exception as e:
return f"Error fetching data for {ticker}: {str(e)}"
@tool("Get Company Fundamentals")
def get_fundamentals(ticker: str) -> str:
"""
Retrieve fundamental financial data for a company.
Includes P/E, market cap, revenue growth, profit margins.
"""
try:
stock = yf.Ticker(ticker)
info = stock.info
fundamentals = {
"Company": info.get("longName", ticker),
"Sector": info.get("sector", "N/A"),
"Market Cap": info.get("marketCap", "N/A"),
"P/E Ratio": info.get("trailingPE", "N/A"),
"Forward P/E": info.get("forwardPE", "N/A"),
"PEG Ratio": info.get("pegRatio", "N/A"),
"Price/Book": info.get("priceToBook", "N/A"),
"Revenue (TTM)": info.get("totalRevenue", "N/A"),
"Revenue Growth": info.get("revenueGrowth", "N/A"),
"Profit Margin": info.get("profitMargins", "N/A"),
"Operating Margin": info.get("operatingMargins", "N/A"),
"ROE": info.get("returnOnEquity", "N/A"),
"Debt/Equity": info.get("debtToEquity", "N/A"),
"Dividend Yield": info.get("dividendYield", "N/A"),
"Beta": info.get("beta", "N/A"),
"50D MA": info.get("fiftyDayAverage", "N/A"),
"200D MA": info.get("twoHundredDayAverage", "N/A"),
}
formatted = "\n".join(
f"{k}: {v}" if not isinstance(v, (int, float))
else f"{k}: {v:,.2f}"
for k, v in fundamentals.items()
)
return formatted
except Exception as e:
return f"Error fetching fundamentals for {ticker}: {str(e)}"
@tool("Calculate Technical Indicators")
def get_technical_indicators(ticker: str) -> str:
"""
Calculate RSI, MACD, Bollinger Bands, and moving average signals.
Returns technical analysis summary.
"""
try:
stock = yf.Ticker(ticker)
hist = stock.history(period="6mo")
if len(hist) < 50:
return f"Insufficient data for technical analysis of {ticker}"
close = hist['Close']
# RSI (14-day)
delta = close.diff()
gain = (delta.where(delta > 0, 0)).rolling(window=14).mean()
loss = (-delta.where(delta < 0, 0)).rolling(window=14).mean()
rs = gain / loss
rsi = 100 - (100 / (1 + rs))
# MACD
ema12 = close.ewm(span=12).mean()
ema26 = close.ewm(span=26).mean()
macd = ema12 - ema26
signal_line = macd.ewm(span=9).mean()
macd_histogram = macd - signal_line
# Bollinger Bands
sma20 = close.rolling(window=20).mean()
std20 = close.rolling(window=20).std()
upper_bb = sma20 + (std20 * 2)
lower_bb = sma20 - (std20 * 2)
# Moving Averages
sma50 = close.rolling(window=50).mean()
sma200 = close.rolling(window=min(200, len(close))).mean()
current = close.iloc[-1]
current_rsi = rsi.iloc[-1]
current_macd = macd.iloc[-1]
current_signal = signal_line.iloc[-1]
# Generate signals
rsi_signal = "OVERBOUGHT" if current_rsi > 70 else "OVERSOLD" if current_rsi < 30 else "NEUTRAL"
macd_signal = "BULLISH" if current_macd > current_signal else "BEARISH"
bb_position = "ABOVE UPPER" if current > upper_bb.iloc[-1] else "BELOW LOWER" if current < lower_bb.iloc[-1] else "WITHIN BANDS"
ma_signal = "ABOVE" if current > sma50.iloc[-1] else "BELOW"
return (
f"Technical Indicators for {ticker}:\n"
f"RSI (14): {current_rsi:.1f} - {rsi_signal}\n"
f"MACD: {current_macd:.3f} vs Signal: {current_signal:.3f} - {macd_signal}\n"
f"Bollinger Bands: Price is {bb_position}\n"
f" Upper: ${upper_bb.iloc[-1]:.2f}, Middle: ${sma20.iloc[-1]:.2f}, Lower: ${lower_bb.iloc[-1]:.2f}\n"
f"50-Day SMA: ${sma50.iloc[-1]:.2f} - Price is {ma_signal}\n"
f"200-Day SMA: ${sma200.iloc[-1]:.2f}\n"
f"Current Price: ${current:.2f}"
)
except Exception as e:
return f"Error calculating indicators for {ticker}: {str(e)}"
@tool("Portfolio Risk Analysis")
def analyze_portfolio_risk(holdings_json: str) -> str:
"""
Analyze portfolio-level risk metrics.
Input: JSON string of holdings, e.g. '{"AAPL": 100, "MSFT": 50, "GOOGL": 30}'
Holdings values are number of shares.
"""
import json
try:
holdings = json.loads(holdings_json)
tickers = list(holdings.keys())
# Fetch price data for correlation matrix
data = yf.download(tickers, period="6mo")['Close']
if data.empty:
return "Could not fetch price data for portfolio tickers"
returns = data.pct_change().dropna()
# Portfolio value calculation
latest_prices = data.iloc[-1]
position_values = {t: holdings[t] * latest_prices[t] for t in tickers}
total_value = sum(position_values.values())
weights = {t: position_values[t] / total_value for t in tickers}
# Portfolio metrics
portfolio_returns = sum(returns[t] * weights[t] for t in tickers)
portfolio_vol = portfolio_returns.std() * np.sqrt(252) * 100
sharpe = (portfolio_returns.mean() * 252) / (portfolio_returns.std() * np.sqrt(252)) if portfolio_returns.std() > 0 else 0
# Max drawdown
cumulative = (1 + portfolio_returns).cumprod()
rolling_max = cumulative.cummax()
drawdown = (cumulative - rolling_max) / rolling_max
max_drawdown = drawdown.min() * 100
# Correlation matrix
corr_matrix = returns.corr()
# Concentration risk
max_weight = max(weights.values())
max_position = max(weights, key=weights.get)
# Value at Risk (95%)
var_95 = np.percentile(portfolio_returns, 5) * total_value
report = (
f"Portfolio Risk Report\n"
f"{'='*40}\n"
f"Total Value: ${total_value:,.2f}\n\n"
f"Position Weights:\n"
)
for t in sorted(weights, key=weights.get, reverse=True):
report += f" {t}: {weights[t]*100:.1f}% (${position_values[t]:,.2f})\n"
report += (
f"\nRisk Metrics:\n"
f" Annualized Volatility: {portfolio_vol:.1f}%\n"
f" Sharpe Ratio: {sharpe:.2f}\n"
f" Max Drawdown: {max_drawdown:.1f}%\n"
f" 95% VaR (daily): ${abs(var_95):,.2f}\n"
f" Concentration: {max_position} at {max_weight*100:.1f}%\n\n"
f"Correlation Matrix:\n{corr_matrix.round(2).to_string()}\n"
)
return report
except Exception as e:
return f"Error analyzing portfolio: {str(e)}"
@tool("Get Market News Sentiment")
def get_market_news(ticker: str) -> str:
"""
Fetch recent news and basic sentiment for a stock ticker.
"""
try:
stock = yf.Ticker(ticker)
news = stock.news
if not news:
return f"No recent news found for {ticker}"
formatted_news = f"Recent News for {ticker}:\n\n"
for i, item in enumerate(news[:5], 1):
title = item.get('title', 'No title')
publisher = item.get('publisher', 'Unknown')
link = item.get('link', '')
formatted_news += f"{i}. {title}\n Publisher: {publisher}\n Link: {link}\n\n"
return formatted_news
except Exception as e:
return f"Error fetching news for {ticker}: {str(e)}"
These tools give agents access to real data. A critical point: the tools do the math, not the LLM. Financial calculations done by language models are unreliable. We compute RSI, VaR, and correlation in Python and feed the results as text.
Defining the Agents
Each agent gets a specific role, goal, and backstory. The backstory matters more than you'd think — it shapes how the LLM frames its reasoning.
# agents/portfolio_agents.py
from crewai import Agent
from tools.market_tools import (
get_stock_data,
get_fundamentals,
get_technical_indicators,
analyze_portfolio_risk,
get_market_news,
)
def create_researcher(llm):
return Agent(
role="Market Research Specialist",
goal=(
"Identify investment opportunities and risks by analyzing market data, "
"earnings reports, macroeconomic trends, and breaking news. Provide "
"factual, data-driven research summaries without speculation."
),
backstory=(
"You are a seasoned market researcher with 15 years of experience "
"covering equities. You focus on verifiable data: price action, "
"fundamental metrics, and news flow. You never fabricate numbers "
"or make price predictions. When data is insufficient, you say so "
"clearly rather than guessing. You have a healthy skepticism of "
"narrative-driven market stories and always look for the data "
"behind the headlines."
),
tools=[get_stock_data, get_fundamentals, get_market_news],
llm=llm,
verbose=True,
allow_delegation=False,
)
def create_analyst(llm):
return Agent(
role="Quantitative Investment Analyst",
goal=(
"Synthesize research data into actionable investment theses. "
"Evaluate valuations, assess technical setups, score opportunities, "
"and produce clear buy/hold/sell recommendations with conviction levels."
),
backstory=(
"You are a CFA charterholder turned quantitative analyst. You "
"combine fundamental valuation with technical analysis to form "
"investment views. You assign conviction scores from 1-10 and "
"always specify your reasoning. You're aware that models have "
"limitations and explicitly state your assumptions. You compare "
"each opportunity against a set of valuation benchmarks: P/E "
"relative to growth, price-to-book vs ROE, and technical "
"momentum confirmation. You never recommend a position without "
"a clear thesis and risk/reward estimate."
),
tools=[get_stock_data, get_fundamentals, get_technical_indicators],
llm=llm,
verbose=True,
allow_delegation=False,
)
def create_trader(llm):
return Agent(
role="Portfolio Trader and Position Manager",
goal=(
"Translate analyst recommendations into specific trade decisions "
"with exact position sizes, entry prices, and order types. "
"Manage position sizing using risk-adjusted frameworks."
),
backstory=(
"You are a former buy-side trader who managed $500M in US equities. "
"You think in terms of basis points and risk-adjusted returns. "
"You use a systematic position sizing framework: no single position "
"exceeds 5% of portfolio, and you scale into positions using "
"limit orders. You always specify: ticker, action (BUY/SELL/HOLD), "
"quantity, order type, and rationale. You consider liquidity, "
"spread costs, and market impact for any trade. You never "
"recommend market orders for positions over $50k."
),
tools=[get_stock_data, get_technical_indicators],
llm=llm,
verbose=True,
allow_delegation=False,
)
def create_risk_manager(llm):
return Agent(
role="Chief Risk Officer",
goal=(
"Evaluate portfolio-level risk, enforce position limits, check "
"correlation exposure, and veto or modify trades that violate "
"risk parameters. Protect capital above all else."
),
backstory=(
"You are a risk manager who has lived through 2008, 2020, and "
"2022. You are inherently cautious and your job is to say 'no' "
"or 'reduce size' when the portfolio takes on too much risk. "
"You enforce these hard limits: max single position 5% of "
"portfolio, max sector concentration 25%, max portfolio drawdown "
"limit 15%, minimum Sharpe ratio target 0.5. You analyze "
"correlation between holdings and flag when the portfolio is "
"effectively concentrated in fewer bets than it appears. You "
"always provide a clear APPROVE, MODIFY, or REJECT decision "
"with specific reasoning."
),
tools=[analyze_portfolio_risk, get_stock_data],
llm=llm,
verbose=True,
allow_delegation=False,
)
Notice allow_delegation=False on every agent. In my testing, enabling delegation in financial workflows causes agents to ping-pong responsibility without converging. The sequential process handles the inter-agent communication.
Workflow Orchestration
CrewAI supports Process.sequential and Process.hierarchical. For portfolio management, sequential is the right choice — each agent's output feeds the next. Hierarchical introduces a manager agent that adds latency and often makes worse routing decisions than a fixed pipeline.
# crew/portfolio_crew.py
from crewai import Crew, Process, Task
from agents.portfolio_agents import (
create_researcher,
create_analyst,
create_trader,
create_risk_manager,
)
class PortfolioCrew:
def __init__(self, llm="gpt-4o-mini", portfolio_value=100000):
self.llm = llm
self.portfolio_value = portfolio_value
self.researcher = create_researcher(llm)
self.analyst = create_analyst(llm)
self.trader = create_trader(llm)
self.risk_manager = create_risk_manager(llm)
def run_analysis(self, tickers: list[str], current_holdings: dict = None):
"""
Run the full portfolio analysis pipeline.
Args:
tickers: List of stock tickers to analyze
current_holdings: Dict of {ticker: num_shares} for existing positions
"""
if current_holdings is None:
current_holdings = {}
ticker_str = ", ".join(tickers)
holdings_str = str(current_holdings) if current_holdings else "No current positions (starting fresh)"
# Task 1: Research
research_task = Task(
description=(
f"Conduct comprehensive research on the following tickers: {ticker_str}\n\n"
f"For EACH ticker, gather:\n"
f"1. Current price data and recent performance\n"
f"2. Key fundamental metrics (P/E, revenue growth, margins)\n"
f"3. Recent news and any material developments\n\n"
f"Portfolio context: Total portfolio value is ${self.portfolio_value:,}.\n"
f"Current holdings: {holdings_str}\n\n"
f"Output a structured research brief for each ticker with the most "
f"relevant data points. Flag any red flags (declining revenue, "
f"high debt, negative news) prominently."
),
expected_output=(
"A structured research report with sections for each ticker, "
"including price data, fundamental summary, news highlights, "
"and a preliminary assessment (positive/neutral/negative outlook)."
),
agent=self.researcher,
)
# Task 2: Analysis
analysis_task = Task(
description=(
"Using the research brief from the Market Research Specialist, "
"perform deep analysis on each ticker.\n\n"
"For EACH ticker:\n"
"1. Calculate and interpret technical indicators (RSI, MACD, Bollinger Bands)\n"
"2. Assess valuation relative to sector and growth rate\n"
"3. Formulate a clear investment thesis\n"
"4. Assign a conviction score (1-10) and recommendation (BUY/HOLD/SELL)\n"
"5. Estimate a risk/reward ratio\n\n"
f"Portfolio value: ${self.portfolio_value:,}\n"
f"Current holdings: {holdings_str}\n\n"
"Be specific about WHY you hold each view. Disagree with the "
"researcher's preliminary assessment if your analysis warrants it."
),
expected_output=(
"An investment analysis report with per-ticker recommendations, "
"conviction scores, technical and fundamental reasoning, "
"and a ranked priority list for action."
),
agent=self.analyst,
context=[research_task],
)
# Task 3: Trading Decisions
trading_task = Task(
description=(
"Convert the analyst's recommendations into specific trade orders.\n\n"
f"Portfolio value: ${self.portfolio_value:,}\n"
f"Current holdings: {holdings_str}\n\n"
"For each recommended action:\n"
"1. Specify exact ticker, action (BUY/SELL), and number of shares\n"
"2. Choose order type (LIMIT/MARKET) with rationale\n"
"3. For BUY orders, suggest entry price and position size in dollars\n"
"4. For SELL orders, specify exit strategy\n"
"5. Apply position sizing rules: max 5% per position\n\n"
"Output a trade blotter with all proposed trades, total capital "
"required, and resulting portfolio allocation estimate."
),
expected_output=(
"A trade blotter table with columns: Ticker, Action, Shares, "
"Order Type, Price, Dollar Amount, Rationale. Include a summary "
"of total capital deployment and resulting allocation."
),
agent=self.trader,
context=[research_task, analysis_task],
)
# Task 4: Risk Review
risk_task = Task(
description=(
"Review the proposed trade blotter and the full portfolio analysis.\n\n"
f"Current holdings: {holdings_str}\n"
f"Portfolio value: ${self.portfolio_value:,}\n\n"
"Perform these checks:\n"
"1. Run portfolio risk analysis on proposed post-trade portfolio\n"
"2. Check position concentration limits (max 5% per name)\n"
"3. Check sector concentration (max 25% per sector)\n"
"4. Evaluate correlation risk between holdings\n"
"5. Assess whether total drawdown risk is acceptable\n\n"
"For each proposed trade, provide a decision: APPROVE, MODIFY (with "
"specific changes), or REJECT (with reasoning).\n\n"
"Conclude with an overall portfolio risk assessment and any "
"recommended changes to the trade blotter."
),
expected_output=(
"A risk review report with per-trade decisions (APPROVE/MODIFY/REJECT), "
"portfolio-level risk metrics, and a final recommended trade list "
"that incorporates all risk adjustments."
),
agent=self.risk_manager,
context=[research_task, analysis_task, trading_task],
)
# Assemble and run
crew = Crew(
agents=[self.researcher, self.analyst, self.trader, self.risk_manager],
tasks=[research_task, analysis_task, trading_task, risk_task],
process=Process.sequential,
verbose=True,
)
result = crew.kickoff()
return result
The context parameter on each task is how we chain outputs. Task 2 receives Task 1's output automatically. This is cleaner than stuffing everything into a single prompt.
Running the System
# main.py
from crew.portfolio_crew import PortfolioCrew
from dotenv import load_dotenv
load_dotenv()
# Configuration
PORTFOLIO_VALUE = 100_000
WATCHLIST = ["AAPL", "NVDA", "JPM", "UNH", "XOM"]
CURRENT_HOLDINGS = {
"AAPL": 50, # ~$9,500
"MSFT": 30, # ~$12,000
}
def main():
crew = PortfolioCrew(
llm="gpt-4o-mini",
portfolio_value=PORTFOLIO_VALUE,
)
result = crew.run_analysis(
tickers=WATCHLIST,
current_holdings=CURRENT_HOLDINGS,
)
# Save output
with open("portfolio_report.md", "w") as f:
f.write(str(result))
print("\n" + "="*60)
print("FINAL PORTFOLIO RECOMMENDATION")
print("="*60)
print(result)
if __name__ == "__main__":
main()
A typical run processes in 3-5 minutes and costs roughly $0.15-0.40 with gpt-4o-mini. With gpt-4o, expect $1.50-4.00 per run — the reasoning quality improves noticeably but the cost adds up fast if you're running daily.
Adding a Structured Output Parser
Raw text output from agents is messy. Here's how to extract structured trade decisions:
# utils/output_parser.py
import re
import json
from dataclasses import dataclass
from typing import Optional
@dataclass
class TradeOrder:
ticker: str
action: str # BUY, SELL, HOLD
shares: Optional[int]
order_type: str # LIMIT, MARKET
price: Optional[float]
dollar_amount: Optional[float]
rationale: str
risk_decision: str # APPROVE, MODIFY, REJECT
def parse_trade_blotter(output_text: str) -> list[TradeOrder]:
"""
Extract trade orders from crew output.
This is intentionally fragile — LLM output format varies.
Use regex patterns that match common table formats.
"""
trades = []
# Pattern for table rows like: | AAPL | BUY | 25 | LIMIT | $185.00 | $4,625 | ... |
table_pattern = re.compile(
r'\|\s*(\w+)\s*\|\s*(BUY|SELL|HOLD)\s*\|\s*(\d+)?\s*\|'
r'\s*(LIMIT|MARKET)\s*\|\s*\$?([\d,.]+)?\s*\|\s*\$?([\d,.]+)?\s*\|',
re.IGNORECASE
)
for match in table_pattern.finditer(output_text):
trades.append(TradeOrder(
ticker=match.group(1),
action=match.group(2).upper(),
shares=int(match.group(3)) if match.group(3) else None,
order_type=match.group(4).upper(),
price=float(match.group(5).replace(',', '')) if match.group(5) else None,
dollar_amount=float(match.group(6).replace(',', '')) if match.group(6) else None,
rationale="", # Extracted separately
risk_decision="UNKNOWN",
))
# Extract risk decisions
decision_pattern = re.compile(
r'(AAPL|NVDA|JPM|UNH|XOM|MSFT|GOOGL|AMZN).*?(APPROVE|MODIFY|REJECT)',
re.IGNORECASE
)
for match in decision_pattern.finditer(output_text):
ticker = match.group(1)
decision = match.group(2).upper()
for trade in trades:
if trade.ticker == ticker and trade.risk_decision == "UNKNOWN":
trade.risk_decision = decision
return trades
This parser is brittle by design. I've built enough of these to know that relying on LLM output format is a losing game long-term. For production, you'd use CrewAI's structured output features or Pydantic models:
from pydantic import BaseModel, Field
from typing import Literal
class TradeDecision(BaseModel):
ticker: str = Field(description="Stock ticker symbol")
action: Literal["BUY", "SELL", "HOLD"]
shares: int = Field(description="Number of shares", ge=0)
order_type: Literal["LIMIT", "MARKET"]
target_price: float = Field(description="Target entry/exit price")
conviction: int = Field(description="Conviction score 1-10", ge=1, le=10)
class TradingBlotter(BaseModel):
trades: list[TradeDecision]
total_capital_required: float
cash_remaining: float
summary: str
Pass this Pydantic model to the Task's output_pydantic parameter for validated output.
Extending with Scheduled Execution
For a system that runs daily, add a scheduler:
# scheduler.py
import schedule
import time
from datetime import datetime
from main import run_analysis # Your main entry point
def daily_run():
"""Run portfolio analysis at market close."""
print(f"\n{'='*60}")
print(f"Starting daily portfolio analysis: {datetime.now()}")
print(f"{'='*60}\n")
try:
result = run_analysis()
# Log results
timestamp = datetime.now().strftime("%Y%m%d_%H%M")
with open(f"reports/report_{timestamp}.md", "w") as f:
f.write(str(result))
print(f"Report saved: reports/report_{timestamp}.md")
except Exception as e:
print(f"Analysis failed: {e}")
# Run at 4:30 PM ET (after market close)
schedule.every().day.at("16:30").do(daily_run)
# Also run Monday morning for weekly rebalancing
schedule.every().monday.at("09:00").do(daily_run)
if __name__ == "__main__":
print("Portfolio scheduler started. Waiting for scheduled runs...")
while True:
schedule.run_pending()
time.sleep(60)
Honest Assessment: Where This Breaks Down
I've built and tested systems like this. Here's what actually matters:
What works well:
- Research gathering is genuinely useful. The researcher agent reliably pulls real data and summarizes it. This alone saves 20-30 minutes of manual work per analysis cycle.
- Risk checking catches concentration issues. The correlation analysis and position limit enforcement are the most production-ready components.
- The structured pipeline forces discipline. Each agent has a focused task and can't skip steps.
What doesn't work well:
LLMs are bad at numerical reasoning. Despite giving them tools, agents still miscalculate position sizes, confuse percentages, and occasionally invent numbers between tool calls. The risk manager's math should always be validated programmatically.
No real execution layer. This system generates recommendations, not trades. Integrating with a broker API (Alpaca, IBKR) is a separate engineering project with its own compliance requirements.
Latency is a problem. Four sequential agents with tool calls take 3-5 minutes. For intraday decisions, that's too slow. This is a daily/weekly rebalancing tool, not a trading system.
Cost scales with frequency. Running this daily at $0.30/run is $9/month with mini. At $3/run with GPT-4o, that's $90/month. Factor in debugging runs and you're looking at $150-200/month.
Backtesting is hard. CrewAI doesn't have built-in backtesting. You'd need to wrap the entire pipeline in a simulation framework and replay historical data through it — a significant engineering effort.
Hallucination risk is real. The researcher might confidently state earnings dates or revenue figures that are wrong. Always cross-reference critical data points with a second source.
Production Considerations
If you're serious about deploying this:
| Concern | Recommendation |
|---|---|
| Data reliability | Replace yfinance with a paid API (Polygon, Alpha Vantage). yfinance breaks without warning. |
| Cost control | Cache tool results. Don't re-fetch AAPL data 4 times per run. |
| Auditability | Log every tool call and agent response to a database. You need to explain why trades were made. |
| Guardrails | Add hard-coded position limits that agents can't override. The risk manager is probabilistic; your code should be deterministic. |
| Human-in-the-loop | Never auto-execute. Route the final trade blotter to a human for approval. |
| Model choice | Use GPT-4o for the analyst and risk manager (reasoning-heavy). Use mini for the researcher (data-fetching-heavy). CrewAI supports per-agent LLM configuration. |
The Complete Architecture
┌─────────────────────────────────────────────────────────┐
│ Scheduler (cron/schedule) │
└──────────────────────┬──────────────────────────────────┘
▼
┌──────────────────────────────────────────────────────────┐
│ CrewAI Pipeline │
│ │
│ ┌──────────┐ ┌──────────┐ ┌────────┐ ┌────────┐ │
│ │Researcher│──▶│ Analyst │──▶│Trader │──▶│ Risk │ │
│ └────┬─────┘ └────┬─────┘ └───┬────┘ └───┬────┘ │
│ │ │ │ │ │
└───────┼──────────────┼──────────────┼─────────────┼───────┘
▼ ▼ ▼ ▼
┌──────────────────────────────────────────────────────────┐
│ Tool Layer │
│ yfinance API │ Technical Calc │ Portfolio Risk Calc │
└──────────────────────────────────────────────────────────┘
▼
┌──────────────────────────────────────────────────────────┐
│ Output: Trade Blotter + Risk Report │
│ → Human Review → Broker API (optional) │
└──────────────────────────────────────────────────────────┘
This system is a starting point, not a finished product. The multi-agent pattern genuinely adds value by decomposing the problem and enforcing a structured workflow. But the gap between "impressive demo" and "trustworthy financial tool" is wide, and