Every strategy looks good in theory. Backtesting is where theory meets reality. By simulating your trading rules on historical data, you can estimate how a strategy would have performed before risking real capital. But backtesting is also where most traders fool themselves — curve-fitting to past data and convincing themselves they have found the holy grail.

Here is how to backtest properly, avoid the common traps, and actually learn something useful from your results.

What Backtesting Actually Tells You

Let's start with what backtesting does NOT tell you:

It does not guarantee future performance
It does not account for slippage in illiquid markets
It does not capture the psychological difficulty of executing losing trades
It does not reflect changing market microstructure

What backtesting DOES tell you:

Whether the core logic of your strategy has a statistical edge on historical data
The approximate risk-reward characteristics (drawdowns, win rate, profit factor)
How the strategy performs across different market regimes (trending, ranging, volatile, quiet)
Whether the strategy degrades gracefully or fails catastrophically when conditions change

Backtesting is a necessary filter, not a sufficient one. A strategy that fails backtesting will almost certainly fail live. A strategy that passes backtesting might succeed live — but it needs further validation.

Step 1: Define Your Rules Precisely

The number one backtesting mistake is ambiguous rules. "Buy when the market looks oversold" is not testable. "Buy when RSI(14) closes below 30 on the daily chart" is.

Every rule must specify:

Indicator and parameters: Which indicator, what period, what threshold
Timeframe: Daily, 4-hour, hourly — this changes everything
Entry logic: Exact condition that triggers the trade
Exit logic: Take profit, stop loss, trailing stop, time stop — all defined
Position sizing: Fixed dollar, fixed fractional, or volatility-adjusted
Filters: Any conditions that must also be true (trend filter, volatility filter)

Write your rules down before you touch any data. If you develop rules while looking at the data, you are curve-fitting — the most dangerous trap in backtesting.

Step 2: Choose Your Data Carefully

Data Quality

Use high-quality OHLCV (Open, High, Low, Close, Volume) data from reputable sources. Common issues with crypto data:

Exchange-specific data: Prices differ across exchanges. Use the exchange you plan to trade on, or an aggregate source.
Missing candles: Some data sources have gaps during outages. These gaps can distort backtests, especially for strategies that rely on consecutive candle patterns.
Survivorship bias: If you test on tokens that exist today, you are only testing survivors. The hundreds of tokens that went to zero are not in your dataset.

Data Period

Minimum: 2 years of daily data (to capture at least one full market regime change)
Ideal: 4+ years (to cover a full crypto cycle: bull, bear, recovery, bull)
For higher timeframes: Scale accordingly. A 4-hour strategy needs 6+ months minimum.

In-Sample vs. Out-of-Sample

Split your data:

In-sample (70%): Use this to develop and optimize your strategy
Out-of-sample (30%): Use this to validate. Never optimize on out-of-sample data.

If your strategy performs well in-sample but poorly out-of-sample, it is overfit to the training data. Go back to the drawing board.

Step 3: Run the Backtest

Account for Transaction Costs

Always include:

Trading fees: Maker and taker fees for your exchange. On Nado Protocol, maker fees are 0.01% + 0.02% builder = 0.03% per fill. Taker fees are higher (0.055%).
Slippage: Estimate 0.02-0.05% per trade for liquid pairs (BTC, ETH). For altcoins, use 0.1-0.5%.
Funding rates: If your strategy holds perpetual futures positions, funding is a real cost/income. Include it.

A strategy that looks great without fees often looks mediocre or negative with them. This is especially true for high-frequency strategies where fees compound across hundreds of trades.

Simulate Realistically

Use candle closes, not intra-candle prices: Your daily strategy should only use daily close prices for signals, not the high or low. You cannot know the high/low until the candle closes.
Account for order execution: If your signal triggers on a daily close, your actual entry is the next candle's open — not the close price. This matters.
Respect position limits: If your strategy calls for 5x leverage but the exchange allows 3x, your backtest must reflect the actual limit.

Step 4: Evaluate the Results

The Key Metrics

1. Net Profit / Total Return The bottom line. But do not stop here — a 200% return means nothing without context on the risk taken.

2. Maximum Drawdown The largest peak-to-trough decline. This is the most important risk metric. A strategy with 200% return and 60% max drawdown is far worse than one with 100% return and 20% max drawdown.

Rule of thumb: Your max drawdown in live trading will be 1.5-2x what the backtest shows. If the backtest shows 20% drawdown, plan for 30-40% live.

3. Profit Factor Gross profit divided by gross loss. A profit factor of 1.5 means you make $1.50 for every $1.00 you lose.

Profit Factor	Interpretation
< 1.0	Losing strategy
1.0 - 1.2	Marginal (fees may kill it)
1.2 - 1.5	Decent
1.5 - 2.0	Good
> 2.0	Excellent (verify it is not overfit)

4. Win Rate The percentage of trades that are profitable. Different strategy types have different expected win rates:

Trend following: 35-45%
Mean reversion: 55-70%
Breakout: 45-55%

If your backtest shows 85% win rate on a trend following strategy, something is wrong. You are likely overfit.

5. Sharpe Ratio Risk-adjusted return. Annualized return divided by annualized volatility of returns.

Sharpe Ratio	Interpretation
< 0.5	Poor
0.5 - 1.0	Below average
1.0 - 1.5	Good
1.5 - 2.0	Very good
> 2.0	Excellent (or overfit)

6. Number of Trades Statistical significance requires enough trades. A strategy with 5 trades and 100% win rate tells you nothing. You need at least 30 trades (preferably 100+) for the results to be statistically meaningful.

Step 5: Stress Test

Walk-Forward Analysis

The gold standard for strategy validation. Instead of a single train/test split, roll the window forward:

Optimize on months 1-12, test on months 13-15
Optimize on months 4-15, test on months 16-18
Optimize on months 7-18, test on months 19-21
... and so on

If the strategy performs consistently across all out-of-sample windows, it is robust. If it only works in some windows, it may be overfit to specific market conditions.

Regime Analysis

Break your backtest into market regimes and evaluate each separately:

Regime	How to Identify	Expected Behavior
Strong uptrend	BTC up 20%+ in 30 days	Trend strategies excel
Sideways	BTC range < 10% for 30 days	Range strategies excel
Crash	BTC down 20%+ in 7 days	Short strategies/hedges work
Recovery	BTC up 30%+ from a low in 30 days	Momentum strategies excel

A strategy that only works in one regime is not wrong — it just needs to be paired with complementary strategies for other regimes.

Monte Carlo Simulation

Randomly shuffle the order of your backtest trades and re-run 1,000 times. This shows you the range of possible outcomes, not just the single historical path.

Key outputs:

95th percentile max drawdown (what is the worst realistic drawdown?)
5th percentile total return (what is the worst realistic return?)
Probability of ruin (what is the chance of hitting -50% at any point?)

If the 95th percentile drawdown is beyond your tolerance, reduce position sizing until it is not.

Common Backtesting Mistakes

1. Overfitting (The Cardinal Sin)

Adding parameters until the backtest looks perfect. If your strategy has 8+ optimized parameters for a dataset of 100 trades, you are almost certainly overfit. A robust strategy should have 3-5 parameters at most.

Test for overfitting: Change each parameter by 10-20%. If performance collapses, the strategy is fragile and likely overfit. Robust strategies degrade gracefully — a 10% parameter change produces a 5-10% performance change, not a 50% one.

2. Look-Ahead Bias

Using information that would not have been available at the time of the trade. Common examples:

Using the daily high/low to set entries (you do not know the high/low until the day ends)
Buying at the low of a crash (you did not know it was the low at the time)
Using future data in indicator calculations

3. Ignoring Market Impact

If your strategy trades large positions relative to the market's liquidity, your orders will move the price. A backtest assumes you can buy at the current price — but in reality, a $500,000 market buy on a $5M daily volume altcoin will cause significant slippage.

4. Cherry-Picking the Test Period

Testing a long-only strategy from March 2020 (the COVID crash bottom) produces spectacular results. Testing from November 2021 (the cycle top) produces terrible results. Always test across full market cycles.

Backtesting on Otomate

Otomate's Strategy Builder includes a built-in backtesting engine. When you define a strategy in natural language, you can backtest it before deploying real capital.

How it works:

Historical data sourced from Hyperliquid candlestick API (1-hour candles)
The same condition evaluator used in live trading evaluates historical data
Backtest periods: 7, 14, 30, or 90 days
Metrics provided: total PnL, win rate, total trades, max drawdown, profit factor
Requires 210+ candles for strategies using EMA200 (warmup period)

Example workflow:

Define: "Go long BTC when EMA 9 crosses above EMA 21 and RSI is below 60. Close when EMA 9 crosses below EMA 21 or loss exceeds 3%."
Backtest: Run on 90 days of data
Evaluate: Check win rate, profit factor, max drawdown
Refine: Adjust parameters if needed
Deploy: Activate the strategy on your Nado subaccount

The backtesting engine uses the same condition evaluator as the live worker, so there is no discrepancy between backtest logic and live execution logic. Your backtest results reflect exactly how the strategy will be evaluated in real-time.

From Backtest to Live: The Transition

Even after a positive backtest, do not go straight to full size:

Paper trade (2 weeks): Run the strategy on a small account or track it manually. Verify the live signals match the backtest signals.
Small live (4 weeks): Trade with 25% of your intended capital. Evaluate live performance against backtest expectations.
Scale up (gradual): If live results are within 80% of backtest results, increase to 50%, then 75%, then 100% of capital over subsequent months.

If live performance is less than 50% of backtest performance after 4 weeks, stop and investigate. The discrepancy is telling you something — slippage, execution delays, changed market conditions, or an overfit backtest.

Backtesting is not a crystal ball. It is a filter that separates ideas worth testing from ideas that are not. Use it as the first step in a rigorous validation process, not the last.

Don't trade. Automate.