How Backtesting Works

A backtest answers one question: “If I had run this strategy on historical data, what would have happened?” It’s the only rigorous way to evaluate a systematic strategy before committing real money. No backtest, no basis for conviction.

But backtests are also notoriously easy to abuse. Every strategy looks excellent on the data it was designed on. The skill is building backtests that are honest, and reading other people’s backtests with appropriate scepticism. After 17 years of building systematic strategies, I can tell you: most backtests you’ll encounter are optimistic. Some are outright deceptive.

The backtest gap | what honest testing looks like

Metrics drawn from 38 Nifty 500 backtests audited at QCAlpha since 2019. Only 6 survived the honesty checklist intact.

strategies audited

QCAlpha research desk

9.6

median CAGR overstatement pp before fix

vs clean rerun

0.42

Sharpe drop after delisted universe added

survivorship removal

16%

passed all three bias checks

6 of 38

Most pitch-deck backtests crumble the moment you include companies that died and costs that actually apply.

Point-in-time data

no future peeks, delisted included

Full cost stack

STT, GST, stamp, slippage

Walk forward

rolling out-of-sample

Stress windows

2008, 2013, 2018, 2020

Paper then live

micro capital first

Any step skipped is a point of overstatement. Cost skip alone adds 3 to 5 pp to annual returns.

Where the CAGR overstatement comes from

Cost skip 34%

Survivorship 28%

Look-ahead 22%

In-sample tuning 16%

Round-trip cost stack on a 30-stock Nifty 500 rebalance

STT + GST 40%

Slippage 26%

Brokerage 18%

Stamp + exch 16%

For Indian equities, STT alone eats 20 bps round-trip on cash leg. Slippage on midcap names adds 30 to 50 bps in a rebalance week.

CAGR drop after honest rerun | 2015 to 2025 Nifty 500 base

Screener viral strategy A

34.2 to 16.8

YouTube momentum 30

27.4 to 19.1

Quality + value blend

22.8 to 18.3

Nifty 200 Momentum 30 TRI

20.1 to 18.7

Nifty 500 TRI passive

14.2 to 13.8

Audited numbers shrink in proportion to how aggressive the original assumptions were. Source RupeeCase rerun on CMIE Prowess + NSE adjusted closes.

TK | 2016 momentum trap I walked into

My first standalone momentum backtest in 2016 showed 38.4% CAGR over 2010 to 2015. I was ready to fund it with 30 lakh of personal capital. Before doing so, I rebuilt the universe to include delisted names | 14 stocks that went to zero between 2011 and 2015 had dropped out of my database. Added them back, CAGR fell to 26.8%. Added the real Indian cost stack | another 3.1 pp gone. Added 6-month walk-forward out-of-sample | it dropped to 19.4%. Still good, but nowhere near 38. That ₹30 lakh went in sized for 19% expectations, not 38%. Position size saved me when the 2018 midcap drawdown hit. If I had sized for the original number, I would have blown through my risk budget in one quarter.

What a backtest actually does

Define the rules

Specify signal (e.g., 12M-1M momentum), universe (Nifty 500), portfolio size (top 30 stocks), rebalance frequency (monthly), position sizing (equal weight), and cost assumptions (0.5% round-trip).

Loop through historical dates

For each rebalance date, apply the rules to data available at that point. Rank all 500 stocks by momentum. Select top 30. Critical: you can only use data that would have been available on that date, not future data.

Simulate trades and costs

Calculate which stocks enter and exit. Apply realistic transaction costs: brokerage, STT (0.1% both sides), exchange charges, GST, stamp duty, and slippage. This is where most backtests fail | they either ignore costs or use unrealistically low ones.

Calculate returns and metrics

Track the simulated portfolio value over time. Calculate CAGR, max drawdown, Sharpe, Sortino, alpha. Build the equity curve and underwater plot.

Compare to benchmark

Run the same period for Nifty 500 TRI (Total Return Index, which includes reinvested dividends). Compare CAGR, drawdown, and Sharpe. Alpha = strategy return minus beta-adjusted benchmark expectation.

The key discipline: every calculation at each historical date must use only data available at that point in time. You cannot use today’s knowledge to make decisions that were simulated in 2015. This sounds obvious, but it’s violated constantly.

The three biases that make backtests misleading

⚠ Survivorship Bias

Running a backtest only on stocks that are currently in the Nifty 500, without including companies that were in the index historically but later delisted, merged, or went bankrupt. The historical universe looks better than it was because you’ve removed all the failures.

Indian example: Yes Bank, Unitech, DHFL, and Jet Airways were once Nifty 500 constituents. Excluding them from your historical backtest universe because they’re no longer listed inflates historical performance. RupeeCase uses point-in-time constituent data, the universe at each historical date reflects what was actually in Nifty 500 at that time.

⚠ Look-Ahead Bias

Using data that would not have been available at the time of the simulated decision. Indian companies publish quarterly results 45 to 60 days after quarter end. Using Q1 results (April,June) to make investment decisions on April 1 contaminated the backtest with information that wasn’t available until mid-August.

Indian example: A strategy using March quarter earnings to make April 1 buy decisions is using look-ahead data. The results were only published in late May. RupeeCase applies a conservative reporting lag, fundamental data is only considered available after the typical NSE publication window.

⚠ Overfitting (Data Mining Bias)

Testing hundreds of parameter combinations, different lookback windows, portfolio sizes, rebalance frequencies, and reporting only the one that performed best historically. Any strategy will look brilliant on the exact data it was optimised on. The question is whether it will work on new, unseen data.

How to spot it: A strategy with very specific parameters (“rebalance on the 8th of each month, use 143-day lookback, hold exactly 17 stocks”) is almost certainly overfitted. Robust strategies produce similar results across a range of reasonable parameters, the exact value doesn’t matter much.

The honest backtest checklist

✓

Transaction costs included? Minimum 0.3% round-trip for large caps, 0.5 to 1% for mid/small caps. RupeeCase uses 0.5%. A backtest without costs is fiction.

✓

Point-in-time constituent data? Universe at each historical date must reflect what was actually in the index at that time, including companies that later failed.

✓

Realistic reporting lag for fundamentals? If the strategy uses earnings or balance sheet data, the backtest must respect when results were actually published on NSE/BSE.

✓

Multiple market cycles covered? A 10-year backtest that only covers 2014 to 2024 missed the 2008 crash. Good backtests cover at least one major bear market.

✓

Parameter robustness shown? Performance should be similar across adjacent parameter values (9M, 10M, 11M, 12M momentum lookback). If only one exact value works, it’s likely overfitted.

✗

Red flag: Only CAGR shown. No drawdown, Sharpe, or alpha shown? They’re hiding unflattering numbers. Every strategy has drawdowns.

✗

Red flag: Sharpe above 2.5 on long-only equity. Real-world factor strategies achieve Sharpe of 0.8 to 1.5. Sharpe of 3+ almost certainly means overfitting or a bug.

✗

Red flag: “This is our best parameter set.” If they only show the optimised result without showing how adjacent parameters performed, overfitting is likely.

The golden rule: A backtest can tell you what would have happened. It cannot tell you what will happen. Use it to eliminate strategies that couldn’t have worked, not to guarantee strategies that will work. The best backtests are honest about this distinction.

NSE, Historical Data & Corporate Filings (backtest data source) NSE Indices, Nifty 500 Historical Constituent Data

◆ How RupeeCase builds backtests

Every RupeeCase backtest uses point-in-time Nifty 500 constituent data, 0.5% round-trip cost assumption, realistic reporting lags for fundamental signals, and covers the full available NSE history. Parameters are tested across a range of reasonable values, not just the best-looking one. The platform shows all six risk metrics so you can evaluate strategies honestly.

Honest backtests, always

Run your own backtests on 10+ years of NSE data, costs included

Point-in-time data. No survivorship bias. Full six-metric tearsheet.

Start free →

A note from the author

Most backtests I see are lying, often unintentionally

In 17 years of systematic trading, I’ve built hundreds of backtests and reviewed hundreds more. The honest ones are rare. The most common issues: costs ignored, survivorship bias hidden, parameters cherry-picked. Sometimes intentional, often not, people genuinely don’t understand that their backtest universe includes companies that were delisted after the test period.

At RupeeCase, we obsess over backtest honesty because if we inflate historical performance, we’re setting investors up for disappointment. Our 0.5% round-trip cost assumption is conservative on purpose. Our point-in-time data is the real work, maintaining historical constituent membership is genuinely hard. But without it, the numbers are meaningless.

Tanmay Kurtkoti

Founder & CEO, RupeeCase · 17 years systematic trading · QC Alpha

Put learning into practice. Every concept in Path 2 maps directly to a tool in the RupeeCase terminal.

Explore terminal →

Glossary, Module 2.4

Backtest

A simulation of how a strategy would have performed if applied to historical data according to its defined rules.

Survivorship bias

Distortion from excluding failed/delisted companies from a historical universe, inflating apparent past performance.

Look-ahead bias

Using data in a simulated historical decision that would not actually have been available at that date.

Overfitting

Designing strategy parameters specifically to historical data, producing excellent backtest results but poor forward performance.

Out-of-sample testing

Evaluating a strategy on data not used during design. The most reliable test of genuine robustness vs overfitting.

Point-in-time data

Historical data that accurately reflects what was known at each past date, no contamination from subsequently revealed information.

Sources & further reading

→ NSE, Historical Financial Results and Corporate Filings
→ NSE Indices, Nifty 500 Historical Data
→ Harvey, C.R. & Liu, Y. (2015). Backtesting. Journal of Portfolio Management.
→ Bailey, D.H. et al. (2014). Pseudo-Mathematics and Financial Charlatanism. Notices of the AMS.
→ Lopez de Prado, M. (2018). Advances in Financial Machine Learning. Wiley. (Chapter on backtest overfitting)

All content on this page is original work by Tanmay Kurtkoti and QC Alpha Technologies Pvt Ltd. Protected under Indian copyright law. tanmay@rupeecase.com

Quick check, Module 2.4

0 correct · 0 answered

🎉

Module 2.4 complete

3 correct. Continue when ready.

Research Lab Qualifier

Path 2, Module 4 of 5, complete all 5 + path test to unlock

Explore terminal →

✅ 2.1 Rules vs Gut→✅ 2.2 Factors→✅ 2.3 Risk Metrics→📍 2.4 Backtesting→2.5 Build a Strategy

← Previous

Previous, Module 2.3

Risk Metrics That Matter

Calculator

Drawdown Recovery Calculator

Years required to climb back to a prior peak from a given drawdown at a given CAGR.

Drawdown (%)Forward CAGR (%)

Quick check, Module 2.4

3 questions. Get 2 right to mark this module complete.

0 of 3 answered

Up next, Module 2

Building Your First Systematic Strategy

Universe, signal, sizing, costs, backtest, and going live. A complete walkthrough of how a systematic strategy is actually built from scratch.

Continue →

PRACTICE WHAT YOU LEARNED

Try systematic strategies on RupeeCase | free paper trading.

Get Started Free →

What a backtest actually does

The three biases that make backtests misleading

The honest backtest checklist

Sources & further reading

Quick check, Module 2.4

Drawdown Recovery Calculator

3 questions. Get 2 right to mark this module complete.

What's working, what isn't.