BlogInvesting

How to Evaluate Startup Founders and Deals: An Angel Investor's Framework

July 21, 202611 min read

Angel investors don't have months for diligence, but they do need a structured first-pass filter. The most-cited research links more diligence to higher returns (the Kauffman Foundation and Angel Capital Association study), though as this guide shows, that link is widely misread. The leading practitioner frameworks all agree on the dominant variable: founder quality. Y Combinator emphasizes determination. NFX leads with founder-market fit. Hustle Fund uses forced-choice scoring. A defensible process runs three gates: a founder filter, a market and model test, and deal terms read as a character signal. Each gate answers a different question: is the founder worth backing, does the market and the math hold, and are the terms fair to the people writing the first checks? This guide synthesizes those frameworks into a practical first-call rubric you can run in 30 minutes.

This is educational content, not investment, legal, or financial advice. Angel investing involves significant risk, including total loss of capital. All examples are illustrative only. Only accredited investors should consider angel investing, and individual suitability varies materially. Consult a financial advisor before making investment decisions.

What the research says about diligence hours and returns

According to the Wiltbank and Boeker study (ACA and Kauffman Foundation, 2007), across 538 angels and 3,097 investments, time spent before the check correlated with higher return multiples, and as Play Money reads it, that correlation is real and widely misread. It is the most-cited quantitative source in angel investing education, and the anchor for nearly every diligence argument you will hear.

Due diligence hours versus return multiples (Wiltbank and Boeker, 2007):

Below median, under 20 hours: 1.1x return multiple, and 65% of these exits returned less than the original investment.
Above median, 20 to 40 hours: 5.9x return multiple. This is the core threshold.
Top quartile, over 40 hours: 7.1x return multiple, the highest-diligence cohort.
Monthly portfolio engagement: 3.7x in four years, versus 1.3x for annual engagement. Diligence does not stop at the wire.

Two more findings from the same dataset matter for how you build a portfolio. Investment multiples were twice as high when angels invested inside their own domain expertise, where they carried an average of 14 years of relevant experience. And the top 10% of exits produced 75% of total cash returns, which is the power-law structure that makes diversification non-negotiable.

The average return across the full sample was 2.6x in 3.5 years, roughly a 27% IRR, per Seraf's research summary. For current context, the ACA 2025 Angel Funders Report covering 2024 exits found the median MOIC (multiple on invested capital) across non-shutdown exits was 1.3x, and 25% of 2024 exits returned less than the original capital, up from 21% in 2022. That loss-of-capital rate is exactly why disciplined rubrics matter in years without 30x outliers. The singles and doubles are real. So is the downside.

The number gets read as a promise: log 20 hours, earn the 5.9x. Read that way, it misleads, for two reasons. First, correlation is not causation. The Wiltbank data is self-reported and from 2007, and the likeliest driver is selection: experienced, domain-expert angels both spend more time on diligence and pick better companies. The same dataset shows it, with multiples doubling when angels invest inside their own domain expertise. The hours track the quality of the investor and do not, on their own, produce the return. Second, the number counts the wrong unit. It prescribes solo diligence hours, but on a syndicate or platform the diligence is proxied to the deal lead. Writing a check behind a lead, the hours you personally log matter far less than whether that lead did rigorous work and whether you trust their process.

This changes what the hours are for. In the syndicate and lead model most angels actually operate in, the bulk of the work sits with the deal lead, and the demand on any one founder's time stays small by design. Multiple hours of an early-stage founder's time is a lead's job. Asking every angel on the cap table to repeat it slows down the company you just funded, and it does something worse over time.

Multiple hours of an early-stage founder's time on diligence is too much for anything but a lead check. Making that founder redo the work for every downstream angel because you won't share what you did is worse. You're slowing down the business you just funded. Reputations travel fast, and too much diligence becomes negative selection: the strong founders won't sit through it, so you end up attracting the weaker ones.

Cheryl Kellond, founder of Play Money, on Angels Decoded, Episode 20: The Three Deadly Sins of Local Angel Groups.

The lead model also settles a worry that stops a lot of new angels: the fear that they need to personally understand every technology before they can invest. The deep, domain-specific work gets done once, by whoever is closest to the problem, and then it travels to everyone else on the deal.

You don't have to be the technical expert on every deal. That's what a strong lead, or a platform like Play Money, is for: the deep diligence gets done once and shared, so you can build a real portfolio instead of re-running hours on every check.

Cheryl Kellond, founder of Play Money.

There is a deeper reason not to over-diligence a single deal. The variable that actually separates good angel outcomes from bad ones is portfolio size. Pour 60 hours into one company and you have traded the forest for a tree: angel returns follow a power law, where a small number of deals carry the whole book, so breadth of good, right-sized checks beats depth on any one of them.

Angel portfolios work because of diversification. Optimizing for 30+ investments over time will deliver a better return than optimizing for a single variable across a handful of deals. The game isn't to find 'the one perfect deal,' it's to write enough right-sized checks into deals with edge.

Cheryl Kellond, founder of Play Money.

The 30-minute first call: what to evaluate and what to defer

The hardest question in angel diligence is not what to ask. It is what to ask first, when you have 30 minutes and a founder who has practiced their pitch 200 times. Seraf's diligence overview and the ACA playbook cover the full process. This is what an experienced angel actually prioritizes in the opening call.

Investor update cadence

One question does more screening work than any other. One Play Money angel who has backed Ollie and Invisalign and leads the Bairitone Health syndicate uses it as her primary filter: "I always ask: 'Can you send me the last investor update you sent, and how often do you send them?' If they pause, hedge, or say it's once a year, I'm out." A founder who already sends monthly updates is telling you how they will treat you after the check clears.

Customer specificity, not mission fluency

The filter that separates surface conviction from operational depth: can the founder name who is desperate for this product and why they will pay again next month? Stories that lean on future behavior change or "once the market matures" are warning signs. As the Play Money evaluation newsletter puts it, "Angels reward conversational momentum and confuse it with operational momentum." Those two things correlate far less than they appear to in a polished pitch.

Self-awareness under pressure

Ask what part of running this business the founder expects to be bad at. A founder who answers cleanly, admits uncertainty, and pushes back thoughtfully on a deliberately bad idea is showing intellectual honesty. The founder who agrees with everything you say is a red flag, not a pleasure to work with.

Bottoms-up market sizing

Most TAM slides, the total addressable market a startup claims it can reach, are built top-down and are wrong. The better test runs the other direction: how many targets exist in the market, at what price, with what repurchase behavior?

Bottoms up forces a founder to understand who really pays, how often, out of which budget, and what has to be true for that to scale.

Cheryl Kellond, founder of Play Money, in a Play Money Angel 101 session.

What to defer out of the first call: full cap table review until close of the call, reference checks until after a term sheet, legal and IP verification until the deal room, and detailed financial model scrutiny until a second meeting once founder quality is confirmed. Spend the scarce first 30 minutes on the things only a live conversation can reveal.

Founder evaluation frameworks: what the leading practitioners say

Three frameworks dominate angel diligence. They agree more than they disagree, and where they diverge, the divergence is itself useful. This section synthesizes them without taking sides.

Y Combinator: determination over intelligence

Y Combinator evaluates founders on five traits: determination, flexibility, imagination, naughtiness, and the quality of the co-founder relationship. Paul Graham's central finding was that determination, not raw intelligence, predicted success. YC's well-known screen asks founders to describe a time they hacked something to their advantage. Scrappiness and a willingness to challenge bad rules are genuine signals. The distinction that matters is which rules the founder chooses to challenge.

NFX: founder-market fit as the primary filter

NFX frames the primary question in its 4 Signs of Founder-Market Fit: does this founder have unique insight into this market that others lack? The four signals are lived experience in the problem domain, obsessive knowledge of customers, unfair access to distribution, and personal authenticity in explaining why they are the right person to build this. NFX treats team composition as the highest-signal diligence target, which lines up with Wiltbank's finding that domain expertise doubles returns.

Hustle Fund: forced-choice scoring

Hustle Fund scores deals 1 to 4 with no middle option across Team, Market, Product, Execution, and Fundraisability. Removing the middle score forces a stance. Their data shows Team dominates the decision, while Product and fundraisability matter less than most angels assume. As the Play Money founder evaluation newsletter notes, "Angels who use any kind of structured rubric, or who've internalized one, are dramatically more likely to move from check one to check five. The rubric doesn't just help you pick. It builds conviction."

Customer obsession versus founder energy

There is genuine practitioner disagreement about which signal is stronger. Some experienced angels weight founder energy and narrative force heavily, arguing that missionary founders attract the resources to overcome early product gaps. Others weight demonstrated customer obsession first, arguing that energy without customer insight produces well-told stories that fail at scale. Both views have empirical support. Rather than adjudicate, the practical move is to test both in the first call: does the founder's energy connect to specific customer behavior they have observed, or is it detached from evidence? The most concerning pattern is energy that cannot be grounded in a real, named user whose life is demonstrably better after using the product. A quick test: ask the founder to describe their last ten customer conversations. Specificity that names real people and real objections signals operational depth. Polished generalities signal the opposite.

The 3-gate vetting process

This framework synthesizes ACA, Seraf, NFX, and Hustle Fund guidance into a sequential process that mirrors how the most rigorous angel groups operate. Each gate is binary. A deal advances only if it clears every pass criterion in that gate, and a single fail stops the process. Most angels run this in the wrong order, anchoring on valuation before establishing founder quality, which is the single most common diligence mistake in the Seraf practitioner surveys.

Gate 1: Founder quality (30-minute call)

Customer specificity. Pass: names a real, current customer and explains exactly why that customer pays and returns. Fail: describes a customer type or future adopter and cannot name one real buyer today.
Intellectual honesty. Pass: answers "what will you be bad at?" with a specific, self-aware response. Fail: deflects, pivots to strengths, or gives a non-answer.
Coachability under challenge. Pass: pushes back on a bad idea with a reasoned counterargument. Fail: agrees immediately or becomes defensive.
Investor communication. Pass: sends a monthly or quarterly update on request. Fail: cannot produce a recent update, or cadence is annual or nonexistent.

Gate 2: Market and model (desk research plus second call)

Gate 2 is where a lot of angels quietly kill their best future returns. The instinct is to demand a big current total addressable market and fail anything that looks small today. For category creators, that instinct is backwards. You size the opportunity the company could create, not the market that already exists. Airbnb looked like air mattresses on floors and was really global travel and lodging. Uber looked like the black-car market and was really all of personal transport. Existing-market math would have failed both, which is the substance of the 2014 Damodaran and Bill Gurley debate over Uber's true addressable market ("How to Miss by a Mile"). The burden sits with the investor. The real Gate 2 question is whether you can see the size the founder is building toward, and that judgment is the diligence. Over-diligence that hunts a pre-revenue founder for proof the market already exists will only miss the deal. Two failure modes to guard against: missing the category creator because today's market looks small, and assuming a business can never grow past the market it starts in.

The same stage-awareness applies to the model. Many early companies have no unit economics yet, and demanding a clean CAC/LTV ratio from a pre-revenue founder is asking for theater. What still matters at this stage is business shape: does the founder correctly read who actually buys, whether this is an SMB or an enterprise sale, and the sales motion that follows from that? A founder with no numbers but a clear, correct read of the shape passes. A polished CAC/LTV spreadsheet built on the wrong buyer or the wrong motion fails. You are grading the quality of the thinking, not the presence of the metrics.

Category size, read by the investor. Pass: you can see a large market the company could create, even if today's addressable market looks small. Fail: neither the founder's framing nor your own analysis supports a market worth building toward.
Business shape. Pass: the founder correctly reads who buys, the type of sale (SMB versus enterprise), and the sales motion that follows, and reasons about it coherently, with or without revenue yet. Fail: the model, or a CAC/LTV spreadsheet, is built on the wrong buyer or the wrong motion.

When there is no market proof yet, name what you are actually doing. As Play Money frames it:

Great founder. Compelling problem. Little or no professional diligence or market proof. You're underwriting the human and the quest. Pure belief capital.

Domain match. Pass: founder has direct experience in the customer's industry or problem domain. Fail: no prior exposure to the segment or market structure.

Gate 3: Deal terms and cap table (deal room)

Valuation cap. Pass: at or below market rate for the stage. Fail: above market with no milestone justification.
Note structure. Pass: capped SAFE or priced round. Fail: uncapped note.
Pro-rata rights. Pass: included in the instrument. Fail: absent.
Lead investor. Pass: a named lead with independent diligence on record. Fail: no lead, or angels asked to lead without institutional support.
Cap table. Pass: clean, with no conflicts between existing investors and this round. Fail: prior investor conflicts, missing consents, or undisclosed obligations.

The sequencing is the point. Founder character is call-based and comes first. Market and unit economics are desk research plus a second call. Deal terms come last, in the deal room. With 25% of 2024 angel exits returning less than 1x capital per the ACA 2025 Angel Funders Report, disciplined sequential gating is the structural protection against preventable losses. Sources for the rubric: ACA Due Diligence Playbook, NFX, and Hustle Fund.

Want to put your learning into action?

We share one vetted startup deal every week. Always free to lurk and learn.

Deal terms as a character signal

Deal terms are not just financial mechanics. They are information about how a founder thinks about fairness, information asymmetry, and long-term incentives. For the full structural mechanics of SAFEs, priced rounds, and pro-rata rights, see Play Money's breakdown of SAFE versus Series SAFE. This section reads the same terms as signals about the person across the table.

Valuation cap. A cap at or near the top of market rate for the stage asks angels to absorb most of the dilution upside. That is worth probing. Why is the cap here? What milestone justifies it? A founder who answers clearly is showing transparency. One who deflects is showing you something too.
Uncapped notes. A SAFE or convertible note with no cap means you have no protection against an outsized valuation at the next round. Per Play Money's note on capped versus uncapped instruments, uncapped structures are unfavorable to angels and warrant a direct conversation.
Pro-rata rights. The right, not the obligation, to keep your ownership percentage in future rounds. Its presence or absence tells you whether the founder sees early angels as long-term partners or one-time funders.
Lead investor presence. A named lead who has done independent diligence lowers the proxied burden on individual angels. No lead, or angels asked to lead without institutional support, carries higher information-asymmetry risk.

What you are actually underwriting: the failure-mode lens

Evaluating a startup is an exercise in stress-testing the failure modes before they happen. According to CB Insights data on why startups fail, analyzed by Play Money, each failure mode maps to a specific gate in the framework above.

Ran out of capital, 70%. Caught at unit economics and burn rate (Gate 2) and deal terms (Gate 3).
Poor product-market fit, 43%. Caught by the customer obsession test (Gate 1) and named-customer evidence (Gate 2).
Bad timing, 29%. Caught by the "why now" question (Gate 1) and the market catalyst check (Gate 2).
Wrong team composition, 23%. Caught by co-founder relationship and complementary skills (Gate 1).
Unsustainable unit economics, 19%. Caught by bottoms-up market sizing (Gate 2).

Seventy percent of failures trace to capital management, a Gate 3 signal. Forty-three percent trace to product-market fit, a Gate 1 and Gate 2 signal. Team quality runs underneath all of it: the First Round Capital 10-Year Project found teams with two or more founders outperformed solo founders by 163%, and teams with at least one female founder outperformed all-male teams by 63%. The failure data is not a list of risks to fear. It is a checklist of what your gates exist to catch. That is the quiet case for spending the hours: the modes that kill startups are mostly observable before the wire rather than discovered after it. A founder who cannot name a paying customer, a model that leans on behavior change that has not happened, an uncapped note with no lead, each of these shows up in a first call or a deal room to an investor who looks.

The AI-era evaluation layer

As of 2026, a founder who can clearly say where AI does and does not improve their product's core value is showing a category of market awareness that separates early adopters from feature-chasers. Three questions are worth asking in the current environment:

Is AI improving the underlying unit economics of this business, or is it a feature that will be commoditized?
Does the product have a data moat that improves with usage, or is it a wrapper on a shared foundation model?
How does the team think about AI-driven competitive threats to their own product in the next 18 to 24 months?

None of these requires technical expertise to ask. They require watching how the founder handles a challenge to their core thesis. Clarity and intellectual honesty in the response matter more than the content of the answer.

Building your investment thesis: the conviction test

Evaluation frameworks are tools for building pattern recognition. The goal is not to find a mechanical reason to say yes or no. It is to build enough conviction to write a check and survive the zeros that will inevitably follow.

As one experienced angel framed it at a Play Money Angel 101 session: "My thesis isn't just a filter. It's emotional armor when the zeros hit."

The research-backed conviction test has four parts. The founder passes Gate 1. The market and model survive bottoms-up scrutiny, with named customers and unit economics that do not depend on behavior change that has not happened. The deal terms sit within market range, with no uncapped notes, no missing pro-rata, and no cap table conflicts. And your thesis can survive the loss.

As the Play Money first-five-investments framework puts it: "If you're under five checks, you're not behind. You're early. The goal isn't perfection. It's shared language, pattern recognition, and disciplined exposure to outcomes." An investment that passes your framework but not your conviction test is not ready. Wait.

A first-pass rubric matters most for angels early in their investing. 80% of Play Money angels are net new to angel investing, according to Play Money, and a repeatable rubric is how they build the pattern recognition that experience eventually makes automatic.

One calibration point from the ACA 2025 Angel Funders Report: the gold exit for 2024 was TCA Venture Group's investment in CaseStack, which returned 22.7x over a 22-year hold. That is not a quick flip. It is patient capital supported by evaluation discipline applied at entry and maintained through engagement. Rubrics do not just help you pick. They define what you are willing to hold, and why.

Written by Cheryl Kellond, founder of Play Money. Serial founder, MIT Sloan MBA, active angel investor. Not tax advice, and not investment advice. Consult a qualified professional for your specific situation. Last updated: July 2026.

Want to put your learning into action?

We share one vetted startup deal every week. Always free to lurk and learn.

Frequently asked questions

The most-cited research, the Wiltbank and Boeker study for the ACA and Kauffman Foundation, found that angels who spent 20 or more hours on diligence saw a 5.9x return multiple, versus 1.1x for those who spent less, where 65% of low-diligence exits returned less than the original investment. The top quartile, over 40 hours, reached 7.1x. The practical floor is roughly 20 hours, weighted toward founder evaluation early and deal mechanics last. Diligence also continues after the check: angels who engaged with portfolio companies monthly saw 3.7x in four years versus 1.3x for annual engagement.

The leading frameworks converge on founder quality as the dominant variable. Y Combinator weights determination over intelligence. NFX leads with founder-market fit, meaning unique insight into the market that others lack. Hustle Fund's data shows team quality dominates the decision. In a first call, experienced angels test customer specificity (can the founder name a real buyer and why they return?), intellectual honesty (what will you be bad at?), coachability under challenge, and investor-update cadence. The clearest red flag is energy that cannot be grounded in a real, named customer whose life is measurably better.

On the founder side: cannot name a real current customer, deflects on weaknesses, agrees with everything, or sends investor updates only annually. On the market side: top-down TAM with no unit-level math, no paying customers, or unit economics that depend on behavior change that has not happened. On the deal side: an uncapped note, a valuation cap above market with no milestone justification, missing pro-rata rights, no lead investor, or cap table conflicts. Deal terms also read as character signals, since how a founder structures a round tells you how they think about fairness and long-term incentives.

Run a 3-gate process in sequence. Gate 1 is founder quality, evaluated in a 30-minute call: customer specificity, intellectual honesty, coachability, and update cadence. Gate 2 is market and model, done with desk research and a second call: bottoms-up sizing, revenue evidence, unit economics, and domain match. Gate 3 is deal terms and cap table, reviewed in the deal room. Each gate is binary, and a single fail stops the process. The common mistake is reversing the order and anchoring on valuation before confirming the founder is worth backing.

The causal evidence points that way. The Wiltbank dataset of 538 angels and 3,097 investments shows return multiples rising with diligence hours: 1.1x below 20 hours, 5.9x at 20 to 40 hours, and 7.1x above 40 hours. Domain expertise roughly doubled multiples. Context matters, though: the ACA 2025 Angel Funders Report found 25% of 2024 exits returned less than 1x capital, and the top 10% of exits historically produce about 75% of total cash returns. Diligence improves your odds on any single deal, but portfolio construction and diversification carry the returns.

Evaluate three things in order: the founder, the market and model, then the deal terms. Start with a short first call to test whether the founder knows their customer with real specificity and updates investors clearly. Do desk research on market size from the bottom up. Only then dig into terms and the cap table. The data is blunt: angels who spend 20 or more hours on diligence see materially higher returns than those who spend a handful. Treat diligence as the work that earns the return, not a formality.

Size the problem you can imagine, not the market that already exists. Category creators look tiny against current markets. Airbnb against air mattresses and Uber against taxis would both have failed an existing-TAM test. Play Money treats early-stage diligence as investor judgment: your job is to see the size yourself and to judge whether the founder understands the shape of the business, who buys and how, rather than to demand unit economics a pre-revenue company cannot have. When there is no market proof yet, you are underwriting the human and the quest. That is belief capital.

How to Evaluate Startup Founders: An Angel Investor’s Framework

The One Question That Tells You If a Founder Is Investable

Your First 5 Angel Investments: A Framework for Evaluating Startup Deals

blogangel-investingdue-diligencefounder-evaluationdeal-termsstartup-investing

Back to all posts

How to Evaluate Startup Founders and Deals: An Angel Investor's Framework

What the research says about diligence hours and returns

The 30-minute first call: what to evaluate and what to defer

Investor update cadence

Customer specificity, not mission fluency

Self-awareness under pressure

Bottoms-up market sizing

Founder evaluation frameworks: what the leading practitioners say

Y Combinator: determination over intelligence

NFX: founder-market fit as the primary filter

Hustle Fund: forced-choice scoring

Customer obsession versus founder energy

The 3-gate vetting process

Gate 1: Founder quality (30-minute call)

Gate 2: Market and model (desk research plus second call)

Gate 3: Deal terms and cap table (deal room)

Want to put your learning into action?

Deal terms as a character signal

What you are actually underwriting: the failure-mode lens

The AI-era evaluation layer

Building your investment thesis: the conviction test

Want to put your learning into action?

Frequently asked questions

Keep reading

How to Evaluate Startup Founders: An Angel Investor’s Framework

The One Question That Tells You If a Founder Is Investable

Your First 5 Angel Investments: A Framework for Evaluating Startup Deals

You read this far... you're clearly curious.