Investor Intelligence
A systematic approach to investor matching for Pre-Seed and Seed fundraising in Europe.
Most founders doing a fundraise end up with the same thing: a long list of investor names assembled from numerous VC lists, Crunchbase, LinkedIn, and recommendations from their investors and fellow founders. The list might have 150 entries. It might have 300.
The list is not the problem. The problem is that a list is not an answer.
Which of these investors have actually backed companies in your category — not just adjacent ones? Which ones lead rounds at your stage, and which ones only co-invest? Who is deploying from a current fund versus winding down? Which ones would view a competitor in their portfolio as a reason to pass, not a reason to lean in?
These questions cannot be answered by filtering a database. They require judgment — and judgment takes time that most founders spend on the wrong investors.
This paper describes a systematic approach to solving that problem.
The Database
The foundation is a structured database of approximately 2,800 investment entities active in European early-stage: VC funds, family offices, and angels across Pre-Seed through Series A.
Scale is not the differentiator. What matters is that every record is built around a shared taxonomy — the same vocabulary used to describe the startup mandate on the other side of the match. This is what makes structured comparison possible.
Each investor record carries:
- Stage focus and ticket range, including whether the fund leads rounds or co-invests only
- Geography (where they deploy capital, not where they are headquartered)
- Taxonomy tags across four dimensions: industries, business models, customer segments, and technology domains where relevant
- Portfolio companies with structured taxonomy tags — not just names
- Fund vintage and current vehicle
- Data quality state: imported, enriched, or manually verified
The portfolio layer is what separates this from a filtered directory. Portfolio companies are a revealed preference. They show what an investor has actually backed, not what their website claims they invest in. The two are often not the same.
The Matching Process
Matching runs in two layers. The sequence matters.
Layer 1 — Structure
The first layer is deterministic. It answers structural questions before any scoring happens.
Three hard filters:
Stage. Does the investor’s active stage range include the mandate’s current stage? An investor who writes Series A cheques is not a fit for a Pre-Seed round, regardless of how well the taxonomy aligns.
Geography. Does the investor cover the geography of the startup’s primary market? This is evaluated against where they deploy capital — a Berlin-based fund with a pan-European mandate is a different counterpart than one that only backs DACH companies.
Ticket size. Is the funding ask within range? A mandate ask that exceeds the investor’s maximum is a hard exclusion — they cannot lead the round. An ask that falls below the investor’s minimum is treated differently: the investor may still participate as a co-investor, and is flagged accordingly rather than dropped from the list.
One design principle worth stating explicitly: null fields on an investor record are not grounds for exclusion. An investor with incomplete geography data is not excluded — absence of data is not evidence of misalignment. Only a populated field with zero overlap triggers exclusion. In practice, this matters: smaller funds and angels often have thin public profiles.
After hard filters, every investor that passes gets a domain score — a weighted overlap ratio across four dimensions:
| Dimension | Weight | Rationale |
|---|---|---|
| Industries | 3× | Strongest signal of market understanding |
| Business models | 2× | Indicates familiarity with commercial mechanics |
| Customer segments | 1× | Secondary confirming signal |
| Technology domains | 1× | Relevant for deep tech mandates only |
When structured portfolio data exists, the domain score blends taxonomy overlap (70%) with portfolio company taxonomy overlap (30%). Portfolio companies are a revealed preference — the blend rewards investors whose actual deals confirm their stated focus.
Layer 2 — Intelligence
After deterministic scoring, the system runs a portfolio intelligence pass for matched investors. This is where the analysis moves beyond what a database can compute.
For each matched investor, the system produces:
A fit signal — strong, neutral, or weak — based on portfolio evidence against the specific mandate. A strong signal requires concrete grounds: a direct investment in the same category, or multiple portfolio companies that confirm the thesis aligns. Thematic overlap alone is not sufficient.
A written rationale — a substantive assessment of why the portfolio evidence is or is not compelling for this mandate specifically. The same investor will read differently across two mandates with similar taxonomy profiles if the portfolio evidence points in different directions.
The Six-Tier Classification
Every investor is assigned to one of six tiers. Assignment happens in sequence — each check only runs if the prior one has not already placed the investor.
| F | Not a Fit | Hard structural exclusion |
| E | Relationship Building | Covers only the next stage tier above; not now, but worth tracking |
| D | Conditional | Ticket floor above the mandate ask; co-invest possible, not a lead |
| C | Exploratory | Intelligence: weak portfolio signal on a B investor |
| B | Approach | Passes all hard filters; core working list |
| A | Prioritize | Intelligence: strong portfolio signal |
Two tiers — Prioritize and Exploratory — can only be assigned by the intelligence layer. The deterministic scoring engine never assigns A or C. A high domain score without portfolio conviction is not the same as structural alignment. The quantitative score earns a place on the list. The qualitative signal determines priority.
The Deliverable
The output is a structured Investor Analysis report, specific to the mandate. It is not a filtered export of the database — every element is generated for this company at this stage at this moment.
Executive summary — all matched investors ranked by tier and domain score, with investor type, recommendation tier, and the key signal that drove tier assignment. This is the working document for preparing outreach.
Individual investor profiles — one page per investor in the Prioritize, Approach, and Exploratory tiers. Each profile contains the structural metadata, the recommendation tier, a substantive assessment grounded in the mandate’s specific profile, and the portfolio evidence with a note on why each company is or is not a meaningful signal.
Pipeline tiers — Conditional, Relationship Building, and Not a Fit investors are presented in compressed form: contact data and a brief rationale. Not every investor warrants a full page.
The report comes in two formats designed to complement each other.
The PDF is the deep-dive. It is the document a founder reads before entering a process — understanding who they are approaching, what the investor has backed, and what angle is most likely to resonate.
The XLSX pipeline is the operational companion. Every investor is pre-populated with tier, fit signal, ticket range, lead status, contact names, and LinkedIn links. The founder adds four columns: Stage, Intro Via, Next Step, and Next Step By. The file can be used directly or imported into any CRM.
The design principle: deep analysis informs who to approach and why. Lightweight structure manages what happens after.
What This Is Not
The analysis does not draft outreach. It does not track responses or manage pipeline state beyond the initial handoff. Those are problems the founder solves in their own tooling — the deliverable is designed to be CRM-agnostic by structure.
The database is also not a real-time product. Freshness is a function of active curation — enrichment runs and manual verification — not live API integrations. For the use case this serves, a well-maintained snapshot is more reliable than a poorly-maintained real-time feed.
A Note on Judgment
The scoring weights, tier thresholds, asymmetric ticket logic, and portfolio signal interpretation in this system are not engineering defaults. They reflect years of working directly within and alongside European VC funds — understanding how investment decisions actually get made, what partners look for across a first meeting and a second, and what structural misalignments kill conversations before they start.
The system encodes that judgment. It does not replace it. The output is a starting point for a conversation between the advisor and the founder — not a substitute for one.
Interested in an Investor Intelligence report?
Investor Intelligence →
Methodology and schema on GitHub:
GitHub Repository →