How Verdikt works are useful only when they survive contact with the evidence. This guide builds the framework that does.
A Verdikt report starts with a problem: most founders who want a research-backed view of their startup idea have no efficient way to get one. Hiring a research analyst costs $3,000 to $10,000 per engagement and takes two to three weeks. AI chat tools produce generic, poorly sourced outputs. Doing the research yourself requires 40 to 60 hours of structured work across five research dimensions.
Verdikt is built to close that gap. The output is a structured research memo with named risk thresholds and 40+ cited sources, covering market opportunity, competitive landscape, regulatory context, customer evidence, and economic viability, with every claim cited to its source. Here is what happens between the pitch and the report.
Step one: the guided brief
Every Verdikt report begins with a structured guided brief. The brief collects the inputs that determine the research dimensions and focus: who the buyer is, what geography you are targeting, what price you intend to charge, what your current solution mapping looks like, and which assumptions you are most uncertain about.
The brief is structured because that is the minimum needed to collect inputs that produce a non-generic report. A two-sentence pitch produces a two-page generic memo. A structured brief produces a multi-section cited research report.
The brief is conversational, not a form. It asks follow-up questions based on your answers, probes assumptions that seem underspecified, and confirms the research dimensions before the pipeline runs.
Step two: the research pipeline
The research pipeline runs across five dimensions simultaneously: market, competitive, regulatory, technical feasibility, and customer evidence.
Market research retrieves current market size data from government databases (Census Bureau, BLS, relevant industry-specific federal agencies) and industry association publications. It builds a bottoms-up market size estimate specific to the buyer and geography you described in the intake, not a category-level estimate from a research firm.
Competitive research retrieves current pricing, product positioning, and review data for the named competitors in your space and the current-solution landscape. It analyzes product review sites for negative theme clusters and job posting histories for strategic signals.
Regulatory research retrieves the applicable regulations for your product category and geography from primary government sources. It identifies the specific provisions relevant to your product and the compliance requirements that would affect your launch timeline or cost structure.
Customer evidence research retrieves qualitative and behavioral data about your target buyer from forum discussions, review site commentary, job posting requirements, and published industry surveys. This dimension cannot be replaced by customer interviews, but it provides the secondary evidence context that makes interview findings more or less representative.
Economic research uses the market, competitive, and customer evidence to build a unit economics model with benchmark-referenced CAC, realistic churn assumptions, and a stated price-to-LTV ratio. It outputs a specific verdict on whether the economic model is viable at achievable customer counts.
Step three: risk threshold identification
After the five research dimensions are complete, the pipeline identifies the named risks and the thresholds at which they trip: the specific conditions that, if true, would mean the business cannot work as described. These are derived from the research findings rather than from the brief, because the research may surface conditions that the founder did not anticipate when describing the idea.
Each risk threshold is stated as a testable condition, the evidence for and against it is cited, and a confidence level is assigned.
Step four: the verdict
The verdict is a numerical Verdikt Score on a 0 to 100 scale, broken into four sub-scores: Market, Competition, Demand, and Stack Fit.
A high Verdikt Score (70 or higher) indicates that the risk thresholds tested are not currently tripped, the market and economic dimensions support the model, and the available evidence does not identify a fundamental viability problem. It is not a guarantee of success. It is a research-based assessment that the conditions for success are present.
A low Verdikt Score (below 40) indicates that one or more risk thresholds are strongly supported by the evidence, and that the business model as described is unlikely to work under current conditions. It identifies the specific finding that drives the score and points to what would need to change for the answer to flip.
Step five: the delivered memo
The delivered memo is a structured document formatted for two use cases: the founder's own decision-making (should I build this?) and team alignment (here is what the research tells us about our strategy).
Every factual claim in the memo traces back to the source that contains it. The source library attached to the report contains 40 or more verified sources with links, so any claim can be verified independently.
The memo is shareable via a unique URL, updated if you run a revision, and archived for your records. If something is wrong on our end (a broken citation, a hallucinated number no source supports, or a verdict that contradicts the evidence the pipeline itself surfaced), email support@tryverdikt.app within fourteen days with what specifically went wrong. We review each request on the evidence and either re-run the report or issue a refund.
The pipeline, stage by stage
The first stage is the guided brief. A founder answers a set of structured questions through a conversational interface that probes for the ICP, the price hypothesis, the 10× claim, the falsifier, and the named risks. The output is a structured brief: a hypothesis tree with one root claim, three to five child claims, and the evidence requirements for each. The brief is parsed by Claude Sonnet 4.6 and validated by GPT-5 before it advances to the next stage.
The second stage is market sizing. Verdikt runs bottom-up TAM from primary databases including SEC EDGAR, BLS, FRED, Census, Eurostat, and country-specific equivalents. SAM is derived from ICP density times an ACV band, cross-checked against three comparable wedges. SOM is modeled across three GTM scenarios with named penetration ceilings. Top-down "X is a $50B market" assertions are explicitly excluded from the source library. The output is a TAM/SAM/SOM table, a sizing memo, and a growth-rate range with two independent sources triangulated.
The third stage is the competitive map. Direct competitors are pulled from Crunchbase, G2, Capterra, ProductHunt, and recent product announcements. Indirect substitutes including open source, in-house build, spreadsheets, and do-nothing baselines are mapped alongside. Each player is scored on feature parity, distribution reach, capital position, hiring velocity, integration surface, and switching cost. The moat thesis is tested against the strongest competitor’s most recent 90 days of shipping.
The fourth stage is the 10× claim test. The claim is decomposed into measurable sub-claims (latency, cost, accuracy, coverage, time-to-value). Each is benchmarked against the named competitor and the do-nothing baseline. The named falsifier is run as an inverse test: if the criterion holds today, the claim is downgraded. A second model (typically GPT-5) runs an adversarial pass that tries to break the claim from the strongest counter-argument. The transcript ships with the verdict.
The fifth stage is synthesis. A multi-section memo is drafted to a fixed template: cover with the Verdikt Score and named risks, market section, competition section, 10× test, pricing, build plan (when the Verdikt Score is 70 or higher), GTM motion, and source library. Tier-graded citations appear inline. The cover shows a numerical Verdikt Score (0 to 100) with four sub-scores.
The 14 quality gates that block a memo from shipping
The memo does not ship if any of the following are true: a numeric claim has no Tier 1 or Tier 2 source; a citation count is below 35; the falsifier check has not been run; the adversarial pass has not produced a counter-argument; the named risks are missing from the cover; the SOM math does not reconcile with the GTM scenario; the moat thesis was not tested against the strongest competitor’s 90-day shipping cadence; the ACV band lacks three comparable wedges; growth rates were averaged rather than triangulated; the do-nothing baseline is missing from the competitive map; the recommendation contradicts the evidence the pipeline surfaced; the reasoning trace is incomplete; the run record is missing; or the re-run hooks for the three weakest claims are not named. Any failure routes the run back to the relevant stage and the verdict is held until the gate clears.
Refund or re-run
If Verdikt produces a fault, the remedy is a re-run at no additional charge or a full refund, decided on the evidence. "Fault" is a specific term: a broken or misattributed citation, a hallucinated numeric claim that no cited source supports, a verdict that contradicts the evidence the pipeline itself surfaced, or a pipeline failure that produced an incomplete or malformed report. The policy is documented in the terms of service. The reason the policy exists is not generosity. It is that a research product without accountability is just confident output. The fault definition is narrow on purpose: a low Verdikt Score report is not a fault. A disagreement with the conclusion is not a fault. The work product must be defensibly wrong for the remedy to apply.
A research letter for AI builders.
One letter per month. What we're shipping, what we're learning, what's actually working in the field.