Back to Home

How We Rate Evidence

Our scoring system uses a multi-factor weighted algorithm to objectively evaluate supplement research. Every score is computed automatically from the raw study data — no human bias, no cherry-picking.

The Scoring Pipeline

Each supplement's evidence score flows through four stages. Every individual study is scored, then scores are aggregated with a confidence adjustment.

Study Design

How rigorous is the methodology?

Study Type

Human, animal, or in vitro?

Sample Size

How many participants?

Outcomes

Positive, negative, or mixed?

1 Study Design Quality

Not all studies are equal. A double-blind, placebo-controlled trial carries far more weight than a case study. We assign a multiplier based on the study's methodological rigor.

Meta-analysis / Systematic Review

1.5×

Double-blind RCT + Placebo

1.4×

Double-blind Trial

1.3×

Randomized Controlled Trial

1.2×

Controlled / Comparative

1.0×

Open-label / Pilot

0.8×

Observational / Cohort

0.7×

Case Study

0.5×

Why it matters: A meta-analysis synthesizes data from multiple trials, giving it 3× the weight of a case study.

2 Study Type

Human clinical trials are the gold standard for supplement research. Animal and in vitro studies provide supporting evidence but can't be directly applied to humans.

Human Study

1.0×

Full weight. Clinical trials, cohort studies, and human observational data.

Animal Study

0.5×

Half weight. Useful for understanding mechanisms but may not translate to humans.

In Vitro

0.3×

Low weight. Lab cell studies provide early signals but are far from clinical proof.

3 Sample Size Scaling

Larger studies are more statistically reliable. We use a logarithmic scale so that going from 10 to 100 participants matters more than going from 1,000 to 10,000.

Formula: weight = min(2.2, 0.3 + log₁₀(n) × 0.45). The logarithmic curve means diminishing returns.

4 Outcome Analysis

We scan each study's results text for statistically significant language to determine whether findings were positive, negative, or mixed.

1.0

Positive

Study found statistically significant benefits.

0.6

Mixed

Both positive and negative signals found.

0.5

Unknown

Results text doesn't contain clear signals.

0.15

Negative

Study found no significant benefit.

Putting It All Together

Each study gets a weight (design × type × sample size) and an outcome score (0 to 1). These are combined into a weighted average, then adjusted for confidence.

Per-study weight

W = Design × Type × SampleSize

Raw score (weighted average)

Raw = (ΣW_i × Outcome_i) / ΣW_i × 100

Confidence adjustment

Confidence = min(1, ΣW / 6)

Final score

Score = 50 + (Raw − 50) × Confidence

Why confidence matters: With only 1-2 studies, even if both are positive, we can't be confident yet. The confidence adjustment pulls the score toward 50 (uncertain) when evidence is thin.

Evidence Level Thresholds

The final score maps to one of four evidence levels.

Weak0 – 37

Moderate38 – 54

Strong55 – 71

Very Strong72 – 100

Weak

Limited or no human clinical trials. Evidence may come only from animal or in-vitro studies.

Moderate

Some human evidence exists, but results are mixed or studies have limitations.

Strong

Multiple well-designed human studies show consistent positive results.

Very Strong

Extensive body of high-quality human research with consistently positive outcomes.

Human Evidence Requirement

Confidence is primarily driven by human clinical data. Animal and in-vitro studies inform mechanisms but cannot replace human trials.

0 human studies

Evidence is capped at Weak regardless of how many positive animal studies exist.

<25% human weight

Evidence is capped at Moderate. Some human data exists but the majority comes from non-human models.

Substantial human data

Strong and Very Strong ratings require meaningful human clinical evidence.

Transparency & Limitations

Fully Automated

All scores are computed algorithmically from the study data. No manual overrides, no editorial bias.

Not Medical Advice

These scores summarize research trends. They are not recommendations. Always consult a healthcare professional.

Keyword-Based Outcome Detection

Outcome scoring relies on text pattern matching, which may miss nuance.

Living Database

Scores update automatically as new studies are added. Evidence levels can change over time.

What does the science actually say?

Browse the Research

Akkermansia muciniphila

Ashwagandha

Beetroot

Berberine

Beta-Alanine

Betaine Anhydrous

Chaga

Chamomile

Cistanche

Collagen

Cordyceps

Creatine Monohydrate

Ecdysterone & Turkesterone

Fadogia Agrestis

Green Tea Extract

Horny Goat Weed

L-Arginine

L-Citrulline

L-Theanine

Lemon Balm

Maca

Magnesium (General)

Magnesium Chloride

Magnesium Citrate

Magnesium Glycinate

Magnesium Oxide

Milk Thistle

Mullein

Nattokinase

NMN

Omega-3 / Fish Oil

Potassium

Probiotics

Psyllium Husk Fibre

Reishi

Resveratrol

Rhodiola

Saw Palmetto

Sea Moss

Shilajit

Tart Cherry

Tongkat Ali

Tribulus Terrestris

Turkey Tail

Turmeric

Vitamin D

Zinc

ZMA

Biotin

Calcium

CoQ10

Elderberry

Garlic Extract

Glutathione

Iron

Melatonin

N-Acetylcysteine (NAC)

Selenium

Spirulina Extract

Vitamin B12

Vitamin C

Vitamin K