The transparency record

How this works.

The atlas did not fall from the sky. Every number on every page traces back to a dog, a study, a researcher, and a decision about how to present what they found. This page tells you where the data comes from, what we do with it, what we believe about how science should work, and what we will never do with what you share.

If something on this page ever stops being true, we have a problem. Hold us to it.

Where the data comes from

Sniff does not generate the foundational science. We synthesize it. The difference matters.

The genetic backbone of the atlas is the CanVAS dataset: 14,478 dogs genotyped across 77,215 SNP markers on the CanFam4 reference genome, drawn from 15 contributing research cohorts spanning 342 breeds in our atlas, sourced from a CanVAS backbone of over 400 breeds. CanVAS was built by researchers who spent years collecting, genotyping, and curating canine DNA. We computed the 256-dimensional coordinate system from their data. The dogs are theirs. The computation is ours. Both are public.

The Golden Retriever Lifetime Study contributes 3,197 Heroes to the atlas. These dogs were enrolled by the Morris Animal Foundation beginning in 2012 and tracked through annual veterinary exams, bloodwork, and owner surveys for over a decade. Their genotypes entered through the CanVAS harmonization. Their phenotype data, if it joins, will come through the Morris Data Commons under their data use terms. We did not run this study. We honor it.

Darwin's Ark contributes the behavioral genetics layer and the mixed-breed coverage that purebred-focused studies miss. The Morrill et al. 2022 finding that breed explains roughly 9% of behavioral variance reshaped how we talk about breed and behavior on every page of this site. Their data is deposited in public repositories. We cite it. We do not claim it.

OMIA (Online Mendelian Inheritance in Animals) is the canonical reference for every genetic disease mentioned on this site. When we say a variant is "well-established," we mean OMIA has it catalogued with a known gene, a mode of inheritance, and peer-reviewed publications supporting the association. When we say "emerging," we mean the evidence exists but replication is limited. When we say "contested," we mean researchers disagree. OMIA's curators at the University of Sydney maintain the standard. We follow it.

OFA (Orthopedic Foundation for Animals) provides the hip, elbow, cardiac, and patellar screening data cited on breed pages. We use their published statistics and we note their limitation: OFA data is self-selected. Breeders choose which dogs to submit. Dogs that fail screening are less likely to be submitted. The true population prevalence of dysplasia is almost certainly higher than OFA numbers suggest. We say this on every page where OFA data appears because pretending the numbers are unbiased would be dishonest.

Breed-specific health surveys from kennel clubs, breed health foundations, and veterinary cohort studies (particularly the VetCompass program at the Royal Veterinary College) provide the lifespan, cause-of-death, and disease prevalence data on breed pages. When UK data and US data disagree, we report both. When a breed-club survey and a veterinary cohort study produce different numbers, we report both and explain why they differ. The answer is usually methodology, not dishonesty, but the reader deserves to see the divergence.

The Dog10K Consortium provides the imputation reference panel: 1,929 dogs whole-genome sequenced at high coverage, phased, and publicly released. This is the library GLIMPSE2 uses to fill in the gaps when a dog is sequenced at low coverage through the Sniff Panel. We did not build this panel. We use it. We cite it. We contribute to its growth through premium-tier genomes that feed back into the public reference.

Every one of these sources is the product of years of work by people who care deeply about dogs and science. We did not do their work. We built the layer that connects their work to each other and to the people who need it most: the owners, the vets, the breeders, and the researchers who are trying to make dogs' lives better.

The integration is what was missing. Not the data. The data existed. It sat in separate silos, published in separate journals, stored in separate databases, accessible to separate communities. A breed's genetic diversity score lived in one paper. Its cancer mortality rate lived in another. Its OFA hip statistics lived in a third. Its behavioral profile lived in a fourth. Nobody had stitched them together into one place where a person could see the full picture of what it means to be a Golden Retriever, genetically, medically, and behaviorally.

That is what Sniff does. We stitch. We cite every thread. And we make the whole tapestry free.

What we believe

Genetic risk is not disease certainty.

A dog that carries a variant associated with progressive retinal atrophy may never go blind. Penetrance, the probability that a genotype produces the expected phenotype, varies from near-zero to near-complete depending on the variant, the genetic background, and the environment. Every health-related finding on this site carries a penetrance note when data exists, and an honest "penetrance not fully characterized" when it does not. We will never simplify a complex risk into a binary scare because that sells tests and misleads owners.

Attribution is not optional.

Every claim on every page links to its source. Not "studies show" with no citation. Not "research suggests" with no DOI. The actual paper, the actual dataset, the actual methodology. If you want to verify a number we published, the link is right there. If we cannot cite it, we do not say it.

Methodology is visible, not buried.

Our scoring rubrics, our projection pipeline, our imputation accuracy, our confidence tiers, our QC thresholds, all published on the methodology page. Not summarized. Published. A population geneticist can read our methodology and reproduce our results from the same public data. That is the standard. If we ever fall below it, we want to know.

Science evolves and so do we.

When a variant gets reclassified, we reclassify it. When a study is retracted, we remove the citation. When our own computation produces a result we later discover was wrong, we correct it and log the correction with a date and an explanation on the public corrections page. The corrections page is not a shame ledger. It is proof that we are paying attention.

Cohorts are heroes, not subjects.

The 3,197 Golden Retrievers in the GRLS did not volunteer. Their owners did. Their owners committed to 14 years of annual exams and surveys because they believed their dog's life could teach something. We call those dogs Heroes because that is what they are. The 14,478 dogs in the CanVAS backbone are Founders because without them the atlas does not exist. The language matters. "Subject 617940" erases the life. "Ember, Hero 1247, Distinguished Oldie" honors it.

The owner is the steward, not the customer.

Nobody who adds their dog to the atlas is buying a product. They are contributing to something that outlasts any transaction. The science their dog enables is free for everyone. The knowledge that emerges from their dog's neighborhood helps dogs they will never meet. The relationship between an owner and this atlas is stewardship, not consumption.

What Sniff will never do

Sell individual dog data to anyone.

Not to researchers. Not to pharmaceutical companies. Not to insurers. Not to pet food brands. Not to anyone. When a researcher needs dogs like yours for a study, we send you the opportunity. You decide. You contact them. You get compensated. We never hand over your dog's data. This commitment is written into our terms and survives any sale, merger, or acquisition of this company.

Claim certainty where the evidence is uncertain.

If a variant has incomplete penetrance, we say so. If a health association is based on a single study with a small sample, we say so. If two studies disagree, we report both. We do not pick the scarier number to drive engagement. We do not round up to make a breed page more dramatic. The data is the data. Our job is to present it faithfully, not to editorialize it into something more clickable.

Compete with veterinarians.

We inform. Vets diagnose. Every health-related page on this site tells you to consult your veterinarian before making medical decisions. We provide the genetic context. Your vet provides the clinical judgment. Those are different roles and we will never blur the line between them.

Take money from a pet food company while scoring their food.

The food scoring engine is independent. No manufacturer, distributor, or retailer has ever paid us anything. No affiliate commission influences a score. No brand gets advance notice of their rating. The score is the score. If that ever changes, the change is posted on the Pledge page with a date before any related content publishes.

Use anxiety as a sales lever.

"Find out if your dog is at RISK!" is how the industry sells DNA tests. Fear of what you might find is the conversion mechanism. We refuse to participate in that. If your dog carries a variant, we tell you what it means in plain language with honest penetrance data. We do not manufacture urgency. We do not use red warning badges on findings that do not warrant alarm. The emotional state of the owner reading the page is something we consider in every design decision, and we design for clarity, not panic.

Hide behind jargon.

If a finding cannot be explained in language a non-scientist can understand, the page is not done. Every breed page, every gene page, every health finding has a plain-language summary above the fold and technical detail below. The depth is there for the researcher. The clarity is there for the owner. Both deserve access to the same information.

What is true today versus what is coming

The atlas is young. It has strengths and it has gaps. Here is the honest inventory.

What is solid today

14,478 genotyped dogs across 342 breeds with real PCA-256 positions
3,197 GRLS Heroes with Oldie sub-cohort identification
Projection pipeline validated at 99.6% breed-cluster accuracy on external data
GLIMPSE2 imputation pipeline validated at 97.85% leakage-free NRC (1x) and 95.95% (0.4x)
Coverage-sensitivity curve measured at four depths with launch-gate pass at all production tiers
3 carrier-status variants shipped with honest tag-SNP-proxy caveats
Carrier calls validated where possible, documented where validation failed, with the failures published alongside the successes

What is flagged

Carrier-status calls use tag-SNP proxies, not direct causal variant genotyping. Every variant page says this. When direct imputation from whole-genome data arrives, these proxies will be replaced and the upgrade will be logged.
Breed-health data is incomplete for most of the 215 breeds with editorial pages. Some pages have deep health profiles. Many have genetics only. We are filling this in systematically with cited data from published studies. If a breed page lacks health data, it says so rather than showing nothing.
The atlas backbone is computed from microarray data at 77,215 markers. This is dense enough for breed-level and neighborhood-level placement. It is not dense enough for fine-grained within-breed substructure analysis on all breeds. Whole-genome sequencing data (via Sniff Panel) will deepen this resolution over time.
Behavioral associations on breed pages draw from Morrill et al. 2022 (Darwin's Ark). The headline finding (breed explains about 9% of behavioral variance) means breed-based behavioral predictions are weak. We present them as population-level tendencies, never as predictions about your individual dog.

What is coming

Breed-health profiles for all 215 breeds, sourced from VetCompass, OFA, OMIA, Donner 2023, and published breed-specific studies
Whole-genome sequencing via Sniff Panel with validated imputation at production accuracy
Re-imputation of all existing dogs against growing reference panels, with quarterly atlas version releases
Research recruitment infrastructure connecting willing owners with funded studies
API access exposing the 256-dimensional knowledge graph for developers and researchers
Breed-aware imputation refinement trained on Sniff's own data (the thing nobody else can build)

Each of these will be documented when it ships: what changed, what improved, what new caveats were introduced. The corrections page and the atlas changelog are the living record of the system's evolution.

The correction record

When we get something wrong, we say so. The corrections page lives at sniff.world/corrections/. Every entry includes:

What was wrong
When we found it
What we changed
Why the original was incorrect
Who flagged it (if they want to be credited)

We do not delete corrections. They accumulate. The length of the corrections page is not a sign of failure. It is a sign that we are looking.

The invitation

If you find a number that does not match its cited source, tell us. If you find a breed page making a claim without a citation, tell us. If you find a gene page oversimplifying penetrance, tell us. If you are a researcher whose work we cited incorrectly, tell us. If you are a breed-club health coordinator who sees data that conflicts with your registry's findings, tell us.

[email protected]

The atlas gets better when people who know more than we do take the time to say so. Every correction is a contribution. Every contribution is credited if you want it to be.

This is an open system. The data is free. The methodology is published. The corrections are public. The science belongs to everyone.

That is how this works.

Last updated May 31, 2026

Sources: CanVAS (Brundage 2026) · GRLS · Darwin's Ark · Dog10K · OMIA · OFA · VetCompass