Memory-Reliability Experts Are Not Categorically Barred: Eleventh Circuit Clarifies Rule 702/403 Limits Beyond Eyewitness-ID Testimony
1. Introduction
In United States v. Stefan Eberhard Zappey (11th Cir. Jan. 21, 2026), the Eleventh Circuit affirmed a former Department of Defense elementary school teacher’s convictions (and life sentence) for sexually abusing multiple children under twelve, in violation of 18 U.S.C. §§ 3261, 2241(c), 2244(a)(5). The prosecution relied heavily on adult witnesses’ recollections of abuse suffered as young children (roughly 6–9 years old) more than a decade earlier, alongside corroborative testimony from additional former students and school personnel who had observed inappropriate conduct.
The appeal turned on evidentiary gatekeeping: whether the district court abused its discretion by limiting defense expert Dr. Christopher Tillitski’s testimony on memory science and by excluding defense expert Dr. Jeffrey Neuschatz’s testimony (proffered outside the jury’s presence). Zappey argued these rulings prevented him from presenting a full defense centered on the alleged unreliability of childhood memories and the risk of false or distorted recall.
2. Summary of the Opinion
The Eleventh Circuit held that:
- The district court properly limited Dr. Tillitski’s testimony to avoid impermissible expert commentary on witness credibility and to keep the testimony tied to issues genuinely helpful to the jury under Federal Rule of Evidence 702 and the balancing constraints of Rule 403.
- The district court did not abuse its discretion in excluding Dr. Neuschatz’s testimony as needlessly cumulative under Rule 403, given the overlap with Dr. Tillitski’s admitted testimony.
- Even assuming arguable error, the court emphasized the breadth of the evidence against Zappey and described why exclusion would be harmless under circuit standards.
Most notably, the court clarified an important doctrinal boundary: while Eleventh Circuit precedent categorically excludes certain expert testimony about eyewitness reliability, that per se rule does not automatically extend to the broader field of memory-reliability science. Instead, admissibility of memory-science testimony is a case-specific Rule 702 “helpfulness” inquiry, constrained by the longstanding bar against experts opining—directly or by implication—on witness credibility.
3. Analysis
3.1. Precedents Cited
A. Standards of review and the trial court’s evidentiary discretion
- United States v. Reeves and United States v. Barton: The panel grounded its review in deferential abuse-of-discretion principles, stressing the “range of permissible choices” available to trial judges.
- Kumho Tire Co., Ltd. v. Carmichael and United States v. Frazier: These anchor the “manifestly erroneous” threshold and reinforce the district court’s gatekeeping role for expert testimony.
- United States v. Brown and United States v. Henderson: Cited for what constitutes abuse of discretion (wrong legal standard, clear error of judgment, erroneous view of law, improper procedure).
- United States v. Langford and United States v. Henderson: Provide the circuit’s harmless-error framing for evidentiary rulings—whether there is a reasonable likelihood the error affected substantial rights or substantially influenced the outcome.
B. Rule 702/Daubert gatekeeping and Rule 403 backstops
- Daubert v. Merrell Dow Pharms., Inc.: The court relied on Daubert’s reliability-and-relevance framework, emphasizing that expert testimony must help the jury and have a “scientific connection” to the dispute.
- Knepfle v. J-Tech Corp. and Hughes v. Kia Motors Corp.: Used to explain that even Daubert-satisfying expert testimony may still be excluded under other evidentiary rules.
- Hibiscus Assocs. Ltd. v. Bd. of Trs. of Policemen & Fireman Ret. Sys. of Detroit: Supports excluding expert opinions that merely address matters of common understanding or are otherwise unnecessary.
- United States v. Frazier: Central to the court’s Rule 403 analysis: expert testimony can have “talismanic significan[ce]” and may mislead or confuse; it can be excluded as cumulative or time-consuming.
- Cook ex rel. Est. of Tessier v. Sheriff of Monroe Cnty. and Prosper v. Martin: Reinforce that the proponent bears the burden to establish expert qualification, reliability, and—critically here—helpfulness.
C. The credibility line: eyewitness reliability, memory science, and the jury’s role
- United States v. Thevis: The foundational rule against expert testimony that comments on witness credibility or invites “a barrage of marginally relevant psychological evidence.”
- United States v. Smith and United States v. Daniels: Cited for the proposition that expert testimony on eyewitness identification reliability is generally unhelpful because jurors can evaluate it with cross-examination tools—and for the Eleventh Circuit’s “attitude of disfavor” toward testimony that edges into credibility judgments.
- United States v. Gillis: Used as an analogy in emphasizing that expert testimony cannot do the jury’s job by leaving “more than just an inference” to draw on ultimate issues (there discussed through a Rule 704(b) lens).
- Sorrels v. NCL (Bahamas) Ltd.: Supports the idea that wholesale exclusion can be an abuse when some portion is reliable, but here the district court admitted substantial testimony and only limited specific parts.
- United States v. Wilk: Reinforces that even arguably relevant evidence may be properly excluded when it is not “crucial or necessary” to establishing a valid defense.
D. Cumulative evidence doctrine (Rule 403) and multi-expert overlap
- Johnson v. United States and Tran v. Toyota Motor Corp.: Supply the Eleventh Circuit’s multi-factor test for whether expert testimony is cumulative (relative qualifications, evidentiary bases, comprehensiveness, duplication).
- United States v. Gaskell: Notes the preference to admit evidence where the question is close, but does not eliminate trial court discretion to exclude truly duplicative proof.
- United States v. Nunez: Although addressing lay evidence, it supports a general principle: excluding testimony that only increases the number of witnesses saying the same thing does not infringe defense rights.
- Bonar v. Dean Witter Reynolds, Inc.: Cited as a contrast—where calling testimony “merely cumulative” may be unfair if it is uniquely influential or uniquely directed to an issue.
E. Sister-circuit guidance and comparative perspective
- United States v. Rouse (8th Cir.): Offered persuasive support for a middle path: experts may educate juries on dangers of implanted memory and suggestive interviewing, but may not opine on credibility.
- Cohen v. Cohen (3d Cir.): Illustrates rigorous policing of memory-repression theories, excluding testimony lacking sound methodology or a good factual “fit.”
- Guam v. McGravey (9th Cir.): In dicta, recognized that defendants can present expert testimony about children’s susceptibility to suggestion, while reaffirming trial-court discretion.
- Thill v. Richardson (7th Cir.): Referenced by analogy, acknowledging expert testimony on false memory implantation and social influences affecting childhood memory.
3.2. Legal Reasoning
A. The key doctrinal clarification: eyewitness-ID experts vs. memory-science experts
The court reaffirmed that under Eleventh Circuit precedent, district courts must exclude expert testimony “related to the credibility of eyewitnesses,” emphasizing United States v. Thevis, United States v. Smith, and United States v. Daniels. But the panel then made the opinion’s most consequential move: it distinguished eyewitness identification reliability from the broader domain of memory reliability.
The panel explicitly rejected the idea that the categorical bar applicable to eyewitness-identification experts automatically swallows all expert testimony about memory science. Instead, it held that memory-reliability testimony may be admissible if it satisfies Rule 702 (helpfulness/fit) and survives Rule 403, while also respecting the prohibition against invading the jury’s credibility function.
This is framed as a “novel question” of increasing practical importance in delayed-disclosure sexual abuse prosecutions, where a victim’s memory may be the primary evidence.
B. Why limiting Dr. Tillitski was within discretion
The district court admitted substantial defense expert testimony about memory formation and decay, suggestibility, flawed memories, memory cues, and children’s difficulty distinguishing appropriate from inappropriate conduct—mirroring and countering topics raised by the government’s expert. But it excluded or narrowed certain lines (e.g., certainty/accuracy linkage, “independent verification” framing tied to lack of physical proof, and segments that risked becoming a proxy credibility attack).
The Eleventh Circuit accepted this as a careful “fit” and “helpfulness” judgment, not a legal error. The appellate court emphasized that the trial judge did not wholesale exclude memory science; rather, it trimmed areas that (in the judge’s view) (1) were common sense, (2) were a poor fit to the factual disputes, (3) risked confusing the jury, or (4) crossed into telling the jury how to weigh the witnesses.
C. Why excluding Dr. Neuschatz as cumulative was within discretion
Applying Johnson v. United States and Tran v. Toyota Motor Corp., the panel found that Dr. Neuschatz’s proffer largely duplicated Dr. Tillitski: similar professional level, overlapping topics, and a proffer that primarily supplied additional laboratory examples (e.g., “Bugs Bunny at Disney,” animal attack, alien abduction paradigm) without a distinct incremental contribution beyond what the jury had already heard.
Because Rule 403 permits exclusion of “needlessly cumulative” evidence, and because the district court had already permitted a significant memory-science defense through Dr. Tillitski, the exclusion fell within the “range of permissible choices.”
D. Harmlessness as a reinforcing rationale
The court underscored that the government’s proof was not limited to the four charged victims’ recollections. Two additional women testified to similar abuse, and multiple adult witnesses (teachers, a dental assistant, and a parent) provided corroborative evidence of inappropriate conduct and contemporaneous reporting. This broader evidentiary record made it harder for Zappey to show that any additional expert testimony on memory would likely change the outcome.
3.3. Impact
The decision is likely to influence three recurring evidentiary battlegrounds in child sexual abuse prosecutions—especially delayed-disclosure cases:
- Expanded doctrinal space for memory-science experts (with limits): Trial courts in the Eleventh Circuit are not required to treat “memory reliability” testimony as categorically inadmissible merely because “eyewitness reliability” experts are disfavored. This invites more tailored Rule 702 litigation and more nuanced admissibility rulings.
- Sharper policing of the credibility boundary: The opinion reiterates that memory experts cannot become credibility referees. Even when memory testimony is permitted, courts will scrutinize whether the expert is effectively telling the jury which witnesses to believe.
- Greater emphasis on Rule 403 cumulative-analysis when defendants offer multiple memory experts: The ruling signals that a defendant generally gets to present a memory-science theory, but not necessarily through multiple experts repeating the same core points with different studies and examples.
Practically, litigants should expect more detailed pretrial hearings and more granular slicing of expert topics—allowing general education about memory mechanisms while excluding “application” testimony that tracks the witnesses too closely or implies a credibility conclusion.
4. Complex Concepts Simplified
- Rule 702 (Expert Testimony): An expert may testify only if the expert’s knowledge will help the jury decide something it cannot easily decide without specialized information, and the testimony is based on reliable methods properly applied.
- Daubert / Gatekeeping: The judge screens expert testimony for reliability and relevance (“fit”) before the jury hears it.
- Rule 403 (Balancing / Cumulative Evidence): Even relevant evidence can be excluded if it risks unfair prejudice, confusion, wasted time, or if it is needlessly repetitive.
- “Invading the province of the jury” (Credibility): Experts generally cannot tell the jury that a witness is truthful or untruthful, or dress that conclusion in scientific language; credibility is the jury’s job.
- Delayed disclosure & grooming: Delayed disclosure refers to victims reporting abuse long after it occurred; grooming describes tactics used by offenders to gain access and reduce resistance/reporting. The opinion treats these as legitimate expert topics, but still subject to evidentiary limits.
- False memory / suggestibility: A person may recall an event inaccurately, sometimes influenced by suggestion or repeated prompting. The court accepts that this subject can be scientifically studied, but whether it is admissible depends on whether it helps the jury in the specific case and does not become a credibility verdict.
5. Conclusion
United States v. Stefan Eberhard Zappey both affirms a conviction and clarifies evidence law in the Eleventh Circuit: while expert testimony attacking eyewitness identification reliability remains categorically disfavored, that doctrine does not automatically bar memory-reliability science more broadly. Instead, district courts retain discretion to admit memory-science testimony when it truly assists the jury and stays clear of credibility determinations, and to exclude additional experts when the testimony becomes duplicative under Rule 403.
The opinion’s larger significance lies in its realism about modern delayed-disclosure litigation: as older claims reach court and memory becomes central evidence, admissibility fights will increasingly hinge not on categorical rules, but on careful, case-specific “fit,” credibility-boundary enforcement, and cumulative-evidence discipline.
Comments