A Statistical Saga at a "Top" Journal

Jan 26

Even the “best” scientific journals can struggle with basic statistical literacy, to the detriment of science.

11 Comments

There is a lot of blind leading the blind in peer review on statistical issues. It is a challenging issue as many quantitatively savvy scientists sometimes advocate bad statistical ideas but they might generally be good on other quantitative issues in their discipline.

Great discussion on the problems with post-hoc power a few years ago when surgeons insisted that post-hoc power analysis was needed and vehemently defended it.

https://discourse.datamethods.org/t/observed-power-and-other-power-issues/731/13

Expand full comment

Katrien

Jan 27

Peer review made sense in the era of print journals when space limitations required selecting only a fraction of submitted articles. However, with the advent of the internet, its usefulness seems less clear to me. This example is just one among many that highlights its shortcomings.

Feedback and revisions are undoubtedly crucial, but relying on a small sample of five so-called experts to decide publication often results in arbitrary decisions. Moreover, the peer review process significantly slows scientific progress and tends to push researchers toward more conservative, non-controversial ideas and methodologies.

I’d advocate for a centralized platform where all work can be published, allowing the scientific community to assess its value through votes, comments, and ongoing feedback. This approach would foster transparency, encourage diverse perspectives, and accelerate the exchange of ideas.

Expand full comment

Dan Elton

Feb 5Edited

Something very similar happened to us. Although in our case, we were using an unusual statistical technique called "conformal prediction", so maybe we can cut the reviewer some slack. We explained the purpose of the technique very clearly, but the reviewer totally misunderstood the technique. After some protest, we convinced the journal to bring in a statistical reviewer. The statistical reviewer also didn't understand what we were doing, and thought we were "cheating" to get the result we wanted. One of our co-authors was Michael I. Jordan, one of the leading statisticians alive today. After more protest from faculty on our author list, the journal said they couldn't accept the publication due to the poor response from the statistical reviewer. Instead of fighting it further we decided to take the article elsewhere since it wasn't getting a fair treatment. Maybe I'll write up an account of the ordeal. The worst part was the journal made our first author spend many hours reformatting the article to the journal's standards, only to reject it because the statistical reviewer was clueless about what we were doing. So a lot of time was wasted. Very frustrating experience!

Expand full comment

Reply (1)

Stuart Buck

Feb 6

You co-authored with Mike Jordan! Very cool!

Expand full comment

Reply (1)

Dan Elton

Feb 8

Well it’s just a preprint right now due to the aforementioned SNAFU, but yes =)

Expand full comment

Antoine Blanchard

Feb 4

How not to find it sad when an editor doesn't do their job correctly... Gatekeepers being stupid is the worst :(

Expand full comment

Han

Feb 3

Indeed, don’t count on journal editors to have a correct understanding of basic statistics.

For 2plus years I am telling editors of physics journals that Aspect's Nobelprize Physics 2022 is statistically flawed.

In order to violate the CHSH inequality, Aspect requires:

1. cos(x)= P(x,=) - P(x,≠)

with

1a. P(x,=)=N(x,=)/N

1b. P(x,≠)=N(x,≠)/N

1c. N(x,=)=N(x,+,+)+N(x,-,-)

1d. N(x,≠)=N(x,+,-)+N(x,-,+)

1e. N=N(x,=)+N(x,≠)

And so,

2. cos(x)=1-2sin²(x/2), x in [0,2π)

3. P(x,=)+P(x,≠)=1

gives:

4. P(x,≠)=sin²(x/2), x in [0,2π).

The x is the angle(a,b) in [0,2π).

The, a, is Alice's instrument parameter vector. The, b, is Bob's instrument parameter vector.

The angle is measured in the plane orthogonal to the A-S-B axis. This “orthogonal to the A-S-B in the plane variation of x" is sufficient variation for understanding the statistics of the experiment. It is also physically valid.

Point 4 cannot be met by any data.

For x in [0,2π), the alleged associated probability density is f(x)=(1/2)sin(x).

This alleged probability density is derived from 4. via f(x)=dF(x)/dx.

But please observe,

f(x)=(1/2)sin(x), x in [0,2π),

isn't a probability density. It isn't positive definite for x in [0,2π).

Therefore Aspect's experiment is flawed.

Thank you.

Expand full comment

James Phipps

Jan 27

This feels like a related discussion https://blogs.worldbank.org/en/impactevaluations/why-ex-post-power-using-estimated-effect-sizes-bad-ex-post-mde-not

Expand full comment

Richard

Jan 27

Of course, after having this pointed out, they agreed to look at it again.

I hope?

Expand full comment

meika loofs samorzewski

Jan 27Edited

yeah well, I once wrote a fable for a collection of fables only to be told that because my fable ( I researched what fables actually were ---FAIL) did not have character development it would be a no. The whole point of fables is that there is no character development, the animals are ciphers for human foibles.... sheesh. They were wanted something they were literally not asking for... MOral of this story is, do not give them what they ask for, do not do your own research, find out about the frameworks the requesters are assuming without any conscious intentional inquiry on their part, and give them that. Easy Peasy.

Expand full comment

Matt

Jan 27

I have a stats PhD (though I admit I've worked mostly in AI/ML since graduating a while ago) but I still don't understand why post hoc while not preferred for clarity of communication isn't totally feasible. I can easily imagine doing the exact same simulations with the same assumptions I would've made a priori and using those to calculate relevant statistics that give you an idea of how likely your result is a true or false positive/negative. You'd have to be careful and set up a complicated simulation with sound logic about what you can actually claim and what your assumptions about potential sampling distributions mean. And you def can't just use the empirical estimates of the moments from the study. But it seems false to me that you can't gather any additional information to better understand your result by doing post hoc analysis, simulation, and reasoning?

Expand full comment

The Good Science Project

A Statistical Saga at a "Top" Journal