My limited knowledge on this subject:
The z-score is how many standard deviations you are from the mean.
In statistical analysis, things are often evaluated against a p (probability) of 0.05 (or 5%), which also corresponds to a z-score of 1.96 (or roughly 2).
So, when you’re looking at your data, things with a z score >2 or <2 would correspond to findings that are “statistically significant,” in that you’re at least 95% sure that your findings aren’t due to random chance.
As others here have pointed out, z-scores closer to 0 would correspond to findings where they couldn’t be confident that whatever was being tested was any different than the control, akin to a boring paper which wouldn’t be published. “We tried some stuff but idk, didn’t seem to make a difference.” But it could also make for an interesting paper, “We tried putting healing crystals above cancer patients but it didn’t seem to make any difference.”
There’s certainly a lot to discuss, relative to experimental design and ethics. Peer review and good design hopefully minimize the clearly undesirable scenarios you describe as well as other subtle sources of error.
I was really just trying to explain what we’re looking at on op’s graph.
i’m in a couple “we tried some stuff but it really didn’t work” medical “research” papers, which we published so no one would try the same thing again.
My limited knowledge on this subject: The z-score is how many standard deviations you are from the mean.
In statistical analysis, things are often evaluated against a p (probability) of 0.05 (or 5%), which also corresponds to a z-score of 1.96 (or roughly 2).
So, when you’re looking at your data, things with a z score >2 or <2 would correspond to findings that are “statistically significant,” in that you’re at least 95% sure that your findings aren’t due to random chance.
As others here have pointed out, z-scores closer to 0 would correspond to findings where they couldn’t be confident that whatever was being tested was any different than the control, akin to a boring paper which wouldn’t be published. “We tried some stuff but idk, didn’t seem to make a difference.” But it could also make for an interesting paper, “We tried putting healing crystals above cancer patients but it didn’t seem to make any difference.”
But then you have competing bad outcomes:
There’s certainly a lot to discuss, relative to experimental design and ethics. Peer review and good design hopefully minimize the clearly undesirable scenarios you describe as well as other subtle sources of error.
I was really just trying to explain what we’re looking at on op’s graph.
Some people will refuse other treatments regardless, so you’re not changing the outcome.
i’m in a couple “we tried some stuff but it really didn’t work” medical “research” papers, which we published so no one would try the same thing again.