**I was a bit shocked when I saw that the researchers report a high significant result in their paper and reject the null hypothesis. What I found was a non significant result and a
slightly higher propability for the null hypothesis. But let’s start from the beginning...**

Today I have taken the time to analyse a study about cyclic movement therapy with motomed, published in 2005 by Kamps & Schule.

I was most interested in the results of the 6-Minute-Walk-Test (6-MWT) because I've found some evidence that cyclic movement therapy may have a beneficial effect on mobility in general and specially on walking capacity after stroke. My aim was a detailed investigation of the reported effect sizes and the statistical power of the study.

### Step 1: Power-analysis

First I ran a post hoc power calculation with g*power to estimate the 1-beta error probability of the 6-MWT post intervention outcome, between the intervention and control group. I didn't really wonder that the study is highly underpowered, because of the small effect-size (d = 0.42), the low sample-size (n = 31) and the high variability of the data.

Given the fact (and this is often neglected!) that low power reduces the
likelihood that a statistically significant result reflects a true effect in the population, I planed to run a summary stats bayesian independent sample t-test. This bayesian alternative
to a conventional student's t-test provides much richer information about the samples and the difference in means than a simple *p-*value and its more or less subjective interpretation of
probability.

**Fig 1.** Post hoc power calculation of the reported 6-Minute-Walk-Test results of the study.

### Step 2: Computation of t-value

For the bayesian approach I needed the t-value of the test statistic, which was not reported in the paper. So I ran a sum t-test in R to compute the t-value with the given Means, SD's and Sample-Sizes of the groups:

# Write function

t.test2 <- function(m1,m2,s1,s2,n1,n2,m0=0,equal.variance=TRUE)

{

if( equal.variance==FALSE )

{

se <- sqrt( (s1^2/n1) + (s2^2/n2) )

# welch-satterthwaite df

df <- ( (s1^2/n1 + s2^2/n2)^2 )/( (s1^2/n1)^2/(n1-1) + (s2^2/n2)^2/(n2-1) )

} else

{

# pooled standard deviation, scaled by the sample sizes

se <- sqrt( (1/n1 + 1/n2) * ((n1-1)*s1^2 + (n2-1)*s2^2)/(n1+n2-2) )

df <- n1+n2-2

}

t <- (m1-m2-m0)/se

dat <- c(m1-m2, se, t, 2*pt(-abs(t),df))

names(dat) <- c("Difference of means", "Std Error", "t", "p-value")

return(dat)

}

# Calculate t-statistic and p-value

t.test2(m1=237.84, m2=195.29, s1=115.66, s2=85.94, n1=16,n2=15,

equal.variance = TRUE)

**Fig 2.** R-script for sum.t.test

### Step 3: A shock!

I was a bit shocked about the result you see below:

**Fig 3.** Result of the sum.t.test

The sum t-test gave out a t-value of *1.15* and a ** p-value of .26**, which means that the result of the t-test is NOT(!) significant. However,
Kamps & Schule reported a

*, wich reflects a high significant result of their group interaction calculation.*

**p-value of .003**

Even it's clear that they performed an ANOVA and not a t-test, I'm wondering how they got this high significant result. For my understanding that seems to be impossible!

**How they got this result? I don't know! Did the researchers report wrong results or is something wrong with my computation?**

Immediately I contacted Schule via researgate and sent him my calculations. Further I asked him to send me a protocol of the statistical analysis they made and/ or allow me to get access to the raw-data.

Now I'm curious how the story ends and for sure I'll keep you updated!

By the way... If I run the bayesian t-test with my computed values in JASP, I find a higher probability for the null hypothesis!

**Fig 4.** Result of the bayesian independent samples t-test with a Couchy prior of 0.707 and my computed t = 1.15 value.

**UPDATE - 2019 Sep. 17th:** He faded away when it was clear that I wanted to see the data... I got him as a new follower in research gate but didn't received any answer!