I’m trying to convert the following R code to python
stdev.cust <- function(x){
sd(x, na.rm = TRUE)
}
.stdev.min.obs <- function(x, min.n) {
n.s <- sum(!is.na(x))
ifelse(n.s >= min.n, stdev.cust(x), NA_real_)
}
df[, col2 := rollapply(col1,
width=35,
.stdev.min.obs,
min.n=10,
fill=NA_real_,
partial=TRUE,
align='right'),
keyby='date']
This is what I came up with
def stdev_min_obs(x, min_n):
ns = np.sum(~np.isnan(x))
return np.std(x) if ns >= min_n else np.nan
df['col2'] = df.groupby('date')['col1'].apply(
lambda x: x.rolling(window=35, min_periods=10).apply(lambda y: stdev_min_obs(y, 10))
)
But when I tried this on the problem I’m working on at work, the results aren’t the same, and also the python version is a lot slower, presumably because apply
is very slow?