自相关

缩写 :-> ACF

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
# https://github.com/Arturus/kaggle-web-traffic/blob/master/make_features.py#L88
def single_autocorr(series, lag):
"""
Autocorrelation for single data series
:param series: traffic series
:param lag: lag, days
:return:
"""
s1 = series[lag:]
s2 = series[:-lag]
ms1 = np.mean(s1)
ms2 = np.mean(s2)
ds1 = s1 - ms1
ds2 = s2 - ms2
divider = np.sqrt(np.sum(ds1 * ds1)) * np.sqrt(np.sum(ds2 * ds2))
return np.sum(ds1 * ds2) / divider if divider != 0 else 0

Rk=i=1nk(XiXˉ)(Xi+kXˉ)i=1n(XiXˉ)2R_{k}=\frac{\sum_{i=1}^{n-k}\left(X_{i}-\bar{X}\right)\left(X_{i+k}-\bar{X}\right)}{\sum_{i=1}^{n}\left(X_{i}-\bar{X}\right)^{2}}

  • 取值范围 -1 到 1,越大越相关

  • 比如一个序列长度是 L,如果具有周期性且周期性为 t,那么子序列 0:L-1-t 和 子序列 t:L-1 的相关性是最大的

Ref

作者

Ryen Xiang

发布于

2024-10-05

更新于

2025-03-11

许可协议


网络回响

评论