标签: didi - 算法花园

2023-03-122023-03-12 智能路 12 分钟读完 (大约1819个字)

【滴滴 HierETA】Interpreting Trajectories from Multiple Views A Hierarchical Self-Attention Network for Estimating the Time of Arrival

滴滴和华南理工在 2022 年 KDD 上发表的 ETA 论文，从多个视角解释轨迹，引入 Hierarchical Self-Attention Network 方法进行建模，最终在滴滴内部数据集上获得指标提升。

2022-08-142024-10-05 随手记 20 分钟读完 (大约2958个字)

@Interpreting Trajectories from Multiple Views: A Hierarchical Self-Attention Network for Estimating the Time of Arrival

[[Attachments]]

Interpreting Trajectories from Multiple Views-2022.pdf

关键信息

关键字 [[ETA]] [[self-attention]] [[didi]]

相关工作

traffic flow prediction [[Traffic Flow Forecasting]]
- GMAN: A graph multi-attention network for traffic prediction 基于图的多注意力机制来预测交通状况 GMAN [ 50] employs a graph multi-attention structure to extract the spatial and temporal relationships
  ls-type:: annotation
  hl-page:: 2
  hl-color:: green
  - 图学习通常会受到不相关的空间邻域区域的负面影响，尤其当区域变大，这种影响会导致误差传播 graph representation learning generally suffers from the negative impact from irrelevant spatial neighboring regions, resulting in error propagation especially when the involved area grows larger
    ls-type:: annotation
    hl-page:: 2
    hl-color:: green
  - 图建模被限制在狭窄的邻近区域，在开发大规模城市系统中存在不足 graph modeling is limited to process only narrow neighboring regions and falls short on developing large-scale urban-wise systems
    ls-type:: annotation
    hl-page:: 2
    hl-color:: yellow
    [[ConSTGAT]]
travel time estimation
trajectory recovery and inference
DeepTTE
ls-type:: annotation
hl-page:: 3
hl-color:: yellow
raw GPS sequences geo-convolutional network LSTM
[[WDR]] wide-deep-recurrent network
- CoDriverETA 2020 滴滴
ConSTGAT 和 CompactETA 图建模
DeepGTT
ls-type:: annotation
hl-page:: 3
hl-color:: yellow
deep generative model for learning the distribution of travel time
[[HetETA]] learns the representation of spatio-temporal information using a multi-relational network;
ls-type:: annotation
hl-page:: 3
hl-color:: blue
[[TTPNet]] 张量分解和图embedding从历史轨迹中学习速度和表示 extracts the travel speed and representation of road network from historical trajectories based on tensor decomposition and graph embedding.
ls-type:: annotation
hl-page:: 3
hl-color:: blue

核心贡献

利用三个视图之间的层次关系对道路底层结构进行建模 HierETA exploits the hierarchical relationship among the three views to portray the underlying road structure
ls-type:: annotation
hl-page:: 2
hl-color:: yellow
分层自自注意力网络根据 segment, link, intersection 之间自然关系进行高效组织 proposed hierarchical self-attention network organizes the segment-, link-, and intersection-views efficiently according to their natural relationships.
ls-type:: annotation
hl-page:: 2
hl-color:: yellow
自适应自注意力网络合并，以在多视图表示框架中共同利用全局和局部模式进行时空依赖建模。 adaptive self-attention network to jointly leverage the global and local patterns for spatio-temporal dependency modeling within the multi-view representation framework.
ls-type:: annotation
hl-page:: 2
hl-color:: yellow
- 利用多视图序列明确捕捉轨迹的时空依赖关系 we design an adaptive self-attention network to explicitly capture the spatio-temporal dependencies of the trajectory using multi-view sequences
  ls-type:: annotation
  hl-page:: 3
  hl-color:: yellow
hierarchy-aware attention decoder
ls-type:: annotation
hl-page:: 2
hl-color:: yellow
利用从不同粒度的信息上学习到上下文特征预估最终 ETA

核心问题

传统 ETA 方法采用分治策略，将一个轨迹拆分成多个小段，然后累加每个小段的预测结果得到整体 ETA traditional ETA algorithms mainly employ the divide-and-conquer strategy by representing a trajectory as a segment sequence and then summing up the local predictions
ls-type:: annotation
hl-page:: 1
hl-color:: blue
- segment-view 将轨迹拆分成多个小段，然后通过小段计算同行时间 most of them decompose a trajectory into several segments and then compute the travel time by integrating the attributes from all segments
  ls-type:: annotation
  hl-page:: 1
  hl-color:: blue
  - 累积误差
多视图下建模困难
- 常规方法用 segment 建模，不考虑 link On the one hand
  ls-type:: annotation
  hl-page:: 2
  hl-color:: yellow
  - 不使用 link 建模，现有的研究很困难对同一个 link 中的多个段之间的一致性建模 However, without explicitly modeling the link-view characteristics, existing studies can hardly model the coherent consistency across segments within the same links.
    ls-type:: annotation
    hl-page:: 2
    hl-color:: red
- segment 和 intersection 的属性是不一致的，很难用同一个网络去建模，大部分选择忽视路口或简化建模 On the other hand
  ls-type:: annotation
  hl-page:: 2
  hl-color:: yellow
- ETA 会受到路口等待影响
什么是 trajectory
- 三个角度 link、Intersection、Segment
  - segement
    - segment 是人工生成的，用来捕捉细粒度的局部交通情况，在表征道路网络结构方面并不完全 segment-view representation is artificially produced to capture the fined-grained local traffic conditions, which is however not comprehensive in characterizing the natural structure of the road network
      ls-type:: annotation
      hl-page:: 1
      hl-color:: green
  - link
    - 提供静态道路属性，pavement type，道路宽度、道路等级 preserve static road characteristics, such as pavement type, road width and road functional level
      ls-type:: annotation
      hl-page:: 1
      hl-color:: green
  - intersection
    - 等待时间、交通灯数量、历史车流量 valued information such as the waiting time, the number of traffic lights, and the historical traffic volume
      ls-type:: annotation
      hl-page:: 1
      hl-color:: green
- link 和 intersection 粗粒度表示轨迹属性，link 可以进一步拆分成多个小段，segment 可以细粒度对空间依赖性进行建模 the link- and intersection-views characterize the trajectory attributes from a coarse perspective; a link can be further decomposed into several segments, and hence the segment-view representation models the spatial dependencies at a fine granularity
  ls-type:: annotation
  hl-page:: 2
  hl-color:: yellow
  - 猜测基于作者的假设，两个Intersection 之间的整条路被称之为 Link
- [:span]
  ls-type:: annotation
  hl-page:: 2
  hl-color:: green

HierETA Hierarchical Self-Attention Network for Estimating the Time of Arrival

[:span]
ls-type:: annotation
hl-page:: 4
hl-color:: yellow

tags:: [[Model Architecture]] [[ETA]]

Attribute Feature Extractor
ls-type:: annotation
hl-page:: 3
hl-color:: yellow
- 连续特征 z-score
- 类别特征 embedding
- 全局特征共享
Hierarchical Self-Attention Network for Multi-View Trajectory Representation
ls-type:: annotation
hl-page:: 4
hl-color:: yellow
- segment encoder
  - 对同一个 link 内的时空依赖进行建模 capture the spatiotemporal dependencies of segments in the same link
    ls-type:: annotation
    hl-page:: 4
    hl-color:: yellow
    - a segment encoder is developed to capture the spatio-temporal dependencies at a fine granularity
      ls-type:: annotation
      hl-page:: 1
      hl-color:: yellow
  - 利用 BiLSTM 处理 $[x^s_j|x_r]$ ，正向和反向结果 concat 成 segment 的表示 $H^s_j$
  - 同一个 link 内 segement 记作 $H^s=\left[H_1^s, \ldots, H_n^s\right] \in \mathbb{R}^{n \times d_s}$
    - 计算出 j-th segment 和 link 内其他 segment 的全局相似度 $G P_j=\frac{Q_j K^T}{\sqrt{d}_s}$
    - a local semantic pattern
      ls-type:: annotation
      hl-page:: 4
      hl-color:: yellow
      局部相似度
      - $L P_j(k)= \begin{cases}G P_j(k), & |j-k| \leq \omega \\ -\infty, & \text { otherwise }\end{cases}$
      - 取 j 相邻 $\omega$ 个 segment 计算相似度
      - 捕获局部 segment 的依赖，加强局部的拥堵转移
  - 用门控机制平衡全局和局部attention结果 a gating mechanism
    ls-type:: annotation
    hl-page:: 4
    hl-color:: yellow
    - $F_j^s=\left(1-z_j\right) \odot \operatorname{Att}\left(G P_j\right)+z_j \odot \operatorname{Att}\left(L P_j\right)$
      - 控制参数怎么学 $z_j=\sigma\left(W_h H_j^s+W_g A t t\left(G P_j\right)+W_l A t t\left(L P_j\right)+b_z\right)$
    - ResNet + LN
  - 所有 link 的 encoder 参数共享以及并行计算
    - n adaptive self-attention module is designed to boost performance
      ls-type:: annotation
      hl-page:: 1
      hl-color:: yellow
- Joint Link-Intersection Encoder.
  ls-type:: annotation
  hl-page:: 4
  hl-color:: yellow
  - 道路属性
  - o characterize the natural trajectory structure consisting of alternatively arranged links and intersections
    ls-type:: annotation
    hl-page:: 1
    hl-color:: yellow
  - 为什么要设计这个模块？
    - segment-view 无法对同一个 link 内 segment 共享的一致性进行建模
      - t fails to model the consistency shared within the same link
        ls-type:: annotation
        hl-page:: 4
        hl-color:: red
    - 粗粒度表示 coarse-scale representation
      ls-type:: annotation
      hl-page:: 4
      hl-color:: red
    - link 和 intersections 交替出现 as links and intersections appear alternatively
      ls-type:: annotation
      hl-page:: 4
      hl-color:: yellow
  - link 表示： $x_i^l=\sum_{j=1}^n \gamma_{i j} h_{i j}^s$
    - link 内的 segment 表示是 $\left\{h_{i j}^s\right\}_{j=1}^n$
    - 加权融合 segment 得到 link 表示，权重计算方法 [[Attention]] $\gamma_{i j}=\operatorname{softmax}_j\left(W_\gamma h_{i j}^s+b_\gamma\right)$
  - 得到 link 和 intersections 的表示后，分别用编码 employ two BiLSTMs to respectively encode the links and intersections
    ls-type:: annotation
    hl-page:: 5
    hl-color:: yellow
    得到 ${H^l_i}$ 和 ${H^c_i}$ ，concat 在一起得到 $\hat{H}_i^l=\left[H_i^l \mid H_i^c\right]$
  - 上一步得到向量经过 the joint link-intersection encoder also includes a self-attention layer, a residual connection and a layer normalization
    ls-type:: annotation
    hl-page:: 5
    hl-color:: yellow
    得到 $\left\{h_i^l\right\}_{i=1}^m$
  - 去除这个 encoder 中的 local pattern
    ls-type:: annotation
    hl-page:: 5
    hl-color:: red
    ，因为相邻 link 之间的交通影响更加弱和稀疏，避免过拟合
  - 总结
    - segement-view 捕捉局部交通信息 segment-view context feature that captures the local traffic conditions
      ls-type:: annotation
      hl-page:: 5
      hl-color:: yellow
    - link-intersection context 表达道路属性 joint link-intersection context feature that preserves the common road attributes
      ls-type:: annotation
      hl-page:: 5
      hl-color:: yellow
Hierarchy-Aware Attention Decoder
ls-type:: annotation
hl-page:: 5
hl-color:: yellow
层次感知注意力解码器
- realize a tradeoff between the multi-view spatio-temporal features
  ls-type:: annotation
  hl-page:: 1
  hl-color:: yellow
- sub-route 对于最后的 eta 贡献是不一样的（拥堵路口和道路应该给予更多关注）
  - travel time estimation is closely related to the critical components
    ls-type:: annotation
    hl-page:: 5
    hl-color:: green
- ETA $\mathcal{R}=(1-\lambda) \sum_{i=1}^m \sum_{j=1}^n \alpha_{i j} h_{i j}^s+\lambda \sum_{i=1}^m \beta_i h_i^l$
  - segment 的表示以及 link-intersection 的表示
  - alpha 和 beta 都是注意力权重
  - 设计注意力引导机制，利用 link-view 之间的关系调整 segment-view 之间的 attention attention guidance that adopts the link-view consistency to further adjust the segment-view attention
    ls-type:: annotation
    hl-page:: 5
    hl-color:: red
    - 先计算 link 的注意力 $\beta_i=\underset{i}{\operatorname{softmax}}\left(f^l\left(h_i^l, x^r\right)\right)$
      - $f^l\left(h_i^l, x^r\right)=v^T \tanh \left(w_1 h_i^l+w_2 x^r+b\right)$
      - xr 是外部影响因素
    - 根据 link 注意力计算 segment 之间注意力 $\alpha_{i j}=\underset{(i, j)}{\operatorname{softmax}}\left(\beta_i f^s\left(h_{i j}^s, x^r\right)\right)$
      - 考虑 segment 之间的重要性，如果不考虑 link 之间的重要性， separately processing each segment without considering the link-view correlation is problematic as it lacks the feedback from the link-view consistency.
        ls-type:: annotation
        hl-page:: 5
        hl-color:: red
  - 改方法可以自适应选择不同表示粒度中最相关的特征 we can adaptively select the most relevant features from different representation granularities.
    ls-type:: annotation
    hl-page:: 5
    hl-color:: yellow
    - 可以实现是几个 link 权重大，还是几个 segment 权重大
- $\mathcal{L}(\Theta)=\frac{1}{N} \sum_{k=1}^N\left|Y_k-\hat{Y}_k\right|$

EXPERIMENTS
ls-type:: annotation
hl-page:: 5
hl-color:: green

20 天训练，1 天评估，预测 7 天
数据分布 probability density functions (PDFs) and cumulative distribution functions (CDFs)
ls-type:: annotation
hl-page:: 5
hl-color:: green
- [:span]
  ls-type:: annotation
  hl-page:: 6
  hl-color:: yellow
We repeat each experiment for five times except the statistics-based approach Route-ETA and report the mean and the standard deviation of different runs.
ls-type:: annotation
hl-page:: 6
hl-color:: green
训练 5 次取平均
指标 mean absolute error (MAE), root mean squared error (RMSE), mean absolute percentage error (MAPE), and satisfaction rate (SR), similar to existing approaches [ 23 ]. Specifically, SR refers to the fraction of trips with error rates less than 10% and a higher SR indicates better performance and customer satisfaction
ls-type:: annotation
hl-page:: 6
hl-color:: green
实验结果
- [:span]
  ls-type:: annotation
  hl-page:: 7
  hl-color:: yellow
- [[ConSTGAT]] ConstGAT considers the graph structures of the road network to exploit the joint relations of spatio-temporal information.
  ls-type:: annotation
  hl-page:: 7
  hl-color:: blue
- HierETA 更具有可解释性，对潜在道路网络结构进行建模
- 误差分析：所有距离分桶中指标都提升了，长单提升更加明显。
  - 层次化建模效果好 That is, interpreting the trajectory from multiple views effectively portrays the hierarchical structure of road network and eases the error propagation for estimating the travel time.
    ls-type:: annotation
    hl-page:: 7
    hl-color:: green
  - [:span]
    ls-type:: annotation
    hl-page:: 7
    hl-color:: green
模型分析
- window sizes
  ls-type:: annotation
  hl-page:: 8
  hl-color:: green
  - 局部窗口效果好
  - segment 之间距离越远，之间的关联性越弱 he correlation between adjacent segments slightly decreases while the modeling uncertainty increases.
    ls-type:: annotation
    hl-page:: 8
    hl-color:: green
  - [:span]
    ls-type:: annotation
    hl-page:: 8
    hl-color:: yellow
- segment 和 link 的权重
  - 只考虑其中一个指标差
- [[Ablation Study]]
  - 有无 local 或 global 特征
    - 建模细粒度交通信息 The local attention in encoder is removed to verify the effectiveness for modeling the semantic traffic condition.
      ls-type:: annotation
      hl-page:: 8
      hl-color:: green
    - 提取结构化交通模式 verify the necessity of extracting the structural traffic pattern.
      ls-type:: annotation
      hl-page:: 8
      hl-color:: green
  - 有无 guide
    - 无引导信息
  - 有无路况信息
  - 有无层次化结构 removing the joint link-intersection encoder
    ls-type:: annotation
    hl-page:: 8
    hl-color:: green
    ，没有这个效果显著下降
  - 从 1s 就是变化很大来说，这些网络结构都挺重要的
    - HierETA performs better than both variants that eliminating local and global attentions, which is contributed to the introduction of the global structural and local semantic patterns.
      ls-type:: annotation
      hl-page:: 8
      hl-color:: yellow
  - [:span]
    ls-type:: annotation
    hl-page:: 8
    hl-color:: green

didi, ETA, KDD/2022

2019-07-282023-03-12 智能路 15 分钟读完 (大约2280个字)

(WDR) Learning to Estimate the Travel Time

严重申明：本篇文章所有信息从论文、网络等公开渠道中获得，不会透露滴滴地图 ETA 任何实现方法。

这篇论文是滴滴时空数据组 2018 年在 KDD 上发表的关于在 ETA 领域应用深度学习的文章，里面提到模型和技巧大家都应该耳熟能详，最大亮点是工业界的创新。

简单介绍一下背景：ETA 是 Estimate Travel Time 的缩写，中文大概能翻译成到达时间估计。这个问题描述是：在某一个时刻，估计从 A 点到 B 点需要的时间。对于滴滴，关注的是司机开车把乘客从起点送到终点需要的时间。抽象出来 ETA 就是一个时间空间信息相关的回归问题。CTR 中常用的方法都可以在这里面尝试。

对于这个问题：文章首先提到一个最通用的方法 Route ETA：即在获得 A 点到 B 点路线的情况下，计算路线中每一段路的行驶时间，并且预估路口的等待时间。最终 ETA 由全部时间相加得到。这种方法实现起来很简单，也能拿到一些收益。但是仔细思考一下，没有考虑未来道路的通行状态变化情况以及路线的拓扑关系。针对这些问题，文章中提到滴滴内部也有利用 GBDT 或 FM 的方法解决 ETA 问题，不过没有仔细写实现的方法，我也不好继续分析下去。

评价指标

对于 ETA 问题来说，工业界和学术界常用的指标是 MAPE(mean absolute percentage error)， ${y_i}$ 是司机实际从 A 点到 B 点花费的时间， ${f(x_i)}$ 是 ETA 模型估计出来的时间。得到计算公式如下：

{min_f \sum_{i=1}^{N}\frac{|y_i - f(x_i)|}{y_i}}

多说一句，如果使用 GBDT 模型实现 ETA 时，这个损失函数的推导有点困难，全网也没有看见几个人推导过。

这个公式主要考虑预估时间偏差大小对用户感知体验的影响，目前我们更加关心极端 badcase 对用户的影响。

特征

特征：
- 空间特征：路线序列、道路等级、POI等
- 时间特征：月份、星期、时间片等
- 路况特征：道路的通行速度、拥堵程度
- 个性化信息：司机特征、乘客特征（有「杀熟」风险）、车辆特征
- 附近特征：天气、交通管制

模型

模型包含 3 个部分：

Wide Learning Models：Wide & Deep 这一部分使用的是 LR + 特征工程，希望模型能记忆一部分的模型。这篇论文中直接利用交叉积学习，减少手动特征工程。
Deep Neural Networks：对 sparse feature 做一次 Embedding，使用 3 层 MLP 和 ReLU 的网络。
Long-Short Term Memory：前两部分模块没用使用路线序列特征，所以这部分采用 LSTM 抽取路线特征。由于路线序列是不定长的，论文中的模型是使用最后一个隐藏状态，我们也可以把全部的隐藏状态 reduce_sum 输入到最后的模块。
Regressor：将 3 个模型的输出综合起来，作为最后的 ETA 预估。MAPE 作为损失函数，利用 BP 训练模型。

WDR

上面模型中使用的特征分类：

Dense feature：行程级别的实数特征，比如起终点球面距离、起终点 GPS 坐标等。
Sparse feature：行程级别的离散特征，比如时间片编号、星期几、天气类型等。
Sequential feature：link 级别的特征，实数特征直接输入模型，而离散特征先做 embedding 再输入模型。注意，这里不再是每个行程一个特征向量，而是行程中每条 link 都有一个特征向量。比如，link 的长度、车道数、功能等级、实时通行速度等。

评估

包括两部分：离线评估和在线评估。

离线评估中取滴滴 2017 年北京前6个月的订单数据，分成两类 pickup （平台给司机分单后，司机开车去接乘客的过程）和 trip （司机接到乘客并前往目的地的过程）。具体数据集划分如下。

离线效果

离线使用 MAPE 来评价模型。在线评估时，为了更好的与用户体验挂钩，采用多个指标来衡量 ETA 的效果。包括：

APE20: absolute percentage error 小于 20% 的订单占比。（越大越好）
Badcase率：APE 大于 50% 或者 AE 大于 180s 的订单占比，定义为对用户造成巨大影响的情况。（越小越好）
低估率：低估订单的比例。（越小越好）

离线结果如下图所示，说来汗颜 PTTE 和 TEMP 是什么算法我都不知道…… WD-MLP 指的是将 WDR 中的 R 部分换成 MLP 。最终 WDR 较 route-ETA 有巨大提升，而且 LSTM 引入的序列信息也在 pikcup 上提升了 0.75%。文章的最后还提出来，LSTM 也可以换成是 Attention，这样替换有什么优点和缺点留给大家思考。

pickup 和 trip 效果

在线实验结果如下图所示，滴滴 ETA MAPE 明显小于 com1、com2、com3 ，这三家地图公司具体是哪三家，大家也能猜到吧。

线上实验效果

ETA 服务工程架构

工程架构

从上面的图中可以看出 ETA 服务工程架构主要包括三个部分：

Data Aggregation：包括利用 Map Matching 将司机上传到平台的 GPS 对应到滴滴的 Map Info 中得到司机真实行驶过的路线信息，Order Context 指的是订单相关的信息，augmented Data 额外数据比如上文说的交通情况相关信息。
Offline Training：利用上一步得到的历史数据训练模型。这里可以值得一提的是，ETA 模型是和时间强相关的（节假日和工作日的数据分布明显不同），所以在文章中作者指出将拿出最新的一部分数据用来 fine-tune 训练出来的 WDR 模型。
Online Service：这里需要一个完整的模型服务系统，其他公司也有很多分享，所以原文没有多提。

FMA-ETA: Estimating Travel Time Entirely Based on FFN With Attention

模型架构

WDR 模型中 RNN 耗时长，探索基于 Attention 机制的模型
对特征分组（multi-factor）去做 Attention 效果比多头要好
实验结果分析这部分没有看懂

The deep modules with attention achieve better results than WDR on MAE and RMSE metrics, which means attention mechanism can help to extract features and sole the long-range dependencies in long sequence.
遗憾之处
- 新模型预测时延减少，但是没有线上实验结果。
- 暂时没有公开代码和数据集。

总结

从上面简单的介绍来看，ETA 可以使用 CTR 和 NLP 领域的很多技术，大有可为。最后，滴滴 ETA 团队持续招人中（社招、校招、日常实习等），感兴趣者快快和我联系。

说点题外话你为什么从滴滴出行离职？ - 知乎中提到一点：

8.同年大跃进，在滴滴中高层的眼里，没有BAT。滴滴单量超淘宝指日可待，GAFA才是滴滴要赶超的对象。百度系，LinkedIn系，学院派，uber帮，联想系，MBB就算了，据说连藤校都混成了一个小圈子。。一个项目A team ，B team。一个ETA，投入了多少人力自相残杀？MAPE做到0%又如何？用户体验就爆表了吗？长期留存就高枕无忧了吗？风流总被雨打风吹去，滴滴是二龙山，三虫聚首？是不是正确的事情不知道，反正跟着公司大势所趋，升D10保平安。

参考

KDD 2018：滴滴提出WDR模型显著提升ETA预测精度 | 雷锋网

LBS工业界ETA应用及滴滴WDR技术 – Semocean

didi, lstm, widedeep

评价指标

特征

模型

评估

ETA 服务工程架构

FMA-ETA: Estimating Travel Time Entirely Based on FFN With Attention

总结

参考

分类

链接

最新文章

标签