BEV Queries

where H, W are the spatial shape of the BEV plane.

Each grid cell in the BEV plane corresponds to a real-world size of s meters.

The center of BEV features corresponds to the position of the ego car by default.

Spatial Cross-Attention

Temporal Self-Attention