Skip to content

Vector Memory Access

Submodule List

Submodule Descrption
VLSplit Vector Load uop splitting module
VSSplit Vector Store uop splitting module
VLMergeBuffer Vector Load flow merge module
VSMergeBuffer Vector Load flow merge module
VSegmentUnit Vector Segment execution module
VfofBuffer Vector fault-only-first instruction write-back VL register uop collection and write-back module

Functional Description

  • Supports all memory access instructions in RVV 1.0
  • Supports out-of-order scheduling for Vector Load/Store instructions
  • Supports out-of-order execution of Uops split from Vector Load/Store instructions
  • Supports vector out-of-order violation checking and recovery
  • Supports unaligned vector memory access
  • Vector memory access to non-Memory space is not supported

Parameter configuration

Parameters Configuration (number of entries)
VLEN 128
VLMergeBuffer 16
VSMergeBuffer 16
VSegmentBuffer 8
VFOFBuffer 1

Functional Overview

Before entering the VLSIssueQueue, the Dispatch stage allocates indices for the Load Queue or Store Queue. After vector memory access instructions are split into uops in the backend, they are first decoded in the Vsplit module to calculate masks and address offsets, while also requesting Mergebuffer entries. In the new vector memory access architecture, the scalar LoadUnit & StoreUnit, as well as Load Queue & Store Queue, are reused.

Vector Load and Store share two Issue Queues. For vector Load, the two Issue Queues connect to two VLSplits. For vector Store, the two Issue Queues connect to two VSSplits. The two VLSplits correspond to LoadUnit0 and LoadUnit1 respectively. The two VSSplits correspond to StoreUnit0 and StoreUnit1 respectively. When a vector Load requires replay via the Replay Queue, it may be resent to a different load unit. After vector memory access completes execution in the pipeline, the results are aggregated by the merge buffer and written back.

Overall Block Diagram

Overall block diagram pending update