[Feature] Implement DMA support#293
Conversation
…RecvIfcRTL. Replace `mem` with `dram` for clarity.
|
Hi @tancheng @BenkangPeng , I summarized two direction of DMA design as below:
I prefer the second method but I think there are still some logic should be written in DataMemControllerRTL. WDTY? |
|
Hi @HobbitQia, option 2 looks good to me. Though I am not sure what logic should be additionally in |
I am thinking if we want enable the concurrent running of DMA and traditional load/store, then we need to multiplex the port of Data SPM and I think this logic can be implemented in |
I thought they are the same latency if we can distinguish the Adding logic inside the |
If the DMA data should go through the controller packet path, there may be extra latency of packeting, and there may be competitions between NoC/CPU/tile request to SPM? Or we have two separate paths in Controller? |
Packing can just be combinational logic before putting into a queue.
I was thinking about separate paths, so mentioned |
Got it. I mean the conflict between different paths. Even we have |
Oh, we don't need to distinguish VectorCGRA/controller/ControllerRTL.py Line 207 in eb71842 |
… error of pytml verilog backend.
…interface for enhanced data transfer capabilities.
… then drives types from them
…te requests and adjust related signal handling for clarity and consistency.
…TL by passing DmaDataType and DmaCmdType as parameters, and updating related type definitions for improved clarity and consistency.
…rite requests, enhancing type definitions for DmaCmdType and DmaDataType
|
|
||
| return mk_bitstruct(new_name, { | ||
| 'dram_data': DramDataType, | ||
| 'dram_mask': DramMaskType, |
There was a problem hiding this comment.
explain what is dram_mask with comment?
| 'dram_data': DramDataType, | ||
| 'dram_mask': DramMaskType, | ||
| 'spm_data': SpmDataType, | ||
| 'spm_mask': SpmMaskType, |
There was a problem hiding this comment.
explain what is spm_mask with comment?
We should also include the figure 2 into our |
@BenkangPeng, as you introduce |
| # # TODO: Handle other cmd types. | ||
| # assert(False) | ||
|
|
||
| if has_dma_ports & s.dma_done.val: |
There was a problem hiding this comment.
I am thinking the possible conflict between CMD_DMA_CONE and CMD_COMPLETE or other commands that will be sent back to CPUs. Maybe we need some logic to judge these conflicts if at the same cycle the CMD_DMA_DONE and CMD_COMPLETE is valid?
| s.opcode_ff <<= s.dma_cmd.msg.opcode | ||
| s.dram_addr_ff <<= s.dma_cmd.msg.dram_addr | ||
| s.spm_addr_ff <<= s.dma_cmd.msg.spm_addr | ||
| s.words_left_ff <<= s.dma_cmd.msg.nbytes >> 2 # Converts the transfer size from bytes to words. |
There was a problem hiding this comment.
There may be a truncation? If we want transfer 3 bytes, then words_left_ff = 0 so no data will be transferred.
For now I think we don't need to consider such fine-grained data movement since usually DMA should handle the large-scale data. I think we can add some assertions or checking to ensure nbytes%4 == 0 ?
There was a problem hiding this comment.
There may be a truncation? If we want transfer 3 bytes, then
words_left_ff = 0so no data will be transferred.For now I think we don't need to consider such fine-grained data movement since usually DMA should handle the large-scale data. I think we can add some assertions or checking to ensure
nbytes%4 == 0?
Yes, I ignore it. I will add an assertion to ensure it.
There was a problem hiding this comment.
Now the data transfer granularity between DRAM and SPM is 1 word (4 bytes), with a mask of 1 bit per word.
| bank_index_store_from_dma = trunc((recv_waddr_from_dma - s.address_lower) >> per_bank_addr_nbits, XbarOutWrType) | ||
| else: | ||
| bank_index_store_from_dma = XbarOutWrType(num_banks_per_cgra) | ||
| s.wr_pkt[dma_wr_idx] @= MemWritePktType(dma_wr_idx, # src |
There was a problem hiding this comment.
Is this mask actually used?
|
I summarized the TODOs and something that needs to be clarified here as below. If there is something missing, please feel free to add it. @tancheng @BenkangPeng This PR should address the following items:
|
…is an integer multiple of 4
…SendRTL to replace them.
…dyRecv/SendIfcRTL for improved clarity and consistency in DMA signal handling.
… with ValRdyRecv/SendIfcRTL
…rite request type and update corresponding tests for consistency
…arity and consistency by renaming signals related to memory requests and responses.
| kAttrOpcode = 'opcode' | ||
| kAttrDramAddr = 'dram_addr' | ||
| kAttrNBytes = 'nbytes' | ||
| kAttrTag = 'tag' |
There was a problem hiding this comment.
tag -> dram_tag, and comment on what this tag is used for?
There was a problem hiding this comment.
tag -> dma_tag, right?
This tag isn't used now. Maybe we will use it to distinguish different DMA commands? @HobbitQia
There was a problem hiding this comment.
right, dma_tag sounds good. You then can leave comment like:
This tag isn't used now. We may use it to distinguish different DMA commands.
| kAttrNBytes = 'nbytes' | ||
| kAttrTag = 'tag' | ||
| kAttrSpmAddr = 'spm_addr' | ||
| kAttrSpmData = 'spm_data' |
There was a problem hiding this comment.
We already have kAttrDataAddr, but we are now distinguishing it from DRAM data. Can you file an issue, and do the cleanup (consolidate these attributes) in later PR?
There was a problem hiding this comment.
Oh, do you mean that we should rename kAttrDataAddr/kAttrData into kAttrSpmAddr/kAttrSpmData, since kAttrDataAddr/kAttrData of current codebase actually indicate the address/data of SPM? Is my understanding correct?
There was a problem hiding this comment.
Right, I feel some kAttr need to be refactored, but we don't need to do that in this PR.
| s.recv_from_ctrl_spm_wr_req = RecvIfcRTL(DmaSpmWriteReqType) | ||
| s.recv_from_ctrl_spm_rd_req = RecvIfcRTL(DmaSpmReadReqType) | ||
| s.send_to_ctrl_spm_rd_resp = SendIfcRTL(DmaSpmReadRespType) |
There was a problem hiding this comment.
Why do we need to access ctrl_spm in DataMemControllerRTL?
There was a problem hiding this comment.
recv_from_ctrl_spm_wr_req means that receiving the request of writing into spm from Controller.
We connect the Controller with the DataMemController, and the wr_req, rd_req, rd_resp of SPM are transferred by these 3 ports.
I will update the current design diagram ASAP.
There was a problem hiding this comment.
then plz rename it to recv_from_controller_spm_wr_req.
The current name sounds like:
VectorCGRA/controller/ControllerRTL.py
Line 344 in eb71842
BTW, how to send to ctrl_ring then? We not yet distinguish ctrl from data?




Related issue: coredac/CGRA-SoC#2
This PR introduces
CgraDmaRTLwhich integrates the CGRA with a DMA engine, enabling direct memory transfers between external DRAM(don't implement now) and the CGRA's dataSPM.