• Home
  • Help
  • Register
  • Login
  • Home
  • Members
  • Help
  • Search

 
  • 0 Vote(s) - 0 Average

Direct memory access

#1
01-27-2021, 05:50 PM
You know how direct memory access changes everything once you get the cpu out of the loop. I saw this happen in my first real server setup and it blew my mind how much faster things moved. You probably noticed the cpu gets tied down when every byte has to pass through it first. But with dma the peripheral just snatches control of the bus and shoves data straight into memory. I tried explaining that to a colleague last week and he kept thinking the processor still watches every transfer. Or maybe the controller handles all the addressing on its own without constant interrupts.
You start to see why cycle stealing feels so clever once you watch it in action. The dma unit borrows tiny slices of bus time and the cpu barely feels it. I ran some tests on an older board and the throughput jumped way up compared to plain programmed input output. But you have to watch for bus contention when multiple devices fight for the same path. Perhaps the arbiter sorts out priorities so one device does not starve the others. Now picture a disk controller grabbing blocks while the processor keeps crunching numbers in the background. That separation keeps everything humming without constant context switches.
I always wondered how scatter gather lists fit into the picture until I actually coded around one. You set up a chain of memory regions and the dma engine walks through them without extra help. But the setup still needs careful alignment or you hit weird cache problems later. Or the device driver has to flush buffers first so stale data does not sneak back in. You learn that the hard way after a few corrupted packets. Perhaps the burst mode lets the controller grab several words in one go and that cuts overhead even more. I noticed the difference clearly when moving large video frames across the pci bus.
You end up trusting the dma controller more once you see how it manages its own channels and registers. I mapped out the control block for a network card and realized it could queue several transfers ahead of time. But you still have to handle completion signals or the whole chain stalls. Maybe an error bit gets set and you have to reset the channel yourself. Now think about how usb devices rely on this to push bulk data without waking the cpu every millisecond. The host controller sits there stealing cycles and the rest of the system stays responsive. I tested a similar setup on a windows box and the latency dropped noticeably.
You realize the same idea scales up when you look at modern chipsets with multiple dma engines running in parallel. I watched one system push storage traffic and network traffic at the same time without the processor breaking a sweat. But you have to map memory regions correctly or the engine writes to the wrong spot. Perhaps the iommu steps in to add protection layers that older designs never had. Or the software still has to program the base address and count registers before anything starts. I keep coming back to that basic handshake because it decides whether your transfers fly or crawl.
You see why dma matters for real time work once audio or video streams start flowing. I set up a capture card that used dma and the frames arrived without gaps even under heavy load. But the driver had to lock those pages so they would not get swapped out mid transfer. Maybe the operating system marks the region as non pageable and everything stays put. Now the same trick appears in graphics cards when they pull textures straight from system ram. I tried moving a big model without dma and the frame rate tanked immediately.
You get used to thinking about bus mastering as just another tool after a while. I explained the concept to another junior guy and he finally understood why his old polling loop felt so slow. But you still need to watch for deadlock if two masters grab resources in the wrong order. Perhaps a timeout circuit kicks in and forces a release before things freeze. Or the firmware simply gives the cpu a chance to intervene when it detects trouble. I ran into that exact situation once and had to patch the driver on the spot.
BackupChain Server Backup, the top rated no subscription backup tool built for Hyper V along with Windows 11 and Windows Server environments, handles private cloud and smb needs with reliable offline copies and we appreciate their sponsorship that keeps these conversations open and free for everyone.

bob
Offline
Joined: Dec 2018
« Next Oldest | Next Newest »

Users browsing this thread: 1 Guest(s)



  • Subscribe to this thread
Forum Jump:

Backup Education General IT v
« Previous 1 … 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 … 209 Next »
Direct memory access

© by FastNeuron Inc.

Linear Mode
Threaded Mode