Prev: Problem with WinUsb_ReadPipe
Next: What is the difference between using IOCTL to write/read IO ports vs. modifying IOPM and using _INP/OUTP_ ??
From: L337 on 1 Sep 2009 14:12 Hello guys. Brief question for someone who sort of understands the Windows IO Manager and IO subsystem model. I have a PCI device that operates at 0x300 IO mapped address. To use this card, we used a DLL written in C and used in VB6 back in Win 98. In WinXP, I now use Userport which claims it modifys Permission MAP of the Windows Subsystem to let all user mode programs run at Ring 0. When I use the C dll to do writes and reads, I achieve a speed of 1.65 uS between writes. Decent speed. But I wonder if I can get any better/faster then this? So I decided to enter into the realm of using IOCTL calls such as IOCTL_WRITE_PORT_UCHAR and IOCTL_READ_PORT_UCHAR provided in the PORTTALK and other IO port programs. So now I am essentially using a C program, to call the device driver using IOCTL to access the hardware. The best speed I got was 8 to 10 uS between consecutive writes or reads. And by consecutive, I mean doing something like this in the __main() routine: outportb(0x378, 0xFF); outportb(0x378, 0xFF); the function outportb is actually a deviceiocall. How am I measuring the speed between requests? Simple, in real time with an oscilloscope. Each time an IO requests is made to the PCI Bus, my PCI card responds with DEVICE SELECT. I use this signal to trigger the scope and I'm able to see when the DEVICE asserts and releases from the bus. This is how I know how much time it takes for the OS to process my write/read requests. So my question is, why is it that when going through the kernel mode driver, the system processes my IO writes/reads slower? Then when I use UserPort or PortTalk to modify the IOPM (IO Permission Map) of the Windows XP Subsystem, and then use either assembly code or visual c++ dll commands like inportw( or outpw( to do my writes and reads, I get much faster speeds? I would prefer to go through using a kernel mode driver and talking to it than having to modify the IOPM to let me through as many websites have said. But why is it so slow when I do that? Is there THAT much overhead?
From: L337 on 1 Sep 2009 14:13 Also: that is also the question I am wondering. How or what part of the IO Subsystem or IO Manager is responsible for handling the INP and OUTP instructions? I don't understand which part of the OS, Kernel, or IO Subsystem handles this requests and passes it to the HAL and then to the hardware. Because it is doing it heck of a lot faster than doing DeviceIOcontrol and IOCTL calls of course. I did some reading and came to some conclusions: -the C source code uses INP/OUTP instructions provided by Visual C++ dll. -but the actual INP/OUTP is in assembly such as __asm mov edx,portid __asm mov al,value __asm out dx,al -the x86 processors (intel-based) have built-in routines to handle IN/ OUT instructions on a "processor-level" command. So does this mean, there is a kernel driver, maybe NTIO.DLL or something that passes the IN/OUT command straight to the processor level, thereby bypassing the entire IO Manager? Thanks
From: Don Burn on 1 Sep 2009 14:28 It is slower because there is more overhead. The problem is you have not written a driver that is designed to manage the card, instead you are trying to use a generic driver to access ports one at a time. So with the driver you are paying a price of the overhead of each system call. Of course if this is real device, using IOPM modify trick will have your customers screming since you have just opened a security hole. So the question is, is this a real device that will be distributed? If so start looking at writing a driver. If this is a device only for the lab, go ahead with modify the permission map. -- Don Burn (MVP, Windows DKD) Windows Filesystem and Driver Consulting Website: http://www.windrvr.com Blog: http://msmvps.com/blogs/WinDrvr Remove StopSpam to reply "L337" <vern.engineering(a)gmail.com> wrote in message news:9f571a8a-9ca0-4213-81d5-1cc4af4b33c3(a)q40g2000prh.googlegroups.com... > Hello guys. > > Brief question for someone who sort of understands the Windows IO > Manager and IO subsystem model. I have a PCI device that operates at > 0x300 IO mapped address. To use this card, we used a DLL written in C > and used in VB6 back in Win 98. In WinXP, I now use Userport which > claims it modifys Permission MAP of the Windows Subsystem to let all > user mode programs run at Ring 0. > > When I use the C dll to do writes and reads, I achieve a speed of 1.65 > uS between writes. Decent speed. But I wonder if I can get any > better/faster then this? So I decided to enter into the realm of > using IOCTL calls such as IOCTL_WRITE_PORT_UCHAR and > IOCTL_READ_PORT_UCHAR provided in the PORTTALK and other IO port > programs. So now I am essentially using a C program, to call the > device driver using IOCTL to access the hardware. The best speed I > got was 8 to 10 uS between consecutive writes or reads. And by > consecutive, I mean doing something like this in the __main() routine: > > > outportb(0x378, 0xFF); > > outportb(0x378, 0xFF); > > the function outportb is actually a deviceiocall. > > How am I measuring the speed between requests? Simple, in real time > with an oscilloscope. Each time an IO requests is made to the PCI Bus, > my PCI card responds with DEVICE SELECT. I use this signal to trigger > the scope and I'm able to see when the DEVICE asserts and releases > from the bus. This is how I know how much time it takes for the OS to > process my write/read requests. > > So my question is, why is it that when going through the kernel mode > driver, the system processes my IO writes/reads slower? Then when I > use UserPort or PortTalk to modify the IOPM (IO Permission Map) of the > Windows XP Subsystem, and then use either assembly code or visual c++ > dll commands like inportw( or outpw( to do my writes and reads, I get > much faster speeds? > > I would prefer to go through using a kernel mode driver and talking to > it than having to modify the IOPM to let me through as many websites > have said. But why is it so slow when I do that? Is there THAT much > overhead? > > > __________ Information from ESET NOD32 Antivirus, version of virus > signature database 4387 (20090901) __________ > > The message was checked by ESET NOD32 Antivirus. > > http://www.eset.com > > > __________ Information from ESET NOD32 Antivirus, version of virus signature database 4387 (20090901) __________ The message was checked by ESET NOD32 Antivirus. http://www.eset.com
From: L337 on 1 Sep 2009 16:39 Thanks for the response. I appreciate it. Yes I am using a skeletal driver from the DDK to learn how to send the commands. No this PCI card will not be distributed to customers. It is used in our lab and company-wide primarily. Basically, the card can only accept a single burst IO write transaction (single DWORD) at a time because the hardware that it is attached to outside of the PC clocks in the data at about 40MHz. Sending in multiple DWORD in the same PCI transaction would be pointless as our hardware below can not clock in that fast. But at the same time, we want to maximize the "access-time" or "response-time" from executing a software command to the time when the PCI bus executes the IO transaction. Basically, is it possible to match/achieve the same speed if I were to develop the proper driver ? Perhaps, I can keep the ports opened with DEVICEIOCONTROL so that the next time I come around and do multiple bursts writes or reads.
From: L337 on 1 Sep 2009 16:44
On Sep 1, 1:12 pm, Kalle Olavi Niemitalo <k...(a)iki.fi> wrote: > L337 <vern.engineer...(a)gmail.com> writes: > > -the x86 processors (intel-based) have built-in routines to handle IN/ > > OUT instructions on a "processor-level" command. So does this mean, > > there is a kernel driver, maybe NTIO.DLL or something that passes the > > IN/OUT command straight to the processor level, thereby bypassing the > > entire IO Manager? > > After UserPort has changed the I/O permission map, the processor > does not trap to the kernel at all when it executes the in and > out instructions in your program. It thus avoids not only the > overhead of the I/O manager but the rest of the kernel as well. Interesting, someone told this but I didn't understand it quite correctly. So basically, you're saying that my software now is bypassing the I/O manager and the HAL? Or is it communicating straight to the HAL? based on this NT archicture model: http://en.wikipedia.org/wiki/File:Windows_2000_architecture.svg The part that I do not understand quite clearly is that the INP/OUTP commands in Visual C (dll) are actually macros for the assembly INP/ OUTP commands in assembly. So I read that these are actual x86 intel assembly "processor-level" instructions. But what i like to understand is, after when I send these commands via software, which part of the hardware layer, takes these assembly instructions and sends it to either the HAL or the processor? Is it the I/O Manager or the HAL itself? I just don't understand in the NT model, that after I have changed the IOPM, and while using IN/OUT commands via DLL, which layer is receiving these instructions and passing it down to the HAL? Thanks. |