From: Charles Oliver Nutter on 25 Jan 2010 13:12 On Mon, Jan 25, 2010 at 6:25 PM, Mike Dalessio <mike.dalessio(a)gmail.com> wrote: > Charlie, you're making a great case against using FFI. FFI is much better than writing any C code at all, due to the security, stability, and portability problems of writing your own C bindings. If you are permitted to load a given library and that library is available and you *must* use that library, FFI is the only logical choice. But it doesn't get around the fact that you need the library you're binding to be available and loadable on your target platform. FFI > C bindings, but [platform-independent binary] > FFI. And that usually means Java-based. I should also point out that you don't necessarily have to write JVM libraries in Java; you could also use Scala or Fan or similar languages, and it would be just as portable (albeit a bit larger due to the runtime dependency on those languages' runtime libraries). But yes, at the end of the day, I believe writing stuff in a portable binary format like JVM bytecode (or CLR bytecode) is a better choice than writing in a language that has to be recompiled for every target system. You ought to know that already...would I be working on JRuby if I believed any differently? :) And yes...I'd love to be able to recommend that everyone just use Ruby for everything. But I don't think it's simply a performance issue; there's some pretty amazing things you can get for free with a rich static type system. - Charlie
From: Aaron Patterson on 25 Jan 2010 13:53 On Tue, Jan 26, 2010 at 03:12:17AM +0900, Charles Oliver Nutter wrote: > On Mon, Jan 25, 2010 at 6:25 PM, Mike Dalessio <mike.dalessio(a)gmail.com> wrote: > > Charlie, you're making a great case against using FFI. > > FFI is much better than writing any C code at all, due to the > security, stability, and portability problems of writing your own C > bindings. References please. Last I checked, it was just as easy to segv from an FFI library as a C library. Plus with FFI you don't get any benefits of compile time checks. You can't, for example, check for #define constants. With FFI you must: 1. Duplicate header files (see below for more problems) 2. Understand struct layouts and the sizeof() for each member 3. Do runtime checking of library features 4. Worry about weak ref maps when using void pointers (see the id2ref problem in nokogiri) 5. Pay a runtime conversion price from ruby data types to FFI types 6. Educate users on LD_LIBRARY_PATH 7. Worry about 32bit and 64bit issues (like Tony mentioned) The duplication of header files becomes an even larger problem if the library you're wrapping changes it's struct layout. Where a simple recompile would have solved the problem, now (without warning) you're getting surprising values in your FFI program. Plus typical debugging tools like gdb get you nowhere. Example: Library "foo" ships with a struct like this: struct awesome { float hello; char * world; }; Then later changes to: struct awesome { char * world; float hello; }; You wrapped the first one, upgrade the library, then boom. It doesn't work. With a compiled program, you wouldn't care. Unfortunately, none of the problems I've just listed off are theoretical. I have personally run in to every one of them and can provide you with real world examples. FFI is awesome for certain, confined, small, stable use cases. I use FFI, and I enjoy it. But saying that it's "the only logical choice" seems wrong. I am curious what your experience has been, and why you haven't run in to the same problems? How do other people overcome these issues? -- Aaron Patterson http://tenderlovemaking.com/
From: Mike Dalessio on 25 Jan 2010 14:12 [Note: parts of this message were removed to make it a legal post.] On Mon, Jan 25, 2010 at 1:12 PM, Charles Oliver Nutter <headius(a)headius.com>wrote: > On Mon, Jan 25, 2010 at 6:25 PM, Mike Dalessio <mike.dalessio(a)gmail.com> > wrote: > > Charlie, you're making a great case against using FFI. > > FFI is much better than writing any C code at all, due to the > security, stability, and portability problems of writing your own C > bindings. If you are permitted to load a given library and that > library is available and you *must* use that library, FFI is the only > logical choice. But it doesn't get around the fact that you need the > library you're binding to be available and loadable on your target > platform. FFI > C bindings, but [platform-independent binary] > FFI. > And that usually means Java-based. > > I should also point out that you don't necessarily have to write JVM > libraries in Java; you could also use Scala or Fan or similar > languages, and it would be just as portable (albeit a bit larger due > to the runtime dependency on those languages' runtime libraries). > > But yes, at the end of the day, I believe writing stuff in a portable > binary format like JVM bytecode (or CLR bytecode) is a better choice > than writing in a language that has to be recompiled for every target > system. You ought to know that already...would I be working on JRuby > if I believed any differently? :) > I agree with everything you're saying, more or less. However, none of that relates at all to what I think is the crux of the issue, which is that everyone writing a non-pure-Ruby gem today is forced to choose one of these options: 1) Support nearly everyone by maintaining two ports of your code: FFI for JRuby; C for MRI, Rubinius and MacRuby. Don't support GAE. 2) Support everyone by maintaining two ports of your code: JVM for JRuby and GAE; C for MRI, Rubinius and MacRuby. 3) Maintain only a single port, FFI, and force everyone on MRI to take a performance hit of some kind. Oh, and don't support Rubinius, MacRuby or GAE. 4) Don't support JRuby or GAE. Just write it in C. 5) Don't support MRI, Rubinius, or MacRuby. Just write it for the JVM. Complicated? Yes. I've summed it all up in a nice matrix here: http://gist.github.com/286126 I personally think these choices all suck, and I refuse to paint a happy face on any of them. We chose option 1 for Nokogiri (you're welcome, intarnets), but everyone who's writing a gem today has to make this decision for themselves. My point is that any of these choices contains a tradeoff, and stating that one in particular "hurts" people more than another is just disingenuous. I'd rather help people understand the tradeoffs.
From: Chuck Remes on 25 Jan 2010 14:17 On Jan 25, 2010, at 1:12 PM, Mike Dalessio wrote: > On Mon, Jan 25, 2010 at 1:12 PM, Charles Oliver Nutter > <headius(a)headius.com>wrote: > >> On Mon, Jan 25, 2010 at 6:25 PM, Mike Dalessio <mike.dalessio(a)gmail.com> >> wrote: >>> Charlie, you're making a great case against using FFI. >> >> FFI is much better than writing any C code at all, due to the >> security, stability, and portability problems of writing your own C >> bindings. If you are permitted to load a given library and that >> library is available and you *must* use that library, FFI is the only >> logical choice. But it doesn't get around the fact that you need the >> library you're binding to be available and loadable on your target >> platform. FFI > C bindings, but [platform-independent binary] > FFI. >> And that usually means Java-based. >> >> I should also point out that you don't necessarily have to write JVM >> libraries in Java; you could also use Scala or Fan or similar >> languages, and it would be just as portable (albeit a bit larger due >> to the runtime dependency on those languages' runtime libraries). >> >> But yes, at the end of the day, I believe writing stuff in a portable >> binary format like JVM bytecode (or CLR bytecode) is a better choice >> than writing in a language that has to be recompiled for every target >> system. You ought to know that already...would I be working on JRuby >> if I believed any differently? :) >> > > I agree with everything you're saying, more or less. > > However, none of that relates at all to what I think is the crux of the > issue, which is that everyone writing a non-pure-Ruby gem today is forced to > choose one of these options: > > 1) Support nearly everyone by maintaining two ports of your code: FFI for > JRuby; C for MRI, Rubinius and MacRuby. Don't support GAE. > 2) Support everyone by maintaining two ports of your code: JVM for JRuby and > GAE; C for MRI, Rubinius and MacRuby. > 3) Maintain only a single port, FFI, and force everyone on MRI to take a > performance hit of some kind. Oh, and don't support Rubinius, MacRuby or > GAE. > 4) Don't support JRuby or GAE. Just write it in C. > 5) Don't support MRI, Rubinius, or MacRuby. Just write it for the JVM. FFI originated with rubinius, so I would wager that it will work once the FFI APIs get synched up again. Also, MacRuby has FFI support on its roadmap. That changes your picture a bit. cr
From: Aaron Patterson on 25 Jan 2010 14:31
On Tue, Jan 26, 2010 at 04:17:32AM +0900, Chuck Remes wrote: > > On Jan 25, 2010, at 1:12 PM, Mike Dalessio wrote: > > > On Mon, Jan 25, 2010 at 1:12 PM, Charles Oliver Nutter > > <headius(a)headius.com>wrote: > > > >> On Mon, Jan 25, 2010 at 6:25 PM, Mike Dalessio <mike.dalessio(a)gmail.com> > >> wrote: > >>> Charlie, you're making a great case against using FFI. > >> > >> FFI is much better than writing any C code at all, due to the > >> security, stability, and portability problems of writing your own C > >> bindings. If you are permitted to load a given library and that > >> library is available and you *must* use that library, FFI is the only > >> logical choice. But it doesn't get around the fact that you need the > >> library you're binding to be available and loadable on your target > >> platform. FFI > C bindings, but [platform-independent binary] > FFI. > >> And that usually means Java-based. > >> > >> I should also point out that you don't necessarily have to write JVM > >> libraries in Java; you could also use Scala or Fan or similar > >> languages, and it would be just as portable (albeit a bit larger due > >> to the runtime dependency on those languages' runtime libraries). > >> > >> But yes, at the end of the day, I believe writing stuff in a portable > >> binary format like JVM bytecode (or CLR bytecode) is a better choice > >> than writing in a language that has to be recompiled for every target > >> system. You ought to know that already...would I be working on JRuby > >> if I believed any differently? :) > >> > > > > I agree with everything you're saying, more or less. > > > > However, none of that relates at all to what I think is the crux of the > > issue, which is that everyone writing a non-pure-Ruby gem today is forced to > > choose one of these options: > > > > 1) Support nearly everyone by maintaining two ports of your code: FFI for > > JRuby; C for MRI, Rubinius and MacRuby. Don't support GAE. > > 2) Support everyone by maintaining two ports of your code: JVM for JRuby and > > GAE; C for MRI, Rubinius and MacRuby. > > 3) Maintain only a single port, FFI, and force everyone on MRI to take a > > performance hit of some kind. Oh, and don't support Rubinius, MacRuby or > > GAE. > > 4) Don't support JRuby or GAE. Just write it in C. > > 5) Don't support MRI, Rubinius, or MacRuby. Just write it for the JVM. > > FFI originated with rubinius, so I would wager that it will work once the FFI APIs get synched up again. Also, MacRuby has FFI support on its roadmap. That changes your picture a bit. Rubinius implements enough of the MRI C api that it will run Nokogiri today. MacRuby will follow suit, and I expect that to happen sooner than it supports FFI (though this is conjecture). With minor tweaks to your C code, you can have a native extension that runs on all three *today*. -- Aaron Patterson http://tenderlovemaking.com/ |