Prev: Black-Jewish History FAQ - Is The Secret Relationship between blacks and jews a “hate literature”?
Next: which library to generate xml with namespaces?
From: Tim Bradshaw on 18 Jun 2010 11:09 On 2010-06-17 21:46:23 +0100, Eric Wolf said: > And I thought: "Hmm that doesn't look elegant" but I'm not used to > recursion and functional programming and that stuff so I didn't > came up with a better version. This is kind of a counterclaim to the "use some complicated abstract data structure" stuff (See Gabriel on that). Depends on tail-call elimintation if you want to use it on long lists (and of course in real life you would use alist functions not roll your own as here). (defun occ (list &key (test #'eql)) (labels ((oc (lt al) (if (null lt) al (oc (rest lt) (oca (first lt) al al)))) (oca (s alt al) (cond ((null alt) (cons (cons s 1) al)) ((funcall test s (car (first alt))) (incf (cdr (first alt))) al) (t (oca s (rest alt) al))))) (oc list nil)))
From: RG on 18 Jun 2010 11:22 In article <4c1b7527$0$274$14726298(a)news.sunsite.dk>, "Captain Obvious" <udodenko(a)users.sourceforge.net> wrote: > R> IMHO a better way to approach this, particular for beginners, is to use > R> abstract associative maps rather than association lists. A CL > R> implementation of abstract associative maps can be found here: > > What's wrong with hash-tables which are in the standard? > > R> (defun count-items (list) > R> (let ((c (make-instance 'counter))) > R> (dolist (item list) (add c item)) > R> c)) > > (defun count-items (list) > (let ((ht (make-hash-table))) > (dolist (item list) > (incf (gethash item ht 0))) > ht)) > > The only difference I see is that GETHASH accessor has default value and you > can use INCF directly on it, no need for a goddamn additional function for > that. Wow, you're the second person to react violently to the lack of a setf expander for REF with a default argument. Why is that so important? It seems like such a trivial issue to me. You can already do this: (defmethod add ((c counter) item) (incf (refd c item 0))) if the explicit call really rubs you the wrong way, and making it work for REF would be trivial. But to answer your question directly: > What's wrong with hash-tables It's a premature optimization. What if hash tables are not the right data structure? What if it turns out that a red-black tree would be more efficient for the particular data you are processing? Or what if you have too much data to fit in main memory and you want to use a database as a backing store? If you've hard-coded in calls to GETHASH you now have to change ALL your code. If you use abstract associative maps you only have to change the underlying implementation. You can even do this at run time: ? (setf c (make-instance 'counter)) #<COUNTER HASH-TABLE { }> ? (add c 'foo) 1 ? (add c 'baz) 1 ? (add c 'baz) 2 ? c #<COUNTER HASH-TABLE { BAZ 2 FOO 1 }> ? (describe (slot-value c 'implementation)) #<HASH-TABLE :TEST EQUAL size 2/60 #x3020019B5CCD> Type: HASH-TABLE Class: #<BUILT-IN-CLASS HASH-TABLE> TYPE: (HASH-TABLE . #<CCL::CLASS-WRAPPER HASH-TABLE #x3000400296BD>) NHASH.KEYTRANSF: #<Compiled-function CCL::%%EQUALHASH #x3000000AA3AF> NHASH.COMPAREF: #<Compiled-function EQUAL #x30000007B23F> NHASH.REHASH-BITS: NIL NHASH.VECTOR: #<HASH-TABLE-VECTOR #x3020019B5DCD> NHASH.LOCK: 524288 NHASH.OWNER: NIL NHASH.GROW-THRESHOLD: 58 NHASH.REHASH-RATIO: 1.1764705 NHASH.REHASH-SIZE: 1.5 NHASH.PUTHASH-COUNT: 0 NHASH.EXCLUSION-LOCK: #<RECURSIVE-LOCK [ptr @ #xB8EB0C0] #x3020019B5D4D> NHASH.FIND: #<Compiled-function CCL::GENERAL-HASH-FIND #x3000000AD27F> NHASH.FIND-NEW: #<Compiled-function CCL::GENERAL-HASH-FIND-FOR-PUT #x3000000ADF4F> NHASH.READ-ONLY: NIL ? (change-implementation c 'plist) #<COUNTER PLIST { FOO 1 BAZ 2 }> ? (describe (slot-value c 'implementation)) #<PLIST #x302001A3ED2D> Class: #<STANDARD-CLASS PLIST> Wrapper: #<CCL::CLASS-WRAPPER PLIST #x3020018AD7CD> Instance slots PLIST: (FOO 1 BAZ 2) ? You can even make an abstract associative map that automatically chooses an underlying implementation based on run-time profiling without changing your client code. rg
From: Tim Bradshaw on 18 Jun 2010 11:53 On 2010-06-18 16:22:29 +0100, RG said: > You can even make an abstract associative map that automatically chooses > an underlying implementation based on run-time profiling without > changing your client code. I think there's really two approaches here: You can go for the "huge complicated reusable library which does everything" approach. This is what Java does, and is what you're proposing. This is great, except the library will probably not actually do everything, and will turn out to have design deficiencies which make it eithr annoying or impossible to use. If you are using a minority language, then the library will not be very well tested because too few people will be using it. It will fail at crucial moments with results varying from you getting woken up at 3AM every day to death. Your program will also become enormous, you will not be able to understand why it performs so badly, and you will spend the rest of your life trying to understand the library, smelling increasingly bad and eventually dying on a street corner somewhere. Or you can use the facilities provided by the language, and build things which work for your application without stressing too much about making them do everything. You will be laughed at by your peers, but your program will work, you will understand it, and it will perform well. You will become rich, and will be able to spend your time foisting huge complicated reusable libraries on your former peers, who are now your slaves, and laughing at them as they starve in the gutter while trying to understand a million pages of documentation, which you change every two weeks. Very strong-willed people can achieve the latter even when using a language such as Java which is predicated around the former.
From: Captain Obvious on 18 Jun 2010 12:11 ??>> The only difference I see is that GETHASH accessor has default value ??>> and you can use INCF directly on it, no need for a goddamn additional ??>> function for that. R> Wow, you're the second person to react violently to the lack of a setf R> expander for REF with a default argument. Why is that so important? It R> seems like such a trivial issue to me. You can already do this: R> (defmethod add ((c counter) item) R> (incf (refd c item 0))) R> if the explicit call really rubs you the wrong way, and making it work R> for REF would be trivial. Yes, it is trivial issue, but when problem itself is so simple, the only things which are left are trivial issues. R> But to answer your question directly: ??>> What's wrong with hash-tables R> It's a premature optimization. Using something else other than hash-tables is a premature optimization! Because they are in standard, they work reasonably well for a wide variety of situations and have quite sane API. Each line of code which is unnecesarry and does not do anything useful is over-engineering and premature optimization. In your example those are: (require :dictionary) (defclass counter (dictionary) ()) (defmethod add ((c counter) item) (setf (ref c item) (1+ (ref c item 0)))) Four unnecessary lines vs four useful lines. So half of your code was over-engineering/premature optimization. R> What if hash tables are not the right data structure? You can think about it when you have performance problems. I'd recommend you to read something on premature optimization, you've got it not just wrong, but in reverse! R> What if it turns out that a red-black tree would be more efficient for R> the particular data you are processing? Then I'll refactor my code, but I'll not waste my time thinking about all possible situations before I need that. R> Or what if you have too much data to fit in main memory and you want R> to use a database as a backing store? If you've hard-coded in calls to R> GETHASH you now have to change ALL your code. If it that bad, packages can do that. R> If you use abstract associative maps you only have to change the R> underlying implementation. You can even do this at run time: It is cool, but it is over-engineering. Over-engineering usually looks cool. R> You can even make an abstract associative map that automatically chooses R> an underlying implementation based on run-time profiling without R> changing your client code. This is over-engineering classic.
From: Captain Obvious on 18 Jun 2010 12:39
TB> You can go for the "huge complicated reusable library which does TB> everything" approach. This is what Java does, and is what you're TB> proposing. This is great, except the library will probably not TB> actually do everything, and will turn out to have design deficiencies TB> which make it either annoying or impossible to use. Definitely. When I was working as a C++ programmer I used STL and Boost libraries which are meant to be very general, abstract and flexible, but unfortunately they are often PITA to use and do not have useful functionality people need. I think there might be a fundamental rule which imposes limit on usefulness of "huge complicated reusable" libraries. It might be related to Kolmogorov complexity and its invariance theorem. There is a good discussion of reusability in Richard P. Gabriel's "Patterns of Software" book. |