Posts Tagged lists

Relative CFLoop Performance for Various Loop Structures

Introduction

Jim over at Ben Nadel’s blog made the assertion that looping a list is faster than looping a struct.  It’s an interesting assertion that looping a list would be faster than looping an array.  I did a test of my own to find out.

Setup

Starting with objects with 1 entry populated, I increased the number of populated entries by 1,000 until I reached 100,001 entries in each of an Array, a List, and a Struct.

I also compared the following syntaxes:
<cfloop array=”myArray” index=”x”>
<cfloop from=”1″ to=”#ArrayLen(myArray)#” index=”x”>
<cfloop collection=”#myStruct#” item=”x”>
<cfloop from=”1″ to=”#StructCount(myStruct)#” index=”x”>
<cfloop list=”myList” index=”x”>
<cfloop from=”1″ to=”#ListLen(myList)#” index=”x”>

For each type of test, I did a <cfsavecontent> to cfoutput the relevant entries from each object without causing a ton of data to be sent back to the browser.  The actual <cfloop> and output was tight (all on one line to minimize the total content going to cfsavecontent).

I ran each incremental test 10 times to get an average duration.  I did this because my in my earliest tests it was obvious that garbage collection was causing significant variance (shorter tests often took longer than their longer brethren).  Even still it’s obvious garbage collection played a bigger factor in efficiency than the actual test results, making them somewhat difficult to decipher meaningfully.

Execution

Early on, I dropped the final test (which involved ListGetAt() for the output portion) since while the others were still taking a few milliseconds, this test was already taking multiple seconds.

It should be no surprise that 1 entry of each took no measurable time (getTickCount() had not incremented).

At 10,001 entries, <cfloop array> and <cfloop list> took 5 and 4 milliseconds respectively, but at 11,001 (the next level up), it was 3 and 7 milliseconds respectively – neck and neck, looking in favor of <cfloop array>.  <cfloop ArrayLen()> was taking 9 and 6 seconds, <cfloop collection> 21 and 17 seconds, and <cfloop StructCount()> waas taking 21 and 14.  At this level we could say <cfloop array> and <cfloop list> were close, while the others were not.

At 50,001 entries, <cfloop array> and <cfloop list> are still neck and neck.  They reported identical times of 11.00 milliseconds.  <cfloop ArrayLen()> weighed in at 56.10ms, <cfloop collection> at 126.2ms, and <cfloop StructCount()> at 91.2ms.

By the final test of 100,001 entries, <cfloop array> and <cfloop list> had finally started to show a real difference.  <cfloop array> pulled ahead as the clear winner, with at least a 10ms margin on all but one of the the 10 tests leading up to 100,001.

Cfloop Performance Comparison
(Click for full version)

So what causes such a performance difference?

<cfloop ListLen()> was the clear loser by a wide margin.  ListGetAt() is horribly inefficient.  Each time you call it, it has to consume the string from the start, counting delimiters until it reaches position-1, then continue consuming until it reaches position.  This is no surprise at all.

<cfloop collection> was the clear next-to-last-place loser.  The most likely reason for this to me is that internally it needs to loop over some other type of object (in this case, most likely an array of struct keys), then needs to do a lookup within the actual structure itself (a HashMap of some form).

<cfloop StructCount()> came next.  This was a bit of an non-real world scenario.  We didn’t have to loop over a collection of struct keys, we knew what the struct keys were and were able to perform the loop itself in a faster fashion than looking up keys.  But to output the value, we still needed to do HashMap lookups on the keys themselves.

<cfloop ArrayLen()> won the bronze.  Seems to me that this is because we have to do a positional lookup to output the value for each entry.  That shouldn’t be huge because internally this should be represented as a vector of pointers to the actual entries, and since each vector has a fixed memory size, we take ArrayMemoryStartPoint + position * sizeof(vector).  Well, would be nice if that’s the case, but it’s not quite that simple since these arrays are dynamically sized.  We actually have to do a bit of work with a table of memory references, blah blah blah.  Point is we have to do a lookup (albeit cheaper than a struct lookup) to find the correct entry.

<cfloop list> took the silver.  This makes sense, the iterator involved in the cfloop can remember where it left off, and each time through the loop only has to consume up to the next delimiter.  It knows what the index of the previous delimiter is, and so the only lookup it has to do at all is collecting the characters from offset A to offset B (which it could actually have kept track of while it was consuming).

<cfloop array> took the gold, but not by a large margin.  This also makes sense because it has the same sort of performance gain that <cfloop list> has (already knowing its current offset for the next time through the loop), and all it has to do is return a pointer to the next element in the vector.  The vector iterator handles this stuff at a lower level and is able to make the best use of its own internal structures to make this as fast as possible.

Conclusion

<cfloop array> and <cfloop list> are pretty interchangeable even though the former wins the contest.

All of this being said, we are still looking at loop overhead of less than 300 milliseconds even for the slowest (other than <cfloop ListLen()>) loop type for 100,000 records.  This is incredibly trivial overhead for that amount of data.  So in reality which type you use probably depends more on what structures you’re already using – trying to convert to a different data container for your loop will probably cost you as many or more cycles than using the native loop type associated with wherever your data currently is.

, , , ,

6 Comments