GetNth Revisited - Helpful function? Or SPAWN OF THE DEVIL?
Bob Balaban August 16 2008 04:22:09 AM
Greetings, Geeks!!I was browsing my feed reader this morning, and came across a couple of interesting posts on the TeamStudio site (see the posts titled "Show Time" and "Time Differences").
Sigh. More discussion about NotesDocumentCollection.GetNth..... This topic has been a pimple on the butt of Notes developers for, let's see, EVER (ok, well, actually since Notes 4.0 shipped in January, 1996, so really only for 12 years....). I was moved to blog about it, one more (and I sincerely hope, LAST) time.
I've written about it, talked about it at conferences and among friends, and it STILL comes up every now and then. So, ok. ONE MORE TIME!
IF (you have a DocumentCollection object -- doesn't matter whether it's LotusScript, Java, COM, whatever language binding),
AND (your DocumentCollection object is an UNsorted list -- i.e., the result of a Database.AllDocuments, or Database.Search operation, NOT an FTSearch)
THEN iterating that collection using the DocumentCollection.GetFirst/GetNext pattern is MUCH (i.e., A LOT) faster than iterating using DocumentCollection.GetNth(index).
CAVEATS:
- For small collections the difference doesn't matter
- I'm talking about navigation time, regardless of what happens to the documents themselves inside your iteration loop.
- I'm ONLY talking about DocumentCollection navigation here, VIEW navigation is completely different (GetNth in a View is just as fast as GetFirst/GetNext)
- Your results may well be version-dependent! (more on this later...)
How do I know? How do i know for sure? I wrote an experiment, which I will now share with you, gentle geek readers. Here's what I did:
- Using Notes/Designer v7.02,
- Created a blank database
- Wrote a LS agent to populate the database with 50,000 documents (see code, below)
- Wrote another LS agent to iterate all the documents in the database (see code, below), first using GetFirst/Next, then using GetNth
- The agent times each full iteration, and reports the elapsed time of each.
Results? The envelope, please....
- GetFirst/Next: 1 second
- GetNth: 446 seconds
You can't tell me that's not a significant difference. I wondered whether someone at Lotus had managed to fix this forever problem recently, so I asked my colleague Steve Mullen to run the test on Notes 8. Steve reported results almost identical to the ones I receved ("painfully slow"). Twelve years and counting, DocumentCollection.GetNth is still.....painful.
So, What's Going on Here?
Why so slow with GetNth? It's because unsorted DocumentCollections, internally, represent a data structure called an "IDTable", a compressed list of Note IDs (see the Lotus CAPI toolkit for documentation, this stuff has been around a long time). There's no way to find the Nth entry in the table other than by starting at the beginning and counting to N. So, to iterate the entire list of X entries will take (x*(x-1))/2 CAPI calls (which is Order(n-squared). In other words, for 50,000 documents, you're going to call the IDTable interface something like 1,249,975,000 times. As Kurt Vonnegut might have said, "That's a lot of times!" The shocker (to me) is that it ONLY takes 7 or 8 minutes (on my laptop).
So, there you go. I think I'll write one more blog entry on this topic, as I recently came across a situation where I had no choice -- I HAD to use GetNth to iterate a document collection (first time in 12 years!). Stay tuned!
Here's the code for my "PopulateDb" agent:
Sub Initialize
Dim s As New notessession
Dim db As NotesDatabase
Dim i As Long
Dim doc As NotesDocument
Set db = s.CurrentDatabase
For i = 0 To 49999
Set doc = db.CreateDocument()
doc.somefield = i
doc.Save True, False
Next
End Sub
And here's the code for my test agent:
Sub Initialize
Dim s As New notessession
Dim db As notesdatabase
Dim dc As NotesDocumentCollection
Dim doc As notesdocument
Dim dtStart As NotesDateTime
Dim dtEnd As NotesDateTime
Dim elapsed As Long
Dim i As Long
Set db = s.CurrentDatabase
Set dc = db.AllDocuments
Set dtEnd = s.CreateDateTime("")
Set dtStart = s.CreateDateTime("")
dtStart.SetNow
Set doc = dc.GetFirstDocument
While Not (doc Is Nothing)
Set doc = dc.GetNextDocument(doc)
Wend
dtEnd.SetNow
elapsed = dtEnd.TimeDifference(dtStart)
Msgbox "GetFirst/Next elapsed time = " & elapsed
dtStart.SetNow
For i = 0 To dc.Count
Set doc = dc.GetNthDocument(i)
Next
dtEnd.SetNow
elapsed = dtEnd.TimeDifference(dtStart)
Msgbox "GetNth elapsed time = " & elapsed
End Sub
Try it yourself!
(Need expert application development architecture/coding help? Contact me at: bbalaban, gmail.com)
Follow me on Twitter @LooseleafLLC
This article ©Copyright 2009 by Looseleaf Software LLC, all rights reserved. You may link to this page, but may not copy without prior approval.
- Comments [0]