Bob Balaban's Blog

     
    alt

    Bob Balaban

     

    Geek-o-Terica 9: Memory Sub-allocator (or, why your server’s memory usage only goes up), and Happy Father’s Day

    Bob Balaban  June 26 2009 12:12:42 AM
    I did an exploration of product memory management many years ago. I worked on a couple of different Lotus products (1-2-3, Notes) that had sophisticated memory-management packages in them. All of them were what you might call "sub-allocators", basically they grabbed big chunks of memory from the operating system, and broke up the big chunk into smaller chunks for program code to use.

    This past Sunday was Father's Day, a nice, family-oriented holiday, essentially invented by Hallmark. I got to thinking about my father, and also about my father's father. He was an ordinary, yet interesting man, born either on the boat over from the Ukraine (or maybe Byelorus, the people who knew for sure have been dead a long time, and nobody's sure), or shortly after arrival in 1900. His name was Sam. My father relates that Sam went into the clothing business in New York City as a young man. When the Depression hit hard, in the early 1930s (editorial comment: it was MUCH worse than the one we're in now, bad as that is...), Sam got laid off, and had the guts to start his own business. He formed a company to buy large amounts of fabric from the cloth manufacturers (this was when the United States still had a textile industry, now long gone), and sell that cloth in smaller chunks to the clothing makers of New York's Garment District. One of his sons, my Uncle Ed, joined the business. My father, Al, did not. Al went to college, then got drafted into the Army during World War 2.

    My grandfather used to be a sub-allocator, though he never would have called it that. The analogy works really well. Sam thought of himself as a broker: he'd get enough money together (couldn't have been easy in the early 30's) to buy big bolts of cloth, haul a sample case around to the clothing manufacturers and sell them what they needed. He tended to specialize in cloth for women's dresses. Sam's company lasted until the mid 1960s, I remember going to visit "the office" once or twice. Rolls of cloth everywhere.

    How is that like memory management in Notes or 1-2-3 (does anyone still use 1-2-3?)? There were 2  problems with memory allocation in earlier versions of Windows: it was slow, and each chunk that you would get from the OS represented one "system handle", regardles of the size of the chunk. And system handles were a finite resource, you couldn't get more than (in one version of Windows NT, as I recall) 4096 of them. When they were gone, you were "out of memory", and you'd mostly likely crash. The other problem with it was that it was slooowwww. Not so much getting  the memory, but "freeing" it, or retuning memory you no longer needed back to the system was very slow. Why? Because of something called "free block consolidation".  The OS would try to figure out when you told it you were done with a particular block of memory, what other "free blocks" there were (if any) adjacent to your's, so it could consolidate them together, raising the odds that the next big allocation would find a contiguous chunk of the size needed easily. Otherwise, the system's memory would become "fragmented", and allocating new chunks would become much more difficult.

    So what does sub-allocation do for you? Say 3 pieces of a program each want around 100KB of memory for a little while. They could each go to the Windows API and request 100KB, but that would use up 3 (scarce) system handles. But what if, instead, we had a "memory manager", a "broker" (we could name it Sam) who, instead would track requests for memory allocations. "Sam" could request say, 64MB of memory from the OS all at once, then break that big chunk up into 3 100KB blocks as requested, using only 1 system handle. If we wrote Sam to be clever enough, he'd hve an efficient way of knowing which bits of his big 64MB chunk were being used, and which were not. And when the 3 program tasks were done with their 100K bits, they could "free" them with a call back to Sam, who could efficiently consolidate them back into his pool of available memory.

    We (the 1-2-3 and Notes product teams) did write such memory managers (we didn't call them Sam, though), and they worked a whole lot better and a whole lot faster than the Windows API did.

    One interesting side-effect of using a memory sub-allocator in a product is something that people who pay careful attention to system memory usage often notice. If you use a system monitor (say Windows Task Manager) to track system memory usage, and then run some kind of performance test suite (maybe a browser client hitting a web page over and over, or hitting different web pages over and over to avoid the page-caching optimization), you see what might appear to be strange behavior: memory usage goes up, up, up, up, and then plateaus, but rarely does it go back down again.

    Why does it happen this way? Well, Sam could have told you. The sub-allocator grabs memory from the OS in bug chunks, and parcels it out in smaller chunks. If it's block gets used up, it goes back to the OS for another (big) block. Both big blocks would be notieced by TaskManager. But sub-allocating out of those blocks is not registered by Task Manager, because it's done only within the program. When a task (HTTP server, for example) asks for a bunch of memory to process an outgoing web page, it gets a sub-allocated block. When the HTTP task returns ("frees") that smaller block back to the memory manager ("Sam"), the memory manager does NOT go return it to the OS. First of all, it's too small, it doesn't represent (usually) a full "handle's worth" of memory that was obtained from the OS in the first place. Secondly, Sam is going to keep it around because he knows that it's probably going to be needed again, laer, for something else.

    That's why you see (at the system level) more and more memory going to the product's process, and only once in a while (when Sam has an entire big block of memory that he got from Windows free again, and decides he doesn' need it anymore) does the number go down again, maybe when the server is idle for a while.

    So, there you have it. Yet another example of art imitating life. Here's to my father Al, on Father's Day, and here's to my Grandfather Sam, sadly now no longer with us. I love you, Dad.
    And here's to all fathers out there, everywhere. Happy Father's Day!

    Geek ya later!

    (Need expert application development architecture/coding help? Contact me at: bbalaban, gmail.com)

    Geek-o-Terica 8: LotusScript, To Delete, or Not to Delete?

    Bob Balaban  June 15 2009 10:41:32 PM
    Greetings, Geeks!

    I have written at length recently about garbage collection in both LotusScript and Java (see Geek-o-Tericas 3, 5, and 6 especially). I went on (and on...) about how and why you need to use the "recycle()" call in the Java classes for Notes, and why that call doesn't really exist in LotusScript, which has a more robust and automated gc mechanism integrated with the Notes classes.

    But, I neglected to mention the built-in LotusScript "delete" function, and since I've received 1 or 2 questions about it, I decided to do another (not so long, I hope) post on that topic.

    "Delete" in LotusScript is very close in function to recycle() in Java: it destroys an object instance, and causes it to be garbage-collected immediately. If the object you delete is a Notes back-end (OR front-end) class object, then all of the memory associated with that object is freed.

    Ok, that was easy! But the question remains: why would you ever need (or want) to use delete, when the LotusScript gc mechanism is invoked after every statement anyway?

    I have only ever needed it in one particular situation, having to do with (wait for it...) NotesAgent.Run() and NotesAgent.RunOnServer(). Why? Because there's a cute trick you can use to pass "parameters" to agents that you invoke from LotusScript, and, which you can also use to get back complex results from an agent. BUT, you need delete.

    So, let's say you have an agent that you want to invoke from a pice of LotusScript code. Maybe the agent is local, or maybe it lives on a server somewhere. The first thing you need to do is "navigate" your way to the agent. Typically you'd do that by using NotesSession.GetDatabase() to get the database containing the agent, then NotesDatabase.GetAgent() to get an instance of NotesAgent.

    If you want the agent code to execute on the same machine as the code that invokes it, you will be using NotesAgent.Run(). If the target agent lives on a server, you can use NotesAgent.RunOnServer() to have the agent code run on that server (your calling program waits for the agent to finish before resuming in both cases). For both methods, as of Notes v5.02, you may optionally supply the NOTEID of a "parameter" document. This comes in handy when you want to be able to pass runtime info to the executing agent to tell it specifically what to do. You just create a document instance (NotesDatabase.CreateDocument) -- locally for the Run() method, on the target server for the RunOnServer() method. You can put any data you want into the new document, obviously you have to know what the target agent is expecting.

    Then, you have to Save() the document object to disk. Why? Becuase you need a NOTEID (NotesDocument.NoteID) to pass to the agent, and a document object does not acquire a NOTEID until you've saved it. So then you can pass your newlly minted NOTEID to Run() or RunOnServer().

    The target agent is set up and invoked. That agent can find out what NOTEID was passed to it as follows:

         Dim s as New NotesSession
         Dim db as NotesDatabase
         Dim currentagent as NotesAgent
         Dim noteid as String
         Dim parameters as NotesDocument
         set currentagent = s.CurrentAgent
         noteid = currentagent.ParameterDocID
         set db = s.CurrentDatabase
         set parameters db.GetDocumentByID(noteid)

    And off you go. Pretty convenient, huh? But we haven't even got to the REALLY cool part yet!

    What if your agent runs, and wants to pass a bunch of results BACK to the calling agent? You can! Just use the existing parameters document you already have. Your target agent can fill it up with whatever data you want to "pass back", and (of course) you have to call Save() on it again to get those changes committed to disk. Then, in your original calling agent, you have to get the modified version of that document back. You can't just use the NotesDocument instance you created to store your invocation parameters, because that document object still exists in memory,and it does not have your target agent's result data in it.

    THAT is why you need delete! Your calling agent has to delete the original parameter document, then re-fetch it with a NotesDatabase.GetDocumentByID() call (it will still have the same NOTEID it had before). Cool, huh?

    Does it work in Java? YES, with only one small difference: you don't get the "current agent" from the Session in Java, you get it from the AgentContext, which you get from the Session. Otherwise, this mechanism works the same in Java as in LotusScript.

    So, there's the one situation I've come across personally that requires "delete" (thanks to Daniel Lehtihet for suggesting this as a topic).

    Geek ya later!

    (Need expert application development architecture/coding help? Contact me at: bbalaban, gmail.com)
         

    Geek-o-Terica 7: Garbage, threads and the CORBA classes in Notes

    Bob Balaban  May 23 2009 11:15:00 AM
    Greetings, Geeks!

    I actually thought that my previous Geek-o-Terica post on garbage collection (gc) and java and Notes and threads would be the final one on that topic. For one thing, there's only so much you can say about it without falling asleep.

    However, I had a couple of requests to map that discussion onto the "other" Java APIs for Notes, the so-called CORBA classes. It's a good point, so here we go.

    First, a brief review of what's different about the CORBA classes. They're different from the "regular" back-end classes for Notes in that they constitute a Java library (NCSO.jar) that does not need a locally installed Notes Client or Domino Server in order to operate. They work remotely with any Domino server that runs the DIIOP server task. The CORBA classes (so named because the remoting technology they use is based on the Object Management Group's "Common Object Request Broker Architecture" specification).

    The classes in NCSO.jar implement the exact same Java interfaces (Session, Database, View, Document, etc etc) that the "local" classes in Notes.jar do, so you (with a few very minor exceptions) use them the same way. But the implementation is entirely different. When you invoke a method on a "local" object instance, a piece of Java wrapper code calls into the C++ code in Notes/Domino that implements the back-end classes (lsxbe). The C++ code uses the Notes C API, and things rock on.

    The job of the "remote" Java classes is entirely different. Each object instance in NCSO.jar is really just a proxy object: it's job is to assemble the input parameters into a command buffer, format it according to the CORBA wire-protocol specification (IIOP, thus the name DIIOP for the server task), and ship it to the server. Each CORBA proxy object is bound to a "real" back-end object on the server, which is the thing that does the real work.

    This architecture has a number of different implications on how you code your CORBA-based Java app:

         1) You still have to use recycle(). Not so much to protect the memory heap of your local JVM (because there are no "real" back-end classes to de-allocate there), but to recover lsxbe/C++ memory on the server.
         
         2) There's no (discernable) connection between threads you may generate on your client machine and threads within the DIIOP server task. Since your program is not directly using the CAPI, your CORBA program is not bound by any of the threading rules you have when running on a Notes/Domino machine using the local classes. You can freely use CORBA objects across threads, no init/term required.

         3) You can actually connect to multiple servers using the CORBA classes. You'd do that by creating one Session instance for each server. But you can NOT mix and match child objects across servers, that dog just won't hunt.

    Hope this helps. (Hope this is enough )

    Geek ya later!

    (Need expert application development architecture/coding help? Contact me at: bbalaban, gmail.com)

    Welcome Rocky back to the "independence-sphere"!

    Bob Balaban  May 20 2009 11:29:09 AM
    Allow me to be among the first to congratulate my pal Rocky Oliver back to the land of the independent consultants! Read the full story here.

    One small observation I can't help making: for years it looked as though my "career path" (loosely defined) was following Rocky's. I re-joined IBM a few months after he did in 2005, and I left IBM again a few months after he did in late 2007/early 2008.

    This time, I'm actually ahead of him!

    So, Rocky: Welcome back! It is my strong prediction that you will be doing great things, and it is my fervent hope  that we get to work together again.

    Geek-o-Terica 6: Now it Gets Complicated: Java, Garbage-Collection, Notes, and Threads

    Bob Balaban  May 17 2009 12:00:00 PM
    Greetings, Geeks!

    If you're not interested in the swirling, sexy synergy at the intersection of Java memory management and Notes back-end classes and multi-threaded programs, you might want to go catch up on events elsewhere. Don't worry, I don't take this stuff personally.

    Ok! For the 2 of you still here, have you already scanned my previous post on the topic of garbage collection (gc) in Java? The point of this gloss on the topic is to try to explain what happens to gc, recycle() and so on when you also mix in multi-threading. As Jeff Foxworthy says, "And then it got WEIRD..."

    You probably already know that the Notes back-end classes, regardless of what programming language you use to access them, are implemented internally as a bunch of C++ code, which in turn manipulates the product (Notes/Domino) via a rather large C API. Thus, the needs and requirements of the C API drive some of the fundamental behavior of the C++ code that implements the back-end classes, and, to one extent or another, also impose constraints on the various language-specific implementations of your agents, standalone programs, etc.

    One of these "requirements" (or constraints, if you prefer) has to do with a nasty (IMHO) little programming technique called "thread-local storage", or TLS. TLS is one of those ideas that sounds cool, if you're a codegeek, but which can actually really tie you in knots in some situations. Basically, TLS means that any given "thread of execution" within a process can allocate some memory that only it can "see". In other words, your code has to be running on the thread that allocated the TLS in order for that code to be able to access that memory.

    Why would you do that? Well, it's a convenient way to "remember" context in certain situations. If you're a server, for example, and you bind a thread to each client session, so that all client requests are serviced by a single thread, you can use TLS to easily remember stuff like, who is this client anyway? Is she authenticated? What databases is she accessing, and so on.

    Where it gets nasty is that the use of TLS in Notes imposes two non-breakable requirements on multi-threaded programs which use the CAPI to get things done: 1) Any TLS allocated on a given thread must also be de-allocated on that same thread, and b) All threads accessing the Notes C API must explicitly initialize (and terminate) themselves before using any CAPI services. Why? Because Notes uses TLS. Since only the thread that creates TLS can "see" it, it's logical that no other thread can de-allocate it. And, like most storage/memory systems in any complicated program (like Notes or Domino), TLS requires per-thread initialization, and therefore, termination.

    Now, when you're using (or writing) a single-threaded program using the CAPI (or back-end classes), there's no problem, because everything happens on the same thread. Examples of single-threaded programs using the Notes C API include: all LotusScript programs; a Notes CAPI program where you don't create extra threads; Java agents or standalone programs where you don't create extra ("child") threads; most of the Notes Client.

    It gets weird fast in Java, though, partly because it's so easy to spin off child threads in that language. So what happens if you create an instance of the java.lang.Thread class and run some code that accesses the back-end classes on that thread? The answer is: it won't work, unless you do the right thread initialization. There are 2 ways to do that:

         A) Make your code use NotesThread instead of Thread (your class can extend it, or you can launch it in any of the ways you can launch Thread). NotesThread extends Thread, and adds just a little bit of logic: it calls the correct Notes CAPI entry points to do the required initialization when it starts running, and when it terminates, it does the required call for termination.

         2) If you don't want to extend NotesThread (or can't, for some reason), any instance of java.lang.Thread can be explicitly initialized (and terminated) with a pair of "static" methods on the NotesThread class. "Static" means that you can call the methods without having an actual instance of the class around. You can call NotesThread.sinitThread() and NotesThread.stermThread() from any thread instance, and then go use back-end classes as you wish.

    Ok. Still with me? Next question: What the heck does this have to do with recycle()? The bottom line: remember from my previous post how I attributed the need to explicity recycle() to the fact that Java gc doesn't have a way to invoke Notes API to free up resources? Well, it's actually worse than that. Even if there WERE such a way (e.g., if the finalize() call in Java were actually reliable), it would still be NO GOOD!

    Why? Because of TLS. And, because in the Java virtual machine, garbage collection takes place on its own thread. So even if there were a way to reliably invoke the Notes API when an object (say an instance of lotus.domino.Database) was being gc'ed, it would violate the rule that TLS MUST be de-allocated on the thread that created it!

    So, what happens, then when you create a Notes object (such as a Database) on one thread, and then use it on another? It works! Why? because the back-end classes code internally figures out that you're accessing the internal data structures of that object (a CAPI thing called a DBHANDLE, in the case of a Database) on another thread, and it compensates by creating new TLS for that object on your thread. When your thread terminates, the back-end classes logic has to go find EVERY object that has TLS allocated on that thread and reclaim it. This happens automatically if you're using NotesThread, or explicitly when you use the static stermThread() call.

    And now, to get FULLY weird (and, I hope, your head will not explode): what happens if you create (say) a Database object on ThreadA, then access it on ThreadB and ThreadC, and then you decide you're ready to recycle it? Think about it (but not TOO hard)! Which thread should you use to call Database.recycle()? Whichever one it is, doesn't that leave 2 threads' worth of TLS still allocated?

    So, when is a "recycle" not a "recycle"? Answer: when you have an object "open" on multiple threads! The only way to avoid memory leaks in the above situation is to not fully destroy the object until you're done with it on ALL threads that have accessed it.

    How much of this does the everyday Java developer using Notes back-end classes have to worry about? Some, but not a whole lot (again, IMHO and YMMV). Here are a few "best practices" I have developed over the years to keep my own code relatively (if not squeaky) clean:

         1) Don't write multi-threaded programs. They're harder to code, harder to debug and harder to maintain. And you can get unintended memory leaks...
         2) If you MUST write multi-threaded programs, make sure you really have to. If you don't REALLY have to, see point #1. Yes, you can derive big performance gains with multi-threading. But ask yourself this: is it worth the extra pain?
         3) Don't share Notes objects across threads.
         4) If you MUST share Notes objects across threads, make sure you really have to. If you don't REALLY have to, see point #3.
         5) Adopt this convention: If a thread created the object, that thread "owns" the object and ONLY that thread may recycle it. Of course there are a couple of corollaries to this:
              a) ALL objects should be recycled when you're done with them
              b) When you go to recycle an object on the owning thread, make sure no other threads expect that object to still be there. Ideally, you'd have the non-owning threads terminate before the owning thread, then that last-thread-standing can recycle the object(s) safely, and all TLS gets cleaned up.
         6) Try to avoid writing multi-threaded programs, unless you really know what you're doing.

    Isn't this fun?

    Geek ya later!

    (Need expert application development architecture/coding help? Contact me at: bbalaban, gmail.com)

    Geek-o-Terica 5: Taking out the Garbage (Java)

    Bob Balaban  May 4 2009 02:00:00 PM
    Greetings, Geeks!

    I posted a short article last week on garbage collection in LotusScript. This post is about a whole different set of issues you need to be aware of if you're writring Java code for Notes or Domino.

    Like LotusScript, the Java language creates objects with a built-in "new" operator, and, like LotusScript sweeping up the no-longer-used memory ("garbage collection",or gc) is supposed to be automatic. But. The gc mechanism in Java is nothing like the one in LotusScript -- you have to do more work to avoid memory leaks, and you have to know more about how it all works. The basic saying people like to quote about gc in Java is: "You never have to worry about memory." I generally add to that: "Until you run out."

    There are two major differences in the way LotusScript and Java each handle allocated memory: the first has to do with how many kinds of memory there are, and the second has to do with when each language does gc.

    How many kinds of memory are involved with a Notes/Domino Java program? At least 2. First there's all the memory your java program allocates directly in the Java Virtual Machine (JVM): space for your code, space for your objects, space for the JVM itself to use. This all comes from the Java "heap": a big pool of memory that the JVM gets from the operating system, and parcels out to your program, and any other Java programs that might be running at the time.

    The second set of memory that your Java program is going to consume comes from a completely different place: It comes from the "Notes runtime heap" -- a different big pool of memory that the Notes/Domino core (which, after all, has nothing to do with any JVMs) gets from the operating system. This pool is what the Notes back-end classes use, as well as the Notes core itself. So, for example, when you instantiate a lotus.domino.Session object (or Notes does it for you, if you're running an Agent), a number of things happen:

         - If the JVM isn't running yet, it starts up. A bunch of .jar files are pre-loaded, using up some JVM heap.
         - Notes (or Domino) creates an instance of the Session class for the agent to use. This uses a bit more memory from the JVM heap. The new Java Session object calls into the back-end classes DLL ("lsxbe") to initialize the corresponding C++ Session object. Every Notes Java object is linked to a corresponding lsxbe back-end C++ object.
         - The C++ object that gets created to go with the new Java object uses up some memory of its own. This memory, however, comes from the "other" heap -- what I called the "Notes runtime heap" above. ALL of the lsxbe objects that are linked to the Java objects you create and use in your Agent come from this other heap. Some of them (Docuemnts, Databases, and others) consume Notes CAPI resources, such as NOTEHANDLEs and DBHANDLEs, which might represent lots and lots of other allocated memory in the Notes core (just as one example, an "open" Notes document that consumes 10mb of disk space might also consume 10mb of memory).

    So here's where it gets interesting: The JVM has a background thread that runs all the time (at lower priority than "normal" application threads), looking for objects which have been allocated out of the JVM heap and which are no longer used anywhere. When it finds such, it frees the memory used by those objects, and that memory in the JVM heap is then available for re-use. HOWEVER, there is no automatic mechanism by which the C++ objects associated with those Java objects can be notified to free up the memory THEY are consuming (which is often far larger). If nothing is done, all of that memory taken from the Notes runtime heap is "leaked": it never gets released.

    Thus, because there's no way for Notes to know for sure when a given Java object is being garbage-collected, we need another way to tell the back-end classes to clean up and take out the garbage. Unfortunately, the only way is to enlist the aid of the developer: you have to TELL lsxbe to clean up with the notorious recycle() call. What makes this unfortunate, from a product point of view, is that developers have to know to do this, and have to know how to do it correctly, so that they don't mess theselves up accidentally.

    So what, exactly, does recycle() do? First, it finds the link stored in the Java object to the corresponding C++ object. Then it invokes some code in lsxbe to destroy that C++ object. The "destructor" code in the back-end classes does a couple of things: it first finds and destroys any "owned" objects that it knows about. When you invoke recycle() on a lotus.domino.Database object, for example, any Document objects that were instantiated out of that database are also destroyed. The object's destructor also knows how to lilnk back to the JVM and tell it to invalidate (and garbage-collect) the corresponding Java object. It also tells the Notes CAPI to release any temporary memory it has consumed, and then the memory taken up by the C++ object itself is reclaimed.

    Try this experiment: FInd (or create) a Notes database that contains a view with 50,000 or so documents in it (the number of fields in the document is relatively unimportant for this purpose). Then write a Java agent that walks the view, and accesses each document (just use View.getFirstDocument/getNextDocument). Don't use recycle(). If your view is big enough, you might actually crash Notes this way, because every time you assign a new Document object to your local variable inside the iteration loop, the previous object referred to by that variable will be gc'ed, but the corresponding C++ document object will not.

    Of course, even when you leak megabytes of memory in this way, Notes (and the operating system) get it all back when the Agent terminates. Why? Because the CurrentDatabase and the Session objects that Notes created for your Agent to use get recycled automatically when the Agent is done, and therefore all other objects "owned" by that Session and Database (i.e., all of them) get recycled automatically. Of course, if you run out of memory before that point, you're screwed.

    So: recycle early, recycle often!

    Believe it or not, there's actually a bunch more to say on this topic. Look for my next blog post in a couple of days: "So Now it Gets Complicated: Java, Garbage-Collection, Notes, and Threads".

    Geek ya later.

    (Need expert application development architecture/coding help? Contact me at: bbalaban, gmail.com)

      Cool new blog, AWESOME rocket launch video

      Bob Balaban  April 29 2009 06:44:50 PM
      New blog from Teamstudio:
      http://voices.teamstudio.com/voices/TSVoices.nsf/d6plinks/ROLR-7RKJSP

      I do not like plagiarism

      Bob Balaban  April 29 2009 08:17:02 AM
      Check out Kevin Pettitt's blog. If things are as he reports, someone needs some serious head-shaping.
      http://planetlotus.org/4b7f58

      Geek-o-Terica 4: AutoUpdate on View navigation

      Bob Balaban  April 29 2009 03:44:00 AM
      Greetings, Geeks!

      I was getting ready to write a new Geek-o-Terica post about how the NotesView.AutoUpdate property (which also exists in Java) can affect view navigation. But Kathy Brown already did, so I'm just going to link to her post.

      Geek ya later!

      (Need expert application development architecture/coding help? Contact me at: bbalaban, gmail.com)

      Geek-o-Terica 3: Taking out the garbage (LotusScript)

      Bob Balaban  April 27 2009 04:56:24 AM
      Greetings Geeks!

      LotusScript is a cool language. One of the reasons it's cool is a feature called "automatic garbage collection". What does that mean? It means that you can litter your code landscape with snippets of memory to your heart's content, and never have to worry about keeping track or picking it up ("deallocating", or "freeing" the allocated memory chunks) later.

      It means you can code a loop like this (skipping all the DIMs, but you get the idea):

         set view = db.GetView("$all")
         set doc = view.GetFirstDocument()
         while not (doc is nothing)
              ' do something...
              set doc = view.GetNextDocument(doc)
         wend

      Every time you do "set doc = ", the Notes back-end classes infrastructure is generating a new NotesDocument object in memory. What happens to the object that used to be stored in the "doc" variable? LotusScript automatically keeps track of all references to objects, and when there are none (in this case, when you assign a new object reference to "doc", overwriting the old one), the memory used by that object is reclaimed.

      When does garbage collection happen? It happens in LotusScript after the execution of every statement (which can lead to some interesting side-effects, as we'll see in a minute). It wasn't always like that, though. In Notes V4.0 (the first appearance of LotusScript in the product), garbage collection only happened at the end of each Sub or Function. But that wasn't good enough, because, as in the above snippet, if the View had, say, 100,000 documents in it, you'd run out of memory before you got to the end of the function.

      But, as with all great boons to humanity, automatic garbage collection in LotusScript has its dark side. Consider this:

         dim s as new NotesSession
         dim entry as NotesEntry
         set entry = s.CurrentDatabase.ACL.GetFirstEntry

      Seems innocuous, right? But you'll never get an object in "entry". Why? Because there's another rule I haven't reminded you of yet (you probably already knew it, but you didn't realize it could bite you like this in some cases). The rule is that when an object is garbage-collected, all of its child objects are also garbage-collected.  That might seem needlessly destructive, but it's actually necessary. Notes does not allow free-floating objects that exist outside their container, or "owner" object. Documents, for example, must belong to a database. Items must belong to a document, and ACLEntries must belong to an ACL object.

      So, what really happens in the above snippet? The expression "s.CurrentDatabase.ACL.GetFirstEntry" is perfectly legal in an agent, where the NotesSession will always exist, as will the CurrentDatabase. And we know that every Notes database has an ACL object with at least one ACLEntry in it. So the expression will always evaluate and create an ACNEntry object, which then gets assigned to the "entry" variable.

      But then, once that all happens, garbage collection kicks in. It says, "What can I sweep up that is no longer referenced?" Session and CurrentDatabase are special, they have to stay around until the agent is done. But, AHA! The ACL object isn't used anymore after this statement is done (the object reference is not saved in any local variables). So it gets destroyed. And, invoking the rule above, all of its instantiated children also get destroyed, including the ACLEntry instance that we just saved in "entry". All before the next statement is executed. Easy come, easy go.

      Everything would have been fine if we had coded the above thusly:

         dim s as new NotesSession
         dim entry as NotesEntry
         dim acl as NotesACL
         set acl = s.CurrentDatabase.ACL
         set entry = acl.GetFirstEntry

      This way, the NotesACL object hangs around until the Sub exits, and so our ACLEntry object is also preserved.

      Now consider this bit of logic:

         set item = doc.items(0)

      Do we have a similar problem here? Will "item" always be Nothing? Actually, no, this works. It's true that the array of NotesItem objects created by the "doc.items" property is garbage-collected, but the single NotesItem object we saved in "item" is fine. Why? Because the item is "owned" by the NotesDocument object, not by the array. The array disappears, but the document doesn't.

      Of course you would be making a mistake to code a loop such as this:

         for i = 0 to ubound(doc.items)
            set item = doc.items(i)
            ' do something....
         next i

      This is horribly inefficient, because every invocation of "doc.items" above will create an array, and then populate that array with a set of NotesItem objects. Then at the end of each statement, the array will be garbage-collected, along with all but one of the NotesItem objects. That's a lot of work to do over and over. Much better to access the array one time, save it in a Variant variable, and then iterate over the array.

      Next time: How garbage collection in Java is way different.
      Geek ya later!

      (Need expert application development architecture/coding help? Contact me at: bbalaban, gmail.com)

      Geek-o-Terica 2: Using parentheses in LotusScript

      Bob Balaban  April 21 2009 03:49:49 AM
      Greetings, Geeks!

      Parentheses in LotusScript can be confusing. For example, check out this seemingly innocuous statement:

         MyFunc(arg1, arg2)

      Easy, right? Well, it's fine if "MyFunc" is a Function, but it will cause a compiler error if "MyFunc" is a Sub. Want to know why?

      The difference between a Sub and  Function (you already know this) is that a Function returns a value, and can therefore be used on the right-hand side of an assignment statement ("="), while a Sub does not return a function.

      So think about this from the compiler's point of view. An invocation of a Sub does not need parentheses around the parameter list being passed to the Sub, syntactically. Because the statement that invokes the Sub stands alone (no value is returned, so you can't assign the result of the invocation to something else, or use it in a more complex expression), there's no ambiguity:

         MyFunc "arg1", "arg2"

      works fine, I (the compiler) know just what to do. Likewise, if MyFunc is an actual Function that returns something, but I (the programmer) don't care, I can still code the above statement and the compiler knows what to do. BUT, if I use MyFunc's return value in some way -- assign it to a variable, or use it to invoke another Sub or Function, I MUST enclose the paremeter list in parentheses:

         stvar = MyFunc("arg1", "arg2")
         MySub MyFunc("arg1", "arg2"), 7

      You can see that if the language allowed me to omit the parens, line 1 might be ok, but line 2 would be impossible to parse.

      So, why is it "wrong" (compiler error) to code my original example above if "MyFunc" is a Sub? BECAUSE enclosing a paremeter in parentheses during a Sub invocation has ANOTHER MEANING! It means "pass this parementer by value, not by reference".

      What's the difference, and why would anyone care? Consider this sample agent:

      option declare     ' ALWAYS use Option Declare!
      Sub Initialize
            Dim st As String
            st = "set in Initialize"
            subr st
            Msgbox "String is: " & st
            st = "reset in initialize"
            subr (st)
            Msgbox "Now string is: " & st
      End Sub
      Sub subr(arg As String)
            arg = "set in subr"
      End Sub

      The result of the first Msgbox will be: "set in subr". The result of the second one will be: "reset in initialize". Why?

      Because the first call to "subr" passes the "st" parameter as a reference -- a pointer to the variable's memory location. So when the Sub "subr" modifies the parameter, that memory location gets updated with the new string, and Initialize "sees" the change. In the second call, the parens around "st" tell the compiler to "pass by value", the calling code does not pass a reference to "st" to the Sub, it passes a COPY of the VALUE of "st" (the string "set in initialize"). The Sub modifies the copy's memory location, but Initialize does not "see" that after the call.

      You can accomplish the same thing as using pass-by-value parens by changing the declaration of "subr" to:

         Sub subr(ByVal arg As String)

      So, now you can see why

         MyFunc(arg1, arg2)

      is weird when MyFunc is a Sub: pass-by-value around TWO parameters like that makes no sense, syntactically, so the compiler throws an error.

      Get it? Yes, it's a little odd, but don't blame LotusScript -- this little feature was copied from BASIC, as was most of the LotusScript language (BASIC as it was back in 1990, that is).

      Geek ya later.
      (Topic inspired by Kathy Brown. Thanks Kathy!)

      Introducing "Geek-o-terica"

      Bob Balaban  April 18 2009 12:17:05 PM
      Greetings, Geeks!

      I was getting bored with stuff, so I decided to make up a new term: Geek-o-terica. "Geek" because this is kind of a blog for geeks, and I've always introduced postings by shouting that out.

      The "o-terica" part is there because it is reminiscent of (is actually a non-null substring of, for you Geeks out there) the word "esoterica", which basically (pun intended) means "things understood by or meant for a select few" (NOT the face cream!). It also means "recondite", but if I said that, you'd have to go look that up too. This is also cool, because the origin of the word "esoteric" is Greek, which, if you remove the "r" is.... ok, never mind.

      So. I thought I'd start a series (meaning, when I feel like it) of blog posts on specific Notes/Domio/Whatever technical topics. LotusScript or @Function or CAPI snippets or tips, perhaps other topics which interest me, and which I therefore assume have at least a small chance of interesting someone else. This will be different from SNTT (Show 'N' Tell Thursdays), because I'm not going to force myself to offer complete code samples, and I'm certainly not going to be constrained by what day of the week it happens to be when I get inspired to post.

      I've started a list of future Geek-o-terica topics, please post suggestions for topics you'd like to see.

      For my kickoff Geek-o-teric offering, here's a little something I like to call, "Multi-Language Character Sets, What They Are, and How to Use Them" (MLCS-WTAAHTUT). This isn't new material, in fact I first wrote it over 2 or 3 days in a hotel room in London in 2001. It was originally a 2-hour Freelance presentation (remember Freelance?) for a customer who wanted some technical background on character sets, LMBCS, Unicode, and fonts. I used Symphony to convert it to PDF format, and here it is. Sorry for the slidemaster...

      Ever wondered what the differences are between LMBCS and Unicode? Hungering to know the real impact of coding strings in utf-8 vs. utf-32? Didn't know there was such a thing as utf-32? Always had a secret itch because you weren't clear on the precise distinction between "character" and "glyph"? Mystified by ligatures? Never knew for sure why programmers groan when confronted with EBCDIC?

      Well then, you have most definitely come to the right place, my fellow Geeks! Read on, and enjoy!


      Geek ya next time.

      Come Visit at Lotus Admin/Developer ’09, in Boston

      Bob Balaban  April 15 2009 02:07:20 AM
      Greetings, Geeks!

      Consider this my personal invitation to you to come visit The View's Admin/Developer09 conference, in Boston (at the Sheraton Hotel, as always).

      Today was Jumpstart and booth set-up day. I'm there working for one of my customers, BCC. I will be working in the BCC booth (come visit!) Wednesday afternoon and all day Thursday, as well as helping to present a vendor session Thursday afternoon on BCC's premier product, Client Genie. Essentially, Client Genie is a Notes Client management tool that takes over where policies peter out. There are other cool products on display too, so come on by.

      The Wednesday morning keynote (8:30am) will be presented by Lotus General Manager Bob Picciano, should be a good session. My understanding is that you can catch a live-blog of it here.

      I was at the conference site for a couple of hours this afternoon, helping with boot setup. Not too much activity, as the jumpstarts were still going on. Because I live only 15 miles from the conference site, it looks like I'll be commuting from home each day.

      Hope to see you there!

      Relief for the COBRA bite

      Bob Balaban  April 10 2009 03:45:57 PM
      Greetings, Geeks!

      I got some VERY good news yesterday, and for those of you in the U.S. who are on COBRA health plans, it may apply to you as well, so I thought I'd share (Non-U.S.ians or those who are on employer-sponsored health plans probably won't care, go do something useful.)

      As you may have read in these pages, I got laid off from my job back in January. Given that unemployment meant that my previously employer-sponsored health insurance was terminated, I needed to do something. So I converted the previous plan to a "COBRA" plan. That means that I was able to continue the health insurance coverage for me and for my family by paying the full amount of the monthly premium. This "conversion" ability is mandated by US Federal law. The idea is that you (the newly unemployed worker) pick up the full cost of the insurance (your former employer pays nothing), but the cost is still calculated at the group rate you were getting before, so (in theory...) the premiums are still lower than they would be if you had to go get your own policy.

      So, that's a nice thing, especially during an economic melt-down, when it's not so easy to get a new job. Unfortunately, health insurance is still incredibly expensive, even at group rates. My monthly bill for a family plan is approximately $1500. Of that, maybe $150 or so is dental coverage, the rest is "normal" health coverage. Naturally, there's still deductibles and co-pays and all the rest, just like before.

      That's a LOT of money to come up with every month when you're unemployed. Luckily, President Obama and the Democratic Congress (I say that not only because the Democratic Party has a majority in both houses of Congress now, but because almost all of the Republicans voted against the bill I'm talking about) added a provision to the recently passed "Stimulus Package" that subsidizes COBRA payments.

      I heard something about this on the radio, and phoned my former employer's benefits department. DO NOT RELY on what I am about to tell you here: go check FOR YOURSELF!!

      Here's a summary of what they told me:

           Eligibility - you can get the subsidy if you were involuntarily terminated from your job, and are currently paying for a COBRA health plan. If you quit, you're not eligible.

           When does it start? - The subsidies are retroactive to March 1, 2009.

           What do I get? - The Government will pay 65% of your COBRA payments, you continue to pay 35%.

           How does it work? -- Your insurance company will be in touch with you later this month. As you can imagine, they are scrambling to catch up with all this. Your payments (supposedly starting in May, assuming they get you the info in time) will be reduced to the new, lower amount. The retroactive amount (the "overpayment" you made for March and April, if any) is applied to future payments.

           How long does the subsidy continue? - I didn't think to ask that, though I'm sure they'll let me know. Anyone else know the answer? Please post it!

      This is a GREAT deal and, if it works out as expected, will save me a lot of money each month.

      Remember - Check with your own COBRA provider to find out exactly how this applies to you, don't rely on what I was told.

      Anyone have more detail or corrections? Please post 'em!

      Personal comment: DAMN I'm GLAD I voted for that guy!!

      Huzzah! New LSX Toolkit is posted (Part 1)!!

      Bob Balaban  March 31 2009 01:13:39 AM
      Greetings, Geeks!

      I am pleased to report that a very long, and sorry saga, lasting more than 6 years, has finally come to a close! I am, of course referring to the LSX Toolkit. I got my new version of it today, the previous release having come out in 2001. That's a long time to wait, but Friends! The waiting is Over!

      Few living now remember (or care) what the brouhaha (brew? ha ha!) as all about, but I will give you a radically shortened history.

      What is the LSX Toolkit anyway?


      LSX (originally "LotusScript Extention") is an architectrue that was invented in the course of developing the LotusScript "back-end classes" for Notes version 4.0, back in 1994 or so. We (primarily me) were developing the original set of back-end classes (meaning, they would run on either the client or the server) using what we'd now call "extension points" in the LotusScript interpreter code.

      English translation:
      A developer could create a bunch of C++ classes, put them in a DLL, and have the "host product" (Notes in the initial case) load the DLL. The host would, as part of the load-and-register protocol, would provide to the DLL code a specially crafted object instance of the LotusScript interpreter. This object has methods that allow "foreign" (or "extension") code to "register" the classes and methods and properties and errors that it implements with LotusScript. From the LotusScript developer's point of view, suddenly a whole bunch of new classes were available to be used in the LS editor and runtime environments.

      The Notes back-end classes were the first ever LSX. The LSX-i-ness of that DLL (nlsxbe.dll on windows, other name prefixes and suffixes on other OS platforms) resided in the fact that without that DLL, Notes had no "native" LotusScript classes at all (beyond the ones built to implement the editor and debugger). The DLL is hard-coded to load automatically when Notes (and now Domino) starts up. Other LSXs can be dynamically loaded when needed with a "USELSX " type statement in a LS program.

      This had two very interesting and immediate impacts on the thinking of how LotusScript could be used:
          1. Since, at that time (1995) Lotus had several products which embedded LotusScript (remember SmartSuite? 1-2-3? Freelance? Wordpro? Approach?), these "desktop products" could (easily, it turned out) be made to ALSO use the Notes LSX! You could run a script in the spreadsheet that could (behind the scenes, as these were the "back-end" classes) have 1-2-3 talk directly to the Notes objects. Cool!

          2. Anyone could write ANY DLL that did, well, ANYTHING, and make it a "plug-in" to either the client or the server, exposing entirely new (and not necessarily strctly Notes-related) functionality.

      And both of these things were done. SmartSuite products learned to be able to load the Notes LSX. People actually built apps using it. More people created new LSXs to be used with Notes/Domino to do different things: connectors to outside data sources (eventually rationalized and collected into the Lotus Connector LSX, a single LotusScript API to talk to a variety of outside (non-Notes) relationsl databases. Still in use today!!

      One major problem remained: it was HARD to configure an LSX just so such that your classes and methods and constants and errors would all get registered properly. And it was HARD to set up the code so that when the LS Interpreter needed to invoke a function whose code resided in an LSX, it would happen correctly, passing through the right data structures and so on. The first few LSXs that were built were all hand-crafted, with a lot of help from those few who'd done it before.

      So the cry went out across the Lotus Rogers St. building in Cambridge, Massachusetts: "We need a TOOLKIT for this bugger! It needs to be easier! And it CAN be made easier!"

      Fortunately, management at the tiem were very responsive and supportive. A team was created at Lotus, and they architected and built the first edition of the LSX Toolkit. It was AWESOME. A thing of beauty. You operated it by creating documents in a Notes database, describinb what you wanted (classes, methods, etc). Then you pushed a button, and it generated a crapload of C++ code that created the "skeleton" of your implementation (of course, the "Wizard" as it was called knew nothing about your business logic, but it COULD create all the code you needed to build, register with LotusScript and accept callbacks for evaluation at runtime. You would then go and "flesh out the skeleton" to add the business logic.

      I think that first version came out in 1998 or 1999, only 4 years after Notes v4.0 had shipped.

      I wrote an LSX to sell as a product at my then new company (I had departed Iris Associates in 1997 to create Looseleaf Software). I had decided that it might make me some money (and be fun) to write an LSX that expanded the reach of the Lotus-owned back-end classes across more of the CAPI functionality than was currently available to LotusScript. Is there a Notes CAPI function you want that you can't (easily) access from LotusScript? Tell me, and I'll "wrap" it in a class in my "ScriptExpander" LSX, and you can buy it from me.

      I found lots, and people suggested others. I sold a bunch of copies over a few years. In fact: Here's what I'm gonna do. EVERYONE reading this blog post can HAVE (for free!) my Notes database that contains the entire product, except for the DLL itself. You can read the documentation, take a look at all the sample agents that show you how each and every one of the 100 or so LSX functions worked. Go on, take it! I said it was free (about 1.5mb zip file containing 1 NSF):
      lsxdoc.zip

      Much cool stuff in there. You want to set the replicaid of a database from LS? It's in there. You want to create a relica stub on another server (not a full replica, the way Notes -- at the time -- made you do it? It's in there. Need text fields up to 2gb in size? It's there.

      Other people wrote LSXs too. The most famous is probably Ben Langhinrich's MIDAS rich text conversion classes (http://www.geniisoft.com). Bill Buchan (http://www.hadsl.com/) wrote one for his FIRM product.

      Life was good (or at least, fun). The toolkit supported all of the client and server platforms for Notes and Domino. An updated kit came out in 2001. Some of the code that glues your LSX stuff to the LotusScript stuff and brings them together is actually delivered in the kit in source code form. I found a couple of bugs there (Reported them, of course!), but I was able to just fix them locally, 'cause I had the code. Life went on.

      As a few of us used LSXs more and more frequently, we noticed some, er, glitches that made things, um, inconvenient. Some could be worked around by tweaking the skeleton code after it got generated, other stuff (crashes in the toolkit wizard, for example), we had no source code for. A list of complaints began to accumulate, but Lotus was, um, slllloooooowwww to acknowledge that fixing these were a priority. The group that had been maintaining the kit was disbanded (ALWAYS a bad sign).

      Yet, I (and oathers) had real products out in the market depending on this technology. What could we do? It was frustrating. So. I think it was 2003, or maybe it was 2004 (I admit to being hazy about this, heck it might have been 2002) that I showed up at the "Meet the Developers" Session at Lotusphere (I think now it's called "Ask the Developers", we just always called it "Beat the Developers"). I happened to be invited to ask the first question. I mentioned that I was having a bunch of problems with the lsx toolkit, and said that, if Lotus wasn't really staffed to fix the problems, I would be happy to donate some time for free to sit in the Westford lab (where I used to work) and do it for them.

      At first I thought something might actually happen. One Vice President asked for my bizcard, and said he'd call me to set it up. He never did. The following year, I was again able to position myself to be the first one called on at "Beat the Developers" at Lotusphere. I repeated my offer. And again the following year! This continued right up to Lotusphere 2006, the first oneat which I was again an IBM employee. This time, a few other people asked if, now that Bob was back at IBM, would the kit get fixed. Damn thing was turning into a joke, and all I wanted was for it to be fixed.

      I was given permission at Lotusphere 2007 to pull a project together to begin the work. Of course things had deteriorated. Whereas the 2001 version of the kit built fine (on Windows) with Visual Studio 2003, it did not work with the 2005 version. There was an extensive list of outright bugs and a fairly long list of important enhancements. To be honest, I didn't spend a lot of time on this project. But I did convince a couple of other developers to do most of the work. We engaged with the release management team to get trhough the paperwork and legal review. At Lotusphere 2008 "Beat the Developers", I "announced" that we were finally going to ship it within a few months.

      Then, in February of 2008, I left IBM to join a Lotus Business Partner company. LSX pretty much fell off my radar. But I did notice, around early January of 2009, that the updated kit had still not shipped. EIGHT YEARS, and they were still sitting on it. Once again, for the 5th or 6th (maybe) time, I complained at "Beat the Developers, Lotusphere 2009 Edition". Brent Peters made a joke at my expense, implying that the delay had been caused by the poor quality of my code. I guess he didn't realize that virtually none of the code in the LSX Toolkit was mine.

      But! Brent also promised that the kit would be released by Feb. 1. Whoo hoo! Light-at-end-of-tunnel! AND, light-is-not-train-coming-my-way! I forgave him for the joke.

      After Lotusphere, Feb. 1 came and went. Brent blogged that the date had slipped to Feb. 15, which also came and went. Buchan blogged about the delays in a way designed (successfully) to embarrass IBM. Brent stepped up in a posted reply and said he agreed, enough was enough, he was going to make sure it got out. Soon.

      Sure enough! I read on Brent's blog last week (http://www.LotusStaffNotes.com/LotusStaffNotes/LSNblog.nsf/) that it was REALLY READY!!! (And! he dedicated it to me. How sweet!).

      BUT! I couldn't find it on the old toolkit site (http://www.notes.net). Only the old one was there. But TODAY! I got the link to the page where it REALLY lives:
      http://www14.software.ibm.com/webapp/download/preconfig.jsp?id=2006-09-14+09%3A05%3A37.162620R&S_TACT=104CBW71&S_CMP=

      Note: This is the windows-only version, unix version to come....soon. You have to register to get the download, but it's free.

      Phew. I'm exhausted. Eight years! And now it's here. I'm thinking I may just begin a project to revive and extend ScriptExpander with this baby, see how that goes. You can assume I will be blogging about that experience.

      But for now, MANY thanks to:
           Brent Peters, George Langlais, Scott Morris, Willie Arbuckle, Roberto Olivares, John Beck
      and everyone else who helped get the darn thing FINALLY out the door.

      The saga continues! This has been but the end of the beginning!