Bob Balaban's Blog

     
    alt

    Bob Balaban

     

    Geek-O-Terica 15: Easy conversion of Notes documents to MIME format (Part 1)

    Bob Balaban  March 21 2011 05:00:00 AM
    Greetings, Geeks!

    MIME is a data format that has become central to transmission of email over the Internet. The nice thing about it is that everyone uses it for mail interchange, and that it's standard. The Domino server converts incoming MIME-formatted messages into Notes documents, and outgoing Notes email documents into MIME formats. As of (I think) Notes v6, you can specify that you want incoming MIME messages to remain in their native format within the NSF database. In these cases, the Notes client will automatically convert any document that resides in MIME format to "regular" Notes rich text document format when you open it in the client UI.

    What does MIME format look like? It can get complicated, but the easiest way to think of it is as a sectioned box, with each compartment within the box holding data in a specified format. So, plain text is just a plain-text section. "Rich text" is usually represented as HTML, with embedded "pointers" to other sections containing embedded objects, such as images or attachments. Other data, such as file attachments, reside in their own sections, with their own "headers" specifying size, encoding (e.g., base-64) and format (e.g., JPEG or GIF). The different sections in the MIME file are separated by "boundaries", essentially a unique string. There's much more to it, of course, but these are the basics.

    What if you want to write script (LotusScript, Java, other) to programmatically convert Notes documents to MIME format? If you're using a recent (v8.5x) version of Notes or Domino, then most of the work is done for you with new methods in the back-end classes. This functionality is, of course, based on entry points in the Notes C API. If you're not using LotusScript or Java, you can accomplish MIME translation with  a C or C++ program using these entry points, or, even easier, use the COM classses. I'll discuss how to do MIME conversion using the C and COM APIs in Part Deux of this post.

    Here's a basic Java program showing the essential techniques for conversion. I'm leaving out all the surrounding code for acquiring Document objects, preparing FileOutputStreams, and so on. There is only one slightly tricky thing about this program, which we'll get to after this first section.

    Session session = NotesFactory.createSession();
    // turn off automatic mime conversion on document open
    // if doc is already in MIME, leave it so
    session.setConvertMIME(false);
    Document doc = .... // get document somewhere
    // kill any $KeepPrivate items
    doc.removeItem("$KeepPrivate");
    doc.convertToMIME(lotus.domino.Document.CVT_RT_TO_HTML, 0);   // note: Designer doc has wrong spelling
    WriteOutputMIME(doc);

    So far, so good. We suppress automatic conversion on document open, to save work (if the document is already in MIME format, we can skip the convert step and just write it out). If the "$KeepPrivate" item is present in the document, conversion will fail, so we remove that. Then all we do is call the Document.convertToMIME() method, specifying that we want rich text converted to HTML.

    After the convert call, the document in memory is now a sequence of items containing MIME headers, and a (possibly multi-part) body representing the rich text body of the original document, plus any attachments it contains. We can (almost) proceed to iterate over these items and write them out to disk (or wherever).

    I say "almost", though, because there's a glitch in the conversion code deep underneath the C API layer: it does not automatically convert attachment contents to a base-64 encoding (even though the API documentation says it will) - it leaves them in binary format, which cannot be written to disk. So, we have to look for those items, and force them to be converted to base-64 text. This next section of Java code for the WriteOutputMIME() functions shows how to do that. Again, I've pared the code down to the essential bits:

          private void WriteOutputMIME(Document doc, File outDir)
          throws Exception
    {
          File outFile = null;
          MIMEEntity mE = null;
          MIMEEntity mChild = null;
          String contenttype = null;
          String headers = null;
          String content = null;
          String preamble = null;
          int encoding;
          FileWriter output = null;
          String noteid = doc.getNoteID();
          int index;
         
          // access document as mime parts
          mE = doc.getMIMEEntity("Body");
          outFile = new File(outDir, noteid + ".eml");
          output = new FileWriter(outFile);
         
          try {
                  contenttype = mE.getContentType();
                  headers = mE.getHeaders();
                  encoding = mE.getEncoding();
                 
                  // message envelope. If no MIME-version header, add one
                  index = headers.indexOf("MIME-Version:");
                  if (index < 0)
                          output.write("MIME-Version: 1.0\n");
                  output.write(headers);
                 
                  // for multipart, usually no main-msg content
                  content = mE.getContentAsText();
                  if (content != null && content.trim().length() > 0)
                          {
                          output.write(content);
                          output.write("\n");
                          }

              // For multipart, examine each child entity,
              // re-code to base64 if necessary                
                  if (contenttype.startsWith("multipart"))
                          {
                          preamble = mE.getPreamble();
                          mChild = mE.getFirstChildEntity();
                          while (mChild != null)
                                  {
                                  headers = mChild.getHeaders();
                                  encoding = mChild.getEncoding();
                                 
                                  // convert binary parts to base-64
                                  if (encoding == MIMEEntity.ENC_IDENTITY_BINARY)
                                          {
                                          mChild.encodeContent(MIMEEntity.ENC_BASE64);
                                          headers = mChild.getHeaders(); // get again, because changed
                                          }
                                 
                                  preamble = mChild.getPreamble();
                                  content = mChild.getBoundaryStart();
                                  output.write(content);
                                  if (!content.endsWith("\n"))
                                          output.write("\n");
                                  output.write(headers);
                                  output.write("\n");
                                 
                                  content = mChild.getContentAsText();
                                  if (content != null && content.length() > 0)
                                          output.write(content);
                                  output.write(mChild.getBoundaryEnd());
                                 
                                  mChild = mChild.getNextSibling();
                                  } // end while
                          } // end multipart
                 
                  // end of main envelope
                  output.write(mE.getBoundaryEnd());
                  }
          finally {
                          if (output != null)
                                  output.close();
                          }
         
    } // end WriteOutptuMIME

    So, a little tricky, but not too bad. You have to get the boundaries right, as well as the line breaks. Otherwise, it's really just copying stuff out to disk. Remember that the message has some overall headers (mE.getHeaders()), and each child entity has its own header section as well, describing what's in that chunk of data. When we re-code an entity from binary format to base-64 format, we need to re-read the entity headers, because they'll reflect that change.

    A final comment about HTML conversion: it theoretically existed back in R5 (you know what they say about "in theory"...), but it didn't start working well for real until v7.03. And it has been improving ever since, so the later the version of the product you have, the better off you'll be.

    In my next blog post (part deux), I'll show you how to adapt this basic code for the Notes COM classes, where (for some stupid reason) the Document.convertToMIME() function does not exist). We will not be thwarted! I'll show you how to use the Notes C API from a C-sharp program to do the MIME conversion.

    Happy coding! Geek ya later!

    (Need expert application development architecture/coding help?  Want me to help you invent directory services based on RDBMS? Need some Cloud-fu or some web services? Contact me at: bbalaban, gmail.com)
    Follow me on Twitter @LooseleafLLC
    This article ┬ęCopyright 2011 by Looseleaf Software LLC, all rights reserved. You may link to this page, but may not copy without prior approval.


    Comments

    1Jim Knight  3/21/2011 8:08:05 AM  Geek-O-Terica 15: Easy conversion of Notes documents to MIME format (Part 1)

    thanks Bob, good timing on this! i just happened to need this functionality. the agent i was using previously wasn't cutting it.

    2Jim Knight  3/21/2011 8:27:47 AM  Geek-O-Terica 15: Easy conversion of Notes documents to MIME format (Part 1)

    i did hit one error on a 20mb file attachment pptx type.

    Exception in thread "AgentThread: JavaAgent" java.lang.OutOfMemoryError

    at sun.nio.cs.StreamEncoder.write(StreamEncoder.java:155)

    at java.io.OutputStreamWriter.write(OutputStreamWriter.java:244)

    at java.io.Writer.write(Writer.java:152)

    at JavaAgent.WriteOutputMIME(Unknown Source)

    at JavaAgent.NotesMain(Unknown Source)

    at lotus.domino.AgentBase.runNotes(Unknown Source)

    at lotus.domino.NotesThread.run(Unknown Source)

    3Scott Leis  4/4/2011 9:12:43 PM  Geek-O-Terica 15: Easy conversion of Notes documents to MIME format (Part 1)

    WriteOutputMIME seems to be omitting some document content from the output. I'm testing it on emails in Notes 8.5.2.

    After calling Document.convertToMIME, a document has multiple Body fields of type "MIME Part". Some have a content-type starting with "multipart", there are two of type "image/gif", one of type "text/plain", and one of type "text/html".

    The output file shows one "multipart/alternative" item, and both "image/gif" items. All other MIME items are absent.

    I suspect the cause is that the "text" items are not counted as children of the first entity, and it would be necessary to loop through all items on the document and call getMIMEEntity() on each item to include the text.

    4Cain Wong  6/10/2011 8:37:34 AM  Geek-O-Terica 15: Easy conversion of Notes documents to MIME format (Part 1)

    Thanks for this post Bob! It really gave me a head start on my current project.

    One issue I've run into however:

    encodeContent(MIMEEntity.ENC_BASE64) produces junk.

    The resulting content looks a lot like Base64, but it is not. In fact, it's more like "Base62" in that there are alphanumeric characters, but none of the "+" and "/" characters.

    Strangely, calling doc.generateXML() produces DXL with file attachments properly Base64 encoded.

    Has anyone else run into this problem?

    As it stands, I'm looking at needing to extract the attachments to disk (since Lotus doesn't provide access to the binary data otherwise... grrr) and then using apache commons for Base 64 encoding.

    5Cain Wong  6/10/2011 11:48:38 AM  Geek-O-Terica 15: Easy conversion of Notes documents to MIME format (Part 1)

    Please disregard my last post. The conversion issue that I was having had to do with how I was converting String to bytes or vice-versa. :(

    The encodeContent function seems to be working correctly.

    6Bob Balaban  6/10/2011 4:31:54 PM  Geek-O-Terica 15: Easy conversion of Notes documents to MIME format (Part 1)

    Glad you figured it out, thanks for posting!

    7Joseph Cameron  9/29/2011 10:13:32 AM  Geek-O-Terica 15: Easy conversion of Notes documents to MIME format (Part 1)

    We are in the process of moving to a paperless process and things are working well except exporting an email to a file that looks like the email as viewed by the end user. It would be preferable that we use native lotus notes functionality.

    This is something we will use and outside contractor to create for us as we have limited lotus notes development capabilities in house.

    We do net need the attachments delt with because we have a solid solution up and running well. It is those pesky email body's All we need to do is get the email file to the filesystem and we can take it from there. Some metadata like the name of the file and we are good to go.

    One note there will be 150,000 pluse emails needing to be converted in the course of the year but hardware is not an issue for us.

    8Bob Balaban  9/29/2011 2:40:06 PM  Geek-O-Terica 15: Easy conversion of Notes documents to MIME format (Part 1)

    A simplified version of the code I posted should work for you (easier if you don't have to deal with converting the attachments from binary to base64).

    You can adapt that, or if you want someone else to do it for you, send me an email

    9Hitesh Zinzuwadia  11/3/2011 10:22:31 PM  Geek-O-Terica 15: Easy conversion of Notes documents to MIME format (Part 1)

    This code fails on a document with multiple body fields.

    I tried Midas, lsx from giini, it's pretty good but it has some problems that I hope they will resolve. I tried multiple approaches to iterate thru all body fields but no success. Any thoughts on this.

    10Bob Balaban  11/4/2011 4:36:51 PM  Geek-O-Terica 15: Easy conversion of Notes documents to MIME format (Part 1)

    Multiple body fields in a document is normal, assuming that they all really represent a single rich-text item that might be bigger than 64KB. If that's not the case, i.e., if there are really 2 or more independent rich text items named "body", then that is not considered a correctly formed Notes document. Duplicate items are generally invalid.

    11Hitesh Zinzuwadia  1/5/2012 3:04:04 PM  Geek-O-Terica 15: Easy conversion of Notes documents to MIME format (Part 1)

    is there anyway we can keep Attachment Icons available with mime, currently it replaces with:

    "(See attached file: .....)

    Thank you.

    12Bob Balaban  1/5/2012 8:38:12 PM  Geek-O-Terica 15: Easy conversion of Notes documents to MIME format (Part 1)

    No, I don't know of any way of controlling that. Although once it's converted, you could always edit the HTML text (and fix up the headers)

    13Hitesh Zinzuwadia  1/16/2012 10:51:55 AM  Geek-O-Terica 15: Easy conversion of Notes documents to MIME format (Part 1)

    One more and hopefully last question, sometimes Document MIMEEntity header does not contain Boundary content, so eml breaks, any idea how to enforce convert mime to generate Boundary=... in headers?

    Appreciate your help.

    14Bob Balaban  1/16/2012 3:30:35 PM  Geek-O-Terica 15: Easy conversion of Notes documents to MIME format (Part 1)

    The only time I've seen what you describe is when the output is not of type "multipart", because then you don't need a boundary.

    If that's not the situation you have, I don't know what would cause that.

    I believe in Notes 8.x you can drag an email message to the desktop and it will do a mime conversion, so you can veryify your results that way

    15Renato  2/7/2012 12:59:16 PM  Geek-O-Terica 15: Easy conversion of Notes documents to MIME format (Part 1)

    Hi,

    I have problem when I receive an email containing an image embedded in the message body. In my case I want to get only the HTML field "Body".

    thank you

    Renato Wagner

    16Bob Balaban  2/7/2012 6:10:24 PM  Geek-O-Terica 15: Easy conversion of Notes documents to MIME format (Part 1)

    You should be able to use the existing LotusScript/Java classes to do that. If you have the Session.ConvertMIME property set to TRUE, then when you open the document, it will automatically be converted to notes rich text, and you can find the "body" richtextitem and look at the attachments. Or, if you don't convert automatically, you can get the "body" MIMEPart and parse that

    17Hitesh Zinzuwadia  3/7/2012 1:58:05 PM  Geek-O-Terica 15: Easy conversion of Notes documents to MIME format (Part 1)

    Another issue we found is when the document form has subforms, we get mime only for mainform contents and not for any content that is displayed by subform(s). if we merge subform(s) in the main form then we get full mime, so it is definitely related to subform(s). Any idea how to resolve this?

    Thank you,

    Hitesh

    18Bob Balaban  3/7/2012 4:39:00 PM  Geek-O-Terica 15: Easy conversion of Notes documents to MIME format (Part 1)

    Sorry to say it sounds like a bug. Have you reported it to IBM?

    19Hitesh Zinzuwadia  3/7/2012 4:51:05 PM  Geek-O-Terica 15: Easy conversion of Notes documents to MIME format (Part 1)

    Bob,

    I ran a couple more tests on the issue, and concluded that it is safe to assume that this is indeed a bug (I tried with different databases/forms and each time subform MIME failed).

    I have not reported it to IBM?

    In addition to that, I also found one more issue (in fact, can not classify this as an issue as this is more of how different engines renders mime different ways), when a document have Embedded MIME (I believe embedded eml file contents), the document renders everything including content of embedded email in body in Lotus Notes Client, but when you open the MIME export eml file, the embedded content becomes an attachment. Any suggestion on if I want to display the content in eml file (Outlook Expres/Windows mails) just like it is displayed in Lotus Notes client?

    Thank you,

    Hitesh

    20Jasper Duizendstra  3/31/2012 8:05:22 AM  Geek-O-Terica 15: Easy conversion of Notes documents to MIME format (Part 1)

    Besides the subform issue I ran into another issue with the MIME conversion.

    When a document in the document library is converted to MIME a hide when is triggered on the @ClientType formula. If the document is opened in the client this formula is evaluated to "Notes", but in the MIME conversion it is evaluated to "Web". This also happens when the form is rendered to RT or if it is evaluated in lotus-script. (Evaluate("@ClientType"))

    The result is a missing field in the html part of the MIME. The workaround that I can think of is to change the design, removeing the @ClientType formula.

    I posted it here also:

    { Link }

    21Bob Balaban  3/31/2012 8:35:26 AM  Geek-O-Terica 15: Easy conversion of Notes documents to MIME format (Part 1)

    Thanks for posting. Sounds like a bug in @ClientType. I wonder if IBM is aware of it...

    22Need help with creating html formatted email  12/11/2014 6:57:50 AM  Geek-O-Terica 15: Easy conversion of Notes documents to MIME format (Part 1)

    Hi

    I have problem with Lotus Notes 8.5.3 and creating html email through API. ( Lotus Notes is not connected to Domino server, it is connected to my gmail account )

    I have following script in VBScript. It creates email and everything is ok, but formatting of the html not works...

    Cannot get bold working - any way, as You can see i tried many ways. I need Your help ASAP.

    If You know, what's wrong with the script, or maybe with Lotus Notes settings, I would be very happy

    Regards,

    Tomasz

    Dim oNotesSession : Set oNotesSession = CreateObject("Notes.NotesSession")

    oNotesSession.ConvertMIME = False ' Do not convert MIME to rich text

    Dim oNotesUIWorkspace : Set oNotesUIWorkspace = CreateObject("Notes.NotesUIWorkspace")

    Dim oNotesDatabase : Set oNotesDatabase = oNotesSession.CurrentDatabase

    Dim EmailDocument : Set EmailDocument = oNotesDatabase.CreateDocument

    Dim Body : Set Body = EmailDocument.CreateMIMEEntity("Body")

    Dim oHeader : Set oHeader = Body.CreateHeader("Content-Type")

    Call oHeader.SetHeaderVal("multipart/mixed")

    Dim AlternativeBody : Set AlternativeBody = Body.CreateChildEntity()

    Dim oHeader2 : Set oHeader2 = AlternativeBody.CreateHeader("Content-Type")

    Call oHeader2.SetHeaderVal("multipart/alternative")

    Dim oChildMIMEEntity2 : Set oChildMIMEEntity2 = AlternativeBody.CreateChildEntity()

    Dim oNStream2 : Set oNStream2 = oNotesSession.CreateStream()

    Call oNStream2.WriteText("Tylko tekst")

    Call oChildMIMEEntity2.SetContentFromText(oNStream2, "text/plain;charset=UTF-8", ENC_IDENTITY_7BIT)

    Call oNStream2.Close()

    Dim RelatedBody : Set RelatedBody = AlternativeBody.CreateChildEntity()

    Dim oHeader3 : Set oHeader3 = RelatedBody.CreateHeader("Content-Type")

    Call oHeader3.SetHeaderVal("multipart/related")

    Dim oChildMIMEEntity : Set oChildMIMEEntity = RelatedBody.CreateChildEntity()

    Dim oHeader4 : Set oHeader4 = oChildMIMEEntity.CreateHeader("Content-Disposition")

    Call oHeader4.SetHeaderVal("inline")

    Dim oNStream : Set oNStream = oNotesSession.CreateStream()

    Call oNStream.WriteText("<html><head></head><body>")

    Call oNStream.WriteText("<strong>Tekst w strongu</strong><br>")

    Call oNStream.WriteText("<b>Tekst przez b</b><br>")

    Call oNStream.WriteText("<bold>Tekst przez b</bold><br>")

    Call oNStream.WriteText("<span><strong>Tekst w spanie</strong></span>")

    Call oNStream.WriteText("<div class=""gruby"">Tekst w divie z klasa gruby</div>")

    Call oNStream.WriteText("<span style='font-style: bold'>Tekst w z czcionka</span>")

    Call oNStream.WriteText("</body></html>")

    Call oChildMIMEEntity.SetContentFromText(oNStream, "text/HTML;charset=UTF-8", ENC_IDENTITY_7BIT)

    Call oNStream.Close()

    Call EmailDocument.CloseMIMEEntities(True, "Body")

    Call oNotesUIWorkspace.EDITDOCUMENT(True,EmailDocument)

    MsgBox "Composed"

    oNotesSession.ConvertMIME = True

    Set EmailDocument = Nothing

    Set oNotesDatabase = Nothing

    Set oNotesUIWorkspace = Nothing

    Set oNotesSession = Nothing

    23Manisha  7/2/2015 2:20:13 AM  Geek-O-Terica 15: Easy conversion of Notes documents to MIME format (Part 1)

    Hi Bob,

    Please help me to resolve this issue.I'm in urgent need.

    (1) I'm creating Email Body with RichTextItem by using Document.createRichTextItem("Body") and appending text and Table and then send this email with java.

    (2) When I retrive this email from My Inbox,Format gets changed as I'm reading the body as String using doc.getItemValueString(EmailRoboConstant.BODY);

    How can I get the same formating which was created in Point-1

    Highly appreciate your help.

    Thanks & Regards,

    Manisha

    24David Conway  9/4/2015 11:39:13 AM  Geek-O-Terica 15: Easy conversion of Notes documents to MIME format (Part 1)

    I used your example code in creating a standalone java program to loop through all of the views/folders in a given mail file, and extract all of the items in those views/folders as elm files. I am noticing that the attachments are in the elm files, but I can not open them. Could this be caused by the converting Binary to Base64? Thanks.

    - David Conway

    25Bob Balaban  9/4/2015 8:45:59 PM  Geek-O-Terica 15: Easy conversion of Notes documents to MIME format (Part 1)

    I don't know if that's the cause of your problem or not. But certainly you could try leaving the attachment part as binary, and see if it works better

    26sahana  9/27/2016 3:40:24 PM  Geek-O-Terica 15: Easy conversion of Notes documents to MIME format (Part 1)

    hi,

    In the above code, the following line when used with IBM notes 9 is crashing the jvm ,with exception related nnotes.dll acess violation exception

    doc.convertToMIME(lotus.domino.Document.CVT_RT_TO_HTML, 0); // note: Designer doc has wrong spelling

    The same is working fine with ibm notes client 8.5.

    Any help regarding the same is appreciated.

    Regards

    Sahana

    27Bob Balaban  9/29/2016 3:52:20 PM  Geek-O-Terica 15: Easy conversion of Notes documents to MIME format (Part 1)

    Hi Sahana. Sounds like you found a bug in the API. I strongly suggest you report it to IBM, they do tend to take compatibility issues like this seriously