Home » excel » c# – object life-cycle for COM objects passed from .NET into VBA

c# – object life-cycle for COM objects passed from .NET into VBA

Posted by: admin April 23, 2020 Leave a comment

Questions:

My organization sometimes needs to use Excel to generate a bunch of formatted statements (in the sense of a document stating “your account balance is $X”), print them to PDF, and then combine them into one big PDF. The method generally used involves a single sheet driven by an index cell and a list of people/data on another sheet. A VBA macro iterates the index cell from 1 to N, then uses the Adobe Distiller API to print the formatted sheet each time and combine the results.

For various reasons I would like to implement most of this macro’s logic in C# in our VSTO Excel Add-In, so that the VBA side of the process would be reduced to a few lines.

I decided to expose an API that looks roughly like this:

AcroPDDoc PdfBegin(Worksheet worksheet, string filename);
void PdfAddPage(AcroPDDoc pdf, Worksheet worksheet);
void PdfComplete(AcroPDDoc pdf);

The idea being you write VBA of the form:

Sub PrintToPdf()
    Dim obj As IMySharedObject
    Set obj = Application.COMAddIns("MyAddIn").Object

    Dim pdf As Acrobat.AcroPDDoc

    Dim i As Long
    For i = 1 To 10
        Range("counter").Value = i

        If i = 1 Then
            Set pdf = obj.PdfBegin(Sheets("Statement"), "C:\myFile.pdf")
        Else
            PdfAddPage pdf, Sheets("Statement")
        End If
    Next i

    PdfComplete pdf
End Sub

I grew curious/worried about the lifecycle of the AcroPDDoc object and about open file handles, Acrobat.exe processes etc. in the event that the macro hit an error or was terminated mid-execution. Not super worried because “close Excel and reopen it” is an acceptable solution if necessary. I wrote the following code in C#:

internal static class Printing
{
    private static WeakReference weakref;

    public static AcroPDDoc PdfBegin(Worksheet worksheet, string filename)
    {
        SetAdobeOutputFile(filename);
        worksheet.PrintOut(ActivePrinter: "Adobe PDF");

        AcroPDDoc pdf = new AcroPDDoc();
        pdf.Open(filename);
        weakref = new WeakReference(pdf);

        return pdf;
    }

    public static void GC()
    {
        System.GC.Collect();
    }

    public static void test(AcroPDDoc pdf)
    {
        if (weakref != null) {
            System.Diagnostics.Debug.WriteLine("IsAlive pre: " + weakref.IsAlive);
            if (weakref.IsAlive) System.Diagnostics.Debug.WriteLine("ReferenceEquals: " + Object.ReferenceEquals(pdf, weakref.Target));
        }

        GC.Collect();

        if (weakref != null) System.Diagnostics.Debug.WriteLine("IsAlive post: " + weakref.IsAlive);
    }
}

I’ve elided a bunch of extra Debug.WriteLines and some other extraneous code. I tested it with the following VBA:

Sub foo()
    Dim obj As IUDFSharedObject
    Set obj = Application.COMAddIns("MyAddIn").Object

    Dim pdf As Acrobat.AcroPDDoc
    Set pdf = obj.PdfBegin(Sheets("Statement"), "C:\myFile.pdf")
    'obj.GC
    'obj.test pdf
End Sub

What I found generally was that .NET does not include the reference sent off to VBA-land in its reference count for garbage collection.

For example, if I uncomment just obj.GC and obj.test pdf, I am informed that weakref is not alive.

However, if I uncomment only obj.test pdf, weakref is alive both before and after (and I emit “ReferenceEquals: true”).

Note that pdf is in scope in VBA the entire time. I originally tested to see what happens if you let pdf escape VBA scope too, but it turns out not to matter.

This is a much, much bigger problem for me than a resource link. Is there any solution short of permanently storing each AcroPDDoc object generated in a List somewhere to keep the reference count above zero?

How to&Answers:

Thanks to @yms above I’ve figured out what is going on and have come up with a solution I’m fairly happy with. First, the slight modification to the API:

void PdfBegin(AcroPDDoc pdf, Worksheet worksheet, string filename);
void PdfAddPage(AcroPDDoc pdf, Worksheet worksheet);
void PdfComplete(AcroPDDoc pdf);

Each C# method will call Mashal.ReleaseComObject(pdf) before returning. I did read Marshal.ReleaseComObject considered dangerous, but I’ve tested for the particular failure mode he calls out and found it does not appear to occur in practice.

The VBA now has to supply the AcroPDDoc object from the beginning. Thus the typical usage will look like:

Sub PrintToPdf()
    Dim obj As IMySharedObject
    Set obj = Application.COMAddIns("MyAddIn").Object

    Dim pdf As New AcroPDDoc

    Dim i As Long
    For i = 1 To 10
        Range("counter").Value = i

        If i = 1 Then
            obj.PdfBegin pdf, Sheets("Statement"), "C:\myFile.pdf"
        Else
            obj.PdfAddPage pdf, Sheets("Statement")
        End If
    Next i

    obj.PdfComplete pdf
End Sub

Essentially just that the declaration is now As New AcroPDDoc rather than As AcroPDDoc with a later Set.

Testing revealed that VBA is very prompt about decrementing the refcount of the AcroPDDoc as soon as it falls out of scope or the reference is set to Nothing. This includes cases where an error is raised mid-subroutine and the user ends execution.

Finally, the Acrobat.exe process is also prompt about killing itself off as soon as its refcount reaches zero, even if it has a file open.

Answer:

Note that your references to AcroPDDoc in .Net are actually wrappers to an out of process COM object that you are passing through different ecosystems, .Net framework does not fully control the lifetime of the underlying object, the reference counting is controlled by the COM server, and as long as there is a COM reference to one object, either from VBA or from .Net, the object will stay alive.

I believe you will find this question and its answers interesting:
RCW & reference counting when using COM interop in C#