Home » excel » excel – Querying a pre-existing in-memory ADODB recordset, using filtering and aggregate functions (advanced ADODB library?)

excel – Querying a pre-existing in-memory ADODB recordset, using filtering and aggregate functions (advanced ADODB library?)

Posted by: admin April 23, 2020 Leave a comment

Questions:

I have some existing code that is querying a SQL database repeatedly with different parameters and I thought it would likely perform better if I changed it to select one big chunk of data into an ADODB.Recordset at the start, and then within the loop query this recordset rather than the database itself.

One additional caveat is that I need to use aggregate functions (SUM,MIN,MAX,AVG) when I am performing these sub-queries.

Coding this wouldn’t be too terribly difficult, but something this obvious seems like it would have been done thousands of times before, making me wonder if there might be an open source library of some sort out there that contains this type of functionality? I swear I encountered one a few years back but am unable to track it down on google.

EDIT:
A good suggestion (by TimW) in the comments was to do all the aggregation on the database server and pass back to the client, and then just do the filtering on the client.
(Although, in this case it won’t work, as 2 of the columns with filtering being applied are DateTime columns)

UPDATE

Here is the library I previously encountered:
http://code.google.com/p/ado-dataset-tools/

Not sure if the author has abandoned it or not (his plan seemed to be to update it and convert to c#), but the VBA versions of the various libraries seem to be available here:
http://code.google.com/p/ado-dataset-tools/source/browse/trunk/ado-recordset-unit-tests.xls?spec=svn8&r=8#ado-recordset-unit-tests.xls

The specific ADO library I was interested in is here:
http://code.google.com/p/ado-dataset-tools/source/browse/trunk/ado-recordset-unit-tests.xls/SharedRecordSet.bas

See specifically the GroupRecordSet() function.
Only SUM,MIN,MAX aggregate functions seem to be supported.

Another possible alternative (if running within Excel)

Writing SQL Queries Against Virtual Tables in Excel VBA
http://www.vbaexpress.com/forum/showthread.php?t=260
Not sure how this would perform, but pulling the raw data (with partial pre-aggregation) into a local worksheet in Excel, and then using that worksheet as a datasource in subsequent queries might be a viable option.

How to&Answers:

My own experience is that it’s actually far more efficient to make many small calls to the database than it is to load large amounts of data into a recordset and then try to filter/query that data.

I’m also under the impression that your ability to filter/query data in an existing ADO recordset is fairly limited in comparison to making individual calls to the database. Back when I was trying to do this I thought it should be as simple as creating a second ADO recordset by querying the first one using SQL. I never did find a way to do that; I’m pretty sure it isn’t possible.

Edit1
To help you understand the difference, I wrote some code that read in new price data from a text file and updated prices in a Visual Foxpro database using ADO and the VFP OLE driver. The table I was querying had about 650,000 records. I thought it would be best to load all the records in a recordset and then use ADO’s filter method. When I did this it took my code three to four hours to run. I changed my code to just look up each record, one at a time, and my code then ran in one minute and two seconds. I posted about this problem on SO. You can take a look at the various responses I received: Speed up this Find/Filter Operation – (VB6, TextFile, ADO, VFP 6.0 Database)

Answer:

If your performance issue stems from a remote SQL Server database over a slow connection then local caching might make a certain amount of sense if you have to work with the data intensively.

One way to get a lot of versatility would be to use a local Jet MDB as your cache.

You could do the initial “caching” query using Jet to do a SELECT from your remote external SQL Server database INTO a local table, then CREATE indexes on it. From there you could perform any number of subsequent queries against the local table. When you need to work with another subset just DROP the local table and indexes, and requery the remote database.

But unless your remote connection path is slow this usually doesn’t buy you so much.

Answer:

From my research into this subject, there is no easy solution or existing libraries or commercial products. The only viable solution from what I can tell is to bite the bullet and hand code a solution, which is more work than it’s worth to me.

So I am marking this as the correct answer despite it not being the solution to the problem. 🙂