Home » excel » regex – How to write a regular expressions to populate a list with given file types & exclude certain folders

regex – How to write a regular expressions to populate a list with given file types & exclude certain folders

Posted by: admin April 23, 2020 Leave a comment

Questions:

I’ve done a lot of searching on this but still can’t quite put it all together.

I’m trying create an Excel VBA program that populates a spreadsheet based on a user inputting regular expressions so that I can process the files with other vba programs.

So for example, if I want to populate a folder with all Autodesk Inventor file types , I would use:

.*\.(iam|ipt|ipn|idw)

and from what I have read, if I want a regex to skip a file in a folder OR containing a string, I would use something like:

(?iOldVersions)

but like I mentioned, I am having trouble putting this together so that it is a single reg ex call — and also, if there are multiple strings that I want it to skip (ie; the folders OldVersions and Legacy)

I think I would like to keep it as regex although I’m guessing I could also use wScript.Shell (or whatever that object is) but It would be nice to just get familiar with regular expressions for now.

The code I am using is the same from this post, but instead I added a parameter to pass the pattern to the top level code by pulling it from a cell in excel.

List files of certain pattern using Excel VBA

Again, any help would be greatly appreciated!

Thanks again, all!

Edit: Latest attempt….

Private Sub FindPatternMatchedFiles()

objFile = "C:\OldVersions\TestFile.iam"

Dim objRegExp As Object
Set objRegExp = CreateObject("VBScript.RegExp")

'objRegExp.Pattern = "(.*\.(iam|ipt|ipn|idw))(?!(\z))."
objRegExp.Pattern = "(^((?!OldVersions).)*$)(.*\.(iam|ipt|ipn|idw))"

objRegExp.IgnoreCase = True

res = objRegExp.test(objFile)
MsgBox (res)

'Garbage Collection
Set objRegExp = Nothing
End Sub
How to&Answers:

To exclude matching strings having \OldVersions\ or \Legacy\, just add anchors and a negative lookahead at the start:

^(?!.*\(?:OldVersions|Legacy)\).*\.(?:iam|ipt|ipn|idw)$

See the regex demo

Details:

  • ^ – start of string
  • (?!.*\\(?:OldVersions|Legacy)\\) – a negative lookahead failing the match if there is \ + either OldVersions or Legacy + \ after 0+ chars other than \r and \n (.*).
  • .* – 0+ chars other than \r and \n, as many as possible, up to the last…
  • \. – literal .
  • (?:iam|ipt|ipn|idw) – one of the alternatives in the non-capturing group
  • $ – end of string.