Home » excel » c# – Parsing out Excel functions from Formula string

c# – Parsing out Excel functions from Formula string

Posted by: admin April 23, 2020 Leave a comment

Questions:

I have a string which contains an Excel formula. How to parse out each particular function name from within the string?

I can’t figure out how to write the regex for this. Basically it has to be the string of characters before a ( that isn’t in a single or double quote.

For example:

  1. =VLOOKUP($A9,'Summary'!$A$10:$C$30,3,FALSE) – Should return VLOOKUP

  2. =IFERROR((C10/B10),"N/A") – should return IFERROR

  3. ='New Chart Data (Date)'!L70 – Should return nothing because there is no function

  4. =IFERROR((C10/B10),Len(E30)) – should return IFERROR and LEN

  5. ='New Chart Data(Date)'!L70 + Len(5) – should return Len. This is the tricky one. A lot will return Data as well which is wrong.

Any ideas?

Thanks in advance.

How to&Answers:

You can use something like this I guess…

(?<=[=,])[A-Za-z2]+(?=\()

regex101 demo (with descriptions of regex)

Actually, there’s one catch: a formula such as =IFERROR((C10/B10), Len(E30)) won’t get Len. You can use this one instead and trim any spaces if any:

(?<=[=,])\s*[A-Za-z2]+(?=\()

Or since C# accepts variable length lookbehinds…

(?<=[=,]\s*)[A-Za-z2]+(?=\()

Which I think takes a bit more resources than the previous.

EDIT: I didn’t think of the fact that sheetnames can take the form =Sheet(2) e.g. ='=Sheet(2)'!A1

(?<=[=,])\s*[A-Za-z2]+(?=\()(?![^']*'!)

revised regex101

EDIT2: Forgot operators as well… I guess I’ll use a word boundary like Andy’s, since the only issue is

\b[A-Za-z2]+(?=\()(?![^']*'!)

updated regex101

Answer:

I think it could be simplified, using a word-break \b rather than a look-behind:

\b([A-Za-z2]+)(?=\()