Home » excel » excel – Extract substring of list based on another list

excel – Extract substring of list based on another list

Posted by: admin May 14, 2020 Leave a comment

Questions:

Using two lists, one consisting of names with added information in various forms (see below for example – list 1) and one consisting of the clear formatted names, i.e. with no added information (list 2)

List 1
--------
Netto City      | Value
Imerco City     | value 
Bilka Suburb    | value
Bauhaus, City   | Value
City FDB Superb | Value

List 2
------
Netto
Imerco
Bilka
Bauhaus
FDB Super

What I am trying to do is create a filter, so that no matter what the first column of my source data(list 1) looks like, i will be able to sum the values based on (list 2).

Something similar to this: Excel – extracting data based on another list

I tried using vlookup, but that does not search for substrings, then i tried using

=IF(COUNTIF(A$4:A$9;"*"&D5&"*")>0;
    INDIRECT(ADDRESS(MATCH("*"&D5&"*";A$4:A$9;0);4));"not found")

But that appears to do the opposite, search list 1 for a single cell value from list 2.
I can’t quite get my head around if this works just as well, I havent been able to get it to work anyway, thus my search for the other way. Search List 2, for each item from List 1.

But, ultimately, what I am trying to accomplish is to create a list from the source data, which I can use to categorize each item in list 1 from, based on list 3

List 3
Bilka     | Cat1
Imerco    | Cat2
FDB Super | Cat1
etc. 

For that to work, i need a clean list of the source data, without all the extra information which comes with it.

I use the following sumif

=SUMIFS($F$3:$F$703;$B$3:$B$703;
    "="&$H4;$D$3:$D$703;">="&I$2;$D$3:$D$703;"<="&I$3)

to sum all sums belonging to a particular item in List 3 (where i’ve manually created List 3), between to dates.

The purpose of this is to create a sheet that contains all expenditures to a particular store or category of ones own choosing, for instance the ones listed in List 1, are primarily food stores.

Edit – Clarification.

What I am proposing to do is a multistage process.
Stage 1:
Insert original source data (done)

Stage 2:
Filter source data for unique values (done)

Stage 3:
Create list of approve names for each item in source data
– Ie, Bilka Suburb into Bilka, Netto City into Netto
Here ‘Netto’ and ‘Bilka’ are approved names which is manually created to allow for grouping in stage 4. I am looking to automatize this step.

Stage 4:
Group each item from the list of Stage 3, based on name and date-interval, weekly monthly whatever (done) if i could only get Stage 3 to work, as it works on my manually corrected data.

Stage 5:
Select appropriate category, and type for each item in resulting list from Stage 3:
Bilka, is a food place, so it would get the category ‘food’, same as netto, where Bauhaus would get the category ‘Building Supplies’, each of these items would get the type ‘expense’ where say wage would get the type ‘income’ (done)

the solution to stage 5, is just a vlookup, based on the category into a table that lists each category with a type, so that is simple enough.

Final Solution: Requires that the list to iterate over is in column G, and outputs the list of approved names in column H. There is the error of if not being able to know the difference between an item such as “Super” and “SU”, I don’t know how to fix that. If anyone has any suggestions on that I am all ears.

Sub LoopCells()
Sheets("RawData").Select
Sheets("RawData").Activate
LRApproved = Cells(Rows.Count, "H").End(xlUp).Row
LRsource = Cells(Rows.Count, "G").End(xlUp).Row
For Each approvedcell In Worksheets("RawData").Range("H2:H" & LRApproved).Cells     'Approved stores entered by users
For Each sourcecell In Worksheets("RawData").Range("G2:G" & LRsource).Cells 'items found from bank statement export
    If InStr(UCase(sourcecell.Value), UCase(approvedcell.Value)) <> 0 Then
                sourcecell.Offset(0, 2).Value = approvedcell.Value
    End If
   Next sourcecell
Next approvedcell
End Sub

Thanks for all the help.
Edit: Added final solution and VBA tag.

How to&Answers:

This works for me:

=SUM(B$3:B$7*NOT(ISERROR(SEARCH(A11,A$3:A$7))))

This assumes that your example list 1 is in range A3:B7 and your list 2 in A11:B15. Paste the above formula in cell B11 and press CtrlShiftEnter to enter it as an array formula. Then you can drag-copy it all the way down to B15.

Explanation: SEARCH for e.g. “Netto” in the cells of List 1. For cells that do not contain that string, SEARCH returns an error. So we’re looking for cells that do not return an error. We now have an array of booleans indicating this. Multiply it element-by-element by the array of values. In this multiplication, TRUE is interpreted as 1 and FALSE as zero, so you’re screening out the values that don’t correspond to “Netto”.

Here’s a secreenshot of my setup:

enter image description here

Answer:

Perhaps I’ve misunderstood but can’t you use SUMIF?

=SUMIF(A$4:A$9;"*"&D5&"*";B$4:B$9)

Answer:

instead of going with VBA, you can extract this with simple small formula. =Index(List2!A2:A10,Match(1,Countif(List1A2,”“&List2!A2:A10&”“),0)) (Press Ctrl+Shift+Enter). Assume you want to extract the list 2 in to list 1.