I have a huge Excel file that contains the result of an online survey. The person who built the survey messed up the formatting in several respects, and the mess-up I need to take care of first is converting HTML entities to regular text.
From what I can see only two HTML entities are used,
" but the document is over 12,000 rows so I cannot be sure there are no other HTML entities used… and if other HTML entities are used I want them converted to text as well.
I have successfully made a macro to convert the two HTML entities I mentioned into text, but I don’t know how to make the macro execute on the entire file (i.e. I have to hold down on the macro hot key to make it execute… and it is taking forever).
If there was a macro already available to do what I want that would be great because I could also use a modified version of it for my next task of arranging all the columns and rows in the proper order.
This is the version of my macro that searches for
,. It works, I just have to hold down on the hot key which takes forever. If I could make this run on the entire Excel file that would be great, and then I can just adjust the macro for each HTML entity until I have eliminated them all.
Sub Macro2() ' ' HTML_Converter Macro ' ' Cells.Find(What:=",", After:=ActiveCell, LookIn:=xlFormulas, LookAt _ :=xlPart, SearchOrder:=xlByRows, SearchDirection:=xlNext, MatchCase:= _ False, SearchFormat:=False).Activate ActiveCell.Replace What:=",", Replacement:=",", LookAt:=xlPart, _ SearchOrder:=xlByRows, MatchCase:=False, SearchFormat:=False, _ ReplaceFormat:=False Cells.Find(What:=",", After:=ActiveCell, LookIn:=xlFormulas, LookAt _ :=xlPart, SearchOrder:=xlByRows, SearchDirection:=xlNext, MatchCase:= _ False, SearchFormat:=False).Activate End Sub
Create a backup of the workbook.
Open the VBA editor by pressing Alt+F11.
Double-click “This Workbook” in the treeview at left under the workbook that you are working with.
Copy and paste the following:
Sub UnescapeCharacters() ' set this to match your case sheetname = "Sheet1" Dim sheet As Worksheet Set sheet = Me.Worksheets(sheetname) For Row = 1 To sheet.UsedRange.Rows.Count For Column = 1 To sheet.UsedRange.Columns.Count Dim cell As Range Set cell = sheet.Cells(Row, Column) ' define all your replacements here ReplaceCharacter cell, """, """" 'quadruple quotes required ReplaceCharacter cell, ",", "," Next Column Next Row End Sub Sub ReplaceCharacter(ByRef cell As Range, ByVal find As String, ByVal replacement As String) Dim result As String cell.Value = replace(cell.Text, find, replacement, 1, -1) End Sub
This just iterates over every cell in the specified worksheet and replaces everything you define. The provided code replaces the two character codes you mentioned.
You can run it as a macro, or just place the caret in the “UnescapeCharacters” subroutine and hit F5.
I’ve made a Excel addin that has this feature: