Issue reading Text files with html tags and convert them into htmldoc VBA -


i have downloaded webpage html , stored in local folder. now, want read same html file using excel vba macro , parse particular tag. issue : html tag attributes getting changed when try read local html file , assign entire file data html.

i not able correct html attributes, hence not able parse it. when try read html, assign html object , write data file, see below results. thats reason not able parse correctly td.classname = "detb".

for ex: part of original tag in local html file

<tbody> <tr height=""22""> <td width=""40%"" class=""detb"" colspan=""1""></td>  <td align=""right"" class=""detb"">mar 13</td> <td align=""right"" class=""detb"">mar 12</td> <td align=""right"" class=""detb"">mar 11</td> <td align=""right"" class=""detb"">mar 10</td> <td align=""right"" class=""detb"">mar 09</td> </tr> 

below kind of data when read file , assign html object , display/ write file:

<tbody><tr height="""" 22""""=""""> <td width="""" colspan=""1"" 1""""="""" 40%""""="""" detb""""=""""></td>  <td align="""" right""""="""" detb""""="""">mar 13</td> <td align="""" right""""="""" detb""""="""">mar 12</td> <td align="""" right""""="""" detb""""="""">mar 11</td> <td align="""" right""""="""" detb""""="""">mar 10</td> <td align="""" right""""="""" detb""""="""">mar 09</td> </tr> 

code :

set mybrowser = createobject("internetexplorer.application")  mybrowser     .navigate << html file path >>      .visible = true      set htmldoc = mybrowser.document       open myfileprev2 output #1      write #1, htmldoc.body.innerhtml     close #1      .quit end 

can please me.

cheers, raghav

write #1, htmldoc.body.innerhtml 

should

print #1, htmldoc.body.innerhtml 

https://msdn.microsoft.com/en-us/library/office/gg264524(v=office.14).aspx

unlike print # statement, write # statement inserts commas between items , quotation marks around strings written file. don't have put explicit delimiters in list. write # inserts newline character, is, carriage return–linefeed (chr(13) + chr(10)), after has written final character in outputlist file.


Comments