Say I have a table that will always contain RANDOM DATA (various product titles, prices, & ratings in no particular order). I noticed that sometimes either the "Price:" column or "Rating" column won't always have a value. So when I'm scraping multiple items into an array & sending each column into a listview, the data won't sync up properly if a value is missing in say the "Price" column.
Here is an example of a html table that I'm trying to scrape data from, but notice how row "# 5" is missing the price. This is what's messing up the syncing of the data while it's being added to the listview in VB.NET:
Now here is an example of what I'm using in VB.NET to collect data from this table:
Again, the problem is that sometimes I won't know which table is going to have some elements missing (such as the "Price" column) which causes the data NOT to be synced up in the rows of the ListView. How could I fix this with the code that I've written above? Thanks.
Here is an example of a html table that I'm trying to scrape data from, but notice how row "# 5" is missing the price. This is what's messing up the syncing of the data while it's being added to the listview in VB.NET:
HTML:
<html>
<head>
<style>
table {
margin:auto;
margin-top:50px;
font-family: arial, sans-serif;
border-collapse: collapse;
width: 40%;
}
td{
border: 3px solid #000;
text-align: left;
padding: 3px;
}
th {
border: 3px solid #000;
background-color:gold;
text-align: left;
padding: 3px;
}
tr:nth-child(even) {
background-color: #dddddd;
}
</style>
</head>
<body>
<table>
<tr><th>#</th><th>Product Title:</th><th width="60">Price:</th><th width="60">Rating:</th></tr>
<tr><td width="20">1</td><td class="ProductTitle">Minera Natural Dead Sea Salt, 5lbs Bulk Bag - Fine Grain</td><td class="Price">$20.00</td><td class="Rating">9/10</td></tr>
<tr><td width="20">2</td><td class="ProductTitle">Minera Dead Sea Salt 2lb Bag Fine Grain, 100% Pure Mineral Salt Treatment</td><td class="Price">$9.99</td><td class="Rating">6/10</td></tr>
<tr><td width="20">3</td><td class="ProductTitle">Minera Pure Dead Sea Salt 10lbs Fine Grain</td><td class="Price">$15.95</td><td>8/10</td></tr>
<tr><td width="20">4</td><td class="ProductTitle">Dead Sea Warehouse - Amazing Minerals Dead Sea Bath Salts, Temporary Relief from...</td><td class="Price">$16.00</td><td class="Rating">5/10</td></tr>
<tr><td width="20">5</td><td class="ProductTitle">Natural Planet Dead Sea Salt, 5lbs Fine Grain - 100% Pure Bath Salt - For Psoriasis...</td><td></td><td class="Rating">5/10</td></tr>
<tr><td width="20">6</td><td class="ProductTitle">Art Naturals Himalayan Salt Body Scrub 20oz -Deep Cleansing Exfoliator With Shea...</td><td class="Price">$13.95</td><td class="Rating">7/10</td></tr>
<tr><td width="20">7</td><td class="ProductTitle">Dead Sea Salt 2.2lb try for Psoriasis, Eczema, and Dermatitis (1 x Resealable...</td><td class="Price">$9.99</td><td class="Rating">4/10</td></tr>
<tr><td width="20">8</td><td class="ProductTitle">Premier Dead Sea Aromatherapy Mineral Body Treatment, Silver, Salt Scrub, 425...</td><td class="Price">$15.95</td><td class="Rating">8/10</td></tr>
<tr><td width="20">9</td><td class="ProductTitle">Dead Sea Warehouse - Amazing Minerals Dead Sea Bath Salts, Temporary Relief from...</td><td class="Price">$16.00</td><td class="Rating">6/10</td></tr>
<tr><td width="20">10</td><td>Natural Planet Dead Sea Salt, 50lbs Fine Grain - 100% Pure Bath Salt - For Psoriasis...</td><td class="Price">$90.25</td><td class="Rating">10/10</td></tr>
</table>
</body>
</html>
Now here is an example of what I'm using in VB.NET to collect data from this table:
VB.NET:
Imports System.Text.RegularExpressions
Public Class Form1
Dim ITEM As New ListViewItem
Dim ProductTitle As String
Dim ProductPrice As String
Dim ProductRating As String
Private Sub Button1_Click(sender As Object, e As EventArgs) Handles Button1.Click
ListView1.Items.Clear()
ProductTitle = ""
ProductPrice = ""
ProductRating = ""
Dim keyword As String = TextBox1.Text
keyword = keyword.Replace(" ", "+")
Try
'This is the HTML Table That I'm talking about:
Dim html As String = "THE HTML TABLE SPECIFIED"
'Product Title:
Dim regx1 As New Regex("td class=""ProductTitle"">.+?</td>", RegexOptions.IgnoreCase)
Dim matches1 As MatchCollection = regx1.Matches(html)
For Each match1 As Match In matches1
ProductTitle += match1.Value & "^"
ProductTitle = ProductTitle.Replace("td class=""ProductTitle"">", "").Replace("</td>", "")
Next
'Price:
Dim regx As New Regex("td class=""ProductPrice"">.+?</td>", RegexOptions.IgnoreCase)
Dim matches As MatchCollection = regx.Matches(html)
For Each match As Match In matches
ProductPrice += match.Value & "^"
ProductPrice = ProductPrice.Replace("td class=""ProductPrice"">", "").Replace("</td>", "")
Next
'Rating:
Dim regx As New Regex("td class=""ProductRating"">.+?</td>", RegexOptions.IgnoreCase)
Dim matches As MatchCollection = regx.Matches(html)
For Each match As Match In matches
ProductRating += match.Value & "^"
ProductRating = ProductRating.Replace("td class=""ProductRating"">", "").Replace("</td>", "")
Next
'Create the split & add all items to listview:
Dim split1() As String = ProductTitle.Split("^")
Dim split2() As String = ProductPrice.Split("^")
Dim split3() As String = ProductRating.Split("^")
For i = 0 To split1.Count - 2
ITEM = ListView1.Items.Add(split1(i))
ITEM.SubItems.Add(split2(i))
ITEM.SubItems.Add(split3(i))
Next
Catch ex As Exception
End Try
End Sub
End Class
Again, the problem is that sometimes I won't know which table is going to have some elements missing (such as the "Price" column) which causes the data NOT to be synced up in the rows of the ListView. How could I fix this with the code that I've written above? Thanks.
Last edited: