Log in

View Full Version : Resolved HTML DOM: retrieve only direct children cells within table.



JShor
01-13-2012, 04:51 PM
I'm using this parser:
http://simplehtmldom.sourceforge.net/

I want to get all of the cells that are direct children of the parent table, not all of the cells within the table. This is because I could have nested tables within cells.

Consider the following example:


<table border="1" id="mytable">
<tr>
<td bgcolor="#e8e8e8">1</td>
<td bgcolor="#e8e8e8">2</td>
</tr>
<tr>
<td bgcolor="#e8e8e8">3</td>
<td bgcolor="#e8e8e8">4</td>
</tr>
<tr>
<td bgcolor="#e8e8e8">
<table border="1">
<tr>
<td bgcolor="#fff">11</td>
<td bgcolor="#fff">12</td>
</tr>
<tr>
<td bgcolor="#fff">13</td>
<td bgcolor="#fff">14</td>
</tr>
<tr>
<td bgcolor="#fff">15</td>
<td bgcolor="#fff">16</td>
</tr>
</table>
</td>
<td bgcolor="#e8e8e8">6</td>
</tr>
</table>


Right now, if I execute this code:


$html->find('#mytable')->find('td');

foreach($t as $k) {
echo $k->innertext;
}


...it will retrieve: 1, 2, 3, 4, 11, 12, 13, 14, 15, 16, 6.

I want it to print ONLY the content of the direct children cells (1, 2, 3, 4, <table>...</table>, 6).

I tried writing a recursive function to get the nested level of the table in the DOM tree, but that didn't work and made the code convoluted. Any suggestions?

JShor
01-13-2012, 07:21 PM
Update: Well, I figured out a solution. Since table cells are the grandchildren of the <table> tag in the DOM tree, it can be accessed by calling parent() twice. From there I can have the script check if the cell parent()->parent() is equal to the currently active <table>, and if it is, it's a direct child.



$tables = $html->find('table');
$c = 0;

foreach($tables as $table) {
$c++;
$table->id = "tbl_$c";

$tds = $html->find('td');
$color = false;

echo "Table tbl_$c has the following:<br />";
foreach($tds as $td) {
if($td->parent()->parent()->id == "tbl_$c") {
// Execution for all cells that are direct children of currently active table.
}
}
}


It's probably not the best solution, but hey, it works for the small one-time task I'm trying to accomplish.