View Full Version : Grabbing Table Data From Page
flagbrew
02-02-2001, 07:09 PM
Well I've been trying to figure this one out for days now. I was hoping to include on my homepage, public information from a government webpage that is updated every fifteen minutes or so. The page is http://water.usgs.gov/az/nwis/current?site_no=09506000&pmcode=00060 and I just want to fetch the streamflow data (between td tags) and be able to use it as a string on my page, so my visitors can be more aware of important things such as 'when can I go kayaking?' Is this kind of thing possible or am I crazy? I've gone to cpan (confuses me) and devshed. Any help would be kind ;).
petesmc
02-04-2001, 12:28 PM
I would say this is impossible without, them letting you do this..
You could ask them, then basically they would set up somethnig and you just link to a URL via cgi which displays it.
However they must set this up first..
gnorthey
02-04-2001, 12:32 PM
As far as I know, they would have to put something on their page with the string in a variable format and then you would go 'fetch' the variable and display it on your page.
If you're a CGI progrmmer you might know more than me, but I've never seen anything that can do that.
petesmc
02-04-2001, 12:38 PM
Some sites offer this like the weaher service, can't remeber URL, but they provided code for you and bascially you have a file like:
http://www.weatherserveice.com/location.cgi?location=calafornia
And basically you grabbed it....
I personnally know NO cgi at all......
flagbrew
02-04-2001, 12:57 PM
Well heres what I found, copied from http://www.htmlwizard.net/resources/phpMisc/scripts/plain/html_parse.php3
<?
/*
void html_parse(string html, string element_handler, string tag)
This function opens and parses $html_file for $tag
and returns its content and its attributes to the
callback function $element_handler.
$element_handler is a custom funtion which acts upon
the content and the attributes of $tag and gets called
everytime $tag is closed. It must accept the following parameters:
- $attributes (attributes of $tag
- $content (content of the element $tag)
Addition 1999-01-22:
$html_file no longer needs to be a file but can also be a string.
*/
function html_parse($html_file, $element_handler, $tag)
{
/*
This is now quite fast (< 1 sec), but it loads the entire file into memory.
*/
if (file_exists($html_file))
{
$fd = fopen($html_file, "r") or die("Error: Unable to open file $html_file");
$file_content = fread($fd, filesize($html_file));
fclose( $fd );
$file_content = stripslashes($file_content);
}
else
{
$file_content = $html_file;
}
while ($full_tag = strstr($file_content, "<$tag"))
{
$full_tag = substr($full_tag, 0, strpos($full_tag, "</$tag>")+strlen($tag)+3);
$open_tag = substr($full_tag, 0, strpos($full_tag, ">")+1);
$open_tag = ereg_replace("[<>]|$tag", "", $open_tag);
// Split the string into key/value pairs: first split it into key=value,
// later into a hash.
$tmp_array = split ("[$\"] +", $open_tag);
for ($i=0; $i<count($tmp_array); $i++)
{
$tmp_array[$i] = trim($tmp_array[$i]);
$tmp_array[$i] = ereg_replace("\"", "", $tmp_array[$i]);
$tmp_attribs = split("=", $tmp_array[$i]);
for ($j=0; $j<count($tmp_attribs); $j++)
{
// Don't add empty pairs :)
if ($tmp_attribs[$j] != "" && $tmp_attribs[$j+1] != "")
{
$attribs[trim($tmp_attribs[$j])] = trim($tmp_attribs[$j+1]);
}
}
}
$content = substr ($full_tag, strpos($full_tag, ">"));
$content = substr($content, 1, strpos($content, "</$tag>")-1);
$element_handler($attribs, trim($content));
$file_content = substr($file_content, strpos($file_content, "</$tag>")+strlen($tag)+3);
}
}
/*
* Example usage:
require("html_parse.php3");
function my_handler($attribs, $content)
{
echo $content;
}
html_parse("index.html", "my_handler", "title");
*
*/
?>
This has the ability to grab things between whatever tag (<title> in this case), but I was hoping to be able to define, say, the third <td> tag encountered. I'll keep playing with it.
One definite problem with this is that when the $html string is defined outside the directory your in, the script doesn't work. If, in the example, index.html, was replaced with http://www.flagbrew.com/ it wouldnt work. Is this some kind of inherent php problem with reserved characters or something? Thanks for the help!!
flagbrew
02-04-2001, 09:51 PM
Well I think I have found the solution to the question, and yes it is possible to do this. The following script will allow you to gleen between the tags. In this example its <td> tags being outputed from index.html.
<HTML>
<BODY>
<?php
$page = file("index.html");
if (!$page) {
echo "<p>Unable to open remote file.\n";
exit;
}
while ( list( $num, $line ) = each( $page ) ) {
if (eregi("<td>(.*)</td>", $line, $out)) {
echo "<BR>".$num." ".$out[1]."\n";
}
}
?>
</BODY>
</HTML>
I must say this is a simpler script from the previous example. If you want to see it in action you can check it at http://www.flagbrew.com/rivers.php. Thought I'd tell you I found a solution.
petesmc
02-17-2001, 07:51 PM
Heres mine:
<?
$url = 'http://water.usgs.gov/az/nwis/current?site_no=09506000&pmcode=00060';
$lines_array = file($url);
$lines_string = implode('', $lines_array);
eregi("<!-- start of data table -->(.*)</TABLE><HR>", $lines_string, $head);
echo $head[0];
?>
Peter
vBulletin® v3.7.0, Copyright ©2000-2009, Jelsoft Enterprises Ltd.