[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Checker - help requested!



Thanks Pam, Jock, etc. for your reports on how the checker works (or
doesn't work...). In response to your reported observations, I have made
a couple of adjustments in ring.pm, and if you upload Ringlink now, you
can simply replace the previous ring.pm with the adjusted one. I have
also replied to a couple of your postings with more specific comments.
Honestly I'm a little disappointed at the results. Maybe I should have
realized that creating a checker isn't easily done...
Let me give you a simple description of how it is intended to work. When
checking a site, it first makes sure that the server is up. If it is,
Ringlink sends the URL to the server, and the returned information
(ideally the contents of the HTML page...) is stored in a Perl variable.
Finally, for the page where the HTML code is supposed to be, it checks
the correctness of those Ringlink links that were selected before the
checker was started.
Believe it or not, but the checker actually works sometimes. :) For
instance, if you log into the "demo" ring at my site and run the
checker, you will find that the result is accurate. Even Site 1 passes
the check, and if you go to
http://www.gunnar.cc/ringlink/demo/site1.html and check out the "list"
link in the source code, you'll understand what I mean.
But in real life, it appears as if web servers behave in many different
ways, and often they don't return the HTML code as a result of a request
from the Ringlink program. For instance, one of the URLs that Pam
reported as failed though it should have passed is
http://www.pamster.addr.com/sogmdedication.htm. When requested from a
browser, that URL is obviously working, but in response to the request
from the Ringlink checker, the server sends the following:
    HTTP/1.1 404 Not Found
    Date: Wed, 08 Nov 2000 17:21:03 GMT
    Server: Apache/1.3.6 Ben-SSL/1.36 (Unix) mod_perl/1.21 PHP/3.0.12 
FrontPage/4.0.4.3
    Connection: close
    Content-Type: text/html
    <!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
    <HTML><HEAD>
    <TITLE>404 Not Found</TITLE>
    </HEAD><BODY>
    <H1>Not Found</H1>
    The requested URL /sogmdedication.htm was not found on this server.<P>
    </BODY></HTML>
It's not too much of HTML fragment to check there, is it?
The reasons for most of the "Invalid URL" results are server responses
similar to the above. If you upload the adjusted ring.pm, you will be
able to see the response code from the first line of the server
response. If that code is 3xx, 4xx or 5xx, the checker assumes that the
actual HTML code has not been received, and accordingly it doesn't look
for any HTML fragment.
Anyway, apparently there is a need for a more efficient way to request
the information from the servers. And here we have reached the point
where I'm feeling helpless.
So if anybody on the list could provide some help in this respect, I'd
be very grateful. The present code is in the subroutine "checksites" in
ring.pm.
/ Gunnar

Follow-Ups from:
"Boatbuilding Ring" <boatbuilding@boatbuildingring.com>

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]