My efforts have ended trying to download the site, with WD4W I have managed to download about 718Mb (27400 files in 1161 folders) of the site with some (no idea how many) of the links working well. It seems that any link that goes to a file with node in the name has not downloaded (or not linked right), I am not quite sure how all the links have been redirected in WD4W so some of the pages that don't link right might still of been downloaded.
Does anyone know a way to do a link check on my downloaded files? maybe somehow import the site (files i have downloaded) into dreamweaver and run some kind of link check or something to make them relivent to to root folder?
If anyone would like a copy of the files then just PM me and we will try and sort something out
You might find that within robot.txt that /node/ is included as an exception ... this is often the case with these sites (having fun with the vids from Teachers' TV atm)
Hi Grumbledook, I have just checked to see if the site has a robot.txt file but it does not seem to of downloaded (not to say that the site does not have one, i just cant find it). Inside the node folder there is 55 objects all with relivent data but I think there were ment to be hundereds of files inside that folder, they have wierd names like '335017@uc=force_deep.html' and '165326.html'.
The downloaded site seems to trip when you hit a link that has some sort of redirect (not sure if its the redirect thats at fault or its just the file its trying to reach has not downloaded). I have tryed to look at the link when it is redirecting but it flashes so fast I can not see it (and does not show up in history) anyidea how i could view the link the page is trying to find?
Did any of you Edugeekers get any further with this?