Test this morning starting at 9:30 on the dot!!
After that I will mark everyones assignments that they have submitted so far and give you feedback as I go
Test this morning starting at 9:30 on the dot!!
After that I will mark everyones assignments that they have submitted so far and give you feedback as I go
Click here to see the example code
Here is this weeks notes and exercises
This week we are looking at our we can present and format our XML documents using CSS and XSLT
While most of the time we want our online content to appear in Goggle and other search engines there are other times that you may want to make sure that either certain pages and folders or the whole site do not. For example if you have sensitive content that you do not want appearing in search engines or if the site is just test one then you probably wouldn’t want google to be promoting it all over the internet.
Therefore if this is the case you can use a robots.txt to tell the robots which pages or folders that they should not index your site.
A robots.txt file is as the name suggest a plain text file which simply tells any robots (any piece of software which crawls the net retrieving content and information) what they can and can’t index. You should be aware though that any malware bots will simply ignore robots.txt files so if you trying to protect sensitve content from anything other than search engines then a robots.txt will probably not be enough.
But for our simply requirement of keeping our webpages out of the search engines they should be fine.
Another important aspect to be aware of is that the robots.txt file must go into the root of the main directory eg in the root of the domain because this is where the robots will look for it. They won’t bother looking any where else and will simply index the full site if they can not find a robots.txt to tell them other wise.
You can either open notepad and type in the directives or use a robots.txt creator such as this one at seobook.com. Either way the resulting file is incredibly simple. For example if want to stop all search engines to not index any of our files then the robots.txt would just contain the following
User-Agent: *
Disallow: /
For further information just do a quick search on google.
Corey Eulas: SEO Master Class.