Company XYZ is going to launch a new line of widgets. All employees have been operating under strict secrecy for months – communicating with each other by using encrypted email and password protected areas of the company website. Nothing is has been overlooked, except Google.
How can this be, you ask? Why should XYZ worry about a search engine? The answer lies in the very nature of how Google operates. Googlebot, the program Google uses to scour the web, looks at everything – not just webpages. Any file hosted on a webserver is fair game. Googlebot indexes the content of files such as PDF’s and MS Word documents and includes the content within Google’s search database.
XYZ has gone to great lengths to ensure the confidentiality of information, unfortunately, thanks to Googlebot, not far enough. Simply typing in the keywords “confidential” and “do not distribute” into a Google search will yield over 60,000 confidential documents.
The lesson to be learned is simple – do not put sensitive information on your websever.