Sample Datasets for Hadoop Testing and Eval.

The effectively do any development on Hadoop, it's important to have a dataset that you can work with.  Here are a few links to Datasets on the web.  Start with basic csv datasets and work your way into other more complicated datasets, like XML and JSON.

A full Employee Dataset for MySQL: https://dev.mysql.com/doc/employee/en/employees-installation.html

XML Datasets: http://www.cs.washington.edu/research/xmldatasets/

Build your own: https://github.com/dstreev/hdp-data-gen