Popular Machine Learing Datasets
Links below are some datasets that I collected from the web or papers. Feel free to play with them!
In most machine learning task, data always comes from 3 ways: provided by some organizations(public/private), generated by oneself or crawled from web pages.
Right here I provided some websites who host some public datasets for others to explore.
Within Programming Languages
R: data()
Python: scikit-learn
Datasets Repositories
UCI : play with Most Popular Data Sets(Recommend)
Online Recognition
IAM On-Line Handwriting Database
Image Recognition
Text Generation/Classification
Machine Translation
Machine QA
See this paper’s appendix for data generation.