Take my journals, and run a relatively simple word separation algorithm over them.
Shuffle up those words and pay to have them annotated.
Reconstruct the dataset from there.
Take my journals, and run a relatively simple word separation algorithm over them.
Shuffle up those words and pay to have them annotated.
Reconstruct the dataset from there.