Are the existing training corpora unnecessarily large?