Susan Li
1 min readNov 22, 2018

--

By default max_df is 1.0, which means “ignore terms that appear in more than 100% of the documents”, while min_df=5 means “ignore terms that appear in more than 5 documents.

--

--

Susan Li

Changing the world, one post at a time. Sr Data Scientist, Toronto Canada. https://www.linkedin.com/in/susanli/