Natural Language Processing (often referred to as 'NLP') is the application of computational techniques to the analysis and synthesis of natural language and speech.
This lesson uses word embeddings and clustering algorithms in Python to identify groups of similar documents in a corpus of approximately 9,000 academic abstracts. It will teach you the basics of dimensionality reduction for extracting structure from a large corpus and how to evaluate your results.
The goal of this course is to introduce key concepts and workflows in Natural Language Processing (NLP) to humanities scholars who have little or no experience with the field.
Researchers often need to be able to search a corpus of texts for a defined list of terms and historians are often interested in certain places named in a text or texts. This lesson details how to programmatically search documents for a list of terms, including place names and then how to obtain coordinates and map historical place names with the World Historical Gazetteer.
Since their beginnings in the 17th century, newspapers have recorded billions of events, stories and personal names in almost every language and every country daily. This course from DariahTeach provides an introduction to digitised historical newspaper analysis, incorporating methods of Natural Language Processing for discovering, exploiting and visualising newspapers.