How could one create a network representation of a book corpus spanning over two hundred years? In this paper, we present a method based on text data vectorization for a complex and multifaceted network representation of an early modern corpus of 239 natural philosophy textbooks published in Latin, French, and English. On the one hand, we use unsupervised methods (namely topic modeling, term frequency – inverse document frequency, and multilingual word embeddings) to represent the broader features of this corpus, such as the homogeneity in the style and linguistic usages, both among works written in the same language, and across multiple languages. On the other hand, we use the collocate analysis of specific keywords to explore how certain concepts were understood, reshaped, and disseminated in the corpus. We call this the ‘semantic dimension.’ Each of these two dimensions provides a different way of correlating the books via text data vectorization and representing them as a network. Since each of these dimensions is in itself complex and multifaceted, the network we construct for each of them is a multiplex one, made of several layer-graphs. Furthermore, provided that there is enough information available about the authors of the works included in our inventory, this research offers the grounds for further expanding the described network representation in such a way as to create a third multiplex, one that explores some of the social features of the authors in question.
- History of philosophy
- Early modern natural philosophy
- Network analysis
- Text data vectorization
- Semantic features