Novel Graph Dataset

A comprehensive dataset of translated novels and their interrelations

A graph visualizing novel relationships, with node colors representing novel categories and edges indicating recommended novels. The graph is generated by randomly selecting a starting node and propagating to a maximum depth of 3.

This minor project emerged from my interest in generating unique graph-based datasets. It compiles a comprehensive dataset sourced from NovelUpdates, focusing on translated novels. The dataset encompasses 21,831 English-translated novels originating from eight languages (Chinese, Japanese, Korean, Malaysian, Filipino, Indonesian, Khmer, and Thai) as of the latest update. It includes detailed individual novel statistics, such as chapter counts and rankings, as well as relational data linking novels to one another. These interconnections are particularly valuable, enabling the construction of intricate graph structures for analysis.

The dataset and accompanying code are publicly available on GitHub. I typically update the dataset annually to incorporate new novels and refresh existing information.