Academic Editor

Article Metrics

  • Fig. 1.

    View in Article
    Full Image
  • Fig. 2.

    View in Article
    Full Image
  • Fig. 3.

    View in Article
    Full Image
  • Fig. 4.

    View in Article
    Full Image
  • Fig. 5.

    View in Article
    Full Image
  • Fig. 6.

    View in Article
    Full Image
  • Fig. 7.

    View in Article
    Full Image
  • Fig. 8.

    View in Article
    Full Image
  • Fig. 9.

    View in Article
    Full Image
  • Information

  • Download

  • Contents

Abstract

Ancient Chinese guqin books hold significant historical and academic value within traditional musical literature, representing a key expression of ancient musical aesthetics and artistic accomplishment. However, current research faces challenges due to a lack of systematic organization and in-depth exploration, which limits the depth of their study and application. This study adopts digital humanities methodologies to explore feasible approaches for semantic modeling of ancient Chinese guqin books, aiming to achieve fine-grained organization, intelligent processing, and dynamic inheritance of these texts. This study utilizes digital humanities methodologies to organize, analyze, and revitalize ancient Chinese guqin books. The research employs a structured workflow that consists of three main steps: ontology construction, knowledge graph construction, and knowledge graph visualization and application. First, key concepts within ancient Chinese guqin books are defined through ontology, which includes the creation of class hierarchies and relational attributes. Using Protégé, this ontology is constructed and validated to ensure semantic accuracy. Next, the ontology is mapped to the Neo4j graph database to create a knowledge graph that represents multi-dimensional relationships between guqin compositions, related personas, and historical contexts. Finally, the knowledge graph is visualized and queried using Cypher to uncover hidden knowledge and facilitate deeper exploration of ancient Chinese guqin books. The semantic modeling approach proposed in this study enables the representation of the complex semantic knowledge embedded in the fragmented and diverse resources of ancient Chinese guqin books in a simplified and intuitive triplet format. This method facilitates the effective integration and clear presentation of multi-source knowledge. Additionally, intelligent querying and graph-based reasoning techniques are employed to uncover hidden knowledge associations, enabling the extraction of valuable insights and knowledge discovery. These findings not only enhance the understanding of ancient guqin texts but also provide perspective and methodological references for research in related fields of the humanities. This study introduces a novel, systematic approach for the organization and application of ancient Chinese guqin books in the digital age, advancing the preservation and modernization of historical knowledge. It also lays the theoretical foundation for constructing a knowledge resource system in the era of digital intelligence that integrates “knowledge organization, data mining, and interactive perception”, as well as a technology-driven framework that connects knowledge, intelligence, and interconnectivity. The integration of ontology and knowledge graphs offers an actionable framework for knowledge discovery and serves as a valuable reference for future digital humanities applications in the preservation of ancient texts.

1. Introduction

The guqin, also known as the “seven-stringed zither”, is a quintessential Chinese musical instrument with a history spanning thousands of years and profound cultural significance (Yang et al, 2024). Acknowledging its historical and cultural importance, United Nations Educational, Scientific and Cultural Organization (UNESCO) (2008) inscribed the guqin and its music on the Representative List of the Intangible Cultural Heritage of Humanity in 2003, highlighting its role as a symbol of China’s rich and enduring musical traditions (Yong, 2024). Widely regarded as one of the most traditional and influential instruments in Chinese music, the guqin is celebrated for its distinctive musical style, elegant timbre, and profound cultural connotations. It occupies a central place in Chinese cultural heritage, embodying the essence of traditional Chinese aesthetics and philosophy. With its extensive historical legacy, vast collection of related literature, and remarkable state of preservation, guqin music is often described as a “living fossil” of ancient Chinese music, offering a unique window into China’s musical and cultural evolution (Zhang et al, 2023). Moreover, the abundant resources of ancient Chinese guqin books, including instructional manuals, theoretical treatises, and historical records, provide invaluable opportunities for in-depth historical research and cultural exploration, further enriching the study of this extraordinary instrument.

Research on ancient Chinese guqin books, essential historical materials for guqin studies, has traditionally focused on bibliography, literature, and musicology. Bibliographical studies have organized, classified, and preserved these books, emphasizing their role in transmitting historical and cultural heritage. Zhao (2020) and Zhao (2010) have explored their origins, content, and academic value, laying the groundwork for their preservation and reuse. Literary research has examined the symbolic and intrinsic qualities of guqin books within literati culture (Zhao, 2021), while musicological studies have increasingly emphasized the literature’s technical aspects, such as finger techniques and musical notations (Liu, 2023). Additionally, research has delved into the acoustic characteristics and transmission of guqin knowledge across different periods and social strata (Ouyang, 2017; Yan, 2012). However, the compartmentalization of traditional scholarship has created “academic silos”, limiting the comprehensive exploration and reuse of the historical knowledge embedded in these books.

The rise of digital humanities has broken traditional barriers within the humanities by integrating new technologies such as ontology, linked data, knowledge graphs, Geographic Information System (GIS), social networks, artificial intelligence, and blockchain (Cheng and Chou, 2022; Liu et al, 2024; Wang et al, 2022). These innovations have revitalized and transformed the development of information resources, leading to significant changes in knowledge organization (Li et al, 2024b). Projects like Europeana (Baierer et al, 2017), Sampo (Hyvönen, 2023), Beyond 2022 (Debruyne et al, 2022), and Digital Dunhuang (Li and Xia, 2023) have profoundly influenced the frameworks and presentation of traditional humanities resources. Semantic web technologies, including knowledge graphs, ontology, and linked data, have expanded the methods of knowledge organization within the humanities. Semantic web technologies have been applied across various regions and fields, such as cultural heritage collections (Li and Bikakis, 2023), ancient architectures (Chen and Ou, 2020), Thai culture (Chansanam and Tuamsuk, 2022; Thunyaluk et al, 2023), ancient Chinese time (Wang et al, 2024), and oral history (Vrachliotou and Papatheodorou, 2024). Furthermore, Kuremoto (2024) has attempted to use deep learning techniques to automatically identify musical elements such as rhythm, tempo, and harmony embedded in guqin music. These studies thoroughly demonstrate the feasibility of semantic web technologies and deep learning techniques in the knowledge organization of humanities resources.

As an essential component of cultural heritage, ancient books have also been significantly impacted by these developments. The evolution of digital humanities has created favorable conditions for the entire lifecycle of ancient books, from reorganization and processing to development and utilization (Chen and Chang, 2019; Li et al, 2024a). Research on ancient books has rapidly emerged as a critical area of study, encompassing text mining (Joo et al, 2022), knowledge organization (Liu et al, 2023), and the construction of specialized databases or knowledge bases for ancient books (Chen and Chang, 2019; Zhao and Chen, 2023). Semantic web technologies have been extensively applied in the study of ancient books, resulting in exemplary research cases, such as studies on medieval geographical literature (Bartalesi et al, 2023), ancient Greek texts (Foka et al, 2020), Chinese ancient books (Wang et al, 2023), catalogues of collections (Ortolja-Baird et al, 2019), and Dante’s literary texts (Bartalesi and Meghini, 2015). These studies illustrate how semantic web technologies can enhance the organization and sharing of knowledge about ancient books, offering rich practical experiences and methodological references for subsequent research.

Through the analysis of the above literature review, it can be found that in the digital humanities era, the application of digital technologies has gained widespread attention and recognition in the study of ancient books, offering new methods and tools for traditional research on ancient Chinese guqin books within the disciplines of bibliography, literature, and musicology. From the semantic representation and comprehensive interpretation of knowledge concepts based on ontology to the visual representation and intelligent querying of knowledge concepts through knowledge graphs, the semantic knowledge organization model has taken root and yielded numerous theoretical and practical achievements in the field of digital humanities. This provides methodological references for the exploration, organization, and utilization of knowledge within ancient guqin books. These studies thoroughly demonstrate the feasibility of semantic web technologies in the knowledge organization of humanities resources. Current research has deconstructed, reorganized, and correlated the inherent historical and cultural knowledge contained in ancient books. Existing research in the field of ancient books has made valuable strides in various areas, yet the exploration of specialized, vertical domains, such as ancient guqin books, remains underdeveloped. As a result, accurately uncovering the musical elements embedded within ancient guqin books remains a significant challenge. Furthermore, these texts are often plagued by unclear concepts, a lack of systematic structure, and difficulties in their practical use, all of which hinder their modern development and application. To address these challenges, personalized knowledge organization tailored specifically to ancient guqin books is essential. This approach would enable a more structured and effective way to interpret and utilize these valuable cultural resources.

This paper, grounded in an evidence-based perspective, focuses on ancient guqin books, whether fully preserved or partially fragmented, and proposes a semantic modeling methodology and implementation pathway through the analysis of musical structure and the standardization of related concepts. This methodology aims to address the challenges associated with the preservation and study of these texts. Firstly, it employs ontology technology to clarify the conceptual content within ancient guqin books, combining traditional bibliography with a musical perspective to accurately define the knowledge system of these books. Secondly, it constructs a knowledge graph of ancient guqin books using the Neo4j graph database, accurately representing textual knowledge through graph patterns and utilizing graph reasoning to further explore hidden knowledge and facilitate multidimensional knowledge discovery. By conducting research on the organization of knowledge within ancient guqin books, this study aims to promote the development and utilization of historical knowledge in the new era, thereby advancing the study of guqin music. Furthermore, the accessibility of knowledge graphs lowers the barrier to entry, facilitating the transition from specialized research to public dissemination.

2. Materials and Methods

This study utilizes methodologies from the fields of digital humanities to systematically organize, analyze, and leverage ancient Chinese guqin books. The methodology is designed to ensure both transparency and reproducibility, starting with clear definitions of key concepts and followed by detailed explanations of the software tools and methodological steps used throughout the research process.

2.1 Key Concept Definitions

Ontology: Ontology, as a formal conceptualization of explicit specifications, is one of the most widely used methods for semantic knowledge organization and modeling (Gruber, 1993; Osman et al, 2022). The concept of ontology originates from philosophy, where it pertains to the study of being and existence. In the 1990s, ontology was adopted in computer science and information science as a formal representation of structured knowledge within a specific domain. Ontology provides a shared vocabulary and a conceptual framework that enables consistent data modeling and knowledge representation across different systems (Garcia Trelles et al, 2024; Padmavathi and Krishnamurthy, 2017). In this study, ontology is employed to formally describe the knowledge embedded in ancient Chinese guqin books. The ontology design focuses on six key aspects: scenarios, reuse, entities, relations, constraints, and evaluation. This standardized descriptive framework facilitates the construction of a conceptual knowledge system for ancient Chinese guqin books.

Knowledge graph: The term “knowledge graph” gained widespread recognition with Google’s introduction of its Knowledge Graph in 2012, which connected information through entities and their relationships. However, the idea of representing knowledge through graph structures dates back to earlier developments in artificial intelligence and semantic networks during the 1970s and 1980s. A knowledge graph is a data structure that represents entities (nodes) and their interrelationships (edges), enabling the visualization and exploration of complex knowledge networks (Ehrlinger and Wöß, 2016). In this study, the knowledge graph is used to represent the multi-dimensional relationships within guqin books. This structure enhances the visualization of knowledge and allows for the discovery of hidden relationships through graph reasoning, thereby supporting more in-depth academic research.

2.2 Software Tools

Protégé: Protégé is an open-source ontology editing tool developed by Stanford University, widely recognized for its capabilities in creating, editing, validating, and visualizing ontology. Supporting a variety of ontology languages, including Web Ontology Language (OWL), Protégé has become a standard tool in ontology engineering. Its intuitive graphical user interface facilitates the modeling of complex domains (Gennari et al, 2003; Musen, 2015). In this study, Protégé was used to construct the ontology model for guqin books. Researchers utilized Protégé to define the hierarchical structures and relationships among the core concepts within the guqin books domain. Additionally, Protégé’s reasoning tools, such as the Pellet reasoner, were employed to validate the logical consistency of the ontology model, ensuring its semantic accuracy and reliability in representing domain knowledge.

Neo4j: Neo4j is a high-performance, NoSQL graph database management system specifically designed to handle complex relational data. Its graph-based architecture makes it particularly well-suited for managing interconnected data and is ideal for applications that require complex relationship queries (Angles, 2012). In this study, Neo4j was used to store and manage the ontology-based knowledge graph of guqin books. Neo4j supports powerful query capabilities through its Cypher query language, enabling researchers to perform deep queries and reasoning tasks within the knowledge graph. This functionality was essential for exploring the intricate relationships between entities in guqin books, facilitating advanced knowledge discovery and inference.

2.3 Methodological Steps

The research process is structured into three primary steps: ontology construction, knowledge graph construction, and knowledge graph visualization and application. The workflow incorporates iterative feedback mechanisms at each stage, ensuring that the outcomes of subsequent steps inform the refinement and optimization of earlier stages. This adaptive approach enhances the accuracy and robustness of the entire process. The specific workflow is illustrated in Fig. 1.

Fig. 1.

Semantic modeling knowledge workflow diagram.

2.3.1 Ontology Construction

Needs analysis and concept extraction: The methodology begins with an in-depth analysis of the guqin book domain to identify core concepts and knowledge elements. Drawing on traditional bibliographic and musicological theories, and incorporating expert opinions, this step ensures the comprehensiveness and accuracy of concept extraction.

Ontology structure construction: Based on the extracted key concepts, the ontology model is constructed using Protégé. This process involves defining classes, establishing hierarchical structures among the classes, and specifying the semantic relationships between different concepts. To ensure the extensibility and generalizability of the ontology, existing standard ontologies are integrated, enabling future interoperability with other datasets.

Ontology validation and optimization: After the ontology design is completed, it is validated using Protégé’s reasoning functions to ensure there are no logical inconsistencies within the model. Based on the reasoning results, the ontology model is further optimized to enhance its semantic expressiveness and flexibility for practical use.

2.3.2 Knowledge Graph Construction

Data mapping and transformation: The completed ontology model is mapped into the Neo4j graph database using Resource Description Framework (RDF) formatting, forming the foundational data structure of the knowledge graph. This step ensures the structured representation of knowledge, allowing complex relational data to be efficiently stored and managed within the graph database.

Knowledge graph construction and reasoning: Using Neo4j’s graph pattern mapping capabilities, nodes and edges are created to construct the knowledge structure. Nodes represent entities, while edges represent the relationships between entities. Graph reasoning techniques are then applied to uncover hidden relationships and latent knowledge structures within the graph, revealing previously overlooked connections in guqin books.

Data validation and adjustment: After constructing the knowledge graph, it is validated to ensure consistency with the actual data. Based on the validation results, necessary adjustments and optimizations are made to the graph, ensuring its accuracy and completeness.

2.3.3 Knowledge Graph Visualization and Application

Visualization: The knowledge graph is visualized through Neo4j’s interface, providing an intuitive representation of the complex knowledge relationships within guqin books. The visualization not only aids researchers in understanding these complex relationships but also offers a user-friendly interface for non-specialist users to explore the rich content of guqin culture.

Intelligent querying and knowledge discovery: Cypher query language is used within Neo4j for deep retrieval and intelligent querying of the knowledge graph. Through graph traversal and reasoning, researchers can quickly locate specific knowledge points and discover hidden relationships and patterns within the literature. This process demonstrates the powerful functionality of the knowledge graph in knowledge discovery and academic research.

Case studies and practical application: To validate the practical utility of the knowledge graph, several case studies were conducted in areas such as textual criticism, knowledge verification, and the discovery of musical structure knowledge. These case studies highlight the application value of the knowledge graph and demonstrate its innovative potential within digital humanities research. Through this series of methodological steps, this study constructs a systematic and structured knowledge system for guqin books and showcases the wide-ranging application potential of the knowledge graph in academic research, cultural preservation, and educational dissemination. Future research can build upon the findings of this study by expanding the coverage of the knowledge graph and applying it to broader cultural heritage research domains.

3. Ontology Model Design

This study utilizes ontology to formally describe the knowledge within the field of ancient guqin books. To ensure a comprehensive and detailed approach, the ontology construction is based on the principles of China’s traditional Six Tactics (Liutao) philosophy. Using the “Six Tactics Method”, (Zhou et al, 2024) the ontology model for ancient guqin books is designed through six key dimensions: scenarios, reuse, entities, relations, constraints, and evaluation. Each of these dimensions plays a crucial role in providing a standardized descriptive framework for research on the organization of knowledge within ancient guqin books, thereby facilitating the construction of a conceptual knowledge system for these historical texts. This multi-layered approach ensures that the ontology is both robust and adaptable, enabling thorough analysis and application in the study of ancient guqin books.

3.1 Scenarios: Analysis of Application Scenarios of Ancient Chinese Guqin Books

The clarification of scenarios is the foundation of ontology modeling, and only by clearly defining scenarios can subsequent research work continue. This paper addresses the application scenarios of semantic modeling for ancient guqin books from two perspectives: the analysis of resource categories and the practical demand scenarios. By doing so, it provides a thorough clarification of the application scenarios for semantic modeling of ancient guqin books, enabling a detailed analysis of both the hidden and explicit knowledge structures within these texts. This approach aims to promote the dynamic inheritance and innovative development of ancient musical cultural heritage resources, with ancient Chinese guqin books serving as a key representative.

3.1.1 Analysis of Categories of Ancient Chinese Guqin Books

To ensure the scope of ontology modeling remains focused and the content is relevant, it is essential to clarify and analyze the core concepts related to ancient guqin books. Identifying and defining these key concepts is crucial for establishing a well-structured and meaningful ontology. To reveal the complete structure of ancient guqin books, this study considers two types of guqin books: surviving guqin books and guqin books with lost or unknown titles.

The surviving guqin books can be further categorized into titled surviving guqin books and untitled surviving guqin books. These books serve as the primary research subjects in this study. The titled surviving guqin books, although their contents have been lost, still hold significant literary value as they provide fundamental materials for guqin studies. These books can be classified into four categories: qinpu (guqin tablature), techniques of playing, construction of guqin, and qin theory. Each category contributes to the understanding of guqin music and its cultural significance.

The untitled surviving guqin books contain valuable information that can be used to enrich knowledge of ancient guqin music. They can be further classified into four categories based on their content: qin theory, qinpu, classical works, and monographs. By analyzing these books, researchers gain insights into various aspects of guqin music, from theoretical discussions to practical playing techniques.

Apart from the surviving guqin books, there are fragmented guqin books with content scattered elsewhere. These fragments are drawn from other documentary materials, such as collections, classical works, ancient commentaries, and poetry collections. The inclusion of these fragmented sources helps to partially reconstruct the lost literary materials and provides additional perspectives on guqin music.

Finally, guqin books in need of further study often lack sufficient detail, requiring domain experts to participate in further investigations. Experts need to examine aspects such as authors, titles, contents, and dates of composition to better understand the context and significance of these guqin books.

3.1.2 Analysis of Practical Demand Scenarios

To maximize the effectiveness of demand matching and domain applicability in ontology modeling within the field of guqin studies, it is essential to clarify the diverse practical needs of semantic modeling for ancient guqin book resources, starting from real-world application scenarios. Therefore, this section, based on the theory of cultural heritage resource data production and reproduction (Zhang et al, 2025), and in combination with the characteristics of ancient guqin resources, identifies three practical demands: the digital preservation demand of ancient guqin books, the fine-grained knowledge reorganization demand, and the intelligent development and utilization demand of ancient guqin books, as shown in Fig. 2.

Fig. 2.

Analysis of the practical demand for semantic modeling of ancient Chinese guqin books.

As illustrated in Fig. 2, the semantic modeling process of ancient guqin books consists of three stages: data production, data quasi-production, and data reproduction. “Data production” refers to the process of converting scattered, multimodal, paper-based ancient guqin book resources into digital forms using technologies such as 3D scanning and Optical Character Recognition (OCR) recognition. This generally occurs during the data collection and metadata processing stages. In this phase, it is necessary to integrate existing ancient guqin books into digital transformation and storage while ensuring the authority, comprehensiveness, and completeness of the resources. This is an essential practical demand for the effective protection of scattered resources and the ongoing transmission of artistic heritage. Categories involved include qinpu, techniques of playing, construction of guqin, qin theory, classical works, and research monographs.

“Data quasi-production” refers to the use of digital humanities research paradigms, employing technologies such as knowledge graphs and domain ontologies to extract, process, and reorganize fine-grained knowledge from raw data. This corresponds to the ontology modeling and knowledge graph construction outlined in this study’s research framework. Therefore, exploring and proposing knowledge reorganization and association models specifically applicable to ancient guqin books is a core demand of this study. It is also a key element influencing the effectiveness of subsequent data mining and exploration.

“Data reproduction” involves using quantitative analysis techniques and knowledge inference methods to extract new knowledge or generate new interpretations of existing knowledge from the established knowledge association network. This corresponds to the knowledge graph visualization and application stages in the research framework. In this phase, it is crucial to explore how to mine implicit knowledge related to ancient guqin books based on completed knowledge reorganization, enhance the degree of digital intelligence in information services, and improve user satisfaction levels. These are key follow-up tasks and significant objectives for advancing the research on semantic modeling of ancient guqin books.

In summary, the practical demands for the semantic modeling of ancient guqin books are structured around these three stages, each contributing to the overarching goals of digital preservation, knowledge reorganization, and intelligent utilization, ultimately supporting the dynamic transmission and innovative development of this precious cultural heritage.

3.2 Reuse: Ontology Reuse in Practice

The focus of ontology reuse is on evaluating existing vocabularies for concept attributes and reusing classes and properties that align with the concepts within the specific domain. Reusing established ontologies not only reduces the cost and effort involved in ontology construction but also minimizes redundant work, enhancing the reliability and portability of the ontology. This is because researchers follow common standards for defining shared entities. In the context of reuse, existing ontological models are carefully considered.

While CIDOC Conceptual Reference Model (CIDOC-CRM) is a well-known ontology widely used in the field of cultural heritage, it was not selected for this particular study. CIDOC-CRM is designed with a broad cultural heritage focus and provides general conceptual structures that may not offer the fine-grained, domain-specific semantic frameworks needed for this research. Specifically, ontologies like Geographical Vocabularies (GEO) (GeoNames, n.d.), Time Ontology (TIME) (Time Ontology in OWL, 2017), and Dublin Core (DC) (Dublin Core, n.d.) offer more targeted semantic descriptions that are better aligned with the needs of this study, which requires highly detailed and precise representations for time, geographical locations, and other related entities.

Therefore, it was decided to adopt widely-used ontologies such as DC, Friend of a Friend (FOAF) (Friend of a Friend, 2014), TIME, GEO, and the Shanghai Library Ontology (SHL) (Ontology Query Center, n.d.), which provide specific and fine-grained structures. Additionally, custom vocabularies, such as the creation of the Guqin Books (GB) vocabulary, were developed to supplement existing frameworks. This approach ensures that the ontology is both aligned with well-established standards and tailored to the specific needs of the guqin book domain, where more specialized semantic detail is required.

3.3 Entities: Class and Hierarchy Structure Design

Entities, the fundamental elements that constitute the ontology, are known as classes. There is a hierarchical structure between superclasses and subclasses that forms the backbone of the ontology model. Classes are defined based on a clear understanding of the basic scenarios, and beyond fundamental concepts, elements such as agents, objects, spatiotemporal components, bibliographic sources, and musical theory are also considered. The systematic categorization of knowledge levels within ancient guqin book resources forms the foundational basis for the construction of the explanatory mechanism and logical inference during the knowledge discovery process. The rationality and granularity of this categorization not only affect the interpretability of the knowledge discovery results but also largely determine the coherence and integration of the knowledge narrative system across texts and contexts. Therefore, this section completes the mutual mapping between the 5W1H (Who, What, Where, When, Why, and How) elements and the knowledge elements of ancient guqin books, based on the six factors of time (When), place (Where), person (Who), cause (Why), content (What), and method (How) in digital storytelling, as shown in Fig. 3.

Fig. 3.

Division of knowledge elements in ancient Chinese guqin books. 5W1H, Who/What/Where/When/Why/How.

As shown in Fig. 3, Time (When) corresponds to the TemporalEntity element in the ancient guqin book ontology, representing the specific time of discovery, the period the book belongs to, or the lifetime of the book’s creator. Place (Where) corresponds to the SpatialThing element, representing the locations where the guqin books are preserved, discovered, created, or where the creator was born or died. Method (How) corresponds to the Guqin Book element, which serves as the text medium recording diverse guqin knowledge. It is not only a critical point for constructing and exploring the associations of guqin knowledge but also a key path to revealing the inherent links between guqin technique inheritance and theoretical context. Content (What) corresponds to the Music Structure and Musical Piece elements, representing the specific musical scores and structural content recorded within the guqin books. Person (Who) corresponds to the Person element, representing individuals related to the creation, transmission, and dissemination of ancient guqin books. Cause (Why) corresponds to the Sources element, representing the data sources of guqin knowledge. The authenticity and authority of these sources are key to ensuring the reliability and credibility of knowledge associations and serve as an important guarantee for the logical consistency and historical continuity of the entire guqin knowledge system.

From the perspective of the knowledge system structure of ancient guqin books, this study divides it into seven main entities: Guqin Book (GB:GuqinBook), Musical Piece (GB:MusicalPiece), Music Structure (GB:MusicStructure), Person (FOAF:Person), Space (GEO:SpatialThing), Time (TIME:TemporalEntity), and Source Documents (GB:Sources). Subclasses further divide these entities. For example, Guqin Book includes subcategories such as Lost Guqin Book (GB:LostGuqinBook), Surviving Guqin Book (GB:SurvivingGuqinBook), Collected Lost Guqin Book (GB:CollectedLostGuqinBook), and Authenticated Guqin Book (GB:AuthenticatedGuqinBook). The specific hierarchical relationships are illustrated in Fig. 4. Through the definition of these core classes, the setting of the ontology classes for ancient guqin books is completed, with detailed conceptual analyses of each class provided subsequently.

Fig. 4.

Ancient Chinese guqin books ontology and hierarchical structure. owl, Web Ontology Language.

From the perspective of creating proprietary classes for the ancient guqin book ontology, Guqin Book is the core component of the ontology. All other entity elements are associated with this central class. Musical Piece represents the specific score recorded in guqin books, while Music Structure provides a musical description of the specific structure of each piece, such as Paragraph, Sentence, Pitch, Rhythm Tempo, and other types of information. Source Documents feature a variety of characteristics, including the collection sources of the guqin books, scores, and biographies, and serve as the foundation for literature verification and the construction of a bibliographic evidence chain.

From the perspective of general classes, Person in ancient guqin books primarily exists as the author of the book and the creator of the music. Considering that individuals, as social agents, are inevitably constrained by space and time, they are associated with guqin books, spaces, times, and source documents. Spatial and temporal entity instances depend on and are associated with entities such as guqin books, musical pieces, persons, and source documents, forming the spatiotemporal components of these entities. Spatial Thing includes subclasses for Region (SHL:Region) and Place (SHL:Place) to describe the characteristics of areas and points in physical space. Temporal Entity is divided into Instant (TIME:Instant) and Interval (TIME:Interval) to describe the characteristics of points and durations in time. Considering the cultural specificity of time description in ancient China, the subclass Interval is further subdivided into Reign (SHL:Reign), Dynasty (SHL:Dynasty), and specific time ranges (TIME:ProperInterval).

3.4 Relations and Constraints: Design Attribute List Settings

Relations are ubiquitous among entity concepts, which are specifically manifested as object properties and data properties in ontologies. In this analysis, relationships between entities form object properties, whereas inherent characteristics of an entity constitute its data properties. Moreover, the concept of constraints in ontologies is concretized through the definition of properties’ domains and ranges, articulating the connotations and extensions of relations.

Object properties have categories as both their domains and ranges, whereas data properties have a category as their domain and a specific data type as their range. This study focuses on the associations between entity categories in the construction of the ontology for ancient Chinese guqin books. Object properties play a pivotal role here, possessing vector characteristics to express one-to-many or many-to-one relationships between objects. The study has set object properties as shown in Table 1. By specifying data properties, the study comprehensively describes the intrinsic attributes of the knowledge concepts of ancient guqin books, with specific data properties shown in Table 2.

Table 1. Object properties of ancient Chinese guqin books.
Property Note Domain Range
DC:creator Creator/Author GB:GuqinBook FOAF:Person
GB:MusicalPiece
GB:Sources
GB:releventPerson Related Figures GB:GuqinBook FOAF:Person
GB:MusicalPiece
GB:subordinate Source GB:GuqinBook GB:Sources
GB:MusicalPiece
FOAF:Person
SHL:temporal Dynasty GB:GuqinBook SHL:Dynasty
GB:MusicalPiece
GB:Sources
FOAF:Person
GB:releventMusicalPiece Related Pieces GB:GuqinBook GB:MusicalPiece
GB:MusicalPiece
FOAF:Person
GB:recordMusicalPiece Recorded Pieces GB:GuqinBook GB:MusicalPiece
GB:includeStructure Included Musical Structure GB:MusicalPiece GB:MusicStructure
GB:createdTime Creation Time GB:GuqinBook TIME:TemporalEntity
GB:MusicalPiece
GB:Sources
GB:createdPlace Creation Place GB:GuqinBook GEO:SpatialThing
GB:MusicalPiece
GB:Sources
GB:collectionPlace Collection Location GB:GuqinBook GEO:SpatialThing
GB:Sources
SHL:beginYear Starts At TIME:Interval TIME:Instant
SHL:endYear Ends At TIME:Interval TIME:Instant
GB:belongsToRegion Region SHL:Place SHL:Region
SHL:birthDay Year of Birth FOAF:Person TIME:Instant
SHL:deathday Year of Death FOAF:Person TIME:Instant
SHL:nativePlace Native Place FOAF:Person SHL:Place
GB:beginVoice Starting Note GB:MusicStructure GB:MusicStructure
GB:endVoice Ending Note GB:MusicStructure GB:MusicStructure
GB:speed Tempo GB:MusicStructure GB:MusicStructure
GB:function Function GB:MusicStructure GB:MusicStructure
GB:include Containment Relationship GB:MusicStructure GB:MusicStructure
GB:continueWith Sequential Relationship GB:MusicStructure GB:MusicStructure

DC, Dublin Core; GB, Guqin Books; SHL, Shanghai Library Ontology; FOAF, Friend of a Friend; TIME, Time Ontology; GEO, Geographical Vocabularies.

Table 2. Data properties of ancient Chinese guqin books.
Property Note Domain Range
DC:title Title/Name OWL:Thing RDFS:Literal
DC:description Description OWL:Thing RDFS:Literal
DC:type Type OWL:Thing RDFS:Literal
DC:identifier Resource Identifier OWL:Thing RDFS:Literal
DC:image Image OWL:Thing XSD:anyURI
GB:content Content GB:GuqinBook RDFS:Literal
GB:MusicalPiece
GB:Sources
GB:MusicStructure
GB:volumeNumber Volumes GB:GuqinBook RDFS:Literal
GB:prologue Prologue GB:GuqinBook RDFS:Literal
GB:epilogue Epilogue GB:GuqinBook RDFS:Literal
GB:prompt Prompt GB:MusicStructure RDFS:Literal
GB:version Edition GB:GuqinBook RDFS:Literal
GB:MusicalPiece
GB:Sources
SHL:courtesyName Courtesy Name FOAF:Person RDFS:Literal
SHL:pseudonym Pseudonym FOAF:Person RDFS:Literal
TIME:inXSDDateTime Date TIME:TemporalEntity RDFS:Literal
SHL:city City SHL:Region RDFS:Literal
SHL:county County SHL:Region RDFS:Literal
SHL:province Province SHL:Region RDFS:Literal
SHL:country Country SHL:Region RDFS:Literal

RDFS, Resource Description Framework Schema; XSD, XML Schema Definition; XML, eXtensible Markup Language.

3.5 Evaluation: Model Assessment and Demonstration

In this study, following the establishment of core concepts and attribute vocabulary, the Protégé tool was used for the formal ontology construction. The constructed ontology model of ancient guqin books was evaluated and tested using the Pellet reasoner integrated with Protégé (Khamparia and Pandey, 2017). As part of this evaluation, a consistency check was conducted using the Pellet reasoner to ensure that no logical conflicts existed within the ontology model. This method primarily assessed whether the model adhered to semantic norms, ensuring consistency and correctness in its structure. The test results indicated that the ontology model passed the consistency check without any conflicts, confirming the logical integrity of the model. Additionally, after the model construction was completed, a coverage assessment was performed to evaluate the extent to which the ontology model covered the core concepts and attributes of the guqin books domain. Specifically, key concepts were identified through discussions with experts in the field of guqin books studies, and the model was checked to ensure that these concepts were fully represented. The evaluation results demonstrated that the ontology model successfully covered most of the core concepts, meeting the expected coverage standards. Additionally, the study provided a visualization of the ontology model, as shown in Fig. 5. This top-level semantic knowledge description of ancient guqin books sets a rigorous standard for concept normalization, clarifies the architectural framework of guqin books knowledge concepts, and lays a fundamental conceptual model base for subsequent knowledge graph construction research.

Fig. 5.

Ancient Chinese guqin books ontology model.

4. Knowledge Graph Construction and Application
4.1 Case Selection and Data Collection

Zhou Qingyun, a renowned Confucian businessman of the late Qing and early Republican era, harbored a special fondness for ancient Chinese guqin culture. He amassed a collection of ancient guqins, guqin books, and related materials and conducted in-depth research on guqin studies. His works include Qin Shu Cun Mu, Qin Shu Bie Lu, Qin Cao Cun Mu, Qin Shi Bu, and Qin Shi Xu. Among these, Qin Shu Cun Mu, compiled in 1914, followed traditional bibliographic methods to systematically organize guqin books from the pre-Qin period to the end of the Qing dynasty. It provides introductions, textual research, and historical tracing of each entry, making it a valuable historical resource.

Our case selection is based on the inventory of guqin books cataloged in Zhou Qingyun’s Qin Shu Cun Mu, from which we expanded the dataset. We collected data on 315 guqin books, 400 source documents, 855 pieces of guqin music, and information on 720 individuals. The biographical data were cross-referenced with works like Chronological Biography of Chinese Historical Figures and A Comprehensive Dictionary of Chinese Historical Figures. Temporal data were primarily referenced from chronologies compiled by the Shanghai Library. After manual data collection and verification, the research structured the data and saved it in Comma-Separated Values (CSV) format, providing a solid foundation of structured data for this case study.

4.2 Knowledge Mapping and Storage

In this study, after collecting and processing data on ancient guqin books and completing knowledge annotation and fusion, the structured data were imported into an ontology using rule-based methods to generate RDF-formatted data, forming the foundation for building a knowledge graph. To store and manage this knowledge graph, the Neo4j graph database was adopted, leveraging its core capability of modeling knowledge through graph structures composed of interconnected nodes and edges. Graph databases—originating from mathematical graph theory—store and query data using simple graph-based models, in which nodes represent entities and edges denote relationships. Known for their efficiency, flexibility, and scalability, they support rapid retrieval of inter-entity relationships through algorithms such as shortest path search. A graph database comprises three core components: nodes, relationships (edges), and attributes, with the entire graph being a collection of labeled nodes and their associated properties.

Formally, the knowledge graph is modeled as G = <Vi, Ei>, where Vi denotes the set of nodes and Ei the set of edges. More specifically, Vi = <Li, Pi>, where Li represents node labels (classes), and Pi denotes node properties (attributes) that describe the characteristics of each entity. Similarly, Ei = <Li, subject, object>, where Li specifies the relationship type and subject/object define the start and end nodes connected by the edge. Entities and ontological classes derived from RDF data are mapped to nodes in Neo4j, while hierarchical class relationships and object properties are represented as edges. To realize this mapping, a set of correspondence rules between ontology constructs and graph database structures was designed (Fig. 6). According to these rules, classes and subclasses from the guqin book ontology are stored as labeled nodes, object properties and inter-entity relations as edges, and data properties as node attributes. The final result is a semantically rich, structurally organized graph-based knowledge repository of ancient guqin texts.

Fig. 6.

Ontology to graph database mapping process.

4.3 Knowledge Graph Visualization and Application
4.3.1 Knowledge Visualization

After completing knowledge storage, this study conducted an in-depth visual analysis and display of the ancient guqin books knowledge graph using the Neo4j graph database interface. This process, primarily through graphical means, intuitively presents related knowledge content. In the knowledge graph, the “node-edge-node” triplets form the core of the knowledge structure, clearly displaying the relational characteristics between nodes (Pouriyeh et al, 2019), offering a comprehensive method to grasp the knowledge context in the field of ancient Chinese guqin books. Additionally, the entity’s data attribute information can be detailed through individual node properties, further enriching the visualization of knowledge. Employing the Cypher query language, the study achieved in-depth retrieval and discovery of ancient guqin book knowledge via graph traversal, path calculation, knowledge inference, and other methods.

In this study, the visualization of the ancient guqin books knowledge graph forms an interconnecting semantic network, such as guqin books, types, figures, spatiotemporal data, source literature, and scores. This facilitates the mining of implicit knowledge contained within the ancient texts and empowers knowledge discovery, transforming static, flat data into a dynamic, three-dimensional knowledge network. Ultimately, this promotes the efficient utilization of ancient guqin book resources from the supply side. This intuitive display of the knowledge graph effectively showcases the richness and depth of the related knowledge.

4.3.2 Knowledge Application

Compared to traditional Structured Query Language (SQL), Cypher offers greater usability in operation and comprehension. Knowledge discovery involves extracting valid, comprehensive, and systematic knowledge from raw data based on specific needs (Verma et al, 2023). By inputting relevant search terms in knowledge graph operations, units of knowledge with the same relational attributes can be sequentially associated.

Targeting guqin book resources, personalized queries can be conducted based on specific time periods or types of guqin books. For instance, as Fig. 7 illustrates, surviving Ming (M1) Dynasty guqin treatises primarily include five texts: Illustrated Manual of Qin Music (I1), The Sixteen Methods of the Cold Immortal’s Zither Playing (S1), The Annotations on the Guqin (A1), Diagrams of Qin Bridges and the Five Tones of the Qin (D1), Qinglian Fang’s Elegant Qin Music (Q1).

Fig. 7.

Ming dynasty surviving guqin treatises query. A1, The Annotations on the Guqin; D1, Diagrams of Qin Bridges and the Five Tones of the Qin; I1, Illustrated Manual of Qin Music; M1, Ming; Q1, Qinglian Fang’s Elegant Qin Music; S1, The Sixteen Methods of the Cold Immortal’s Zither Playing.

From a textual criticism perspective, knowledge graphs provide reliable evidence chains for guqin books requiring verification. They methodically list sources to authenticate ambiguities or uncertainties such as authorship, book names, content, and time of composition, allowing for a more objective and precise clarification of specific information about guqin books. For example, Zhou Qingyun conducted a textual analysis to determine whether Chronicles of Qin Scores (C1) and Collection of Rhythmic Patterns in Qin Music (C2) were the same work. Both titles appear in Guo Maoqian’s (G1) Collection of Music Bureau Poems (C3) from the Northern Song (N1) Dynasty, with entirely different collections of guqin scores, leading to the conclusion that they are indeed separate works. As illustrated in Fig. 8, the evidence chain presented in the knowledge graph’s structure clearly supports this view.

Fig. 8.

Verification and knowledge inference of guqin books. C1, Chronicles of Qin Scores; C2, Collection of Rhythmic Patterns in Qin Music; C3, Collection of Music Bureau Poems; G1, Guo Maoqian; N1, Northern Song.

This study conducts a detailed, sentence-level granular analysis of specific guqin compositions, aiming to professionally construct a knowledge graph of guqin books from a musical perspective. This approach addresses the limitations in current research that exclusively focuses on the organization of external literary knowledge. By mining the knowledge of musical structures, which are a concrete manifestation of the internal knowledge in ancient guqin texts, it is possible to uncover more nuanced hidden knowledge within these ancient books. For instance, a granular representation of the musical structure is demonstrated in Section 3 of the guqin piece Autumn Moon over the Han Palace (A2), as shown in Fig. 9. This not only reveals external knowledge features like literary sources and authors but also details internal musical structural characteristics.

Fig. 9.

Verification and knowledge inference of guqin books. A2, Autumn Moon over the Han Palace; A3, Authentic Transmission Zither Manual; C4, Cao Dajia; L1, Liaohuai Hall Qin Manual; T1, The Primordial Elegance of Rationality; T2, The Unified Guqin and Se of the Fan Family; Y1, Yang Lun’s Method of the Heart of Bo Ya.

From the perspective of external knowledge features such as literary sources and authorship, Autumn Moon over the Han Palace’s composer is Cao Dajia (C4) and appears in five guqin books: Authentic Transmission Zither Manual (A3), Yang Lun’s Method of the Heart of Bo Ya (Y1), The Primordial Elegance of Rationality (T1), Liaohuai Hall Qin Manual (L1), and The Unified Guqin and Se of the Fan Family (T2). From the perspective of musical internal structure, the third section of Autumn Moon over the Han Palace consists of four phrases, functioning as the “Start” of the composition, characterized by its slow tempo. The study marks the tonal characteristics of each phrase’s beginning and ending sounds using the five-tone scale (Gong, Shang, Jue, Zhi, Yu). For instance, the third phrase begins with the Yu tone and ends with the Jue tone. This detailed tonal analysis provides new insights into the structure and expressive style of the piece. The knowledge graph’s data attributes suggest that although this section serves as an introduction, it still requires contrasts in dynamics and articulation, reflecting the distinctive style of the Guangling school of guqin music. The piece demands a flowing and echoing touch, conveying a reflective and melancholic sentiment of “speechlessly gazing at the moon”. This nuanced emotional expression not only reflects the profound artistic conception of guqin music but also embodies the Guangling school’s high pursuit of musical expression.

5. Discussion and Conclusions

This study investigates the application of digital humanities methods in the systematization and knowledge discovery of ancient Chinese guqin books. By integrating evidence-based bibliographic analysis with cataloging and musicology through ontology-based knowledge modeling, this research constructs a knowledge graph that facilitates the organization, association, and discovery of complex information within guqin books. This approach proposes a feasible pathway for the knowledge organization and semantic modeling of traditional performing arts resources, such as guqin, in the digital age, further promoting interdisciplinary collaboration between traditional humanities and digital humanities. It also empowers research in traditional humanities through relevant digital humanities technologies. The ontology developed for the field of ancient guqin books demonstrates excellent reusability and transferability, providing a framework and methodological support that can be applied to knowledge organization and semantic modeling for other similar performing arts texts. The graph reasoning techniques and quantitative analysis methods employed in this study can uncover new, implicit knowledge hidden across resources, thereby forming a development trend in which digital discoveries feed back into traditional humanities research. In recent years, the rise of digital humanities has introduced innovative methodologies for studying traditional cultural texts, enabling a more comprehensive understanding of historical knowledge through the use of digital tools. Unlike traditional studies that often focus on isolated aspects of bibliology, literature, or musicology, this research leverages ontology to standardize domain knowledge and create a reusable, evolving conceptual framework. This standardization enhances the utility of guqin studies by providing scholars and enthusiasts with accessible, organized, and retrievable knowledge, reducing the need for exhaustive manual searches through extensive texts. By constructing a detailed and accurate evidence chain from bibliographic sources, the knowledge graph ensures that guqin knowledge is both verifiable and traceable, aligning with contemporary standards of scholarly rigor.

Moreover, this study makes a significant contribution to the field by incorporating musical elements and exploring the causal relationships between various types of knowledge within guqin books. The inclusion of these elements in the knowledge graph allows for a more nuanced understanding of the internal structures of guqin music, which are often overlooked in traditional research methodologies. By doing so, the research highlights the potential of digital humanities to uncover hidden knowledge and to facilitate interdisciplinary dialogue between bibliology, musicology, and digital studies. In addition, this study is grounded in the knowledge structure characteristics of traditional ancient guqin book resources. Compared to traditional ontology development, this research integrates the unique and often overlooked musical elements and performance techniques into the semantic organization and ontology modeling process as core entity categories. By employing a knowledge graph, it presents the complex internal structure of guqin music in a relatively clear and intuitive manner, further refining the granularity of semantic relationships within traditional performing arts books, represented by the guqin. As such, it offers valuable insights for scholars across disciplines, further promoting the integration of digital humanities in academic research and cultural preservation initiatives.

However, the scope of this research also points to the need for further integration of advanced digital technologies. While the current study successfully demonstrates the value of ontology-based knowledge modeling, it also identifies limitations in fully capturing the complexity of guqin books. Future research should focus on expanding the knowledge graph’s coverage, integrating more comprehensive datasets, and employing AI-driven tools to enhance the automation of ontology generation and knowledge inference. This will not only broaden the scope of guqin studies but also improve the accuracy and efficiency of research outcomes.

Availability of Data and Materials

The data that support this study are available from the corresponding author upon reasonable request.

Author Contributions

SZ and FM designed the research study. SZ performed the research. YH provided help and advice on validation. SZ analyzed the data. SZ wrote the manuscript. FM and YH revised the manuscript critically for important intellectual content. All authors contributed to editorial changes in the manuscript. All authors read and approved the final manuscript. All authors have participated sufficiently in the work and agreed to be accountable for all aspects of the work.

Acknowledgment

Not applicable.

Funding

This research received no external funding.

Conflict of Interest

The authors declare no conflict of interest.

Declaration of AI and AI-Assisted Technologies in the Writing Process

During the preparation of this work, the authors used ChatGPT-3.5 to check spelling and grammar. After using this tool, the authors reviewed and edited the content as needed and take full responsibility for the content of the publication.

References

Publisher’s Note: IMR Press stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Cite

Share