Identifying Human Remains from 20th Century Warfare: A State of the Field Essay

² MOE Key Laboratory of Contemporary Anthropology, Department of Anthropology and Human Genetics, School of Life Sciences, Fudan University, 200433 Shanghai, China

^*Correspondence: wlx.wang@foxmail.com (Lingxiang Wang); wenshaoqing@fudan.edu.cn (Shaoqing Wen)
Academic Editor: Cristoforo Pomara

Front. Biosci. (Landmark Ed) 2022, 27(9), 271; https://doi.org/10.31083/j.fbl2709271

Submitted: 18 July 2022 | Revised: 17 August 2022 | Accepted: 31 August 2022 | Published: 29 September 2022

(This article belongs to the Special Issue Forensic pathology and forensic genetics: past, present and future)

This is an open access article under the CC BY 4.0 license.

Download PDF

Cite

Abstract

As we continually reflect on the wars of the 20th century, identification of the remains of victims takes an increasingly prominent position in ongoing research. Existing work on the identification of human remains from 20th century wars primarily covers the determination of phenotypic characteristics, kinship and geographic origins, supporting the establishment of genetic information databases. Compared with standard forensic methods, DNA analyses have revealed greater effectiveness. The process of DNA analysis includes DNA extraction, genetic marker testing and data analysis. Protocols from ancient DNA research can be applied to degraded remains, and next-generation sequencing (NGS) techniques can compensate for shortcomings in the most commonly-used PCR-capillary electrophoresis typing. As it stands, wide-ranging inter-governmental and inter-institutional collaboration is necessary in order to set up NGS-based public databases, and thereby promote the identification of human remains and archaeological forensics.

Keywords

forensic science

physical anthropology

degraded samples

STR typing

next-generation sequencing

1. Introduction

The remains of tens of millions of servicemen and civilian victims abandoned or buried during the wars of the twentieth century have begun to enter the spotlight of meaningful archaeological enquiry. With the new century, an increasing volume of human remains from 20th century wars have been identified, bringing renewed awareness of the cruelty of war and its impact on individuals. By identifying war victims brings new meaning to the value of peace. Researchers have collected and identified the remains of servicemen who lost their lives in foreign lands, repatriated these remains and located relatives in order to provide comfort and solace. Similarly, the identification of civilian victims has also provided comfort for their families. Identification of major figures from these conflicts has also provided crucial new historical information.

Present research on the identification of 20th century human remains related to conflicts has primarily covered servicemen and civilian casualties from World Wars I and II, the Spanish Civil War (1936–1939), the Bosnia and Herzegovina War, the Croatia War (1991–1995), the Chinese Civil War (1927–1949), Korean War (1950–1953) and the Vietnam War (1955–1975). Databases containing the physical anthropological features and DNA profiles of victims have been established for the purpose of identification. In missing persons databases, a reference database containing mitochondrial DNA profiles of the maternal relatives of missing people, along with a separate database containing mitochondrial DNA profiles of unknown human remains can prove effective in identification and have also been proven valuable in victim identification. A case in point concerns mass graves in Croatia and Bosnia and Herzegovina, where positive identification was achieved for 703 victims from 1155 skeletal samples through either standard forensic methods or DNA analysis [1]. These databases are also bound up with their relevant organizations. The Network for Genetic Identification of Victims (SIGO), administered by the Polish Genetic Database of Victims of Totalitarianism (PBGOT), is committed to searching and identifying the search and identification of victims of Nazism in Poland, based on a victim list compiled by the Polish Institute of National Remembrance (IPN) [2]. The Slovenian Government Commission on Concealed Mass Graves has found more than 600 hidden burial pits, amounting to nearly 100,000 victims of extrajudicial killings during and after the Second World War [3]. The Institute for Genetic Engineering and Biotechnology in Sarajevo, Department of Molecular Medicine in Forensic Genetics Group of Ruđer Bošković Institute and other institutions have identified skeletal remains excavated from mass graves in Slovenia [4]. In Croatia, a joint US-Croatian forensic anthropology project has recovered and identified missing individuals, including war victims [5]. The Casualty Identification Program of the Canadian Armed Forces, founded in 2007, aims to identify the remains of Canadian soldiers and airmen [6]. The Finnish Association for Cherishing the Memory of War Dead has supported the search for Finnish World War II soldiers [7]. The United States Army Central Identification Laboratory in Hawaii (CILHI) has performed identification on war remains returned to the U.S. with the help of the Armed Forces DNA identification Laboratory (AFDIL) [8]. These projects and organizations have played a major role in promoting the identification of human remains.

In 2018, in order to protect the soldiers who lost their lives in the wars, safeguard public interest, and uphold and pass on the patriotic spirit of such individuals, the PRC (People’s Republic of China) passed relevant laws to legally enshrine September 30th as Memorial Day. At the same time, the Chinese government has made strides beyond its borders for the repatriation, confirmation and reburial of war remains. From 2014 to 2021, China and South Korea successfully performed eight separate handovers of remains amounting to a total of 825 Chinese solider remains in the Korean War. This work continues, with the ninth batch due to be handed over in September of 2022. Additionally, the nongovernmental project “Veterans’ Homecoming”, is dedicated to the search and identification of the soldiers from World War II. For the Chinese government and grieving relatives of these soldiers, identification of remains is a top priority. The Forensic Archaeology Laboratory of the Institute of Archaeological Science in Fudan University has been committed to the construction of the DNA Database of war remains, which aims to locate the descendants and relatives of the victims, primarily through DNA comparison. To date, this database has accumulated archaeological and physical anthropological data of 572 remains from eight sites and accomplished DNA identification of the remains of man serving in the Chinese Expeditionary Force in Burma (1942–1945) [9] and Huaihai Campaign (1948–1949) [10].

2. Application of War-Related Human Remains Databases

The databases mentioned above lend great assistance to the purposes of identification. Methodological application divides into the following aspects:

(1) Determining the actual identities of war victims by restoring their phenotypic characteristics and comparing these with existing records. Standard forensic methods can be applied in this process. For example, in investigations at mass graves in Bosnia and Herzegovina, researchers depended on the medical and dental records of victims, distinguishing features such as clothing and belongings, and using video superimposition to identify the victims [11]. Some phenotypic features can be inferred by phenotypic related SNPs (single nucleotide polymorphisms), such as eye, skin, and hair color [12], baldness [13] and height [14]. Successful applications of these techniques involved the identification of a Slovenian elite by comparison of her eye and hair color with a portrait-painting [15]. However, this identification work remains a challenge in the event that records such as residential registration and military status are incomplete.

(2) Determining the kinship of war victims with living relatives by comparing their DNA profiles. This is a common post-war research agenda that has been successfully realized for a Vietnam War serviceman [8], a Slovenian elite couple [15], victims in mass graves of the Spanish Civil War [16, 17, 18], Bosnia and Herzegovina [19, 20], Croatia [21] as well as Slovenia [4, 22, 23], Red Army soldiers [24], Italian victims in the Fosse Ardeatine mausoleum [25, 26], a Royal Hungarian First Lieutenant casualty in Ukraine [27], victims of the Korean War [28], Norwegian [29] and Finnish [7] World War II Soldiers, and Polish victims of totalitarian regimes [30]. Reliable comparison databases are required in order to conduct such kinship identification, though some servicemen died childless, and relatives of victims are often likely to have passed away or lost contact, further complicating the establishment of comparative databases.

(3) Determining the geographic origins of the war victims using genetic data. Genetic markers on the Y chromosome and mitochondrial DNA (mtDNA) represent monophyletic genetic markers—exhibiting specific lineages for different populations—and are therefore often used to infer individual bio-geographical origin in forensic sciences. In the identification study of the Chinese Expeditionary Force, Y-haplogroups were speculated based on Y-STRs (short tandem repeats on Y chromosome) as an indicator of provincial origins, with the results echoing a Chinese Expeditionary Force list maintained by a private website [9]. A small panel of SNPs has also been used to further determine the Y-haplogroups of Chinese Civil War soldiers [10]. However, recent human migrations may lead to certain biases when determining geographical origin [31]. With the development of genomic studies, specific SNPs can serve as ancestral informative markers that will allow for more accurate honing in on geographic origins.

3. The Establishment of War-Related Human Remains Database

The methods used in establishing database on human remains from warfare can be categorized as either standard forensic methods or DNA technique, the latter mainly including capillary electrophoresis based STR typing, mitochondrial DNA sequence analysis and genome-wide SNP analysis. Standard individual identification methods in forensic science include recognition through facial features, as well as through hand geometry, iris, tattoos and scars, fingerprints, skeletal and dental morphology, and bite marks [1]. The advantage of standard forensic methods lies in the construction of individual features that cannot be accurately inferred through DNA analysis, such as age at death [32], surgical traces, types of trauma, bone length, robustness and muscle crest development [5]. These methods benefit identification and also provide biographical information on victims’ experiences. For example, the forensic investigation of World War II victims at three karst sinkholes in southern Croatia showed that young and healthy people suffered more violence than elderly or invalided individuals [33].

Considering that, in numerous examples, several decades will have now passed since the conclusion of some event in 20th century warfare and any remains may be in poor states of preservation, DNA analysis stands out as the most feasible method to identify the remains in the absence of relevant records [1]. The effectiveness of identification using standard forensic methods is often lower than that of DNA analysis [34]. The sections below focus on the relevant DNA techniques that have been employed.

3.1 Burial Environment and Sample Collection

Difficulties in the identification of war victims primarily originate in the degraded condition of the skeletal remains. During the lifetime of the organism, DNA integrity and stability is maintained through the cell’s DNA reparation mechanism. DNA will then be gradually degraded by nucleases and microorganisms following death. Since recently deceased individuals will have been exposed to the sun and rain or interred for decades prior to collection, they share similar characteristics with ancient skeletal samples. DNA in remains aged over four years will have degraded to fragments, and cytosine may convert to uracil following deamination [35]. No direct correlation has been shown to hold between sample age and the length of the DNA fragments, though there are more adenine residues at the 5’ end of DNA in the samples dated less than a century old, and more guanine residues in those dated $>$ 40,000 BP [36]. In the PCR process, nucleotide modifications and cross-links may prevent DNA polymerase from functioning, thus impairing amplification [37]. In addition, it is likely that humic acids as well as calcium chloride exist as PCR inhibitors in excavated materials [38]. These characteristics call for more effective identification procedures to produce reliable DNA profiles.

Under the same external conditions, anatomical location of bones impacts DNA quality and yield to a certain degree [39]. In analysis of ancient skeletal samples dated 1800–10,000 cal. BP, high endogenous DNA could be obtained from the petrous portion of the temporal bone [40], which is widely chosen in ancient DNA research. However, for skeletal samples of a relatively young age, the DNA yields of small cancellous bones are, on average, higher than those of dense cortical bones [39]. This might be the consequence of the soft tissues in cancellous bones, which contains more DNA than dense cortical bones, and will not have completely degraded in such circumstances, as confirmed by SR micro-CT [41] and X-ray photoelectron spectroscopy [42]. A team from the Institute of Forensic Medicine in University of Ljubljana performed a series of studies on the differences in DNA quantity and quality extracted from different skeletal elements of World War II victims. Using PowerQuant (Promega) and the PowerPlex ESI 17 Fast System (Promega) to compare DNA yield and STR typing success between the petrous portion of the temporal bone and metacarpals III, they demonstrated that no significant distinction existed between the two regions. Therefore, when the skull cannot be collected, metacarpals III can be used as a substitute, avoiding contamination when performing excisions on the skull [43]. Similarly, in comparison of 48 different types of bones from the head, torso, arm, leg, hand and foot of three victims, the small cancellous bones of the foot and hand as well as the petrous portion of the temporal bone provided the highest DNA yields and the most complete STR profiles. This study also demonstrated the impact of micro-environmental conditions on sample quality [44]. In situations where only torso bones could be collected, the vertebral arches of the 12th thoracic vertebra are recommended for identification [45] due to the higher degree of bone remodeling [46] or cortical bone characteristics. Rib bones are the most highly recommended, especially at the proximal or vertebral ends [47]. The 12th thoracic vertebra and the first rib are both suitable for sampling, because their DNA yields and STR typing success rate show greatest conformity [48]. Using these high-yield skeletal sites can effectively improve the success rate of identification.

At present, the identification of war victims mainly follows or refers to the guidelines and protocols for Disaster Victim Identification (DVI) developed by the International Criminal Police Organization (INTERPOL) and ratified by 190 member nations [49]. The importance of DNA analysis procedure for DVI was proposed during a round table discussion on the 2004 Indian Ocean Earthquake and Tsunami as part of the 21st congress of the International Society for Forensic Genetics [50], spawning a series of technical recommendations. One recommendation suggested that different sample types for the same individual should be collected for DNA testing in order to avoid mistakes and re-sampling. To meet this recommendation, for mixed remains, individuals would need to be distinguished by physical anthropologists prior to DNA analysis.

3.2 DNA Extraction and Quality Control

More efficient extraction methods are required to obtain highly degraded DNA from human remains. Prior to extraction, surface dirt on samples must be removed with a scalpel. Bones should then be immersed in 5% sodium hypochlorite solution for at least 15 minutes, washed with ethanol, ground and placed in sealed tubes. EDTA, Proteinase K and SDS are used to induce cell lysis and proteolysis, which can significantly increase endogenous DNA [51]. Because the total volume of DNA contained in the remains of war dead was greatly reduced as opposed to environmental contaminants, complete protection in the sampling process, use of sterile and clean operating environment, and DNA authenticity evaluation together with repeated, independent evaluations have been proven necessary in order to prove absence of contamination [52].

DNA extraction methods used in present researches on human remains from warfare include organic extraction methods relying on phenol and chloroform methods [53], magnetic beads based methods [54], and silica-binding methods [55]. In organic extraction methods, DNA and proteins are separated in the organic phase and aqueous phase respectively. However, the maximum loss of DNA is 75% through organic extraction [56]. Magnetic bead extraction methods involve a positively charged binding buffer and negatively charged magnetic beads which can bind with DNA. This is usually semi-automated, such as through Promega DNA IQ [57] or robot-guided Enlighten Biotech extraction [58]. It is worth noting that in the extraction of highly degraded ancient DNA, silica-coated magnetic beads provide high yields in a short time [59], meaning this protocol has definite potential in the identification of victims. As with magnetic beads, silica columns are negatively charged allowing for DNA adsorption in the binding buffer. The performance of organic extraction methods and the silica-binding methods has been compared in a study on victims of armed conflicts in the Balkans from 1992–1995. DNA extracted by silica-binding methods exhibited three times the purity of DNA extracted using organic extraction methods, with subsequent amplification made even more successful due to the reduction of amplification inhibitors [60]. The extraction protocol established by Dabney et al. [61] is widely used in such degraded DNA research and silica-binding methods can obtain DNA fragments as short as 25 bp [62], with the corresponding analysis method analyzing DNA fragments as short as 35 bp [63], which satisfy requirements in the identification process. Therefore, in the work of identification, magnetic beads and silica columns used in degraded DNA extraction are recommended as opposed to use of phenol and chloroform methods. Considering that the lysis and demineralization process consumes an extended period of time, the Promega Bone DNA Extraction kit, custom-made for the Maxwell FSC extraction robot, along with the Hamilton AutoLys tube, provides a rapid extraction protocol based on partial demineralization, suitable for large-scale identification [64]. It is worth noting that silica-binding methods are suitable for extracting short fragments in samples exhibiting poor preservation and that longer passages of genetic data extracted from less degraded remains may have to be broken into smaller parts. Furthermore, for highly degraded samples, whole genome amplification (WGA) has the potential to improve identification success rate [65], but PCR amplification bias may be introduced and subsequently impact downstream detection.

Following DNA extraction, preliminary quality control is also needed, such as through DNA quantification and fragment length analysis. The methods used for DNA quantification mainly include quantitative PCR (qPCR), quantitative spectrophotometer analysis and fluorescence quantification. Results from quantitative spectrophotometer analysis might be unreliable, while the effectiveness of qPCR has been confirmed in ancient and the World War II bone samples [66]. Fluorescence quantification is widely used in the biological research field. For instance, Thermo Fisher Qubit 2.0 plays a good role in the quantification of ancient DNA, and has also been used in the DNA identification of soliders’ remains [9]. Since microorganism and modern human DNA is significantly longer than that from human remains, fragment length analysis can assist in determining the main component of DNA templates. Agarose gel electrophoresis is an important and traditional method for fragment length analysis, but on-chip electrophoresis is now commonly used, e.g., the Agilent 2100 biological analyzer, together with the software 2100 Expert and the High Sensitivity DNA Kit.

When using next-generation sequencing technology, in addition to the standard contamination-control operation during the experiment, contamination assessment and quality control in of the sequencing data are also essential. The latest contamination assessment method is based on interpreting endogenous DNA sequences characteristic of DNA damage as found human remains [67]. Preliminary quality control of read length from original data is carried out through leeHom [68], before damage pattern and the length distribution from the BAM file containing sequence alignment information are evaluated through using mapDamage2.0 (http://ginolhac.github.io/mapDamage/) [69]. For assessment of contamination viewed from mitochondria, sex chromosomes and autosomes, the software Schmutzi [70], ANGSD [71], and the MCMC algorithm in DICE can be applied respectively [72].

3.3 PCR-Capillary Electrophoresis Typing and Next-Generation Sequencing

At present, PCR-capillary electrophoresis (CE) for the autosomal and Y chromosome STR loci is the most commonly-used method in the identification of human remains from warfare. For instance, PowerPlex16 (15 STRs) [4], PowerPlex1Y23 (23 Y-STRs) [17], AmpFlSTR1 NGMTM PCR Amplification Kit (15 STRs and Amelogenin), AmpFlSTR1 Y-FilerTM PCR Amplification Kit (15 Y-STRs) [24] and Investigator ESSplex SE QS (16 STRs and Amelogenin) [45] have proven applicable in research on war victims. Mini-STRs can bring about better results for short DNA fragments in the degraded samples, with amplification efficiency improved due to smaller amplicons [73]. When identifying the victims at mass graves in Slovenia, 5 mini-STR loci were used to improve observation sensitivity and increase tolerance to common inhibitors and thus successfully obtain complete profiles from a small amount of DNA [74]. Mini-STR loci were also effective in confirming the kinship between Korean War victims and their relatives [28].

As has been noted, SNPs can be used to infer the phenotype of the victims. The HIrisPlex assay, a single panel covering 24 phenotypic markers, is a successful application [75]. This system has been effective in the prediction of eye and hair colors of the Second World War victims in mass graves, combined with the STR profiles to confirm kin relations to potential victims [76]. PCR-CE typing method has also been used in mtDNA sequencing. Hypervariable region I (HVI) and hypervariable region II (HVII) are mainly tested object [7, 17, 18, 23, 25, 27], and small amplicons as well as primers with low nucleotide variability are useful for successful amplification [28]. When samples are poorly preserved, more durable mitochondrial can provide valid information, however mitochondrial heteroplasmy at the individual level and the haplotype frequency difference at population level may have a negative impact on identification.

There remain, however, some drawbacks with traditional PCR-CE typing methods. The first pertains to limits to the number of genetic markers that can be detected at one time. Although upgrades from a monochromatic to polychromatic fluorescence system have increased the number of markers, the present number still stands below 40, limiting the total detectable forensic genetic markers [77]. This hinders the accuracy of kinship identification under certain circumstances, i.e., where local inbreeding may result in an unexpectedly high number of false matches between unrelated individuals in STR typing [78]. Furthermore, when the number of genetic loci in the established database is lower than that for the newly generated data, the risk of false conclusion rises remarkably [79]: making database uniformity essential. Secondly, when multiple types of genetic markers involved, choosing identification kits is challenging and inefficient as a result of the low number and short length of DNA templates in human remains. Multiple iterative tests (autosomal STR, Y-STR, Y-SNP, X-STR, and mitochondrial DNA) would consume a considerable proportion of limited DNA templates. In addition, when amplicon length has exceeded the length of the DNA templates extracted from degraded samples (70–150 bp), the instability of capillary electrophoresis typing and abnormal events will increase [80], e.g., through ladder-like bands, stutter bands, unbalanced amplification of alleles, allele drop-out and PCR substitution errors. Solutions to these problems involve shortening amplicon length, improving typing sensitivity [81] and using new SNP typing systems [82]. Notably, next-generation sequencing (NGS) sequencing can meet the above requirements, providing a feasible scheme for DNA identification of human remains.

Development of NGS technology has resulted in dramatic reductions in sequencing costs, making possible the application of such techniques to a large number of human remains. Compared with capillary electrophoresis, the advantages of NGS mainly lies in its ability to detect a greater number of genetic markers, distinguish mutations within STRs and differentiate mixed samples. Reliable results can be acquired even with 1 ng of DNA input [83], demonstrating the suitability of this technique for degraded samples. The MiSeq FGx™ & ForenSeq™ system, launched by Illumina, shows great promise for forensic science; it involves 26 autosomal STRs, 24 Y-STRs, 7 X-STRs (plus Amelogenin), 95 identity SNPs, 22 phenotypic and 56 geographical ancestry SNPs (optional) [84]. These systems have been used to generate DNA profiles of 13 individuals belonging to mass graves from the Spanish Civil War. In this context it was demonstrated that enhanced library inclusions seemed to have increased the number of reads, thus improving overall performance. The probability of exclusion for unrelated individuals was $>$ 99.99% for first-degree relatives as well as second-degree relatives exceeded 99%, overcoming the inaccuracy of STR typing [16]. Nevertheless, between ~8% and 34% of the loci showed allele drop-out, possibly due to STR amplicons generally exceeding 150 bp in length, which fails to meet the need for amplifying degraded DNA in human remains. Meanwhile, the error rate of first cousin testing was relatively high, hence more markers were required [85]. In addition, ancestral informative SNPs and phenotypic informative SNPs in this system did not present a versatile set since databases containing these SNPs had failed to develop across all populations. For example, these ancestral informative SNPs could not meet the need for distinguishing different populations in East Asia, especially when distinguishing remains between Chinese and Japanese soldiers. Phenotypic informative SNPs also failed to provide effective information in the East Asian case, since pigment related phenotypes (skin color, iris color, hair color, etc.) in these populations exhibited low diversity. Another frequently-used kit is Ion TorrentTM HID SNP 169-plex, containing 51 autosomal SNPs from SNPforID, 85 autosomal SNPs from Kiddlab and 33 Y-SNPs [86], but its utility in the analysis of human remains is yet to be assessed.

3.4 Complex Kinship and Biogeographic Inference

The essence of DNA identification of human remains is to ascertain the identities and geographical origins of the samples. Kinship analysis is the key method used for determining the identities of remains. Kinship may include the full-sib relationship, half-sib relationship, and uncle-nephew relationship, among other relations. The degradation or mixing of DNA in remains, and missing tested family members has exacerbated the problems associated with kinship analysis [87]. Full siblings can be identified by adding STR markers [88], existing STR typing systems are often ineffective for more distant kinship, such as secondary kinship. Therefore, more genetic markers are required, including SNPs, in order to enhance detection ability within complex kinship analysis. To date, a number of NGS-based panels have indeed been developed to identify complex kinship. Ida Grandell et al. [89] formulated a panel containing 140 autosomal SNPs that could be used identify complex kinship. Manfred Kayser et al. [90] designed a multiplex PCR panel containing 530 Y-SNPs, which improved resolution power for detecting paternal inheritance. Zhang et al. [91] selected 233 autosomal SNPs, 9 Y-SNPs and 31 X-SNPs for secondary kinship identification. Mo et al. [92] found that 85, 127, 491 and 1858 SNPs were needed to distinguish parents, siblings, half siblings/uncles, and cousins respectively, and then developed the SNP2kin panel for secondary kinship identification. In terms of data analysis, Bian et al. [80] established a statistical decision-making method based on a ‘NGS+’ forensic genetic marker analysis system, i.e., the distribution patterns and characteristics of genetic markers between two individuals within third-degree relatives. As next-generation sequencing technology matures, people are increasingly entrusting companies to sequence their own genomes, which are then uploaded to public databases. Despite the ethical and privacy issues that come with this new phenomenon, these databases have proven effective in determining individual identity in the absence of close relatives. High density SNPs enable us to trace distant or unknown relatives using the likelihood ratio (LR) and ‘identical by state’ (IBS), which show higher true classification rate in combined use of DNA data [93]. A classic case is the arrest of the “Golden State Killer”, accomplished with the help of the public genetic database GEDMatch [94]. Outside of the investigation of criminal cases, forensic genealogy can also be used to search for the ancestors of individuals and the relatives of war victims.

The biogeographic ancestry analysis, which examines the frequent variation markers applicable in the inference of specific geographic origin of individuals, stands out as the major method for determining the origin of remains. As mentioned above, a simple reliance on uniparental genetic markers is prone to false inferences, meaning use additional markers is recommended. In the past decade, many panels containing a limited number of ancestry informational markers (AIM) have been developed. A comparison study of 21 independently developed AIM panels covering 1397 SNPs revealed that few SNPs overlapped in any of these panels [95]. The EUROFORGEN Global AIM-SNPs panel contained 128 markers that were mainly used to distinguish populations from five continents [96], and the Precision ID Ancestry kit launched by Thermo Fisher contained 165 AIM selected from Seldin [97] and Kidd [98]. These panels greatly facilitate work to distinguish individuals from different areas of the world who come together in intercontinental wars such as World War II. In terms of STRs, the global population can be divided into five continental clusters, with Eurasian cluster splitting into three groups: Europe, Middle East, and Central and South Asia [99]. When analyzing 650,000 SNPs, the genetic structure of different populations shows the similar pattern [100]. However, small AIM panels could hardly be applicable to the scenario where neighboring subpopulations or admixed populations involved, given the difficulty in balancing differentiation between populations [101]. That is, the cumulative Population-Specific Divergence (PSD) values among these groups are hard to balance with limited SNPs. This imbalance of ancestral information in small AIM sets increases the redundancy of whole panel while cutting down on the resolution. Phillips [102] has therefore proposed carefully balancing population-specific differentiation (PSD) of all comparison groups in the development of the AIM panel. Kidd [95] proposes a two-tier approach: for the first stage, smaller number of SNPs are used to distinguish Inter-Continental groups, while for the second stage another panel is developed in order to perform fine-grained ancestral inference according to actual regional situations. Overall, in order to solve the above problems and to achieve a higher resolution, more genetic markers need to be detected simultaneously. Hence, the second tier AIM panel should be developed with the help of large-scale population genome database like 1000 Genomes (https://www.internationalgenome.org/) [103] and gnomAD (http://www.gnomad-sg.org/) [104], and be further verified on more related populations. Considering all the above, the Forensic Archaeology Laboratory of the Institute of Archaeological Science in Fudan University developed Panel A, a multiplex PCR amplification system based on short amplicons containing a hypervariable region I (16025–16399, 375 bp), hypervariable region II (65–371, 307 bp) of mitochondrial DNA and three sex-determine genes (Y-indel, SRY and Amel), and Panel B, containing 47 Y-mini STRs and 485 Y-SNPs, which covered the common haplogroups in East Asia. These two panels were used to identify maternal lineages, paternal lineages and genders. Moreover, Panel C, containing 145 AI-SNPs, has been used to infer the geographical origins for East Asian populations.

While the multiplex PCR panels fail to meet higher resolution requirements, high-throughput sequencing based on probe hybridization capture offer greater potential, especially for complex kinship determination. For example, the 1240K panel, first described in 2015 has been widely used in ancient DNA research; this panel includes about 1.15 million autosomal SNPs, 49,000 X-SNPs and 33,000 Y-SNPs [105]. Compared with whole genome sequencing, this kit economizes on sequencing and analysis costs by orders of magnitude, particularly for degraded samples with low quality endogenous human DNA. Because sequences enriched by probe capture are highly specific, this kit can effectively avoid potential pollution introduced through PCR. One newly developed kit based on the 1240K panel is the Daicel Arbor Biosciences myBaits Expert Human Affinities Kit (https://arborbiosci.com/), containing more than 2 million sites.

4. Conclusions and Prospects

As DVI related techniques have improved, the identification of human remains from 20th century warfare has been actively carried out in many countries. The structures of existing databases can serve as references for the establishment of new databases, whose applications cover the restoration of phenotypic characteristics, kinship analysis and the biogeographic ancestry inference. DNA analysis is an effective supplement to the physical anthropological methods and PCR-CE is the most commonly-used technique in existing research. The experimental and analytical techniques of ancient DNA studies can also be applied to war remains, such as magnetic beads and silica columns in DNA extraction, high-throughput sequencing based on probe hybridization capture, pollution assessment and quality control. Since next-generation sequencing can overcome the shortcomings of some traditional DNA analysis techniques including capillary electrophoresis, various multiplex PCR and probe capture kits developed with sequencing technology promise to be more widely used in the identification of the human remains of warfare.

In addition to DNA analysis, we should give full consideration to sample uniqueness and analyze on a case-by-case basis during the identification of specific human remains. Interdisciplinary methods and techniques may also be introduced when the occasion permits. For example, Hidetoshi Someda et al. [106] have provided new scientific evidence for clearly distinguishing Japanese and American World War II servicemen by comparing carbon and oxygen stable isotope levels in the tooth enamel. Stable isotope analysis has been widely used in studies on ancient human diet and migration in recent years due to its strong correlation with regional dietary practice [107]. Analysis and comparison of the carbon, nitrogen and other stable isotopes in different human tissues can yield important information on human diets and living environments. These features can help to identify some specific remains, such as the remains of overseas Chinese in the Chinese battlefields of the World War II that are unidentifiable using DNA analysis.

The identification process includes excavation of remains, transportation and storage of samples, physical anthropological identification and DNA analysis, and the collection of information belonging to victims and their relatives. Completing this process requires not only the permission and support of relevant governments, but also considerable information pertaining to the remains provided by historians, physical anthropologists, forensic anthropologists and even social workers. The result demands extensive cooperation between governments, biotech companies and university research institutes. The European Network of Forensic Science Institutes (ENFSI) and European DNA Profiling Group (EDNAP) have been collaborating on the standardization of DNA profiling throughout Europe, strengthening cooperation between DNA laboratories [72]. Similar standardization can be established to achieve the same effect in other regions given agreement on protocols.

In China, forensic archaeological research has been applied to a mere 1% of remains from solider or civilian victims of warfare—huge gap between fieldwork and laboratory work. Additional gaps exist in corresponding standards in sample collection, preservation, experimental norms, safety regulations, and especially in professional ethics. The Regulations of the People’s Republic of China on the Administration of Human Genetic Resources, which came into force on July 1st, 2019, clearly stipulates that human genetic resources are genetic materials containing human genome and genes, a definition which naturally covers human remains. There is therefore an urgent need to introduce specific ethical regulations for the unearthed human remains, as well as the informed consent and privacy protection for relatives of the soldiers and victims. In this context, by using state-of-the-art NGS technologies (Multiple PCR panels, Probe capture panels and Shotgun sequencing), the Forensic Archaeology Laboratory of the Institute of Archaeological Science in Fudan University, working jointly with related research institutes have begun establishing a public database [9, 10, 108], which will promote the development of DVI related techniques and archaeological forensics in China in reference to the successful international experience.

Author Contributions

SW and YX designed this study. YX and LW organized the references. YX, EA and LW wrote the manuscript. LW, EA and SW revised the manuscript. All authors reviewed the manuscript.

Ethics Approval and Consent to Participate

Not applicable.

Acknowledgment

Not applicable.

Funding

This work was funded by the National Natural Science Foundation of China (32070576).

Conflict of Interest

The authors declare no conflict of interest.

Publisher’s Note: IMR Press stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

References

[1]

Andelinović S, Sutlović D, Erceg Ivkosić I, Skaro V, Ivkosić A, Paić F, et al. Twelve-year experience in identification of skeletal remains from mass graves. Croatian Medical Journal. 2005; 46: 530–539.