1 Department of Pharmacy, First Hospital of Shanxi Medical University, 030001 Taiyuan, Shanxi, China
2 School of Pharmacy, Shanxi Medical University, 030001 Taiyuan, Shanxi, China
Abstract
β-lactam allergy labels (BALs) are commonly found in patient records but are often inaccurate. This can lead to suboptimal antibiotic selection, increased healthcare costs, and antimicrobial resistance. Most existing risk assessment tools were developed in Western settings and are not applicable in Chinese clinical contexts. This study developed and pilot-tested a pharmacist-led BAL risk assessment tool tailored to the Chinese healthcare environment.
The study was conducted in three phases: (1) A systematic review of 90 studies to identify key β-lactam allergy risk factors; (2) Grounded theory and text co-occurrence analysis to extract high-risk features and construct the assessment framework; and (3) A pilot implementation in a tertiary hospital to evaluate the tool’s feasibility, clinical impact, and patient outcomes using a quasi-experimental design.
The final tool comprised eight dimensions, 35 subdimensions, and over 1328 distinct coded nodes. Of the 289 patients involved in the pilot, 18.7% were classified as high risk. Compared with patients with BALs but without high-risk features, those at lower risk had significantly shorter hospital stays (8.5 ± 4.3 vs. 10.6 ± 5.5 days; p < 0.001), reduced hospitalization costs (17,800 ± 6200 vs. 21,000 ± 7500; p = 0.0011), and lower allergy event rates (0% vs. 6.5%; p = 0.002). β-lactam use increased (75.3% vs. 40.3%; p < 0.001), whereas second-line antibiotic use decreased (24.7% vs. 59.7%; p < 0.001). The tool also demonstrated high feasibility, achieving a 100% completion rate and strong adherence among pharmacists.
This pharmacist-led risk assessment tool has strong potential for accurately identifying high-risk β-lactam allergy patients and optimising antimicrobial stewardship in Chinese hospitals. Further large-scale validation is warranted.
Keywords
- beta-lactam
- allergy and immunology
- risk assessment
- antimicrobial stewardship
Several international tools have been developed in recent years to help clinicians identify low-risk patients who are eligible for de-labeling. These include the PEN-FAST score in Australia [10], Shenoy’s multidimensional algorithm in the US [11], the Antibiotic Allergy Assessment Tool [12], and electronic screening tools tailored for paediatric populations [13]. In 2022, the American Academy of Allergy, Asthma & Immunology and the American College of Allergy, Asthma & Immunology jointly updated their clinical practice parameters, further reinforcing the importance of risk stratification based on the nature and severity of prior reactions [14]. Through the use of structured symptom checklists and risk-scoring algorithms, these tools have demonstrated success in reducing the unnecessary use of broad-spectrum antibiotics, shortening hospital stays, and lowering healthcare costs.
However, most existing tools were developed in Western settings, focusing on specific populations or use-case scenarios. Their heterogeneous assessment criteria and limited indicator coverage have hindered their adoption in clinical practice. Key variables, such as the timing of the initial reaction, reaction type and severity, organ system involvement, and comorbidities, are inconsistently defined across tools, which reduces the comparability and generalisability of results. Furthermore, cultural and systemic differences pose additional challenges to their application in non-Western healthcare environments.
The clinical use of allergy risk assessment tools remains limited in China. Despite a prevalence of 4–5.8% of BAL among hospitalised patients [15, 16, 17], there is no standardised, validated tool tailored to the Chinese population. Risk evaluation often relies heavily on clinician experience rather than objective criteria, which contributes to a persistently high false-positive rate. This issue is compounded by vague symptom descriptions, incomplete or fragmented medical records, and a lack of systematic follow-up for allergy re-evaluation. Existing international tools cannot be directly adopted by the Chinese healthcare system due to differences in language, symptom expression, phenotype, and healthcare infrastructure.
Furthermore, pharmacists in China are frequently underutilised in the allergy assessment process and are primarily responsible for dispensing medication rather than making active clinical decisions. However, given their expertise in taking medication histories and evaluating cross-reactivity, pharmacists are well-positioned to lead allergy risk assessments if they are supported by the appropriate tools and frameworks.
Therefore, there is an urgent need to develop a pharmacist-led, evidence-based, China-specific BAL risk assessment tool to address this critical issue. This would improve diagnostic accuracy, optimise the use of antimicrobials, and support national initiatives that promote the rational use of drugs and antimicrobial stewardship.
This study used a three-phase, mixed-methods approach to develop and trial a
pharmacist-led
Fig. 1.
Flowchart of the research process.
To identify commonly used domains, indicators, and scoring methodologies in
existing
This systematic review was conducted in accordance with Preferred Reporting
Items for Systematic Reviews and Meta-Analyses (PRISMA) 2020 guidelines. Four
electronic databases were searched: PubMed, Embase, Web of Science, and the
Cochrane Library. The search strategy included combinations of Medical Subject
Headings (MeSH) and keywords such as: ‘
The search covered articles published between January 2000 and July 2024. Additionally, the reference lists of relevant articles and guidelines were manually screened to identify further eligible studies. The search strategy is provided in Supplementary Table 1.
Studies were included if they met the following criteria: reported the
development, validation, or implementation of
Two independent reviewers screened the titles, abstracts, and full texts. Any discrepancies were resolved through discussion or by a third reviewer. A standardised data extraction form was used to collect the following information: (1) Study characteristics (year, country, setting, and population); (2) Tool type (score-based, algorithmic, or checklist); (3) Assessment domains (e.g., reaction type, timing, severity, treatment, and comorbidities); (4) Risk stratification methods (e.g., numerical score or tiered categories); (5) Validation metrics (e.g., sensitivity, specificity, and predictive values); and (6) Clinical implementation outcomes (e.g., antibiotic prescribing, delabelling, and infection rates).
The extracted indicators and domains were categorised and mapped using a narrative synthesis approach. Dimensions that recurred across multiple tools were identified and grouped thematically to form the preliminary framework for tool development in Phase 2.
The aim was to develop a structured, pharmacist-administered
A three-stage, grounded theory-inspired coding framework was applied to
publications, guidelines, and reviews on
Two trained researchers performed open coding independently, identifying meaningful semantic units related to allergy risk. These included symptom descriptors (e.g., urticaria, rash, wheezing, and anaphylaxis), temporal markers (e.g., immediate and within 1 hour), and exposure indicators (e.g., multiple doses and cross-reactivity). Diagnostic terms were also included (e.g., IgE, skin tests, and drug provocation test).
During axial coding, these concepts were categorised based on their shared clinical meaning. For instance:
• Symptom-related terms were categorised by organ system (e.g., skin/mucosa, respiratory, cardiovascular).
• Timing-related expressions were grouped under reaction latency and reaction duration.
• Exposure and outcome data were integrated into drug history, tolerance, and recurrence domains.
During selective coding, a draft framework comprising eight primary sections was constructed:
(1) Basic patient demographics, (2) Allergy history, (3) Suspected drug characteristics, (4) Reaction characteristics, (5) Management measures, (6) Clinical outcomes, (7) Diagnostic indicators, and (8) Data sources and patient concerns.
The tool’s structure emphasised terminology and question logic that is suitable for use in Chinese clinical environments. This includes specific local risk factors, such as the use of traditional Chinese medicine and hospital insurance classification, as well as symptom expressions that are common in Chinese medical documentation.
Theoretical saturation was reached when five consecutive articles yielded no new codes. Coding consistency was monitored through iterative consensus discussions.
To complement and validate the results of the literature-based coding analysis, a text co-occurrence analysis [18] was performed on the complete texts of the 90 articles included in the study. Term frequencies were calculated, and the Term Frequency-Inverse Document Frequency (TF-IDF) algorithm was applied to identify important terms across the literature. A network analysis of co-occurrence with clinical descriptors such as ‘high-risk’, ‘severe’, ‘immediate’, and ‘contraindicated’ was then conducted using Python’s NetworkX tool to extract keywords semantically related to severe or high-risk allergic events. These keywords were then mapped to the relevant domains of the assessment framework, such as reaction characteristics (symptom descriptors and temporal patterns), auxiliary diagnostics (immunological or laboratory markers), and drug history. This supports the data-driven construction of preliminary risk stratification logic, whereby the presence of certain semantic features could trigger a high-risk classification.
A structured draft tool was created based on the coded domains and text-derived semantic indicators. Patients were classified as high risk if they met one or more of the following conditions:
- Presence of one or more of the 50 high-risk terms in historical or reported symptoms.
- Onset of symptoms within 1 hour of
- History of allergy to multiple
The tool embedded management recommendations for both high- and lower-risk groups.
- High-risk patients: Referral to an allergy specialist or use of
- Lower-risk patients: Considered eligible for
Each item in the tool was formatted as a closed or semi-structured question with standardised response options, making it suitable for both paper-based and electronic implementation. This structure enables pharmacists to use the tool in inpatient, outpatient, and emergency care settings.
A single-centre, quasi-experimental pilot study was conducted at a tertiary general hospital in northern China. The tool was implemented across multiple internal medicine and surgical wards as part of routine, pharmacist-led medication reconciliation at the time of patient admission. A pre-post design was employed, supplemented by interrupted time series (ITS) analysis and propensity score matching (PSM) to evaluate changes in outcomes and control for confounding factors. The detailed steps of phase 3 are illustrated in Fig. 2.
Fig. 2.
Research steps of phase 3.
Eligible participants were adult inpatients (aged
For patients classified as lower-risk and considered eligible for
The outcome evaluation focused on the feasibility of implementing the tool and
its preliminary clinical impact. Feasibility indicators included pharmacist
adherence, completion rate of the tool, and the time required for each
assessment. To assess clinical utility, several outcome variables were collected
and compared before and after implementation, including length of hospital stay,
total hospitalisation cost, utilisation rate of
The data were managed using Microsoft Excel 2021 (Microsoft Corp., Redmond, WA, USA) and analysed using IBM SPSS Statistics (version 25.0; IBM Corp., Armonk, NY, USA), R (version 4.4.0; R Foundation for Statistical Computing, Vienna, Austria), and Python (Spyder, Py3; Python Software Foundation, Wilmington, DE, USA). Descriptive statistics summarised patient characteristics and implementation metrics. Between-group comparisons were conducted using chi-square or Fisher’s exact tests for categorical variables and t-tests or Mann–Whitney U tests for continuous variables. PSM was employed to balance baseline differences between the pre- and post-intervention groups with respect to age, sex, comorbidities, and prior antibiotic use. Additionally, an ITS analysis was performed to evaluate changes in the level and trend of outcome indicators following the implementation of the tool. A two-tailed p-value of less than 0.05 was considered statistically significant.
A total of 4657 records were identified through systematic database searches. Following the removal of 1175 duplicates, 3482 titles and abstracts were screened. Of these, 3162 were excluded due to irrelevance or a lack of methodological clarity. The remaining 320 full-text articles were reviewed in detail.
Subsequently, a further 247 articles were excluded for the following reasons:
use of previously published tools without a methodological explanation (n = 56);
non-
An additional 17 eligible articles were identified by screening the reference lists of the included reviews and clinical practice guidelines. In total, 90 studies were included in the final synthesis. The selection process is illustrated in Fig. 3 (PRISMA flowchart).
Fig. 3.
PRISMA flowchart. PRISMA, Preferred Reporting Items for Systematic Reviews and Meta-Analyses.
Supplementary Table 2 summarises the characteristics of the included
studies. These studies were conducted in multiple countries (including the United
States, Canada, China, and the United Kingdom) and employed various designs (such
as observational studies, randomised controlled trials, and quality improvement
studies). The study populations mainly consist of adult and paediatric patients
with penicillin antibiotic allergy labels (PALs) and BALs. Sample sizes range
from a few individuals to over a million, and risk stratification approaches
include qualitative classification, screening questionnaires, decision
algorithms, and scoring systems. These studies reflect the diverse practices
involved in the global management of penicillin and
Fig. 4 shows the geographical distribution of the studies, and Fig. 5 presents their publication years, revealing geographical and temporal trends in this research area.
Fig. 4.
Country distribution of included studies.
Fig. 5.
Publication years of included studies.
A total of 1328 distinct open codes related to the assessment of risk of
During axial coding, these codes were clustered into multiple hierarchical levels. The final framework, which was constructed using selective coding, comprised 35 secondary-level categories and over 150 specific data elements. This formed the basis of a multidimensional assessment tool. Fig. 6 (tree diagram) visualises this structure, and Supplementary Table 3 presents representative concepts, citation frequencies, and document coverage per subcategory.
Fig. 6.
Tree diagram of the
The most frequently coded elements were: skin and mucous symptoms (e.g., urticaria, rash and angioedema), which were presented in 74 studies totalling 446 citations; multisystem involvement (e.g., anaphylaxis, hypotension and fever), which was observed in 75 studies totalling 201 citations; and respiratory features (e.g., wheezing and dyspnoea), which were cited in 46 studies totalling 128 instances. Temporal patterns (e.g., onset time after administration or symptom duration) were covered by 13–20 studies, depending on the indicator. Drug reuse/tolerance and reaction recurrence were cited in seven and two studies, respectively.
Inter-coder agreement analysis indicated a raw agreement rate of 89.7% and a Cohen’s kappa coefficient of 0.82, suggesting substantial reliability in the coding process. To confirm theoretical saturation, five additional studies were sequentially reviewed after code stabilisation. None produced new categories, which supports the sufficiency and comprehensiveness of the framework.
To validate and enhance the findings from the qualitative coding analysis, a
text mining analysis was conducted on the full text of the 90 included studies. A
total of 220 unique keywords that frequently co-occurred with “high-risk” or
“severe”
The top 20 high-frequency co-occurring terms such as ‘anaphylaxis’, ‘urticaria’, ‘angioedema’, ‘systemic’, ‘immediate’, ‘eosinophilia’, ‘epinephrine’, ‘DRESS’, and ‘Stevens–Johnson syndrome’, were then quantitatively ranked based on normalised term frequency and visualised using a word cloud and a tree map (Figs. 7,8). The three most frequently occurring terms were anaphylaxis (n = 97, 6.06%), urticaria (n = 92, 5.75%), and angioedema (n = 84, 5.25%) (Table 1). The full version is shown in Supplementary Table 4.
Fig. 7.
Word cloud of text co-occurrence analysis for “high-risk” terms.
Fig. 8.
Tree map of top 20 co-occurring terms associated with “high-risk”.
| Keyword | Count | Percentage (%) |
| Anaphylaxis | 97 | 6.06% |
| Urticaria | 92 | 5.75% |
| Angioedema | 84 | 5.25% |
| Systemic | 67 | 4.18% |
| Immediate | 62 | 3.87% |
| Positive | 53 | 3.31% |
| Swelling | 51 | 3.19% |
| Eosinophilia | 49 | 3.06% |
| Dress | 47 | 2.94% |
| Serum | 42 | 2.62% |
| Sickness | 41 | 2.56 |
| Hives | 40 | 2.50% |
| Johnson | 40 | 2.50% |
| Stevens | 40 | 2.50% |
| Epidermal | 38 | 2.37% |
| Necrolysis | 38 | 2.37% |
| Toxic | 38 | 2.37% |
| Wheezing | 33 | 2.06% |
| Shortness | 26 | 1.62% |
| Epinephrine | 24 | 1.50% |
Based on the structured coding framework and keyword frequency analysis, the
final
Among the 150+ extracted elements, three indicators were empirically selected as
primary stratification triggers: (1) High-risk term presence (e.g., anaphylaxis,
DRESS, epinephrine); (2) Immediate reaction onset (
Patients meeting any of these criteria were preliminarily classified as high
risk. Risk tiering was directly linked to tailored clinical recommendations
(e.g., specialist referral, desensitization, pharmacist-supervised use) and
informed the tool’s final logic (Risk Assessment Tool for
A total of 289 patients were included in the pilot phase, with 139 assessed before tool implementation and 150 afterwards. There were no statistically significant differences in baseline characteristics between the groups, confirming the effectiveness of propensity score matching. No differences were observed in sex (p = 0.65), age (p = 0.15), major diagnoses, or comorbidities such as diabetes (p = 0.51) and hypertension (p = 0.33) (Table 2).
| Variable | Pre-intervention group (n = 139) | Post-intervention group (n = 150) | p-value | ||
| Sex (Male) | 85 (61.2%) | 88 (58.7%) | 0.19 | 0.67 | |
| Age (years) | 47.3 |
45.1 |
1.45 | 0.15 | |
| Primary diagnosis | |||||
| Pneumonia | 42 (30.2%) | 40 (26.7%) | 0.43 | 0.51 | |
| Urinary tract infection | 30 (21.6%) | 32 (21.3%) | 0.00 | 0.96 | |
| Oral infection | 14 (10.1%) | 18 (12.0%) | 0.27 | 0.60 | |
| Osteomyelitis | 20 (14.4%) | 27 (18.0%) | 0.69 | 0.41 | |
| Neurological infection | 17 (12.2%) | 20 (13.3%) | 0.08 | 0.78 | |
| Others | 16 (11.5%) | 13 (8.7%) | 0.65 | 0.42 | |
| Department | |||||
| Department of Infectious Diseases | 56 (40.3%) | 70 (46.7%) | 1.19 | 0.28 | |
| Department of Urology | 30 (21.6%) | 35 (23.3%) | 0.13 | 0.72 | |
| Department of Stomatology | 14 (10.1%) | 18 (12.0%) | 0.27 | 0.60 | |
| Department of Orthopedics | 22 (15.8%) | 22 (14.7%) | 0.08 | 0.78 | |
| Department of Neurosurgery | 17 (12.2%) | 5 (3.3%) | 8.12 | ||
| Type of allergic drugs | |||||
| Penicillins | 30 (21.6%) | 32 (21.3%) | 0.00 | 0.96 | |
| Cephalosporins | 15 (10.8%) | 17 (11.3%) | 0.02 | 0.88 | |
| Comorbidities | |||||
| Diabetes | 50 (36.0%) | 48 (32.0%) | 0.51 | 0.47 | |
| Hypertension | 55 (39.6%) | 50 (33.3%) | 0.20 | 0.27 | |
| Length of hospital stay (days) | 10.6 |
8.5 |
3.60 | ||
| Hospitalization cost (CNY, Chinese Yuan) | 21,000 |
17,800 |
3.94 | ||
| First-line antibiotic usage rate | 56 (40.3%) | 113 (75.3%) | 36.54 | ||
| Second-line antibiotic usage rate | 83 (59.7%) | 37 (24.7%) | 35.80 | ||
Further feasibility indicators supported clinical integration. The tool achieved a completion rate of over 95%, with each assessment taking an average of 5–7 minutes. Pharmacists consistently adhered to assessment protocols at a rate above 90% throughout the study period, demonstrating the tool’s high acceptability and ease of integration into routine clinical workflows.
Of those assessed post-intervention, 18.7% were categorised as high risk based
on predefined semantic and clinical criteria. Patients without identified
high-risk features demonstrated significantly better clinical and economic
outcomes. Specifically, their average length of hospital stay decreased from 10.6
Fig. 9.
ITS segmented regression plot. ITS, interrupted time series.
Fig. 10.
Residuals of the segmented regression model.
This study developed and piloted a multidimensional, pharmacist-led
One of the most notable findings was the significant improvement in
antimicrobial stewardship outcomes following its implementation. Among patients
without high-risk allergy features, the tool enabled the safe reintroduction of
In this study, 18.7% of patients were classified as high risk using the tool.
This is broadly consistent with previous reports, which have ranged from 10% to
20% [25, 26, 27, 28, 29]. However, it is still higher than the true prevalence of
Future research should aim to validate the tool across multiple hospital levels and regions, and to explore real-time integration into hospital information systems. Further enhancement of natural language processing functions may also enable automatic flagging of high-risk features in narrative documentation. Longitudinal monitoring of allergy-related outcomes and resistance trends would be essential to assess broader public health impacts.
In summary, this study presents a locally adapted, semantically grounded
Despite the promising results, this study has several limitations.
First, the pilot validation was conducted in a single tertiary hospital, with a relatively small sample size and short follow-up period, which may limit the generalisability of the findings.
Second, although the tool’s stratification logic was consistent with keyword analysis and expert consensus, it has not yet been validated through objective diagnostic testing such as skin tests, specific IgE assays, or drug provocation tests. This represents an important limitation. Future research should design prospective validation studies using allergist-confirmed diagnoses as the gold standard to systematically evaluate the tool’s predictive performance.
Third, some of the tool’s inputs relied on retrospective recall or manual extraction from medical records, which may introduce recall or documentation bias [26].
Finally, the tool has not yet been tested across multiple hospital levels or regions, and real-time integration into hospital information systems remains to be explored.
This study developed and preliminarily validated a pharmacist-led,
multidimensional
Pilot implementation demonstrated the tool’s feasibility and clinical utility,
supporting more accurate identification of high-risk patients, reducing
unnecessary avoidance of
The datasets generated and analyzed during the current study are available from the corresponding author upon reasonable request.
XZ: Supervision, provision of resources, and critical revision of the manuscript; contributed to study design and interpretation of data. XY: Conceptualization, methodology, project administration, and writing—original draft; participated in study design and data interpretation. NS: Data collection, investigation, revision of the manuscript and validation; contributed to analysis and interpretation of results. WC: Formal analysis, visualization, and writing—review & editing; contributed to interpretation of data and preparation of figures and tables. All authors have read and approved the final manuscript, contributed sufficiently to the work, and agree to be accountable for all aspects of the work in accordance with ICMJE authorship criteria.
The study was reviewed and approved by the Ethics Committee of the First Hospital of Shanxi Medical University (Ethics No. KYLL-2025-293) and was carried out in accordance with the guidelines of the Declaration of Helsinki. Written informed consent was obtained from all participants prior to data collection.
The authors wish to thank the clinical pharmacists, physicians, and nursing staff at the First Hospital of Shanxi Medical University for their collaboration and support during the pilot study.
This research received no external funding.
The authors declare no conflict of interest.
During the preparation of this work the authors used ChatGPT-3.5 in order to check spell and grammar. After using this tool, the authors reviewed and edited the content as needed and take full responsibility for the content of the publication.
Supplementary material associated with this article can be found, in the online version, at https://doi.org/10.31083/Pharmazie51744.
References
Publisher’s Note: IMR Press stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.










