OUP user menu

Assessment of α-Synuclein Pathology: A Study of the BrainNet Europe Consortium

Irina Alafuzoff MD, PhD, Laura Parkkinen PhD, Safa Al-Sarraj FRCPath, Thomas Arzberger MD, Jeanne Bell MD, FRCPath, Istvan Bodi FRCPath, Nenad Bogdanovic MD, PhD, Herbert Budka MD, Isidro Ferrer MD, Ellen Gelpi MD, Stephen Gentleman PhD, Giorgio Giaccone MD, Wouter Kamphorst MD, PhD, Andrew King FRCPath, Penelope Korkolopoulou MD, PhD, Gábor G. Kovács MD, PhD, Sergey Larionov MD, David Meyronet MD, Camelia Monoranu MD, Jodie Morris PhD, Piero Parchi MD, PhD, Efstratios Patsouris MD, PhD, Wolfgang Roggendorf MD, Danielle Seilhean MD, PhD, Nathalie Streichenberger MD, Dietmar R. Thal MD, Hans Kretzschmar MD
DOI: http://dx.doi.org/10.1097/nen.0b013e3181633526 125-143 First published online: 1 February 2008

Abstract

To determine the reliability of assessment of α-synuclein-immunoreactive (αS-IR) structures by neuropathologists, 28 evaluators from 17 centers of BrainNet Europe examined current methods and reproducibility of αS-IR evaluation using a tissue microarray (TMA) technique. Tissue microarray blocks were constructed of samples from the participating centers that contained αS-IR structures. Slides from these blocks were stained in each center and assessed for neuronal perikaryal inclusions, neurites, and glial cytoplasmic inclusions. The study was performed in 2 phases. First, the TMA slides were stained with the antibody of the center's choice. In this phase, 59% of the sections were of good or acceptable quality, and 4 of 9 antibodies used performed consistently. Differences in interpretation and categorization of αS-IR structures, however, led to differing results between the laboratories. Prior to the second phase, the neuropathologists participated in a training session on the evaluation of αS-IR structures. Based on the results of the first phase, selected antibodies using designated antigen retrieval methods were then applied to TMA slides in the second phase. When the designated methods of both staining and evaluation were applied, all 26 subsequently stained TMA sections evaluated were of good/acceptable quality, and a high level of concordance in the assessment of the presence or absence of specific αS-IR structures was achieved. A semiquantitative assessment of αS-IR neuronal perikaryal inclusions yielded agreements ranging from 49% to 82%, with best concordance in cortical core samples. These results suggest that rigorous methodology and dichotomized assessment (i.e. determining the presence or absence of αS-IR) should be applied, and that semiquantitative assessment can be recommended only for the cortical samples. Moreover, the study demonstrates that there are limitations in the scoring of αS-IR structures.

Key Words
  • α-Synuclein
  • BrainNet Europe
  • Interrater reliability
  • Immunohistochemistry
  • Tissue microarray

Introduction

The intracytoplasmic aggregation of α-synuclein (αS) is a common denominator found in a group of neurodegenerative disorders currently known as synucleinopathies, including Parkinson disease, dementia with Lewy bodies (DLB), and multiple system atrophy (MSA). In these synucleinopathies, abnormal aggregates of αS are present as neuronal perikaryal inclusions (NPIs), neurites, or glial cytoplasmic inclusions (GCIs) (1). Thus, current consensus guidelines of Parkinson disease, DLB, and MSA recommend the use of αS immunohistochemistry (IHC) (2-4) and emphasize the importance of a semiquantitative rating of immunoreactive (IR) structures. However, it seems that quantification might be greatly dependent on the methodology and, specifically, on the antibodies (Abs) used, as was recently reported by Croisier et al (5). Surprisingly, there are no large-scale interlaboratory studies that have assessed the influence of Abs and the tissue section pretreatments on the semiquantitative rating or dichotomized assessment, that is, the determination of the presence or absence of αS-IR structures.

This BrainNet Europe (BNE) interlaboratory study was designed to evaluate the reliability of assessment of αS-IR neuronal and glial inclusions by experienced neuropathologists. This included an assessment of current practices followed by each center, a microscope teaching exercise designed to reach a good concordance with respect to categorization of αS-IR structures, and, finally, another assessment of results when methodologic aspects were standardized. It is crucial for neuropathologic evaluations to be comparable between centers, and the optimization and harmonization of methodologies are fundamental responsibilities of the BNE consortium. To ensure that the same tissue samples were analyzed by all evaluators, we used the tissue microarray (TMA) technique (6, 7), as described previously (8). This report summarizes the results when close to 30 evaluators from 17 centers assessed the αS-IR structures in the TMA core samples.

Materials and Methods

Construction of the TMA Blocks

The flow chart in Figure 1 delineates the study design. Each participating BNE center provided routinely processed, paraffin-embedded blocks of cortex, pigmented nuclei of brainstem, amygdala, and striatum with αS-IR inclusions to the coordinating center. Sixty blocks from 40 cases (26 with Lewy body disease and 14 with MSA) were obtained for the construction of the TMA blocks. The demographics of the subjects from whom the tissue samples were obtained include postmortem delay, fixation time, fixative type, embedding medium, and storage time (Table 1). The age at death of the subjects (15 women/25 men) ranged from 50 to 89 years (mean, 71 years); the postmortem delay varied from 5 hours to 6 days (mean, 31 hours); and the fixation time with various fixatives ranged from 1 week to more than 4 years (mean, 29 weeks). The maximum temperature of the embedding medium ranged from 54°C to 65°C, and the storage duration of the blocks ranged from 1 month to 12 years. Two core samples were taken from most of the brain regions listed above (a single sample was taken from 4 regions). Four TMA blocks were constructed, that is, blocks A, B, and C contained 128 core samples, and block D contained 43 core samples (Fig. 2A, B).

FIGURE 1.

The flowchart delineates the logistics of this study. Work performed by the 28 evaluators from 17 participating BNE centers and by the coordinating center is represented. αS, α-Synuclein; BNE, BrainNet Europe; TMA, tissue microarray.

FIGURE 2.

The flowchart summarizes the construction of the tissue microarray (TMA) blocks and the given assessment instructions. (A) Paraffin sections from tissue were first cut and stained with routine hematoxylin and eosin, and regions of interest were marked carefully on these slides (circles). The hematoxylin and eosin-stained sections overlaid on the surface of the donor blocks then guided sampling from the morphologically representative areas. The TMA blocks were constructed by taking core tissue samples 2 mm in diameter at different locations of the donor block and inserting them into the recipient paraffin blocks A to D. (B) The final number of core samples to be analyzed in each block after the lost or damaged cores were excluded. (C) Summary of the instructions for the analyses and rating of the αS-positive structures in the first part of the study. (D) The categorization of the αS-positive structures after the training session, and the assessment sheet for the second phase of this trial is given. αS, α-Synuclein; BFB, basal forebrain; CG, cingulate gyrus; DMV, dorsal motor nucleus of vagus; ECx/CA2, entorhinal cortex/the CA2 region of the hippocampus; FCx, frontal cortex; LC, locus coeruleus; SN, substantia nigra; STR, striatum.

View this table:
TABLE 1.

The core biopsies were taken using a Manual Tissue Arrayer 1 instrument (Beecher Instruments, Sun Prairie, WI). A 2-mm-diameter needle was used to obtain a representative sample, and serial 7-μm-thick sections were placed on commercial SuperFrost Plus microscope slides (6, 7). To determine the uniformity of the sections shipped to different laboratories, 4 sections (every 10th) were stained by the coordinating center using monoclonal Ab to rat synuclein 1, clone 42 (Transduction Laboratories, Lexington, KY) at a dilution of 1:1000.

BNE Participant Efforts With Current IHC Practice

In the first phase, each participating center received sections cut from the TMA blocks A to C that were stained using their own choice of anti-αS Ab with no recommendations regarding antigen retrieval methods. The shipment included data sheets for recording of assessments and a recent publication with figures of different αS-IR structures (9). The Abs, dilutions, and pretreatment methods used by the participating centers are summarized in Table 2. The 28 participating neuropathologists were asked to assess each αS-IR structure within the core sample. They were instructed to use 100× magnification when counting, and they were requested to count the number of somal NPIs (up to 25) in each core sample. The participants evaluated αS-positive neurites as follows: absent (0); some but needed to be sought out (1); moderate when readily seen (2); and numerous (3). Glial cytoplasmic inclusions were rated to be absent (0); present occasionally, that is, they have to be sought out (1); moderately, approximately 20 to 30 (2), or extensively (3) (Fig. 2C). All assessment sheets and the stained slides were shipped back to the coordinating center.

View this table:
TABLE 2.

Training Session

To standardize the assessment prior to the second phase of this trial, a training session was held around a multiheaded microscope attended by 25 participants. Several either poorly or well-stained sections that had been produced by the participants were examined. Many examples of labeled structures were inspected and discussed. The participants emphasized the need for detailed and specific guidelines for both the categorization and scoring of the IR structures. Therefore, new instructions were designed to be unambiguous, simple, and easy to follow.

BNE Participant Efforts With Designated IHC Methodology

In the second phase of the study, the BNE participants received 1 section (block D) to be stained with 1 of 4 Abs that had been selected on the basis of the results obtained from the first phase. The guidelines included a protocol on Ab dilutions and pretreatments (Table 3), written instructions for assessment based on the group discussions during the training session, and representative photos of the lesions (Fig. 3). As shown in Figure 2D, various patterns of immunoreactivity were to be assessed and recorded into the assessment sheets. The details of lesions to be assessed were also given.

View this table:
TABLE 3.
FIGURE 3.

Immunohistochemical staining, Novocastra α-synuclein (αS) monoclonal antibody, clone KM51, dilution 1:1000. In cortex: (A) no staining and (B) synaptic staining. In substantia nigra: neurons with (C) αS-immunoreactive (IR) aggregates (arrows), (D) αS-IR neurites/neuropil threads (arrowheads), and (E) αS-IR macrophages (open arrow); (F) punctate cytoplasmic labeling; (G) punctate cytoplasmic labeling with ovoid inclusions; (H) multiple rounded inclusions; and (I) extracellular Lewy body-like inclusion.

Data Analysis at the Coordinating Center

All stained sections received were reassessed by 2 evaluators at the coordinating center (Fig. 2C). The sections were first evaluated with respect to loss and damage of cores and quality of staining. They were scored in a scale incorporating both staining intensity (good, acceptable or poor) and background staining. The slides were then reassessed following specific criteria. Only IR structures fulfilling these criteria were included.

To simplify the comparison between primary and reassessments, the counts of NPIs were reassigned on a semiquantitative assessment scale (Fig. 2C). During the reassessment, it became evident that it was not possible to distinguish unequivocally a cross-sectioned neurite from a GCI when the nucleus of the glial cell was not clearly visible. Therefore, these 2 categories were modified (Fig. 2C). For statistical analysis, a mean value was obtained from the semiquantitative scores of all included cores in each of the following brain regions: cortex, pigmented nuclei of brainstem, amygdala, and striatum. By contrast, in the second phase in which there were stringent instructions, the results were reported for each individual core, rather than as means of IR structures for different neuroanatomical regions. This method was chosen to assess the uniformity of the semiquantitative assessment because there were variable numbers of IR structures in different cores.

Statistical Analysis and Photomicrography

For statistical analyses, the SPSS program for Windows (SPSS, Inc., Chicago, IL) was used. The statistical differences in the distributions of semiquantitative assessment of IR structures were estimated by the nonparametric Kruskal-Wallis (between primary assessments and between reassessments) and Wilcoxon tests (between primary and reassessment). The Spearman correlation test was used to assess the linear relationship of core values between the primary and reassessment. The value of absolute agreement (%) was calculated, that is, the proportion of core samples assessed equivalently in the primary and reassessments. Digital images were taken using a Leica DM4000 B microscope equipped with a Leica DFC 320 digital camera (Leica Microsystems Wetzlar, Ltd., Heerbrugg, Germany).

Results

Comparability of Consecutive Sections

Uniformity of the consecutive sections shipped to the participants was verified by the coordinating center. The mean value of the semiquantitative scores of αS-IR NPIs in TMA blocks A to C was calculated for the whole section (4 sections), and these values did not differ significantly (Kruskal-Wallis; p > 0.4). In TMA block D, each core was assessed with respect to αS-IR NPIs, neurites, and GCIs (4 sections). Their dichotomous assessment values (i.e. present or absent) did not differ significantly. Furthermore, regarding the semiquantitative assessment of GCIs, a complete agreement was achieved in all cores, whereas the agreement was 95% for neurites and 76% for NPIs.

Loss or Damage of Core Samples

The estimated total loss of core samples was 6%. Only those cores in which 75% of the tissue remained in more than half of the evaluations were included in the reassessment analysis. Some cores were excluded when they were not representative of a specific area; therefore, in TMA section A, 29 of 37 cores remained available; in section B, 17 of 39 cores; in section C, 27 of 52 cores; and in section D, 34 of 43 cores remained available (Fig. 2B).

Quality of Staining

The coordinating center received 28 evaluations from 17 BNE centers in the first phase and 26 evaluations from 15 BNE centers in the second phase. In the first phase, the centers used 9 different Abs with a variety of pretreatments (Table 2), and the labeling and staining intensity of IR structures varied extensively in these. In all slides (A + B + C= 3 × 28 = 84), the staining quality was judged to be good in 29%, acceptable in 30%, and poor in 41% of the slides. The best staining results were obtained with Novocastra-KM51, Transduction-Syn 42, Alexis-15G7, and Signet-4D6 Abs (Table 2); consequently, these were chosen for the second phase. The Transduction-Syn42 stain required section pretreatment with formic acid, whereas the Novocastra-KM51 and Alexis-15G7 Abs required both heat pretreatment and formic acid. In addition to pretreatment, the specific clone of the Ab and the mode of staining (manual vs automatic) seemed to influence the staining quality (Fig. 4). Furthermore, the quality of staining was found to vary within a single section, that is, various cores stained differently in the same section. This was possibly attributable to many factors such as fixation time, type of fixative, and other factors related to the characteristics of the tissue material (Table 1, Fig. 5). When using the same Ab, core samples (i.e. IE, 5I) fixed up to 3 months were almost equally stained regardless of the pretreatment strategy, whereas core samples (i.e. 2B, 5B, 6G) fixed over 6 months were virtually unstained with 1 of the used pretreatments. The synaptophysin-like gray matter staining varied from strong to nonexistent, and, occasionally, nonspecific staining was seen irrespective of the type of tissue, gray, or white matter. Other occasionally encountered labeling that can be mistaken for αS-IR inclusions included the staining of corpora amylacea (Fig. 6A), Nissl substance, and/or lipofuscin/lipopigment (Fig. 6B and, in some cases, evidence of starlike glial αS-IR; Fig. 6C).

FIGURE 4.

The effect of different clones of the same antibody (Ab) and modes of staining on the quality of staining in 200× magnification (scale bars = 20 μm). (A, B) The same core sample of substantia nigra (tissue microarray [TMA] block B, core 1A) stained by 2 centers using different clones of the same monoclonal Ab (A) 4D6 (B) 4B12 produced by Signet using human α-synuclein (αS) as immunogen. Note the difference in visualization of neuronal inclusions that are clearly demarcated in (A) but only faintly visible and difficult to discern from the neuromelanin in (B) (arrows). (C, D) The same core sample of pons (TMA block B, core 5F) stained with Alexis 15G7 Ab by 2 centers using different modes of stainings, (C) manual and (D) automatic. The αS-positive oligodendroglial inclusions were rated to be present extensively (C), whereas only a few weakly positive structures were detected (D).

FIGURE 5.

The effect of pretreatment and characteristics of the tissue (i.e. fixation time) on the quality of staining. Magnification: 200×; scale bars = 20 μm). Five core samples from the section of tissue microarray (TMA) block C (cores 1E, 2B, 5B, 5I, 6G) were stained by the same Signet 4D6 antibody (same clone) by 2 different centers (for pretreatments, see Table 2, assessment codes 13 and 14). Note the variability in labeling of the α-synuclein-positive structures in the same core sample fixed over 6 months (C vs D, E vs F, and I vs J).

FIGURE 6.

Staining of corpora amylacea (A), Nissl substance, and/or lipofuscin/lipopigment (B) and starlike glia (C). Magnification: 400×; scale bars = 10 μm.

In the second round, all 26 sections were assessed to be good/acceptable in quality, and, in general, all 4 of the selected Abs performed similarly. Eight sections were stained with Novocastra-KM51 and Transduction-Syn 42, whereas 5 evaluations were performed on both Alexis-15G7 and Signet-4D6 (Table 3).

Results in the First Phase: Current IHC Practices

Results are given when only evaluations of the sections of good/acceptable staining quality are included (Tables 4-8).

View this table:
TABLE 4.
View this table:
TABLE 5.
View this table:
TABLE 6.
View this table:
TABLE 7.
View this table:
TABLE 8.

Cortex

Table 4 shows that the semiquantitative scores differed significantly in the primary assessments (NPIs ranged from 0.4± 0.1 to 2.2 ± 0.3 and neurites/GCIs from 0 to 1.8 ± 0.2; Kruskal-Wallis; p < 0.000), whereas the variation between reassessment results of NPIs was nonsignificant (Kruskal-Wallis; p > 0.05). Moreover, the primary and reassessment mean values differed significantly (Wilcoxon; p < 0.05) in 56% of evaluations for the NPIs and in 46% of evaluations for neurites/GCIs. An "excellent" correlation (Spearman r, 0.8-1.0) was found in 63% of evaluations for NPIs and in 22% of evaluations for neurites/GCIs. Furthermore, the absolute agreement between the primary and reassessments ranged from 52% to 86% for NPIs and from 19% to 95% for neurites/GCIs.

Pigmented Nuclei

The mean values (±SE) for the semiquantitative scores differed significantly among the primary assessments (NPI range, 0.8 ± 0.2-2.6 ± 0.5; neurites/GCIs, 0.9 ± 0.2-2.6 ± 0.5; Kruskal-Wallis; p < 0.000) (Table 5), whereas the variation between the reassessment results of both NPIs and neurites/GCIs was nonsignificant (Kruskal-Wallis; p > 0.05). The primary and reassessment mean values differed significantly (Wilcoxon; p < 0.05) in 56% of evaluations for both NPIs and neurites/GCIs. An excellent correlation (Spearman r, 0.8-1.0) was found in 53% of evaluations for NPIs and in 33% of evaluations for neurites/GCIs. Furthermore, the absolute agreement between the primary and reassessments ranged from 14% to 87% for NPIs and from 20% to 65% for neurites/GCIs.

Amygdala

The mean values (±SE) for the semiquantitative scores differed significantly between the primary (NPI range, 1.2 ± 0.3-3.9 ± 0.4; neurites/GCIs, 1.0 ± 0.3-3.2 ± 0.3; Kruskal-Wallis; p < 0.000/p < 0.001) and reassessment (NPI range, 0.4 ± 0.2-2.9 ± 0.6; neurites/GCIs, 0.9 ± 0.3-2.2 ± 0.2; Kruskal-Wallis; both p < 0.000) (Table 6). The primary and reassessment mean values differed significantly (Wilcoxon; p < 0.05) in 53% of evaluations for NPIs and in 36% of evaluations for neurites/GCIs. An excellent correlation (Spearman r, 0.8-1.0) was found in only 41% of evaluations for NPIs but in 65% of evaluations for neurites/GCIs. Furthermore, the absolute agreement between the primary and reassessments ranged from 20% to 79% for NPIs and from 13% to 73% for neurites/GCIs.

Striatum

The mean values (± SE) for the semiquantitative scores differed significantly among the primary assessments of both types of labeled structures (NPI range, 0-4.3 ± 0.4; neurites/GCIs, 1.4 ± 0.4-2.8 ± 0.5; Kruskal-Wallis; p < 0.000), whereas the variation between reassessment results of both NPIs and neurites/GCIs was nonsignificant (Kruskal-Wallis; p > 0.05) (Table 7). The primary and reassessment mean values differed significantly (Wilcoxon; p < 0.05) in 57% of evaluations for NPIs and in 25% of evaluations for neurites/GCIs. None of the evaluations of NPIs exhibited an excellent correlation (Spearman r, 0.8-1.0), whereas this was found in 56% of evaluations for neurites/GCIs. Furthermore, the absolute agreement between the primary and reassessments ranged from 0% to 100% for NPIs and from 42% to 75% for neurites/GCIs.

Agreement in the Scoring of αS-Positive Structures on a Single Core Level

Table 8 shows variation in the agreement of scoring of IR structures within a core sample. The absolute agreement levels indicate the proportion of cores evaluated with the same value in both primary and reassessments. With respect to αS-positive NPIs in the entorhinal cortex, considerable variation was detected between the primary assessments. The range of scores extended over a few stages in most cores, and in 15% of cores, all possible ratings were given. Absolute agreement in scoring of entorhinal core samples between the primary and reassessments ranged extensively from 18% to 95%, and it was excellent (≥80%) in only 4 of the 20 cortical core samples.

Some variation was also detected in the assessment of neurites in the substantia nigra between the primary assessments, and all possible ratings from none to numerous were given in 2 of the 17 nigral core samples (12%). The proportion of matching scores between the primary and reassessments varied significantly from 27% to 83%, and absolute agreement was excellent (≥80%) in only 2 of the 17 nigral core samples.

In the assessment of αS-IR GCIs in the striatum, a significant variation was detected between the primary assessments such that in 4 of the 12 striatal core samples, the rater assessment ranged from none to extensive. Absolute agreement ranged extensively from 13% to 81%, being excellent (≥80%) in only 3 of the 12 striatal core samples.

Training Session

During the exercise around the multiheaded microscope, several issues were raised, including 1) the difficulty of strictly specifying whether a rounded labeled structure without a notable nucleus represented a cross section of a large neurite or was an intracellular inclusion, and 2) whether a tiny dot-like structure represented a GCI or a cross section of smaller threads. Therefore, prior to proceeding, there was an agreement on the categorization of αS-IR structures to be followed in the second phase (Figs. 2D, 3). Furthermore, other possible pitfalls were discussed, including IR corpora amylacea, Nissl substance, lipofuscin pigment, and glial processes (Fig. 6).

Results in the Second Phase With Standardized IHC Methodology

Table 9 lists the results of the dichotomous and semiquantitative assessments using the 4 selected Abs. There was a good agreement between the dichotomous assessments in which only the presence or absence of αS-IR was determined. This was particularly good among the 8 evaluations that had used the Transduction-Syn42 Ab to stain αS-IR When the Alexis-15G7 Ab was used in 3 cores and with Signet-4D6 in 2 cores, negative results were reported by some evaluators. The most variable results were obtained using the Novocastra-KM51 Ab. In contrast, the agreement between semiquantitative assessments of αS-IR (NPIs + neurites + GCIs) was poor, and this was not independent of the Ab used. With Signet-4D6, in 71% of core samples, at least 2 different scores were given. The range of scores extended over a few stages also with Alexis-15G7 (74%), Novocastra-KM51 (79%), and Transduction-Syn42 (82%).

View this table:
TABLE 9.

Assessment of αS-Labeled Morphologic Entities

Irrespective of which Ab had been used (Tables 10, 11), a similar trend in dichotomous and semiquantitative assessments was found. Both NPIs and neurites were rather easily detected and dichotomously assessed in all brain regions except the striatum. Neuronal perikaryal inclusions were reported to be present by all evaluators in 12 of the 24 core samples representing cortex, substantia nigra, or amygdala (Table 10). In these 12 cores, the absolute agreement of dichotomized assessment of NPIs between the primary and reassessments also reached 100%. In the remaining 12 cores, some false-negative results were obtained mostly in the cortex. Overall, the absolute agreement between the primary and reassessment regarding the dichotomous assessment of NPIs was excellent (≥80%) in most (20/24) cores. Additionally, αS-IR neurites and threads were reported as being homogenously present in 11 of 24 cores, where absolute agreement also reached 100%. In general, absolute agreement of neurites/threads was excellent (≥80%) with the exception of 1 core.

View this table:
TABLE 10.
View this table:
TABLE 11.

In contrast, the semiquantitative assessment of both NPIs and neurites achieved poorer agreement within the primary assessments and between the primary and reassessments (absolute agreement values). With respect to NPIs, all possible ratings from zero to greater than 20 inclusions were given in 17%, 3 possible ratings in 50%, and 2 different ratings in 29% of cores. In the cortical core samples, agreement regarding the extent of NPIs ranged from 69% to 100% (mean, 82%), and in only 1 core was agreement absolute. Agreement was even poorer for the primary semiquantitative assessment of neurites, where all possible ratings were given for 21%, 3 possible ratings for 63%, and 2 different scores for 17% of cores. None of the cores achieved a complete agreement. Absolute agreement for NPIs was excellent (≥80%) in most cortical core samples (7/8), whereas in substantia nigra in only 1 of 10 core samples and in none of 6 core samples in amygdala was an excellent agreement achieved. The proportion of matching scores between the primary and reassessment for neurites and threads was even poorer, achieving excellent values (≥80%) in only 2 of 24 cores.

In the striatum, GCIs were easily identified dichotomously, and 100% agreement between primary and reassessments was achieved in most (8/10) core samples (Table 11). Nonetheless, the semiquantitative assessment of GCIs was again poorer, so that 2 or more ratings were given for most core samples. The absolute agreement between the primary and reassessment of semiquantitative assessments of GCIs reached excellent agreement (≥80%) in only 3 of 10 core samples. Neuronal perikaryal inclusions and neurites were more challenging to identify in this region, and in none of the cores did the absolute agreement of semiquantitative assessments reach excellent values.

Discussion

This interlaboratory study is the first of a series devoted to problems encountered in the methodology and interpretation of αS-labeled structures. The study was designed to evaluate both the quality of immunostainings and the level of agreement in assessments. These are important issues because the current guidelines for the assessment of αS-IR structures emphasize both regional distribution (3) and semiquantitative grading of αS-IR structures (4). To address the variability in estimates that might be caused by the topography (10), we used a 2-mm-punch TMA technique (6). Battifora (11) predicted in 1986 that the TMA technique would be of value in large-scale quality control studies. The success of this technique has recently been confirmed in our large interlaboratory assessment of Alzheimer disease-related pathology (8). Furthermore, the advantage of using TMA as a tool in interlaboratory quality control trials has been discussed by Mengel et al (12), who evaluated the proliferation activity in tumor cells by using Ki-67 IHC in a multicenter study. One significant benefit of using the TMA method is that all of the participating observers evaluated the same regions with only minor, but not significant, variations due to the sectioning process. The latter factor was controlled by staining a series of sections from all 4 TMA blocks, and the results did not significantly differ.

To our surprise, when participating laboratories followed their own current methodologic practices in visualizing αS-IR structures, the variations were remarkable; some Abs and specific clones were clearly better than the others. This is consistent with the recent study by Croisier et al (5), who examined the influence of Abs in labeling structures of diagnostic relevance. The diverse labeling of IR structures reported by Croisier et al when the sections were stained in 1 laboratory was clearly exaggerated by the multiplicity of laboratories in the present study. We found that the same Ab yielded both good and poor results, indicating that the key factor for the quality of staining was not only the Ab but also the antigen retrieval method. For example, with transduction-Syn42, the lack of formic acid pretreatment abolished the staining almost completely. In line with this, Pletnikova et al (13) showed that with this same Ab, formic acid pretreatment greatly enhanced the immunoreactivity. Overall, the poorest quality (and, therefore, the lowest counts of αS-IR structures) was obtained with all Abs without pretreatment. Thus, the standardized use of formic acid and heat pretreatments to enhance the αS immunoreactivity is a prerequisite for obtaining good results with most current commercial Abs (14, 15). In addition to pretreatment, the characteristics of the tissue (i.e. fixation time and postmortem delay) seemed to influence the staining results significantly. It was noted, for example, that when samples were fixed for more than 6 months, staining was abolished with certain pretreatment strategies. This indicates that the IHC labeling of αS in fixed brain tissue is more complex than is currently appreciated, and further assessment of the influence of tissue characteristics on the quality of αS labeling is urgently required.

Even when only good and acceptable stainings were included, the assessments of the 15 to 17 BNE participants in the first phase differed significantly for all types of IR structures in each anatomic region. This is partly explained by diverse categorization of the labeled structures. In the first phase, the raters were asked to interpret the stained structures and assess them according to criteria from the current literature without receiving any additional detailed descriptions. The variability among observers of these assessments of αS-IR structures was clearly evident during the joint evaluation of stained sections performed around a multiheaded microscope.

Interestingly, in almost all regions, the variability in the reassessments diminished in the second phase. This is explained by the stringency in the evaluation criteria for the reassessor, and most importantly, when there were strict criteria for the categorization of IR structures. Both from slide to slide and from staining to staining, the interpretations of IR structures were more uniform. However, there were still significant differences in the reassessment of neurites/GCIs in the cortex and problems in the reassessment of all IR structures in the amygdala. The differences found in assessing IR structures might partly be explained by the various Abs used (5). In approximately half of the good/acceptable stainings, when current methodologic practices were applied, we found a significant difference between the primary and the reassessment. Although the correlations between assessments were relatively good in most examined areas, there were significant differences between the semiquantitative assessments. Thus, even if the 2 raters (primary and reassessor) detected αS-IR structures in the same cores, they tended to score them differently. The BNE centers occasionally assessed the NPIs more generously than the reassessor in all of the examined brain regions. Hence, the question arises as to why different evaluations give discordant ratings for the same IR structures. As discussed above, this is partly due to variations in the interpretation and categorization of the labeled structures. In addition, in many instances, the heavy staining of background in a synaptophysin-like manner made it very difficult to discern labeled structures. When this occurred, the counting or rating of labeled structures became arbitrary and not reproducible.

In contrast, in the second phase, when designated methodologic instructions were applied both regarding the choice of Ab and particularly the antigen retrieval method, concordant and reliable αS IHC staining results were achieved by the different laboratories. The Abs used here are not the only good Abs available, but each product requires thorough testing, and, moreover, each staining batch should include a good positive control. In addition, adherence to continuous quality control is advisable.

Regretfully, our results in the second phase following designated methodologic instructions indicated that even when the staining quality was optimal, there were limitations in what could be expected of a semiquantitative assessment of the αS-IR structures. In contrast, concordance between raters was found to be good with a dichotomous assessment. In particular, when disregarding the type of IR structures and only assessing the overall presence or absence of αS-IR (NPI + neurite + GCI), a very high level of agreement was reached by the 26 evaluators. The study by Müller et al (16) also addressed interrater variability by undertaking a dichotomized assessment of αS-IR, and they found a good interrater reliability when the 5 raters evaluated one and the same set of stained sections. When our participants assessed 26 stained sections, they achieved a high agreement level not only overall but also when the raters were asked to identify different types of IR structures (NPIs, neurites, and GCIs). In the recently revised consensus guidelines of DLB (5), in addition to regional distribution and dichotomized assessment, significant emphasis is assigned to the semiquantitative assessment of αS-IR structures. As shown here, this kind of quantitative classification is prone to problems. All raters with expertise and practice reached full agreement in assessing NPIs in only 1 core sample of 24. Interestingly, however, when assessing NPIs in 8 cortical samples, the agreement in semiquantitative assessments was quite high (mean, 82%) when compared with other brain regions. Semiquantitative assessment of neurites never yielded full agreement because 2 or more possible ratings were given in all core samples. Based on the poor agreement in the semiquantitative assessment of both NPIs and neurites, we believe that a high concordance following the current consensus criteria of DLB might be difficult to achieve in an interlaboratory setting.

Immunohistochemistry has been used since the 1980s as a diagnostic tool, although it is known to be capricious. This is well illustrated in our study, where the results depended on the characteristic of the tissue, staining quality, and the expertise of the evaluator. The first phase of our trial clearly highlights the urgent need to acknowledge all pitfalls regarding αS IHC, that is, those related to characteristics of the tissue methodology and categorization of the αS-IR structures. The second phase of our trial indicates that even if training and standardizing methodology are prerequisites to achieving reliable results between different laboratories, there are still limitations that need to be acknowledged regarding the scoring of labeled structures.

Acknowledgments

The authors thank Helen C. Cairns, Frances Carnie, Lynne Christian, Louisa Djerbib, Helga Flicker, Tarja Kauppinen, Maria Kemerli, M. Kooreman, Tanja Treutlein, A. Van den Berg, J. Wouda and all other laboratory technicians of BNE members for their skillful technical assistance; Vesa Kiviniemi for his assistance in statistics; Ewen MacDonald for critical reading of the article; and Dr Kirstin Goldring, manager of the UK Parkinson's Disease Society Tissue Bank.

The study has been authorized by the Ethics Committee of Kuopio University Hospital.

Footnotes

  • Supported by Grant No. FP6: BNEII No LSHM-CT-2004-503039 from the European Union.

  • This article reflects only the authors' views, and the Community is not liable for any use that may be made of the information contained therein.

References

  1. 1.
  2. 2.
  3. 3.
  4. 4.
  5. 5.
  6. 6.
  7. 7.
  8. 8.
  9. 9.
  10. 10.
  11. 11.
  12. 12.
  13. 13.
  14. 14.
  15. 15.
  16. 16.
View Abstract