Documentos de Académico
Documentos de Profesional
Documentos de Cultura
SOLO ONLINE
Introducción: El método de maduración de las vértebras cervicales (CVM) ha sido defendido como un predictor del crecimiento
mandibular máximo. Este método se basa en la capacidad del médico para determinar el estadio de maduración de las vértebras.
Un examen cuidadoso de los informes de esta técnica muestra fallas metodológicas que pueden conducir a niveles inflados de
reproducibilidad. El propósito de este estudio fue evaluar la reproducibilidad de la determinación del estadio CVM mediante el
uso de una metodología más estricta. Métodos: Diez ortodoncistas practicantes, capacitados en el método CVM, evaluaron 30
radiografías individuales y 30 pares de radiografías cefalométricas en 2 sesiones para determinar el estadio CVM. La confiabilidad
interobservador e intraobservador se determinó utilizando el coeficiente de concordancia de Kendall y el estadístico kappa
ponderado. Resultados: Todos los grados de concordancia interobservador e intraobservador fueron moderados (W de Kendall,
0,40,8). Los niveles de acuerdo entre observadores para la estadificación CVM de los 10 ortodoncistas en ambos momentos
estuvieron por debajo del 50%. La concordancia mejoró marginalmente con el uso de 2 radiografías longitudinales.
La concordancia intraobservador fue sólo ligeramente mejor; En promedio, los médicos estuvieron de acuerdo con su propia
estadificación sólo el 62% de las veces. Conclusiones: Con base en estos resultados, no podemos recomendar el método CVM
como una guía clínica estricta para el momento del tratamiento de ortodoncia. (Soy J Orthod Dentofacial Orthop 2009;136:478.e1478.e7)
Liberales, Universidad de Iowa, Iowa City. cervicales. Lamparski22 creó un conjunto de estándares para
F
Profesor y director del Departamento de Ortodoncia, Facultad de Odontología, la maduración de las vértebras cervicales (CVM) y los
Universidad de Iowa, Iowa City.
correlacionó con la radiografía de mano y muñeca. Informó
Los autores no declaran ningún interés comercial, de propiedad o financiero en los
productos o empresas descritos en este artículo. que su conjunto de estándares era un indicador de la madurez
Solicitudes de reimpresión a: Karin A. Southard, Departamento de Ortodoncia, esquelética tan preciso como el método manomuñeca.
Facultad de Odontología, S221 Dental Science Bldg, Iowa City, IA 52242; correo
electrónico, karinsouthard@uiowa.edu. Posteriormente, otros autores informaron, en diferentes
Presentado en mayo de 2007; revisado y aceptado en julio de grados, la relación entre la CVM y la madurez esquelética
2007.
según lo evaluado en radiografías de mano y muñeca,2333,
08895406/$36,00 Copyright 2009 de la Asociación Estadounidense de
Ortodoncistas. doi:10.1016/j.ajodo.2007.08.028
pero otros informaron una correlación entre la CVM y el crecimiento mandibular
478.e1
Machine Translated by Google
Fig 1. Definiciones de la morfología de las vértebras cervicales y definiciones de los estadios de la CVM (CS 16):
trapezoide, el borde superior se estrecha de posterior a anterior; horizontal rectangular, las alturas de los bordes
anterior y posterior son iguales, y los bordes superior e inferior son más largos que los bordes anterior y posterior;
cuadrado, los bordes posterior, superior, anterior e inferior son iguales; vertical rectangular, los bordes anterior y
posterior son más largos que los bordes superior e inferior. CS1, los bordes inferiores de las 3 vértebras (C2C4)
son planos y los cuerpos de C3 y C4 tienen forma de trapezoide; CS2, concavidad en el borde inferior de C2 en el
80% de los casos, y los cuerpos tanto de C3 como de C4 tienen forma trapezoidal; CS3, las concavidades en los
bordes inferiores de C2 y C3, y los cuerpos de C3 y C4 tienen forma trapezoidal o rectangular horizontal; CS4, las
concavidades en los bordes inferiores de C2, C3 y C4, y los cuerpos de C3 y C4 tienen forma rectangular horizontal;
CS5, concavidades en los bordes inferiores de C2, C3 y C4, y al menos uno de los cuerpos de C3 y C4 es cuadrado
y, si no es cuadrado, el cuerpo de la otra vértebra cervical sigue siendo rectangular horizontal; y CS6, concavidades
en los bordes inferiores de C2, C3 y C4, y al menos 1 de los cuerpos de C3 y C4 es rectangular vertical y, si no es
rectangular vertical, el cuerpo de la otra vértebra cervical es cuadrado (reimpreso de Baccetti et al40 publicado en
Semin Orthod 11, The cervical vertebral maturation metod for the Assessment of Optimum Training Timing in
Dentofacial Orthopaedics, 11929, Copyright Elsevier 2005.).
para cada observador, ¿cuál fue el acuerdo entre los 2 tiempos para Se calculó para el individuo y los pares de cefalogramas laterales.
cada observador (acuerdo intraobservador)? Cuanto más cercano a 1 esté el valor kappa, mayor será la
¿Hubo una gama amplia o estrecha de CS entre los sujetos? es concordancia entre las 2 calificaciones. Generalmente se considera
decir, ¿fueron algunos temas más difíciles de escenificar que otros? que los coeficientes kappa de 0,4 a 0,8 indican un acuerdo moderado.
Cuando los observadores diferían (tanto interobservadores como Todas las pruebas fueron evaluadas con un nivel de significación
intraobservadores), ¿cuál fue la diferencia en términos de número de estadística de 0,05.
EC?
El análisis de los pares longitudinales de cefalogramas laterales
RESULTADOS
implicó responder las siguientes preguntas.
¿Cuál fue el acuerdo entre los observadores en los 2 momentos Hubo 1350 observaciones interobservadores para los 30
(acuerdo interobservador)? Cuando se repitió la puntuación para cada cefalogramas laterales individuales en ambos momentos.
observador, ¿cuál fue la concordancia entre las 2 veces para cada Hubo 607 (45%) acuerdos y 743 (55%) desacuerdos en el momento
observador (acuerdo intraobservador)? inicial, y 665 (49%) acuerdos y 685 (51%) desacuerdos a las 3
semanas (Fig. 3 ) .
Se utilizó el coeficiente de concordancia de Kendall (W de
Kendall) para evaluar la concordancia interobservador en los 2 La Tabla I proporciona los resultados del acuerdo interobservador
momentos para el individuo y los pares de cefalogramas laterales. para los 10 observadores en los 2 momentos, tanto para cefalogramas
La W de Kendall varía entre 0, sin acuerdo, y 1, con acuerdo máximo. laterales individuales como para pares, utilizando la W de Kendall.
Generalmente se considera que los valores W de Kendall de 0,4 a En todos los casos, el acuerdo interobservador fue moderado. Para
0,8 indican un acuerdo moderado. Se calculó el estadístico kappa las radiografías individuales, los valores W de Kendall fueron 0,74 y
ponderado para datos ordinales para evaluar el nivel de acuerdo de 0,72 para el período inicial y de 3 semanas, respectivamente. Cuando
estadificación entre los 2 puntos temporales para cada observador. a los observadores se les proporcionó una serie de radiografías
Este longitudinales, las puntuaciones W de Kendall mejoraron mínimamente.
Machine Translated by Google
DISCUSIÓN
Fig 2. Ejemplo de cefalograma lateral recortado que incluye las guía para evaluar la madurez esquelética y el momento de
en). Aunque la mayoría de los desacuerdos estuvieron separados por una etapa, Recientemente, varios estudios informaron altos niveles de
un porcentaje significativo estaban separados por 2 o más etapas, precisión y reproducibilidad con el método CVM,
y esta discrepancia entre los observadores empeoró en pero la mayoría tenía fallas metodológicas que ponen en duda
el punto de las 3 semanas, que indica mayor desacuerdo la exactitud de sus resultados. Además, no hay estudios
entonces. Para unos pocos sujetos (hasta el 10%), todos los observadores han examinado la reproducibilidad del método CVM
acordado. por médicos capacitados de práctica privada.
Hubo 300 observaciones intraobservador para el Diseñamos este estudio para abordar algunos de estos
30 cefalogramas laterales individuales, con 187 acuerdos y 113 Cuestiones metodológicas. Primero, evitamos el sesgo al tener
desacuerdos. En otras palabras, se puede esperar que un ortodoncista los observadores escenifican las imágenes del cefalograma lateral
esté de acuerdo con sus propios directamente, no a partir de trazados de las vértebras cervicales como en
La estadificación CVM, en promedio, sólo el 62% de las veces. Porcentajes estudios previos.2430,36 En segundo lugar, teníamos un panel de 10
de acuerdo intraobservador para los 10 observadores ortodoncistas de práctica privada sin conocimiento de
entre los 2 puntos de tiempo tanto para individuales como para parejas el diseño del estudio y quiénes recibieron información estandarizada
de los cefalogramas laterales se muestran en la Tabla III. entrenamiento en el método CVM con materiales de referencia exactos
Acuerdo intraobservador para los 10 observadores medido por el según Baccetti et al.40 En varios estudios previos
coeficiente kappa ponderado entre los 2 estudios, los propios autores realizaron las pruebas de reproducibilidad
Los puntos de tiempo para cefalogramas laterales individuales y de pares interobservador e intraobservador.24,27,30,32,33,36 En 1 caso, los
se muestran en la Tabla IV. Todos los valores para el kappa. autores que
coeficiente están por debajo de 0,8 (0,40,8 es un acuerdo moderado). El participó en la prueba de acuerdo realmente desarrollada
coeficiente kappa ponderado no sólo mide el nivel de acuerdo, sino los criterios CVM.24 En todos estos estudios, los autores que
también factores del nivel de acuerdo. sirven como observadores tienen un conocimiento de "nivel de
desacuerdo. Esto explica por qué 2 observadores (3 y 6) investigación" del método CVM y, debido a esto, los resultados de
quienes tuvieron una disminución en el porcentaje de acuerdo entre reproducibilidad pueden estar exagerados.
cefalogramas laterales individuales y de pares, mostraron Se utilizó la W de Kendall para evaluar el acuerdo interobservador
un aumento en el coeficiente kappa. Ambos observadores en ambas observaciones; es una estadística preferida
Machine Translated by Google
800
700
600
500
400
crF
aaicicnneeurcruee d
o
300
200
100
0
Acuerdos Desacuerdos Acuerdos Desacuerdos
Inicial 3 semanas
Tabla I. W de Kendall para la concordancia interobservador en las etapas Cuadro II. Diferencias de estadio cervical para desacuerdos entre observadores
cervicales en el momento inicial y a las 3 semanas para los cepahalogramas laterales individuales en
los tiempos inicial y de 3 semanas
Nivel estadístico
La W de Kendall de acuerdo Punto de tiempo inicial Punto de tiempo de 3 semanas
3 5,84% 6,33% 2
4 0,58% 1,35% 34
5 0%50%
método sobre el coeficiente de correlación de Pearson porque
del uso de datos ordinales (clasificación/estadificación) en este tipo
de estudio.24 Además, utilizamos el kappa ponderado no pudieron probar sus hallazgos en un estudio separado, más grande y
estadística de datos ordinales para evaluar intraobservador población de muestra aleatoria.
acuerdo. Estos análisis estadísticos no sólo evalúan También intentamos determinar si una comparación longitudinal,
acuerdo, pero también sopesar el desacuerdo y proporcionar lado a lado, de 2 cefalogramas laterales
una evaluación precisa de la reproducibilidad de la del mismo sujeto mejoraría interobservador
Método CVM. y acuerdo intraobservador. Este observador del CVM
Seleccionamos aleatoriamente cefalogramas laterales de El diseño del acuerdo no ha sido probado en estudios anteriores.
sujetos en el rango de edad circumpuberal. esta muestra Encontramos valores de concordancia interobservador de 0,76.
proporcionó una visión única de las variadas gamas de CVM y 0,74 (P\0,0001) (W de Kendall) para el inicial
puntuaciones asignadas a estas materias. Hasta en un 10% de los y puntos temporales de 3 semanas, respectivamente. Esto es sólo
sujetos, no se observó variación y todos los observadores realizaron una ligera mejora con respecto al acuerdo entre observadores
estos cefalogramas de forma idéntica. Aparentemente, algunos cefalogramas resultados para los cefalogramas individuales: 0,74 y 0,72
se pueden estadificar fácilmente, pero, en la mayoría de los demás casos, (P\0.0001) para los puntos temporales inicial y de 3 semanas,
La estadificación de las vértebras cervicales por parte de observadores capacitados es respectivamente. Además, el nivel de intraobservador
claramente mucho más difícil. Varios estudios recientes el acuerdo varió del 46,7% al 73,3% en la puesta en escena
utilizaron tamaños de muestra pequeños que se redujeron de grandes pares de cefalogramas laterales con 3 semanas de diferencia. Esto se
poblaciones, cuestionando la aleatoriedad de la muestra.36,39,40 compara con el acuerdo intraobservador del 43,3% al 80%.
Además , estos autores aparentemente también para los cefalogramas laterales individuales. En total, 5 de
Machine Translated by Google
Cuadro III. Porcentajes de acuerdo intraobservador entre Tabla V. Diferencias de estadio cervical para desacuerdos intraobservadores para
los tiempos inicial y de 3 semanas para individuos y parejas de cefalogramas laterales individuales
cefalogramas
Etapas cervicales separadas Total desacuerdos
CONCLUSIONES
los 10 observadores tuvieron un porcentaje de acuerdo menor para 1. Acuerdo interobservador para la estadificación de CVM en nuestro
un par de cefalogramas laterales longitudinales de la A continuación se muestra una muestra de ortodoncistas practicantes.
mismo paciente. Al comparar los coeficientes kappa ponderados, 7 de 50%. El acuerdo mejoró marginalmente con el
los 10 observadores habían mejorado los valores del coeficiente kappa utilización de 2 radiografías longitudinales.
para los pares de cefalogramas laterales. 2. En promedio, los médicos estuvieron de acuerdo con su propia
comparados con los individuales. Estos 7 observadores estadificación sólo el 62% de las veces, 3 semanas después.
en promedio mejoró el coeficiente kappa ponderado 3. La puesta en escena para una muestra selecta de sujetos es mucho
en 0,11, pero, a pesar de la mejora, el nivel de más consistente que para una muestra aleatoria más grande.
En todos los casos, el acuerdo se mantuvo muy por debajo de los niveles 4. Cuando nuestros médicos no estuvieron de acuerdo sobre la etapa de
informados en la literatura. Las comparaciones longitudinales de un tema en particular, había una amplia gama (2 o
cefalogramas laterales son importantes en más etapas) de desacuerdo en aproximadamente 1
ortodoncia a la hora de realizar diagnósticos clínicos y en 4 casos.
decisiones de planificación del tratamiento. Se puede concluir 5. La reproducibilidad para nuestros médicos capacitados fue
de este estudio que los resultados de reproducibilidad CVM no significativamente por debajo del nivel pretendido en
no mejoran significativamente cuando se les ofrece a los médicos la literatura. Con base en estos resultados, no podemos recomendar
una comparación longitudinal de las vértebras cervicales el método CVM como una guía clínica estricta para el momento de
morfología. los tratamientos de ortodoncia.
Machine Translated by Google
REFERENCIAS 23. Mitani H, Sato K. Comparación del crecimiento mandibular con otras variables durante
la pubertad. Am J Orthod Dentofacial Orthop 1992;62:21722.
1. Ricketts RM. Planificación del tratamiento en función del patrón facial y estimación de
su crecimiento. Ortodoncia angular 1957;27:1437.
24. Hassel B, Farman A. Evaluación de la maduración esquelética utilizando vértebras
2. Bjork A. Variación en el patrón de crecimiento de la mandíbula humana: estudio
cervicales. Am J Orthod Dentofacial Orthop 1995;107: 5866.
radiográfico longitudinal por el método del implante. J Dent Res 1963;42:40011.
Introduction: The cervical vertebrae maturation (CVM) method has been advocated as a predictor of peak
mandibular growth. This method relies on the clinician’s ability to determine the stage of maturation of the ver-
tebrae. Careful examination of reports of this technique shows methodologic flaws that can lead to inflated
levels of reproducibility. The purpose of this study was to evaluate the reproducibility of CVM stage determi-
nation by using a more stringent methodology. Methods: Ten practicing orthodontists, trained in the CVM
method, evaluated 30 individual and 30 pairs of cephalometric radiographs in 2 sessions to determine the
CVM stage. Interobserver and intraobserver reliability was determined by using the Kendall coefficient of con-
cordance and the weighted kappa statistic. Results: All degrees of interobserver and intraobserver agreement
were moderate (Kendall’s W, 0.4-0.8). Interobserver agreement levels for CVM staging of the 10 orthodontists
at both times were below 50%. Agreement improved marginally with the use of 2 longitudinal radiographs.
Intraobserver agreement was only slightly better; on average, clinicians agreed with their own staging only
62% of the time. Conclusions: Based on these results, we cannot recommend the CVM method as a strict
clinical guideline for the timing of orthodontic treatment. (Am J Orthod Dentofacial Orthop 2009;136:478.e1-478.e7)
A
chieving excellence in adolescent orthodontic produced prediction methods that were thought to apply
treatment is predicated on proper management to any growing patient. However, others3,7,8,12,13 con-
of growth for patients with skeletal jaw dis- cluded that there is minimal validity to prediction
crepancies. The ability to predict future growth would methods, and some completely rejected the concept of
greatly aid diagnosis and treatment for these patients. facial growth prediction.14-16
Better therapeutic decisions could be made regarding Investigations to determine the timing of maximum
timing of treatment, appliance selection, extraction pat- facial growth have also shown conflicting results. Most
terns, retention, and possible need for surgery. With this have studied the relationship between statural and
information, therapy could truly be tailored to each facial growth, and the ability to predict when a growth
patient for achieving optimal results in a shorter time. spurt will occur. As summarized by Moyers and
Many authors have proposed techniques to predict Enlow,17 somatic and craniofacial growth are generally
patterns of facial growth based on cephalometric radio- related, but the relationship is difficult to use as a
graphs.1-11 Some, such as Ricketts1 and Skieller et al,9 precise, practical prediction of facial dimensional
change. Their claims have been supported by Bjork,2,5
a
b
Major, United States Air Force, Okinawa, Japan. Bambha and Van Natta,18 Bishara et al,19 Jamison et al,20
Professor, Department of Orthodontics, College of Dentistry, University of
Iowa, Iowa City. and Lewis et al.21
c
Associate research scientist, Department of Preventive and Community More recently, a method to assess skeletal matura-
Dentistry, College of Dentistry, University of Iowa, Iowa City. tion and a possible relationship to facial growth has
d
Visiting associate professor, Department of Orthodontics, College of Dentistry,
University of Iowa, Iowa City; private practice, Chicago, Ill. used the morphology of the cervical vertebrae. Lamp-
e
Associate professor, Department of Anthropology, College of Liberal Arts and arski22 created a set of standards for cervical verte-
Sciences, University of Iowa, Iowa City. brae maturation (CVM) and correlated this to the
f
Professor and head, Department of Orthodontics, College of Dentistry, Univer-
sity of Iowa, Iowa City. hand-wrist radiograph. He reported that his set of
The authors report no commercial, proprietary, or financial interest in the prod- standards was as accurate an indicator of skeletal ma-
ucts or companies described in this article. turity as the hand-wrist method. Subsequently, other
Reprint requests to: Karin A. Southard, Department of Orthodontics, College of
Dentistry, S-221 Dental Science Bldg, Iowa City, IA 52242; e-mail, authors have reported, to differing extents, the rela-
karin-southard@uiowa.edu. tionship between CVM and skeletal maturity as
Submitted, May 2007; revised and accepted, July 2007. assessed on hand-wrist radiographs,23-33 but others
0889-5406/$36.00
Copyright Ó 2009 by the American Association of Orthodontists. reported a correlation between CVM and mandibular
doi:10.1016/j.ajodo.2007.08.028 growth.34-40
478.e1
478.e2 Gabriel et al American Journal of Orthodontics and Dentofacial Orthopedics
October 2009
After careful examination of the studies of the CVM MATERIAL AND METHODS
methods, we had questions regarding the specific Our sample was randomly selected from longitudi-
methodology. Many authors reported interobserver nal growth records of untreated subjects. Thirty individ-
and intraobserver reproducibility of the CVM ual lateral cephalograms and 30 pairs of lateral
method.24,26,27,30,32,33,36 In all, the cited interobserver cephalograms of good quality with complete visualiza-
and intraobserver reproducibility exceeds 90%, with tion of cervical vertebrae 1 through 4 from 15 white
the exception of Kucukkeles et al,26 when 2 of 3 intra- male and 15 white female subjects were selected. The
observer tests of reproducibility were 45% and 65%. pairs of cephalograms were taken from the same subject
Most of those who cited high reproducibility results, within 2 years of each other. The lateral cephalograms
however, used tracings of the cervical vertebrae instead were scanned at 600 dpi for presentation as high-resolu-
of the actual radiograph during the CVM staging pro- tion images in TIF format to maintain the original
cess. This method, which involves observers tracing lat- radiographic quality.
eral cephalograms and then having other observers Ten private practice orthodontists with an average of
stage these same tracings, can introduce bias in the stag- 19.2 years of clinical experience were chosen as
ing results and thus affect the reproducibility outcomes. observers. They did not participate in the design or con-
In other words, we are all aware that a certain latitude struction of this project. Each observer was trained in
must be granted to a person in tracing cephalometric the CVM method according to Baccetti et al40 using
radiographs since this is not an exact science. The per- exact figures and legends (Fig 1). The observers
sons tracing the radiographs in CVM studies could eas- received a hard-copy handout of the reference material
ily, although unintentionally, influence the later staging in Figure 1 and a high-resolution image presentation
of those radiographs. Additionally, most observers per- containing 30 individual and 30 longitudinal pairs of lat-
forming the reproducibility tests are the authors them- eral cephalograms. The lateral cephalograms were crop-
selves. It is possible that the authors’ ‘‘research-level’’ ped to include cervical vertebrae C1 to C4 and to
understanding of the CVM method could overstate the eliminate any additional information such as stage of
reproducibility results. dentition that might bias the observer about the cervical
Other potential problems with CVM study method- stage (CS) (Fig 2). The observers were given instruc-
ology include small sample sizes, some of which appear tions to first stage the 30 individual lateral and then
to be significantly reduced from larger samples so that the 30 pairs of lateral cephalograms using the reference
the overall randomness of the sample is questionable, materials provided. The pairs of radiographs (A and B)
the same sample in several studies, and the failure to were displayed side by side so that a longitudinal com-
test study outcomes on separate, larger, and random parison of cervical vertebrae morphology could be
samples. Finally, some authors determined values of made. For the pairs of lateral cephalograms, the
reproducibility using a correlation coefficient that is observers were told that film A was taken first and
a measure of association between 2 variables. However, film B of the same subject was taken within 2 years of
a more stringent measure of association, especially for film A. The observers were told that the pairs could be
use with ordinal data, is recommended for measuring at the same or a different CS.
agreement between judges.41-43 Three weeks after the first observation, the 10
The CVM method has been advocated as a tool for observers were retrained in the CVM method. They
timing orthopedic treatment. Baccetti et al40 provided received the identical materials as the initial observa-
CVM clinical guidelines for the treatment of malocclu- tion, except that the 30 individual and the 30 pairs of lat-
sions in all 3 dimensions. These guidelines require strict eral cephalograms were in a random order from the
and accurate identification of the CVM stage to be clin- original presentation. The observers were given the
ically applicable. If accurate and reproducible at the same instructions to first stage the 30 individual lateral
clinical level, these guidelines would give clinicians cephalograms and then the 30 pairs.
a valuable and reliable growth assessment of their
patients from a routine radiograph—the lateral cephalo-
gram. However, before clinical use of the CVM method Statistical analysis
can be advocated, its accuracy and reproducibility The statistical analysis was conducted separately for
should be assessed by using methods that eliminate the individual and pairs of lateral cephalograms. Analy-
the methodologic shortcomings of previous studies. sis of the individual ones involved answering the fol-
The purpose of this study was to evaluate the reproduc- lowing questions. What was the agreement between
ibility of CVM stage determination while avoiding the observers at the initial time and 3 weeks later (inter-
these flaws. observer agreement)? When the scoring was repeated
American Journal of Orthodontics and Dentofacial Orthopedics Gabriel et al 478.e3
Volume 136, Number 4
Fig 1. Definitions of cervical vertebrae morphology and definitions of CVM stages (CS 1-6): trape-
zoid, the superior border is tapered from posterior to anterior; rectangular horizontal, the heights
of the posterior and anterior borders are equal, and the superior and inferior borders are longer
than the anterior and posterior borders; square, the posterior, superior, anterior, and inferior borders
are equal; rectangular vertical, the posterior and anterior borders are longer than the superior and in-
ferior borders. CS1, the lower borders of all 3 vertebrae (C2-C4) are flat, and the bodies of C3 and C4
are trapezoid shaped; CS2, concavity at the lower border of C2 in 80% of cases, and the bodies of
both C3 and C4 are trapezoid shaped; CS3, concavities at the lower borders of both C2 and C3, and
the bodies of C3 and C4 are either trapezoid or rectangular horizontal in shape; CS4, concavities at
the lower borders of C2, C3, and C4, and the bodies of C3 and C4 are rectangular horizontal in shape;
CS5, concavities at the lower borders of C2, C3, and C4, and at least 1 of the bodies of C3 and C4 is
square, and, if not square, the body of the other cervical vertebra still is rectangular horizontal; and
CS6, concavities at the lower borders of C2, C3, and C4, and at least 1 of the bodies of C3 and C4 is
rectangular vertical, and, if not rectangular vertical, the body of the other cervical vertebra is square
(reprinted from Baccetti et al40 published in Semin Orthod 11, The cervical vertebral maturation
method for the assessment of optimal training timing in dentofacial orthopedics, 119-29, Copyright
Elsevier 2005.).
for each observer, what was the agreement between the was computed for the individual and the pairs of lateral
2 times for each observer (intraobserver agreement)? cephalograms. The closer the kappa value is to 1, the
Was there a wide or narrow range of CSs between sub- greater the agreement between the 2 ratings. Kappa
jects; that is, were some subjects harder to stage than coefficients of 0.4 to 0.8 are generally considered to
others? When the observers differed (both interobserver indicate moderate agreement. All tests were evaluated
and intraobserver), what was the spread in terms of with a 0.05 level of statistical significance.
number of CSs?
Analysis of the longitudinal pairs of lateral cephalo-
grams involved answering the following questions. RESULTS
What was the agreement between the observers at the There were 1350 interobserver observations for the
2 times (interobserver agreement)? When the scoring 30 individual lateral cephalograms at both time points.
was repeated for each observer, what was the agreement There were 607 (45%) agreements and 743 (55%) dis-
between the 2 times for each observer (intraobserver agreements at the initial time, and 665 (49%) agree-
agreement)? ments and 685 (51%) disagreements at the 3-week
The Kendall coefficient of concordance (Kendall’s point (Fig 3).
W) was used to assess interobserver agreement at the Table I provides the results of interobserver agree-
2 times for the individual and the pairs of lateral cephalo- ment for the 10 observers at the 2 times for both individual
grams. Kendall’s W varies between 0, for no agreement, and pairs of lateral cephalograms by using Kendall’s W.
and 1, for maximal agreement. Kendall’s W values of 0.4 In all cases, interobserver agreement was moderate. For
to 0.8 are generally considered to indicate moderate the individual radiographs, Kendall’s W values were
agreement. The weighted kappa statistic for ordinal 0.74 and 0.72 for the initial and 3 week-times, respec-
data was computed to evaluate the level of staging agree- tively. When the observers were given a set of longitudi-
ment between the 2 time points for each observer. This nal radiographs, Kendall’s W scores improved minimally.
478.e4 Gabriel et al American Journal of Orthodontics and Dentofacial Orthopedics
October 2009
DISCUSSION
The fact that there were more disagreements than
agreements between observers and that an orthodontist
agreed with himself or herself only 62% of the time un-
derscores the principal finding of this study; the CVM
method is too variable to be used as a strict clinical
Fig 2. Example of cropped lateral cephalogram includ- guideline for assessing skeletal maturity and timing of
ing cervical vertebrae C2 to C4. orthopedic treatment. Additionally, there was little
improvement in the reproducibility of the CVM method
when orthodontists were given a clinically applicable,
The interobserver disagreements for the individual side-by-side comparison of 2 longitudinal lateral ceph-
lateral cephalograms were analyzed to find the number alograms. Finally, the level of disagreement between
of stages apart for each disagreement. Table II shows the orthodontists—the number of stages apart—indicates
percentages of total disagreements that each CS differ- some randomness to staging lateral cephalograms with
ence represents (ie, 1 stage apart, 2 stages apart, and so the CVM method.
on). Although most disagreements were 1 stage apart, Recently, several studies reported high levels of
a significant percentage were 2 or more stages apart, accuracy and reproducibility with the CVM method,
and this discrepancy between observers worsened at but most had methodologic flaws that call into question
the 3-week point, indicating greater disagreement the accuracy of their results. Additionally, no studies
then. For a few subjects (up to 10%), all observers have examined the reproducibility of the CVM method
agreed. by trained private-practice clinicians.
There were 300 intraobserver observations for the We designed this study to address some of these
30 individual lateral cephalograms, with 187 agree- methodologic issues. First, we avoided bias by having
ments and 113 disagreements. In other words, an ortho- the observers stage the lateral cephalogram images
dontist can be expected to agree with his or her own directly, not from tracings of cervical vertebrae as in
CVM staging, on average, only 62% of the time. Intra- previous studies.24-30,36 Second, we had a panel of 10
observer agreement percentages for the 10 observers private-practice orthodontists with no knowledge of
between the 2 time points for both individual and pairs the design of the study and who received standardized
of lateral cephalograms are shown in Table III. training in the CVM method with exact reference mate-
Intraobserver agreement for the 10 observers mea- rials according to Baccetti et al.40 In several previous
sured by the weighted kappa coefficient between the 2 studies, the authors themselves performed the interob-
time points for both individual and pairs of lateral ceph- server and intraobserver tests of reproducibil-
alograms is shown in Table IV. All values for the kappa ity.24,27,30,32,33,36 In 1 case, the authors who
coefficient are below 0.8 (0.4-0.8 is moderate agree- participated in the test of agreement actually developed
ment). The weighted kappa coefficient not only mea- the CVM criteria.24 In all of these studies, authors who
sures level of agreement, but also factors of level of serve as observers have a ‘‘research-level’’ understand-
disagreement. This explains why 2 observers (3 and 6) ing of the CVM method, and, because of this, reproduc-
who had a decrease in agreement percentage between ibility results might be overstated.
individual and pairs of lateral cephalograms, showed Kendall’s W was used to assess interobserver agree-
an increase in the kappa coefficient. Both observers ment at both observations; it is a preferred statistical
American Journal of Orthodontics and Dentofacial Orthopedics Gabriel et al 478.e5
Volume 136, Number 4
800
700
Frequency of Occurrence
600
500
400
300
200
100
0
Agreements Disagreements Agreements Disagreements
Initial 3Week
Table I. Kendall’s W for interobserver agreement on cer- Table II. Cervical stage differences for interobserver dis-
vical stages at the initial and 3-week times agreements for the individual lateral cepahalograms at
the initial and 3-week times
Statistical level
Kendall’s W of agreement Initial time point 3-week time point
Individual lateral cephalograms Cervical Cervical
Initial 0.74 Moderate stages Total stages Total
3 week 0.72 Moderate apart disagreements apart disagreements
Pairs of lateral cephalograms
Initial 0.76 Moderate 1 73.89% 1 68.32%
3 week 0.74 Moderate 2 18.43% 2 25.26%
3 6.33% 3 5.84%
4 1.35% 4 0.58%
5 0% 5 0%
method over the Pearson correlation coefficient because
of the use of ordinal data (ranking/staging) in this type
of study.24 Furthermore, we used the weighted kappa failed to test their findings on a separate, larger, and
statistic for ordinal data to assess intraobserver random sample population.
agreement. These statistical analyses not only assess We also sought to determine whether a longitudi-
agreement, but also weigh disagreement and provide nal, side-by-side comparison of 2 lateral cephalograms
an accurate assessment of the reproducibility of the from the same subject would improve interobserver
CVM method. and intraobserver agreement. This CVM observer
We randomly selected lateral cephalograms from agreement design has not been tested in previous stud-
subjects in the circumpubertal age range. This sample ies. We found interobserver agreement values of 0.76
provided a unique look at the varied ranges of CVM and 0.74 (P \0.0001) (Kendall’s W) for the initial
scores assigned to these subjects. In up to 10% of the and 3-week time points, respectively. This is only
subjects, no variance was noted, and all observers staged a slight improvement from the interobserver agreement
these cephalograms identically. Apparently, a few ceph- results for the individual cephalograms: 0.74 and 0.72
alograms are easily staged, but, in most other cases, (P \0.0001) for the initial and 3-week time points, re-
staging the cervical vertebrae by trained observers is spectively. Furthermore, the level of intraobserver
clearly much more difficult. Several recent studies agreement varied from 46.7% to 73.3% when staging
used small sample sizes that were reduced from large pairs of lateral cephalograms 3 weeks apart. This com-
populations, questioning the randomness of the sam- pares with the 43.3% to 80% intraobserver agreement
ple.36,39,40 Furthermore, these authors also apparently for the individual lateral cephalograms. Overall, 5 of
478.e6 Gabriel et al American Journal of Orthodontics and Dentofacial Orthopedics
October 2009
Table III.Intraobserver agreement percentages between Cervical stage differences for intraobserver dis-
Table V.
the initial and 3-week times for individual and pairs of agreements for individual lateral cephalograms
cephalograms
Cervical stages apart Total disagreements
Agreement percentage Agreement percentage
for individual lateral for pairs of lateral 1 66.37%
Observer cephalograms cephalograms 2 25.66%
3 5.31%
1 56.7% (17/30) 68.3% (41/60) 4 2.65%
2 73.3% (22/30) 61.7% (37/60) 5 0%
3 80.0% (24/30) 68.3% (41/60)
4 66.7% (20/30) 65.0% (39/60)
5 43.3% (13/30) 53.3% (32/60) The validity of the CVM method is in its reproduc-
6 63.3% (19/30) 60.0% (36/60) ibility among clinicians. If clinicians cannot agree on
7 63.3% (19/30) 73.3% (44/60)
8 70.0% (21/30) 71.7% (43/60)
the stage of maturation, the method lacks clinical rele-
9 43.3% (13/30) 46.7% (28/60) vance. We also studied the range of disagreements; ie,
10 63.3% (19/30) 51.7% (31/60) when observers disagreed, how many stages apart
were their observations? Disagreements between
observers of 2 or more stages apart occurred in 26%
and 31% of the total disagreements for the initial and
Table IV.Weighted kappa coefficients for intraobserver 3-week time points, respectively. Furthermore, intraob-
agreement between the initial and 3-week times for server disagreements that were 2 or more stages apart
individual and pairs of cephalograms occurred in 34% of all intraobserver disagreements.
Weighted kappa Weighted kappa These findingse indicate a level of randomness to stag-
coefficient for individual coefficient for pairs ing lateral cephalograms with the CVM method.
lateral cephalograms of lateral cephalograms What do these results mean to a practicing ortho-
Observer (95% confidence limits) (95% confidence limits)
dontist? To be of clinical value, this technique should
1 0.49 (0.28-0.70) 0.68 (0.53-0.83) be robust enough to allow clinicians to correctly identify
2 0.79 (0.64-0.93) 0.71 (0.60-0.82) the CS of maturation. However, our results demonstrate
3 0.70 (0.49-0.92) 0.79 (0.70-0.88) that, for our panel of 10 private-practice clinicians with
4 0.60 (0.36-0.83) 0.47 (0.28-0.65)
mean experience of 19 years, this is not the case. At the
5 0.36 (0.14-0.58) 0.58 (0.44-0.71)
6 0.58 (0.37-0.80) 0.70 (0.59-0.81) clinical level, these indicators should be used only to
7 0.66 (0.47-0.85) 0.72 (0.60-0.85) augment the orthodontist’s other observations in mak-
8 0.70 (0.49-0.90) 0.75 (0.62-0.87) ing clinical decisions.
9 0.45 (0.26-0.63) 0.51 (0.37-0.64)
10 0.67 (0.51-0.83) 0.61 (0.49-0.73)
CONCLUSIONS
the 10 observers had a lower agreement percentage for 1. Interobserver agreement for CVM staging in our
a pair of longitudinal lateral cephalograms from the sample of practicing orthodontists was below
same patient. In comparing weighted kappa coeffi- 50%. Agreement improved marginally with the
cients, 7 of the 10 observers had improved kappa co- use of 2 longitudinal radiographs.
efficient values for the pairs of lateral cephalograms 2. On average, clinicians agreed with their own stag-
compared with the individual ones. These 7 observers ing only 62% of the time, 3 weeks later.
on average improved the weighted kappa coefficient 3. Staging for a select sample of subjects is much
by 0.11, but, despite the improvement, the level of more consistent than for a larger random sample.
agreement in all cases remained well below the re- 4. When our clinicians disagreed about the stage of
ported levels in the literature. Longitudinal compari- a particular subject, there was a wide range (2 or
sons of lateral cephalograms are important in more stages) of disagreement in approximately 1
orthodontics when making clinical diagnoses and in 4 cases.
treatment-planning decisions. It can be concluded 5. The reproducibility for our trained clinicians was
from this study that CVM reproducibility results do significantly below the level that is purported in
not improve significantly when clinicians are offered the literature. Based on these results, we cannot rec-
a longitudinal comparison of cervical vertebrae ommend the CVM method as a strict clinical guide-
morphology. line for the timing of orthodontic treatments.
American Journal of Orthodontics and Dentofacial Orthopedics Gabriel et al 478.e7
Volume 136, Number 4