If you don't remember your password, you can reset it by entering your email address and clicking the Reset Password button. You will then receive an email that contains a secure link for resetting your password
If the address matches a valid account an email will be sent to __email__ with instructions for resetting your password
The Melasma Area and Severity Index (MASI), the most commonly used outcome measure for melasma, has not been validated.
We sought to determine the reliability and validity of the MASI.
After standardized training, 6 raters independently rated 21 patients with mild to severe melasma once daily over a period of 2 days to determine intrarater and interrater reliability. Validation was performed by comparing the MASI with the melasma severity scale. The darkness component of the MASI was validated by comparing it with the difference between mexameter scores for affected versus adjacent normal-appearing skin. The area component of the MASI was validated by comparing it with the area of each section of the face determined by computer-based measurement software.
The MASI score showed good reliability within and between raters and was found to be valid when compared with the melasma severity scale, mexameter scores, and area measurements. Homogeneity assessment by raters showed the least agreement and can be removed from the MASI score without any loss of reliability.
Patients were limited to Hispanic, African, and Asian backgrounds.
The MASI is a reliable measure of melasma severity. Area of involvement and darkness are sufficient for accurate measurement of the severity of melasma and homogeneity can be eliminated.
The Melasma Area and Severity Index (MASI) is a reliable measure of the severity of melasma, showing temporal stability and internal consistency.
The melasma severity scale, colorimetric readings of pigmentation intensity, and area measurement using computer software established validation of the MASI.
Measurement of darkness and area of involvement is sufficient in determining severity of melasma, whereas homogeneity is unreliable and should be eliminated from the MASI.
A new modified MASI score is recommended as a result of this study.
Melasma is a common, persistent disorder of hyperpigmentation affecting a significant portion of the population, particularly women with skin types IV to VI living in areas with intense ultraviolet radiation.
Conventional treatment of melasma includes elimination of any causative factors coupled with the use of sunscreens and hypopigmenting agents, often in combination with other therapies. Despite the availability of these measures, melasma is often recalcitrant to treatment and frustrating for both patients and clinicians. To develop new agents for melasma, valid and reliable outcome measures are critical to determining clinical severity of disease and the significance of treatment results when performing clinical trials. These outcome measures should be easy to learn, responsive, inexpensive, and applicable worldwide. Unfortunately, no such measure exists for melasma.
The Melasma Area and Severity Index (MASI), an outcome measure developed to provide a more accurate quantification of the severity of melasma and changes during therapy, was developed by Kimbrough-Green et al,
who based it on a similar scoring system devised for psoriasis (Fig 1). The MASI score is calculated by subjective assessment of 3 factors: area (A) of involvement, darkness (D), and homogeneity (H), with the forehead (f), right malar region (rm), left malar region (lm), and chin (c), corresponding to 30%, 30%, 30%, and 10% of the total face, respectively. The area of involvement in each of these 4 areas is given a numeric value of 0 to 6 (0 = no involvement; 1 = <10%; 2 = 10%-29%; 3 = 30%-49%; 4 = 50%-69%; 5 = 70%-89%; and 6 = 90%-100%). Darkness and homogeneity are rated on a scale from 0 to 4 (0 = absent; 1 = slight; 2 = mild; 3 = marked; and 4 = maximum). The MASI score is calculated by adding the sum of the severity ratings for darkness and homogeneity, multiplied by the value of the area of involvement, for each of the 4 facial areas:
The total score range is 0 to 48. The MASI is the most commonly used outcome measure for melasma trials but has never been validated. The purpose of this prospectively designed study was to determine the reliability and validity of the MASI score.
This prospective study was approved by the institutional review board of the University of Texas Southwestern Medical Center, Dallas, and all patients gave written informed consent. Six raters were recruited for the study, including 5 dermatologists and one dermatology resident. Two raters had extensive experience, two had minimal experience, and two had no experience scoring the MASI. The raters were given a 30-minute lecture and slide presentation describing the MASI; shown examples of different levels of area, darkness, and homogeneity; and shown 4 practice images of patients with melasma. These images were scored by raters, after which the answers were discussed. The raters were also shown an image of each level of the melasma severity scale (MSS) (Fig 2), which is a nonvalidated, simple, easy-to-use categorical scale (range: 0 = none, 1 = mild, 2 = moderate, and 3 = severe) developed for a large multicenter trial using a combination cream for patients with melasma.
In all, 21 patients with mild to severe melasma (MSS score 1-3) were recruited from local dermatology clinics. Inclusion criteria included bilateral epidermal or mixed melasma and age 18 years or older. The 6 raters examined the patients independently and rated the patient's degree of melasma using the MASI and the MSS. The following day, the same patients were seated in different rooms and the raters scored each patient again.
All patients completed a short demographic questionnaire and were photographed using a stereotactic face device (Canfield Scientific, Fairfield, NJ). A narrowband reflectance spectrophotometer (Mexameter MX-16, Courage and Khazaka, Cologne, Germany) was used to measure degree of pigmentation of involved and adjacent uninvolved skin in all affected areas. The percent of involved skin on digital images was measured using computer imaging software (Scion Image for Windows, Scion Corp, Frederick, MD).
Weighted kappa statistics were computed to assess the intrarater reliability for each rater by comparing MASI scores on all patients from day 1 to 2, so that ratings that were directly adjacent to one another received a higher weight than ratings spaced further apart. Strength of agreement of kappas was as follows: fair = 0.21 to 0.40, moderate = 0.41 to 0.60, and substantial = 0.61 to 0.80.
MASI scores were compared by location (forehead, right malar region, left malar region, and chin) combined with area of involvement, darkness of melasma, and homogeneity of hyperpigmentation for intrarater reliability assessment. Interrater reliability was assessed using Spearman rank order correlations comparing MASI darkness ratings between raters separately on days 1 and 2, by location. Both intrarater and interrater reliabilities were also assessed using intraclass correlation coefficients, in which a reliability of 0.6 or greater is considered good.
Validation of the MASI was performed comparing MSS (ratings for all subjects performed by all raters), mexameter (darkness of affected skin compared with normal-appearing adjacent skin), and size, based on computer imaging (area of skin affected through digital images). The MASI total score was compared with the MSS using analysis of variance (ANOVA) models combined across raters for each day with MASI total score as the continuous response variable and MSS score as the grouping variable. Validity was considered good if the ANOVA models were significant and the mean MASI score increased with increasing category scores for MSS using the linear contrast for trend; Tukey-Kramer adjustments for multiple pairwise mean comparisons were also conducted. Additional analyses (results not reported) compared MASI total score by levels of MSS for each rater separately by day using ANOVA. Pairwise differences between MASI and MSS scores by each rater were also examined using Spearman correlations. The darkness subcomponent of MASI was correlated with the difference in mexameter scores between normal-appearing and involved skin for each of the 4 facial locations. The area subcomponent was correlated with area calculation of involved skin on digital images using software (Scion Image for Windows, Scion Corp). Validity was considered good if the Spearman correlation between these measures was high (≥0.40) and significant.
In all analyses, two-sided P values less than .05 were considered statistically significant. Data were analyzed using software (SAS, Version 9.1.3, SAS Institute Inc, Cary, NC; and SPSS 15.0, SPSS Inc, Chicago, IL).
A total of 21 patients were recruited for this study. Demographic information is presented in Table I. The majority of the patients identified themselves as Hispanic, whereas two patients self-identified as Asian and one as African. An item analysis of the ratings was performed, which showed that all levels for all 3 components of the MASI in all 4 areas of the face were used, indicating a wide spectrum of melasma severity in the patient sample.
The average results across raters for intrarater reliability (weighted kappa) presented in Table II (available online at www.eblue.org) show moderate agreement for the area, darkness, and homogeneity values for the forehead and darkness values for the malar region. There was fair agreement for all chin values and area and homogeneity values for the malar region. Average interrater reliability (Spearman) was greater than 0.4 in all 4 locations of the face on both days (Table III, available online at www.eblue.org). Average correlations improved from day 1 to 2 and met the minimum criterion for good reliability except for chin. Overall, the average intrarater reliability comparing the MASI total score on days 1 and 2 (intraclass correlation coefficient = 0.75) (Table IV, available online at www.eblue.org) exceeded the minimum criterion of 0.6.
Validity analyses using ANOVA models showed that mean MASI total scores increased significantly with increasing categories of MSS levels across raters on both days (P < .0001 for linear trend on both days) (Fig 3). All pairwise comparisons for MASI total score between MSS categories were significant (maximum P < .0001) on both days, separately, except for mild versus moderate on day 2 (P = .061). These results were robust for each individual rater when examining linear trend test and pairwise comparisons among mild, moderate, and severe MSS categories and day. Validity using Spearman correlation for MASI total score versus MSS score revealed good agreement overall for all raters on both days (average rho = .73) (Table V, available online at www.eblue.org). A similar analysis of correlations between MASI darkness ratings and the mexameter difference between normal-appearing and uninvolved skin showed poor agreement (rho = 0.14 and 0.25) for the forehead but good agreement on the malar areas and the chin (rho = 0.45-0.61) (Table VI, available online at www.eblue.org). Validity analysis between the area of involvement calculated by computer imaging software (Scion Image for Windows, Scion Corp) and MASI area ratings by rater revealed good correlation for most raters and locations (Table VII, available online at www.eblue.org). The forehead and chin were low on day 1 and rater 3 (average day 1 rho = 0.2653), however, this improved on day 2 (average day 2 rho = 0.4184). The MASI total score was substantially higher when the MSS category was severe, and the severe melasma category could be distinguished clearly from the mild and moderate categories. When the MSS rating was changed from mild to moderate for these subjects, the corresponding change in the mean MASI total score was 0.7 (±2.5) points; therefore, the MASI scale being a continuous scale is able to provide more precise numeric estimates of the melasma severity than the categorical scale of MSS.
Until now, the MASI has never been tested for reliability or validity. Although a simple outcome, such as number of patients who cleared, is optimal, melasma is frequently a chronic, recalcitrant disorder; therefore, a measure of disease severity of all degrees is needed. Multiple instruments to measure hyperpigmentation have been developed; however, these instruments are expensive, and measurements can be time-consuming. We were able to use such instruments to validate clinical measures of melasma in the current study, indicating that the MASI measures what it is designed to measure.
Outcome measures should have internal consistency, have construct validity, and demonstrate sensitivity to change.
The MASI has face validity, as it attempts to measure the size and darkness of pigmentation associated with melasma, which are the most frequent symptoms of patients with this disorder. Experts in melasma have used the MASI score as the predominant outcome measure for almost 20 years, suggesting that it has content validity as well.
This study demonstrated that the overall MASI is reliable in the measurement of melasma and that raters were consistent in their assessments between days 1 and 2, demonstrating temporal stability. Interrater reliability testing showed consistency in the use of the scale, however, individual components of the MASI were problematic, particularly assessment of homogeneity and the chin. Homogeneity was the most difficult component to reliably assess in our patients. Removal of homogeneity did not alter reliability or validity measures, therefore, we recommend removal of homogeneity from the MASI score. The chin may be more difficult to assess than other components of the face because of its small size and less involvement with melasma than other areas. Nevertheless, we recommend that the evaluation of the chin be retained, as it is an integral part of evaluation of the face, represents only 10% of the overall score, and variations in assessment cause little change to the MASI score. The modified MASI we propose is easy to learn and perform and is scored as follows:
The range of the total score is 0 to 24. Area and darkness are scored as follows: area of involvement: 0 = absent, 1 = <10%, 2 = 10%-29%, 3 = 30%-49%, 4 = 50%-69%, 5 = 70%-89%, and 6 = 90%-100%; darkness: 0 = absent, 1 = slight, 2 = mild, 3 = marked, and 4 = severe.
Validity testing showed that the MASI scale being a continuous scale is able to provide more precise numeric estimates of the melasma severity than the categorical scale of MSS. Importantly, the mean MASI scores presented in Fig 3 may be useful as threshold scores for entry of patients with moderate and severe melasma in future studies, as there is currently no consensus on threshold levels for entry into a melasma study. For the modified MASI proposed in the current study, the values should be reduced by half.
In summary, the MASI score shows good reliability within and between raters and good validity when compared with the MSS, mexameter readings, and computer-assisted area measurements. Removing chin and homogeneity measurements does not improve intraclass correlation coefficient reliability but because kappa statistics were poor for homogeneity and because it is a difficult assessment to teach and perform, we recommend removal of homogeneity from the MASI score. We also recommend that chin assessments are preserved in the MASI as it is an important part of facial assessment and because it only represents 10% of the whole index. Practice of raters is important, as more agreement was noted on day 2 versus day 1. The validation and reliability testing of the modified MASI score required a training module, including practice images, before initiating patient assessments, therefore, this training should be given to raters to ensure reliability in future studies. If desired, readers may contact the corresponding author for access to this training module. Future studies assessing the modified MASI over time in individual patients should be performed to determine the sensitivity to change of this outcome measure.
The authors would like to thank Christina Carrigan, RN, for assistance in conducting this study.
1Intrarater reliability: weighted kappa statistics comparing Melasma Area and Severity Index ratings across days
Supported by an unrestricted educational grant from Galderma International .
Disclosure: Drs Pandya, Grimes, Nordlund, Ortonne, Rendon and Taylor are members of the Pigmentary Disorders Academy and receive honoraria as consultants from Galderma International for their work on behalf of the Academy. Drs Hynan, Bhore, Guevara, Gottschalk, and Agim, and Ms Copeland Riley have no conflicts of interest to declare.