Evaluation of segmentation algorithms usually involves comparisons of segmentations to gold-standard delineations without regard to the ultimate medical decision-making task. We compare two segmentation evaluations methods—a Dice similarity coefficient (DSC) evaluation and a diagnostic classification task–based evaluation method using lesions from breast computed tomography. In our investigation, we use results from two previously developed lesion-segmentation algorithms [a global active contour model (GAC) and a global with local aspects active contour model]. Although similar DSC values were obtained (0.80 versus 0.77), we show that the global + local active contour (GLAC) model, as compared with the GAC model, is able to yield significantly improved classification performance in terms of area under the receivers operating characteristic (ROC) curve in the task of distinguishing malignant from benign lesions. [Area under the compared to 0.63, ]. This is mainly because the GLAC model yields better detailed information required in the calculation of morphological features. Based on our findings, we conclude that the DSC metric alone is not sufficient for evaluating segmentation lesions in computer-aided diagnosis tasks.