Ordinal scales have numbers with rank order but unequal distances between, while interval scales

  • Journal List
  • J Res Med Sci
  • v.19(1); 2014 Jan
  • PMC3963323

J Res Med Sci. 2014 Jan; 19(1): 47–56.

Abstract

Background:

selecting the correct statistical test and data mining method depends highly on the measurement scale of data, type of variables, and purpose of the analysis. Different measurement scales are studied in details and statistical comparison, modeling, and data mining methods are studied based upon using several medical examples. We have presented two ordinal–variables clustering examples, as more challenging variable in analysis, using Wisconsin Breast Cancer Data (WBCD).

Ordinal-to-Interval scale conversion example:

a breast cancer database of nine 10-level ordinal variables for 683 patients was analyzed by two ordinal-scale clustering methods. The performance of the clustering methods was assessed by comparison with the gold standard groups of malignant and benign cases that had been identified by clinical tests.

Results:

the sensitivity and accuracy of the two clustering methods were 98% and 96%, respectively. Their specificity was comparable.

Conclusion:

by using appropriate clustering algorithm based on the measurement scale of the variables in the study, high performance is granted. Moreover, descriptive and inferential statistics in addition to modeling approach must be selected based on the scale of the variables.

Keywords: Biostatistics, breast cancer, cluster analysis, data mining, research design

INTRODUCTION

In medical research, the design of a study is the most important part that directs other steps of research, especially, all type of data analysis. A badly designed study could never be retrieved, whereas a poorly analyzed one can usually be re-analyzed.[1] Another important issue, such as sample size calculation, also depends on the kind of experimental design and kind of measurements that exist in the study. Above all, the main question is: What types of data are being measured? The other steps of the analysis are indeed determined by the type of variable used.[2,3,4,5,6] In this regard, analyzers assume that the variables have specific levels of measurement.

Stevens proposed his typology in 1946.[7] In his article, Stevens claimed that all measurements in science were conducted using four types of scales that he called ‘nominal’, ‘ordinal’, ‘interval’ and ‘ratio’, unifying both qualitative (which are described by his ‘nominal’ type) and quantitative (to a different degree, all the rest of his scales). The concept of scale types later received the mathematical rigor that it lacked at its inception with the work of mathematical psychologists Theodore Alper,[8,9] Louis Narens,[10,11] and R. Duncan Luce.[12,13,14] Nowadays, the ordinal scale is considered as a qualitative variable.[15] However, this scale typology has received a lot of criticism.[6,16,17,18] Alternative scale taxonomies have therefore been suggested[19] that consists of grades, ranks, counted fractions, counts, amounts, and balances.[6] Most of the conflict between the pro-Stevens (‘conservative’) and the anti-Stevens (‘liberal’) camps begins after both sides agree that a certain variable is ordinal. But they part company when analyzing the data generated by that variable. The exchange in Nursing Research between Armstrong and Knapp is illustrative of the competing positions.[20]

Measurement scales

Nominal scales are only used for qualitative classification. They can be only measured whether the individual items belong to certain distinct categories. However, it is not possible to quantify or rank order the categories. Nominal data has no order, and the categories assignment is arbitrary. Also, it is not possible to perform arithmetic or logical operations on the nominal data.[18] Briefly, nominal data have three distinct features: 1) no ordering of the different categories, 2) no measure of distance between values, and 3) categories can be listed in any order without affecting the relationship between them. Nominal variables are also called (nonranked) categorical in the literature. The number of occurrences in each category is referred to as the frequency count for that category.[6] The other category dichotomous (binary) is defined as the variables that are nominal variables that have only two categories or levels. Examples of normal variable are gender, marital status, eye color, nationality, affiliation, religious preference, surgical outcome (dead/alive), blood type, and epidemiological status (healthy, patient), having any symptoms in a questionnaire (yes/no).

A discrete–ordinal scale is a nominal variable, but the different states are ordered in a meaningful sequence. Ordinal data have order, but the intervals between scale points may be uneven. Because of the lack of equal distances, arithmetic operations are not possible, but logical operations can be performed.[21] Under an ordinal scale, the subjects or objects are ranked in terms of degree to which they possess a characteristic of interest.[6] An ordinal scale indicates direction, in addition to providing nominal information. In medicine, ordinal variables often describe the patient's characteristics, attitude, behavior, or status. Examples of ordinal variables might include: stages of cancer (stage I, II, III, IV), education level (elementary, secondary, college), pain level (1-10 scale), satisfaction level (very dissatisfied, dissatisfied, neutral, satisfied, very satisfied), social status (upper, middle, lower), type of degree (BS, MS, PhD), the Likert variable[22] such as the attitudinal response variable (agreement level) with four levels (strongly disapprove, disapprove, approve, strongly approve), or 4-item-rating scale (always, often, sometimes, never), graduation rank, visual analog scale (VAS), BMI (body mass index)-based nutritional status (sever thin, thin, normal, overweight, and obese).

Continuous — ordinal scales occur when the measurements are continuous, but one is not certain whether they are on a linear scale, the only trustworthy information being the rank order of the observations. For example, if a scale is transformed by an exponential, logarithmic, or any other nonlinear monotonic transformation, it loses its interval scale property. Here, it would be expedient to replace the observations by their ranks.[21]

Interval scales are metric scales that have constant, equal distances between values, but the zero point is arbitrary. They are measured on a linear scale, and can take on positive or negative values. It is assumed that the intervals keep the same importance throughout the scale.[21] In an interval scale, such as body temperature (°C, °F) or calendar dates, a difference between two measurements has meaning, but their ratio does not.[23] Counts are interval scale measurements, such as counts of publications or citations, years of education, intelligence (IQ test score), BMI, and age (years).

The ratio scales are metric scales and the most informative scale. It is an interval scale with the additional property that its zero position indicates the absence of the quantity being measured. Briefly, ratio scales have equal intervals between values, the zero point is meaningful, and the numerical relationships (e.g. division) between numbers are meaningful. Examples of the ratio scales include weight, pulse rate, respiratory rate, body temperature (°K), and body length in infants or height in adults. Since the statistical tests on the ratio scales are the same as those of interval scales, the inferential statistics will be discussed on normal, ordinal, and interval scales.

Statistics are part of our everyday life. Anyone who lacks fundamental statistical literacy, reasoning, and thinking skills might not be able to perform acceptable research. Kuzma provided a formal definition of the term ‘statistics’:[24]

‘A body of techniques and procedures dealing with the collection, organization, analysis, interpretation, and presentation of information that can be stated numerically’. The statistical analysis divided in two important branches; descriptive and inferential analysis.

Descriptive and inferential statistics for different types of variables

Descriptive statistics is the strategy of quantitatively describing the main features of a collection of data and presented by central and dispersion tendencies. The central tendency of nominal variables is defined as the mode, the most common item. For the ordinal variables, the median (middle-ranked item), or the mode can be used as the central tendency estimates. For interval variables, the mode, median, and arithmetic mean could be used as the central tendency, yet in addition to the aforementioned operators, the geometric (the samples root of the product of the data samples) and harmonic (the reciprocal of the arithmetic mean of the reciprocals of the data samples) means are allowed for ratio variables.

Statistical dispersion is not defined for nominal and ordinal scales. For interval variables, the range, and standard deviation could be used as the dispersion measure, yet in addition to the aforementioned operators, the studentized range (the difference between the largest and smallest data, divided by the standard deviation) and the coefficient of variation (the ratio of the standard deviation to the mean) are allowed for ratio variables. The inferential statistics used to describe systems of procedures that can be used to draw conclusions from datasets arising from systems is affected by random variation. Any statistical inference requires some assumptions. Rejection of a hypothesis is an important part of inferential statistics using suitable statistical tests as parametric or nonparametric. In parametric tests, the probability distributions describing the data-generation process are assumed to be fully described by a family of probability distributions involving only a finite number of unknown parameters whereas in nonparametric tests the assumptions made about the process generating the data are much less than in parametric statistics and may be completely undefined. The purpose of the analysis and the scale of the measurement of the data define the suitable statistical test.[4] Usually, the statistical parametric tests rely on the normality of the distribution of the interval-scale data. Thus, normality tests such as Kolmogorov–Smirnov or Shapiro–Wilks are used to check the normality assumption.[25] The power of the parametric tests is higher than the corresponding nonparametric tests. Thus, the transformation of the interval variables is sometimes used to guarantee normality assumption.[26]

The appropriate tests for different variable scales for comparisons between two or more groups containing independent or paired samples are listed in [Table 1]. The following clinical examples are given to elaborate the issue of correct statistical test to use the following.

Table 1

Selecting the appropriate test for comparisons between two or more than two groups based on different scales

Ordinal scales have numbers with rank order but unequal distances between, while interval scales

  • Comparing the HDL (High-density lipoprotein) value in the healthy and diabetic patients, two independent sample t-test is used if HDL values are normally distributed in the classes, otherwise Wilcoxon–Mann Whitney test is used.

  • To identify whether gender is equally distributed among abdominal obese people, the Chi-square test can be used.

  • If the distribution of the BMI-based nutritional status (sever thin, thin, normal, overweight and obese) is the same among the patients with liver cancer, the Wilcoxon–Mann Whitney test is used.

  • Finding whether the prevalence of high diastolic pressure is the similar in the Normoalbuminuria, Microalbuminuria, and Macroalbumineria groups, the Chi-square test could be used.

  • The effectiveness of an educational program on the correct diagnosis of a disorder is identified using the McNemar test.

  • The difference of blood sample vitamin-D concentration in normal, pre-diabetic and diabetic patients is identified using one-way ANOVA.

  • The comparison of blood HbA1C concentration among pregnant women in the first, second and third semester of the pregnancy is performed using one-way repeated measurements ANOVA.

Additionally, appropriate modeling methods for different variable scales are listed in [Table 2]. Modeling is usually used when we want to reduce the effect of confounders and the type of the modeling is determined by the scale of the dependent variable(s). Here are some clinical modeling examples.

Table 2

Selecting the appropriate test or modeling for different categories of dependent and independent variables

Ordinal scales have numbers with rank order but unequal distances between, while interval scales

  • The gender-specific difference of blood sample vitamin-D concentration in normal, pre-diabetic and diabetic patients is identified using factorial ANOVA.

  • The effect of air pollutant concentration on the born weight considering mother's nutritional status and the supplementary intake is determined using the multiple linear regression.

  • In the later example, if the born weight is categorized by the underweight and normal groups, the simple logistic regression is used.

  • The effectiveness of a treatment method on stage of tumor (grades I–IV), cancelling the effect of confounders such as gender, age, and immunologic factors of patients is determined by using ordered logistic regression.

For detailed description of the aforementioned methodologies the reader is referred to the selected textbooks and guidelines.[4,6,27,28,29,30]

Data mining for different types of variables

Data mining (DM) is the process of discovering new patterns embedded in large data sets. DM uses this information to build predictive models. A lot of complex data are generated by healthcare systems in which manual analysis has become impractical. DM can generate information that can be useful to health care, including patients by identifying effective treatments. DM of medical data requires specific medical and DM knowledge. Medical DM activities include clustering, classification and estimation, and treatment effectiveness.[31,32,33] In this section, we focus on clustering. However, the issues considered can be extended to other DM methods.

Clustering is the task of grouping a set of objects in such a way that objects belonging to the same cluster are similar to each other (homogeneity) and objects belonging to different clusters are dissimilar to each other (separation). A clinical example is now given for clarification of clustering procedure: in year 2000, a paper was published in Nature by Alizadeh et al.,[34] in which the gene expression profiles (micro array) of 72 patients diagnosed as either acute myeloid leukemia (AML) or acute lymphatic leukemia (ALL) were analyzed. The authors could distinguish two similar groups corresponding to AML and ALL by clustering and match the groups with the routine leukemia diagnosis. Based upon this Roland Eils designed an expert system for prediction of genetic disease.[35] In the other words, if a new microarray gene profile is tested, it is possible to diagnose type of leukemia.

The similarity between objects plays an important role in any clustering algorithm, since similar objects belong to a cluster. An object could be a patient with variety of recorded clinical data (features). Similar objects have similar features. Features could be interval, ordinal, and nominal variables. The question is how the similarity is measured for various types of data scales?

The dissimilarity measure (distance) can be easily defined for interval variables. The Euclidean, Manhattan, Maximum, Minkowski, Mahalonobis, Average, Chord, Canberra, and Czekanowski distances could be used in this case.[36] For the nominal variables, simple matching, Russell-Rao, Jaccard, Dice, Rogers-Tanimoto, and Kulczynski distances might be used, while there are more than 76 distance measures such as Yule, Sokal-Sneath-c, and Hamann measures that could be used for the binary data.[36,37,38] An example is shown in [Figure 1] for better clarification. However, there are many problems in defining dissimilarity measures for ordinal variables. The distance measure for the ordinal data cannot be defined unless the ordinal to interval variable conversion is used. Moreover, defining proper similarity measure can also affect statistical feature reduction and visualization techniques such as multidimensional scaling (MDS), in which the distance measure is defined for different measurement scales (e.g. using the weighted Euclidean model).[39,40,41,42]

Ordinal scales have numbers with rank order but unequal distances between, while interval scales

An example of calculating the distance between two objects of ordinal variables, using the simple dissimilarity measure

Ordinal to interval variable conversion

Consider the four-item rating scale (always, often, sometimes, never) that is widely seen in the questioners of psychological,[43] gastrointestinal,[44] nutritional,[45] and public health[46] researches. One approach to handle ordinal variables is introducing a dummy binary variable by merging [always, sometimes] and [rarely, never] as ‘yes’ or ‘no’. Thus, the ordering information is discarded and a suitable binary distance measure can be used. However, some information is lost, that could have potentially improved the predictive performance of the groups’ dissimilarity.[47]

The other strategy is monotonic nonrandom and random assignments of numbers to rank order and treat them as if they conform to interval scale.[48,49] The first approach is called equal distance scoring (EDS), while the other solution is entitled as monotonic random scoring (MRS) in the literature. Using EDS, interval variables such as [0, 1, 2, and 3] are used for the four-item rating scale. Accordingly, the distance between ‘sometimes’ and ‘never’ is the same as that of ‘sometimes’ and ‘often’. This is not really correct. Additionally, EDS has received criticisms in the literature and proved not to be efficient even in correlation analysis in some cases where the ranks are not uniformly distributed.[50] Although, MRS has been extensively used in the literature, it has also received criticisms.[51] In MRS, uniform and normal monotonic random numbers are generated and used instead of the ordinal scale. Using MRS, the aforementioned four-item rating scale might be represented by the following uniform monotonic random numbers [0.1270, 0.8147, 0.9058, and 0.9134]. Using the random number generator again, the new mapping would be [0.0975, 0.2785, 0.5469, and 0.6324]. The question is whether the transformation is unique at every MRS run, and also if the problem mentioned in EDS is resolved?

The optimal ordinal-to-interval conversion is still debatable and many complicated approaches have been introduced in the literature.[51,52] In none of which, the mapping was not defined as to maximize the separation of the groups in the clustering procedure. In the next section, clustering methods defined for different variable scales are discussed and the relationship between this mapping and clustering is considered.

Clustering methods for different variable scales

Most previous clustering methods focus on interval data for which the dissimilarity could be calculated easily, such as density-based (DBSCAN,[53] OPTICS[54]), partitioning (k-means,[55] k-medoids,[56] fuzzy c-means,[57] ISODATA[58]), hierarchical (different linkage algorithms,[59,60] MONA,[61] DIANA[62]), and grid-based (WaveCluster,[63] Fractal Clustering[64]).

Nonranked categorical clustering algorithms have been extensively proposed in the literature, such as LIMBO,[65] COOLCAT,[66] CACTUS,[67] ROCK,[68] MMR,[69] CLICKS,[70] HD vector,[71] AUTOCLASS,[72] K-modes,[73] fuzzy K-modes,[74] fuzzy centroids,[75] genetic fuzzy k-modes,[76] and fuzzy centroids.[75] However, the dissimilarity measures and cluster representatives have great impact on the clustering performance and convergence.[77,78,79]

It is possible to use dummy binary variables for ordinal data, and then use any of the above clustering methods at the expense of losing details. There are few algorithms proposed for clustering ordinal data, such as median fuzzy c-means[80] and a modified fuzzy c-means clustering method in which the ordinal-to-interval mapping is simultaneously determined by particle swarm optimization.[81] In the later method, the mapping is calculated so as to maximize the inter-cluster distance and minimize the intracluster distance. This algorithm is one of the few clustering methods in which the mentioned transformation is adaptively estimated for each ordinal variable. This algorithm will be used at the next section of this manuscript for clustering a cancer dataset with ordinal variables.

Latent variable models

Latent variable models, specifically item response theory, have also been used for modeling and clustering of ordinal data.[82,83,84] The mixture of item response models could be used for the clustering of such data. It is assumed that the observed ordinal data are discrete versions of an underlying latent Gaussian variable. The clustering is then achieved by fitting a mixture model to the latent Gaussian data.[85] However, this method relies on the posterior mean of the latent Gaussian data and the Gaussian assumption could be valid for a sufficiently large data set (number of variables and also levels of ordinal variable) which cannot be always taken for granted.[85]

Latent class analysis

Latent class analysis (LCA) is a subset of structural equation modeling, used to find groups or subtypes of cases in multivariate categorical data. These subtypes are called ‘latent classes’.[86] One of the common statistical application areas of LC analysis is the clustering, in which LC cluster models are introduced. These models have advantages over traditional clustering methods: such as probability-based classification (similar to fuzzy memberships), handling continuous, categorical, counts,[87] or mixed mode data[88,89,90] and the application of demographics and other covariates for clustering analysis.[91,92,93,94] LC models are model-based clustering methods in which explicit assumptions are made about the form of the probability density function describing the population of the observed data.[95,96] Clustering analysis and further inferences about the numbers of clusters and cluster membership are based on estimation of the unknown parameters in the probability model used.[97] Two main methods to estimate the parameters of the various types of LC cluster models are the maximum-likelihood (ML) method and the maximum-posterior (MAP) method; thus, a well-known problem in LC analysis is the occurrence of local solutions. Accordingly, the analyst must interpret estimates cautiously. Moreover, the weak identifiability of LC clustering,[98] the complexities of the likelihood function and likelihood surface make the procedure sensitive to initial estimates.[99] Also, the model selection issue is one of the main research topics in LC clustering, that is, estimation of the number of clusters and the form of the model given the number of clusters. Akaike (AIC), Bayesian (BIC), and consistent Akaike (CAIC) information criteria have been used for model selection.[100] Software packages such as MCLUST,[101] Mplus,[102] poLCA,[103] Latent GOLD,[104] and SAS[99] can be used for LC cluster analysis.[105]

Mixed data

In many applications, each instance in a data set is described by more than one type of attribute. For example, we would like to group people based on their recorded anthropometric or clinical data. This grouping can identify different diseases. The recorded data for each person contain gender (binary variable), the assignment to (underweight, normal, overweight, and obese classes) (ordinal variable), HDL and LDL cholesterol values (interval), etc. This is an example of mixed-type data, in which similarity and dissimilarity between two instances (e.g. people) cannot be calculated using the methods discussed so far. A general distance coefficient and a generalized Minkowski distance was introduced for mixed-type data in the literature.[36] Other methods have also been introduced in the literature.[106,107,108,109,110,111,112]

ORDINAL-TO-INTERVAL SCALE CONVERSION EXAMPLE

Since there are few studies on ordinal data clustering, an example is given based on the breast cancer databases obtained from the Machine Learning Repository (http://archive.ics.uci.edu/ml/datasets/Breast+Cancer+Wisconsin+(Original)). This database, known as Wisconsin Breast Cancer Data (WBCD) with the number of web hit of 98032, was obtained from the university of Wisconsin Hospitals, Madison by Dr. William H. Wolberg[113,114,115,116] and has been extensively used as a clustering benchmark in the literature.[81,117,118] There are 699 patient records in the database. Each attribute has 10 ordinal values. Sixteen patient recordings had missing values, excluded. Thus, the sample size was 683. Each recording represents nine measurements made on a fine needle aspirate (FNA) taken from the patient breast. The nine cytological measurements are the clump thickness, size uniformity, shape uniformity, marginal adhesion, cell size, bare nuclei, bland chromatin, normal nucleoli, and mitosis. Each of these measurements are described by an ordinal integer label between 1 and 10, the larger the number the greater likelihood of malignancy.[115] These ratings were done by the clinical experts. All malignant aspirates were histologically confirmed whereas FNAs diagnosed as benign masses were biopsied only at the patient's request. The remainder of benign cytologies was confirmed by clinical re-examination 3 and 12 months after the aspiration. Masses that produced unsatisfactory or suspicious FNAs were surgically biopsied.[114] Accordingly, 239 cases were diagnosed as malignant and 444, as benign. The class labels were saved as the gold standard and kept for comparison. The class labels were excluded from the data set; thus 683 10-dimensional ordinal dataset was used for clustering. The number of clusters (groups) was estimated and the accuracy of malignant and benign classification was assessed by comparison with the gold standard.

Since ordinal data clustering is more challenging than clustering other types of data, we consider two different ordinal clustering methods for analyzing WBCD. The first approach was taken from the literature while the second one is proposed by the authors of this manuscript.

Ordinal data clustering based on modified FCM analysis (clustering #1)

Using the ordinal dataset, a modified fuzzy c-means whose ordinal-to-interval conversion was estimated based on the particle swarm optimization was used.[81] The algorithm was run from 2 to 10 numbers of clusters, and the clustering structure with optimum Xe-Beni clustering validity index[119] was selected. In the other words, number of clusters with better relative compactness (minimum intra-cluster distance) and separation (maximum intercluster distance) was chosen.[120] In the selected clustering structure, the malignant and benign clusters were identified by comparison with the gold standard and the errors were reported. Errors included number of malignant cases in the benign cluster and vice versa.

Ordinal data clustering based on modified OPTICS analysis (clustering #2)

The ordinal data were converted to interval data by using the EDS algorithm. It was because the ordinal scales were equally assigned without prior expert-based knowledge. Then, a density-based clustering method OPTICS was used to identify the clustering structure. OPTICS resolves the problem of detecting meaningful clusters in data of varying density such that points that are spatially closest in the multidimensional space become neighbors in the ordering. OPTICS can identify clustering structure, and unlike FCM does not need major input parameters or postprocessing such as clustering validity analysis.[121] Like the previously mentioned clustering method, the malignant and benign clusters were identified by comparison with the gold standard and the errors were reported.

Clustering performance analysis

The values of true positive (TP), true negative (TN), false positive (FP), and false Negative (FN) were calculated for each of the aforementioned clustering methods, by comparing the clustering results with those of the gold standard. Then, the information theory parameters were calculated as the following:

Sensitivity (Se) = Recall (Re) = TP/(TP+FN);

Specificity (Sp) = TN/(FP+TN);

Precision (Pr) = TP/(TP+FP);

Type I error: FP rate (α) =1-Sp;

Type II error: FN rate (β) =1-Se;

Power =1-β =Se;

F-score =2*(Pr*Re)/(Pr+Rl) = harmonic mean (Pr,Rl);

Accuracy (Acc) = (TP+TN)/(TP+TN+FN+FP);

The codes of the above-given two clustering algorithms and the validation program were written in Matlab (Matlab and Statistics Toolbox Release 2012b, The MathWorks, Inc., Natick, Massachusetts, United States), and is available upon the request to the authors.

RESULTS

0In the first clustering method, Xe-Beni index showed the optimum value at two clusters. It showed that there were two clusters in the data, which is quite reasonable. The FCM clustering algorithm was run 10 times, and the clustering results with the best compactness and separation were used.[122] The ordinal-to-interval conversion matrix for nine ordinal variables with 10 ranks was listed in [Table 3]. The ranks of different ordinal variables were transformed differently. In the other words, the transformation was done, so as to optimize the clustering structure. Comparing with the gold standard, the performance of the first clustering method is listed in [Table 4].

Table 3

The ordinal-to-interval conversion matrix for nine ordinal variables (columns) with 10 ranks (rows) studied on the WBCD using the clustering method #1

Ordinal scales have numbers with rank order but unequal distances between, while interval scales

Table 4

The performance of the clustering methods studied on the WBCD

Ordinal scales have numbers with rank order but unequal distances between, while interval scales

Using the clustering method #2 with 40-nearest neighbors (40-NN), the reachability distance plot (RD-plot) was shown in [Figure 2]. This 1D plot shows the clustering structure of the multidimensional data, in which major local minimums correspond with a cluster. In this plot, two major clusters were detected related to malignant and benign groups, respectively. Although the major local minimums could be detected manually, there are methods for automatically detecting including clusters.[121] The performance of this clustering method was shown in [Table 4].

Ordinal scales have numbers with rank order but unequal distances between, while interval scales

The clustering structures of WBCD, found by the second ordinal– variable clustering method. Each major valley (local minimum) of the reachability distance plot (RD-plot) corresponds with a possible cluster. In this example, the first cluster is the malignant group while the second one is the benign group.

The power of both of clusters methods are 98%, while the type-I error (α) was 0.03 and 0.09 for the clustering methods #1 and #2, respectively. In both of the clustering methods, the FN-rate (β) was 0.02. A FN is much more serious than a FP since it means that the subject will not be treated.[81] Both of aforementioned methods, showed ‘almost perfect agreement’ with the gold standard.

DISCUSSION

One of the important elements of a good medical research is identifying the key variables of the study and their method of measurement (measurement scale) and unit of measurement.[123] In addition to different types of variables,[124] such as independent (risk factors), dependent (outcome), confounding (intervening), and background variables, the scale of variables (qualitative versus metric) plays an important role of selecting appropriate statistical tests. Due to the importance of selecting appropriate statistical comparison and modeling tests, they have been mentioned in [Tables 1 and 2], in detail. Also, clinical examples taken from different medical studies were given in this paper for better elaboration. Although the selection of appropriate tests have been studied in the manuscripts,[4,6] this manuscript is one of the first one of its kind to discuss about different variable scales and their suitable statistical and data mining methods with several examples. Much of what was written in the literature is about clustering analysis and validity analysis of interval data,[62,120,125] but little was mentioned about the analysis of categorical variables. In this paper, we discussed about different clustering methods for categorical data and as the first manuscript in review, two different clustering methods were used for analyzing the ordinal WBCD. The first approach was already proposed and tested,[81] while the second approach was proposed by the authors. We hope that this review will be of use for researchers in the field of biomedical sciences.

One of the main limitations of this manuscript is that most of the nominal-data clustering methods were only mentioned and cited. There was no criterion to select in this paper. We have been contacting the authors of the corresponding papers. Most of the clustering programs were received. Some of which were re-compiled in different operating systems, for example, Linux, with the help of other data-mining researchers from different countries. We will be trying to run several clustering algorithms on categorical data on standard Benchmark datasets to have a fair comparison. It will be the focus of our future work.

ACKNOWLEDGEMENTS

The authors would like to thank Mr. Sobhan Goudarzi for the implementation of the first ordinal clustering algorithm. This study was supported by the University of Isfahan and Isfahan University of Medical Sciences.

Footnotes

Source of Support: This study was supported by the University of Isfahan and Isfahan University of Medical Sciences

Conflict of Interest: None declared

REFERENCES

1. Campbell MJ, Machin D, Wiley J. Vol. 2. London: Wiley; 1993. Medical Statistics: A Commonsense Approach. [Google Scholar]

2. Swinscow TDV, Campbell MJ. Bmj. London: 2002. Statistics at Square One. [Google Scholar]

3. Marusteri M, Bacarea V. Comparing Groups for Statistical Differences: How to Choose the Right Statistical Test? Biochemia medica. 2010;20:15–32. [Google Scholar]

4. McCrum-Gardner E. Which Is the Correct Statistical Test to Use. British Journal of Oral and Maxillofacial Surgery. 2008;46:38–41. [PubMed] [Google Scholar]

5. McDonald JH. Vol. 2. Sparky House Publishing Baltimore; 2009. Handbook of Biological Statistics. [Google Scholar]

6. Lawal B. N. J: Lawrence Erlbaum Associates; 2003. Categorical Data Analysis with Sas and Spss Applications Mahwah; p. vii. 561 p. [Google Scholar]

7. Stevens SS. Bobbs-Merrill, College Division; 1946. On the Theory of Scales of Measurement. [Google Scholar]

8. Alper M. A Note on Real Measurement Structures of Scale Type (M, M+1) Journal of Mathematical Psychology. 1985;29:73–81. [Google Scholar]

9. Alper TM. A Classification of All Order-Preserving Homeomorphism Groups of the Reals That Satisfy Finite Uniqueness. Journal of Mathematical Psychology. 1987;31:135–54. [Google Scholar]

10. Narens L. A General Theory of Ratio Scalability with Remarks About the Measurement-Theoretic Concept of Meaningfulness. Theory and Decision. 1981;13:1–70. [Google Scholar]

11. Narens L. On the Scales of Measurement. Journal of Mathematical Psychology. 1981;24:249–75. [Google Scholar]

12. Luce RD. Uniqueness and Homogeneity of Ordered Relational Structures. Journal of Mathematical Psychology. 1986;30:391–415. [Google Scholar]

13. Luce RD. Measurement Structures with Archimedean Ordered Translation Groups. Order. 1987;4:165–89. [Google Scholar]

14. Luce RD. Conditions Equivalent to Unit Representations of Ordered Relational Structures. Journal of Mathematical Psychology. 2001;45:81–98. [PubMed] [Google Scholar]

15. Mendenhall W, Sincich T. A Second Course in Statistics. 1996 [Google Scholar]

16. Lord FM. On the Statistical Treatment of Football Numbers. 1953 [Google Scholar]

17. Guttman L. A General Nonmetric Technique for Finding the Smallest Coordinate Space for a Configuration of Points. Psychometrika. 1968;33:469–506. [Google Scholar]

18. Velleman PF, Wilkinson L. Nominal, Ordinal, Interval, and Ratio Typologies Are Misleading. The American Statistician. 1993;47:65–72. [Google Scholar]

19. Mosteller F, Tukey JW. A Second Course in Statistics. Addison-Wesley Series in Behavioral Science: Quantitative Methods, Reading, Mass.: Addison-Wesley, 1977; 1977. Data Analysis and Regression; p. 1. [Google Scholar]

20. Knapp TR. Treating Ordinal Scales as Interval Scales: An Attempt to Resolve the Controversy. Nursing Research. 1990;39:121–3. [PubMed] [Google Scholar]

22. Likert R. A Technique for the Measurement of Attitudes. Archives of psychology. 1932 [Google Scholar]

23. Campbell MJ, Machin D, Walters SJ. Wiley.com; 2010. Medical Statistics: A Textbook for the Health Sciences. [Google Scholar]

24. Kuzma JW. 1st edn. Palo Alto, Calif: Mayfield Pub. Co; 1984. Basic Statistics for the Health Sciences; p. xiv. 274 p. [Google Scholar]

25. Bland M. Oxford University Press; 2000. An Introduction to Medical Statistics. [Google Scholar]

26. Siegel S. Nonparametric Statistics for the Behavioral Sciences. 1956 [Google Scholar]

27. Pallant J. Open University Press; 2004. Spss Survival Manual: Version 12. [Google Scholar]

28. Brown BW. Vol. 40. CRC Press; 1997. Beyond Anova: Basics of Applied Statistics. [Google Scholar]

29. Forthofer RN, Lee ES, Hernandez M. Academic Press; 2006. Biostatistics: A Guide to Design, Analysis and Discovery. [Google Scholar]

30. Rosner BA. CengageBrain.com; 2011. Fundamentals of Biostatistics. [Google Scholar]

31. Lavrač N. Selected Techniques for Data Mining in Medicine. Artificial intelligence in medicine. 1999;16:3–23. [PubMed] [Google Scholar]

32. Lavrač N, Zupan B. Springer; 2005. Data Mining in Medicine. [Google Scholar]

33. Lavrač N, Zupan B. Data Mining in Medicine, in Data Mining and Knowledge Discovery Handbook. In: Maimon O, Rokach L, editors. US: Springer; 2005. pp. 1107–37. [Google Scholar]

34. Alizadeh AA, Eisen MB, Davis RE, Ma C, Lossos IS, Rosenwald A, et al. : Distinct Types of Diffuse Large B-Cell Lymphoma Identified by Gene Expression Profiling. Nature. 2000;403:503–11. [PubMed] [Google Scholar]

35. EILS R. Expert System for Classification and Prediction of Genetic Diseases. WO Patent 2,002,047,007. 2002 [Google Scholar]

36. Siam.org; 6. Similarity and Dissimilarity Measures, in Data Clustering: Theory, Algorithms, and Applications; pp. 67–106. [Google Scholar]

37. Boriah S, Chandola V, Kumar V. Similarity Measures for Categorical Data: A Comparative Evaluation. red. 2008;30:3. [Google Scholar]

38. Choi S-S, Cha S-H, Tappert C. A Survey of Binary Similarity and Distance Measures. Journal of Systemics, Cybernetics and Informatics. 2010;8:43–8. [Google Scholar]

39. Mead A. Review of the Development of Multidimensional Scaling Methods. The Statistician. 1992;41:27–39. [Google Scholar]

40. Young FW, Null CH. Multidimensional Scaling of Nominal Data: The Recovery of Metric Information with Alscal. Psychometrika. 1978;43:367–79. [Google Scholar]

41. Cox TF, Cox MA. Chapman & Hall; 1994. Multidimensional Scaling. Number 59 in Monographs on Statistics and Applied Probability. [Google Scholar]

42. Borg I. Springer; 2005. Modern Multidimensional Scaling: Theory and Applications. [Google Scholar]

43. Kristensen TS, Hannerz H, Høgh A, Borg V. The Copenhagen Psychosocial Questionnaire-a Tool for the Assessment and Improvement of the Psychosocial Work Environment. Scandinavian journal of work, environment & health. 2005:438–49. [PubMed] [Google Scholar]

44. Adibi P, Keshteli AH, Esmaillzadeh A, Afshar H, Roohafza H, Bagherian-Sararoudi H, et al. The Study on the Epidemiology of Psychological, Alimentary Health and Nutrition (Sepahan): Overview of Methodology. J Res Med Sci. 2012;17:S291–7. [Google Scholar]

45. Alderman MH, Cohen H, Madhavan S. Dietary Sodium Intake and Mortality: The National Health and Nutrition Examination Survey (Nhanes I) The Lancet. 1998;351:781–5. [PubMed] [Google Scholar]

46. Bruce B, Fries JF. The Stanford Health Assessment Questionnaire: A Review of Its History, Issues, Progress, and Documentation. The Journal of rheumatology. 2003;30:167–78. [PubMed] [Google Scholar]

47. Frank E, Hall M. Springer; 2001. A Simple Approach to Ordinal Classification. [Google Scholar]

48. Labovitz S. The Assignment of Numbers to Rank Order Categories. American Sociological Review. 1970;35:515–24. [Google Scholar]

49. O’Brien RM. The Use of Pearson's with Ordinal Data. American Sociological Review. 1979;44:851–7. [Google Scholar]

50. Mayer L. A Note on Treating Ordinal Data as Interval Data. American Sociological Review. 1971;36:519–20. [Google Scholar]

51. Allen MP. Conventional and Optimal Interval Scores for Ordinal Variables. Sociological Methods & Research. 1976;4:475–94. [Google Scholar]

52. Granberg-Rademacker JS. An Algorithm for Converting Ordinal Scale Measurement Data to Interval/Ratio Scale. Educational and Psychological Measurement. 2010;70:74–90. [Google Scholar]

53. Ester M, Kriegel H-P, Sander J, Xu X. in KDD; 1996. A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise; pp. 226–31. [Google Scholar]

54. Ankerst M, Breunig MM, Kriegel H-P, Sander J. Optics: Ordering Points to Identify the Clustering Structure. ACM Sigmod Record. 1999;28:49–60. [Google Scholar]

55. MacQueen J. California, USA: in Proceedings of the fifth Berkeley symposium on mathematical statistics and probability; 1967. Some Methods for Classification and Analysis of Multivariate Observations; p. 14. [Google Scholar]

56. Kaufman L, Rousseeuw P. Clustering by Means of Medoids. 1987 [Google Scholar]

57. Bezdek JC. Kluwer Academic Publishers; 1981. Pattern Recognition with Fuzzy Objective Function Algorithms. [Google Scholar]

58. Ball GH, Hall DJ. Isodata, a Novel Method of Data Analysis and Pattern Classification. DTIC Document. 1965 [Google Scholar]

59. Defays D. An Efficient Algorithm for a Complete Link Method. The Computer Journal. 1977;20:364–6. [Google Scholar]

60. Sibson R. Slink: An Optimally Efficient Algorithm for the Single-Link Cluster Method. The Computer Journal. 1973;16:30–4. [Google Scholar]

61. Kaufman L, Rousseeuw PJ. Vol. 344. Wiley.com; 2009. Finding Groups in Data: An Introduction to Cluster Analysis. [Google Scholar]

62. Xu R, Wunsch D. Vol. 16. Neural Networks, IEEE Transactions; 2005. Survey of Clustering Algorithms; pp. 645–78. [PubMed] [Google Scholar]

63. Sheikholeslami G, Chatterjee S, Zhang A. Wavecluster: A Wavelet-Based Clustering Approach for Spatial Data in Very Large Databases. The VLDB Journal. 2000;8:289–304. [Google Scholar]

64. Barbará D, Chen P. Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining ACM; 2000. Using the Fractal Dimension to Cluster Datasets; pp. 260–4. [Google Scholar]

65. Andritsos P, Tsaparas P, Miller RJ, Sevcik KC. Springer; 2004. Limbo: Scalable Clustering of Categorical Data, in Advances in Database Technology-Edbt 2004; pp. 123–46. [Google Scholar]

66. Barbará D, Li Y, Couto J. Proceedings of the eleventh international conference on Information and knowledge managementACM; 2002. Coolcat: An Entropy-Based Algorithm for Categorical Clustering; pp. 582–9. [Google Scholar]

67. Ganti V, Gehrke J, Ramakrishnan R. Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining ACM; 1999. Cactus—Clustering Categorical Data Using Summaries; pp. 73–83. [Google Scholar]

68. Guha S, Rastogi R, Shim K. Rock: A Robust Clustering Algorithm for Categorical Attributes. Information systems. 2000;25:345–66. [Google Scholar]

69. Parmar D, Wu T, Blackhurst J. Mmr: An Algorithm for Clustering Categorical Data Using Rough Set Theory. Data & Knowledge Engineering. 2007;63:879–93. [Google Scholar]

70. Zaki MJ, Peters M. Data Engineering, 2005. ICDE 2005. Proceedings. 21st International Conference onIEEE; 2005. Clicks: Mining Subspace Clusters in Categorical Data Via K-Partite Maximal Cliques; pp. 355–6. [Google Scholar]

71. Zhang P, Wang X, Song PX-K. Clustering Categorical Data Based on Distance Vectors. Journal of the American Statistical Association. 2006;101:355–67. [Google Scholar]

72. Stutz J, Cheeseman P. Springer; 1996. Autoclass — a Bayesian Approach to Classification, in Maximum Entropy and Bayesian Methods; pp. 117–26. [Google Scholar]

73. Huang Z. Extensions to the K-Means Algorithm for Clustering Large Data Sets with Categorical Values. Data Mining and Knowledge Discovery. 1998;2:283–304. [Google Scholar]

74. Huang Z, Ng MK. Vol. 7. Fuzzy Systems, IEEE Transactions; 1999. A Fuzzy K-Modes Algorithm for Clustering Categorical Data; pp. 446–52. [Google Scholar]

75. Kim D-W, Lee KH, Lee D. Fuzzy Clustering of Categorical Data Using Fuzzy Centroids. Pattern recognition letters. 2004;25:1263–71. [Google Scholar]

76. Gan G, Wu J, Yang Z. A Genetic Fuzzy -Modes Algorithm for Clustering Categorical Data. Expert Systems with Applications. 2009;36:1615–20. [Google Scholar]

77. Ng MK, Li MJ, Huang JZ, He Z. Vol. 29. Pattern Analysis and Machine Intelligence, IEEE Transactions; 2007. On the Impact of Dissimilarity Measure in K-Modes Clustering Algorithm; pp. 503–7. [PubMed] [Google Scholar]

78. Cao F, Liang J, Li D, Bai L, Dang C. A Dissimilarity Measure for the K-Modes Clustering Algorithm. Knowledge-Based Systems. 2012;26:120–7. [Google Scholar]

79. Bai L, Liang J, Dang C, Cao F. The Impact of Cluster Representatives on the Convergence of the K-Modes Type Clustering. 2012 [PubMed] [Google Scholar]

80. Geweniger T, Zülke D, Hammer B, Villmann T. Median Fuzzy C-Means for Clustering Dissimilarity Data. Neurocomputing. 2010;73:1109–16. [Google Scholar]

81. Brouwer RK, Groenwold A. Modified Fuzzy C-Means for Ordinal Valued Attributes with Particle Swarm for Optimization. Fuzzy sets and systems. 2010;161:1774–89. [Google Scholar]

82. Johnson VE, Albert JH. Springer; 1999. Ordinal Data Modeling. [Google Scholar]

83. Lee S-Y. Access Online via Elsevier; 2011. Handbook of Latent Variable and Related Models. [Google Scholar]

84. Qu Y, Piedmonte MR, Medendorp SV. Latent Variable Models for Clustered Ordinal Data. Biometrics. 1995:268–75. [PubMed] [Google Scholar]

85. McParland D, Gormley I. Clustering Ordinal Data Via Latent Variable Models, in Algorithms from and for Nature and Life. In: Lausen B, Van den Poel D, editors. Ultsch ASpringer International Publishing; 2013. pp. 127–35. [Google Scholar]

86. Lazarsfeld PF, Henry NW. New York:Houghton: 1968. Latent Structure Analysis; p. ix. 294. [Google Scholar]

87. Bartholomew DJ, Knott M, Moustaki I. Vol. 899. Wiley.com; 2011. Latent Variable Models and Factor Analysis: A Unified Approach. [Google Scholar]

88. Everitt BS. A Finite Mixture Model for the Clustering of Mixed-Mode Data. Statistics & probability letters. 1988;6:305–9. [Google Scholar]

89. Everitt BS, Merette C. The Clustering of Mixed-Mode Data: A Comparison of Possible Approaches. Journal of Applied Statistics. 1990;17:283–97. [Google Scholar]

90. Moustaki I. A Latent Trait and a Latent Class Model for Mixed Observed Variables. British journal of mathematical and statistical psychology. 1996;49:313–34. [Google Scholar]

91. Magidson J, Vermunt JK. Latent Class Factor and Cluster Models, Bi-Plots, and Related Graphical Displays. Sociological methodology. 2001;31:223–64. [Google Scholar]

92. McLachlan GJ, Basford KE. Statistics: Textbooks and Monographs. New York: Dekker 1988; 1988. Mixture Models. Inference and Applications to Clustering; p. 1. [Google Scholar]

93. Vermunt JK, Magidson J. Latent Class Cluster Analysis. Applied latent class analysis. 2002:89–106. [Google Scholar]

94. Magidson J, Vermunt JK. Latent Class Models. The Sage handbook of quantitative methodology for the social sciences. 2004:175–98. [Google Scholar]

95. McLachlan G, Peel D. Wiley.com; 2004. Finite Mixture Models. [Google Scholar]

96. Fraley C, Raftery AE. Model-Based Clustering, Discriminant Analysis, and Density Estimation. Journal of the American Statistical Association. 2002;97:611–31. [Google Scholar]

97. Moustaki I, Papageorgiou I. Latent Class Models for Mixed Variables with Applications in Archaeometry. Computational statistics & data analysis. 2005;48:659–75. [Google Scholar]

98. Berzofsky M, Biemer PP. Weak Identifiability in Latent Class Analysis [Google Scholar]

99. Lanza ST, Collins LM, Lemmon DR, Schafer JL. Proc Lca: A Sas Procedure for Latent Class Analysis. Structural Equation Modeling. 2007;14:671–94. [PMC free article] [PubMed] [Google Scholar]

100. Fraley C, Raftery AE. How Many Clusters. Which Clustering Method. Answers Via Model-Based Cluster Analysis? The Computer Journal. 1998;41:578–88. [Google Scholar]

101. Fraley C, Raftery AE. Mclust: Software for Model-Based Cluster Analysis. Journal of Classification. 1999;16:297–306. [Google Scholar]

102. Muthen LK, Muthén L. Los Angeles, CA: Muthén & Muthén; 1998. Mplus [Computer Software] [Google Scholar]

103. Linzer DA, Lewis JB. Polca: An R Package for Polytomous Variable Latent Class Analysis. Journal of Statistical Software. 2011;42:1–29. [Google Scholar]

104. Vermunt JK, Magidson J. Belmont (Mass.): Statistical Innovations Inc; 2005. Technical Guide for Latent Gold 4.0: Basic and Advanced. [Google Scholar]

105. Haughton D, Legrand P, Woolford S. Review of Three Latent Class Cluster Analysis Packages: Latent Gold, Polca, and Mclust. The American Statistician. 2009;63:81–91. [Google Scholar]

106. Ng MK, Wong JC. Clustering Categorical Data Sets Using Tabu Search Techniques. Pattern Recognition. 2002;35:2783–90. [Google Scholar]

107. Morlini I. A Latent Variables Approach for Clustering Mixed Binary and Continuous Variables within a Gaussian Mixture Model. Advances in Data Analysis and Classification. 2012;6:5–28. [Google Scholar]

108. Shih M-Y, Jheng J-W, Lai L-F. A Two-Step Method for Clustering Mixed Categroical and Numeric Data. Tamkang Journal of Science and Engineering. 2010;13:11–9. [Google Scholar]

109. Huang Z. Singapore: Proceedings of the 1st Pacific-Asia Conference on Knowledge Discovery and Data Mining, (PAKDD); 1997. Clustering Large Data Sets with Mixed Numeric and Categorical Values; pp. 21–34. [Google Scholar]

110. Ahmad A, Dey L. A <I>K</I>-Mean Clustering Algorithm for Mixed Numeric and Categorical Data. Data & Knowledge Engineering. 2007;63:503–27. [Google Scholar]

111. Hsu C-C, Chen C-L, Su Y-W. Hierarchical Clustering of Mixed Data Based on Distance Hierarchy. Information Sciences. 2007;177:4474–92. [Google Scholar]

112. Fayyad U, Bradley PS, Reina CA. Scalable System for Clustering of Large Databases Having Mixed Data Attributes. Google Patents. 2003 [Google Scholar]

113. Mangasarian OL, Street WN, Wolberg WH. Breast Cancer Diagnosis and Prognosis Via Linear Programming. Operations Research. 1995;43:570–7. [Google Scholar]

114. Wolberg WH, Mangasarian OL. Vol. 87. Proceedings of the National Academy of Sciences; 1990. Multisurface Method of Pattern Separation for Medical Diagnosis Applied to Breast Cytology; pp. 9193–6. [PMC free article] [PubMed] [Google Scholar]

115. Mangasarian OL, Setiono R, Wolberg W. Pattern Recognition Via Linear Programming: Theory and Application to Medical Diagnosis. Large-scale numerical optimization. 1990:22–31. [Google Scholar]

116. Bennett KP, Mangasarian OL. Robust Linear Programming Discrimination of Two Linearly Inseparable Sets. Optimization methods and software. 1992;1:23–34. [Google Scholar]

117. Akay MF. Support Vector Machines Combined with Feature Selection for Breast Cancer Diagnosis. Expert Systems with Applications. 2009;36:3240–7. [Google Scholar]

118. Setiono R, Liu H. Neural-Network Feature Selector. Neural Networks, IEEE Transactions. 1997;8:654–62. [PubMed] [Google Scholar]

119. Xie XL, Beni G. Vol. 13. Pattern Analysis and Machine Intelligence, IEEE Transactions; 1991. A Validity Measure for Fuzzy Clustering; pp. 841–7. [Google Scholar]

120. Halkidi M, Batistakis Y, Vazirgiannis M. Clustering Validity Checking Methods: Part Ii. ACM Sigmod Record. 2002;31:19–27. [Google Scholar]

121. Marateb HR, Muceli S, McGill KC, Merletti R, Farina D. Robust Decomposition of Single-Channel Intramuscular Emg Signals at Low Force Levels. Journal of Neural Engineering. 2011;8:066015. [PubMed] [Google Scholar]

122. Goudarzi S. the University of Isfahan; 2013. ‘Clustering Ordinal Data : Diagnosing Functional Gastrointestinal Disorders - in Farsi’ p. 92. [Google Scholar]

124. Fathalla MF, Fathalla MM. A Practical Guide for Health ResearchersWorld Health Organization. Regional Office for the Eastern Mediterranean. 2004 [Google Scholar]

125. Halkidi M, Batistakis Y, Vazirgiannis M. Cluster Validity Methods: Part I. ACM Sigmod Record. 2002;31:40–5. [Google Scholar]


Articles from Journal of Research in Medical Sciences : The Official Journal of Isfahan University of Medical Sciences are provided here courtesy of Wolters Kluwer -- Medknow Publications


What is the difference between interval scale and ordinal scale?

The interval level is a numerical level of measurement which, like the ordinal scale, places variables in order. Unlike the ordinal scale, however, the interval scale has a known and equal distance between each value on the scale (imagine the points on a thermometer).

Which type of data has a rank order but unequal intervals?

Ordinal data is classified into categories within a variable that have a natural rank order. However, the distances between the categories are uneven or unknown.

Does ordinal have equal intervals?

The ordinal scale does not have the property of equal intervals between adjacent units. The scale does not tell the absolute level of the variable (e.g., they all could be high – or – they all could be low). The interval scale represents a higher level of measurement than the ordinal scale.

What is the main difference between an ordinal scale and an interval scale and between an interval scale and a ratio scale?

The difference between interval and ratio scales comes from their ability to dip below zero. Interval scales hold no true zero and can represent values below zero. For example, you can measure temperature below 0 degrees Celsius, such as -10 degrees. Ratio variables, on the other hand, never fall below zero.