rVariables

Next: zVariables Up: 1.5.1 Variables Previous: 1.5.1 Variables

rVariables

rVariables all have the same dimensionality (number of dimensions and dimension sizes). An example of the type of data set that may be stored in a CDF's rVariables is shown in Table 1.1. Each record holds one value for each of the four variables: Time, Longitude, Latitude, and Temperature. CDF can store scalar data in a ``flat'' (0-dimensional) representation such as this, but storage in this manner may hide fundamental relationships among the data values. Consistent repetitions found in the data for this example suggest another way to organize the data set. Note that every fourth record is an observation at the same point on Earth at different times. That fact is not immediately clear from this representation of the data. Looking more closely, we note that only two differing values are recorded for Longitude and, similarly, only two differing values are recorded for Latitude. This repetition suggests a 2-dimensional array structure whose dimensions are defined by Longitude and Latitude. For each of the two Longitude values there are two Latitude values. Time repeats for each Longitude/ Latitude pair --- the observations were taken simultaneously at the longitude/latitude locations. Because of Time's repetition for Longitude/Latitude pairs, the number of Time values specifies the number of records needed in the CDF. Each record conceptually contains a 2-dimensional array per rVariable (Table 1.2). The array structure defines the dimensionality of the rVariables in the CDF. Although there are four rVariables, the array dimensions and the sizes of those dimensions are determined only by Longitude and Latitude. Temperature varies across the entire array while Time tells us how many records to expect. Therefore, the example, when reduced as described, defines a CDF with 2-dimensional rVariables. The number of discrete values for each rVariable that defines a dimension generates the size of that dimension. For example, Longitude has two unique values so the dimension defined by Longitude has a size of two.

Table 1.1: Example Data Set --- ``Flat'' Representation (0-Dimensional)

Table 1.2: Example CDF --- 2-Dimensional Representation (Conceptual)

Adding another independent rVariable, for instance Pressure, poses no difficulty for the example. Temperature would then be dependent on a specific Longitude, Latitude, and Pressure --- a 3-dimensional array structure. In this 3-dimensional example Longitude, Latitude, and Pressure define the number of dimensions for the rVariables in the CDF, where the size of each dimension is determined by the number of discrete values contained in each of those rVariables. Additional dependent rVariables would be stored in the same way as Temperature.

Although conceptually there is a 2-dimensional array structure for each rVariable in each record of the CDF, this would not be an efficient way to store the data. For instance, the time for each record need only be stored once as opposed to being stored four times as shown in each 2-dimensional array (Table 1.2). This problem is circumvented by specifying ``variances.'' For each rVariable there are variances associated with the array dimensions as well as the records. ``Record variance'' indicates whether or not an rVariable has unique values from record to record in the CDF. Time changes for each record so the record variance for Time is [TRUE]. One could also say that Time is record-variant. Latitude and Longitude repeat their values from record to record so the record variance for each is [false]. Latitude and Longitude are non-record-variant (NRV). The Temperature values change from record to record so they are record-variant. The record variances for this example are shown in Table 1.3.

Similarly, the term ``dimension variance'' indicates whether or not an rVariable changes with respect to the CDF dimensions. In the example above with 2-dimensional rVariables, the Longitude rVariable defines the first dimension of the CDF with its values repeating along the second dimension so its dimension variances would be [TRUE,false]. The Latitude rVariable defines the second dimension of the CDF with its values repeating along the first dimension so its dimension variances would be [false,TRUE]. Because the Temperature values change for each latitude/longitude location, its dimension variances are [TRUE,TRUE]. Time does not change from one latitude/longitude location to another, so its values are the same along both dimensions. The dimension variances for Time would be [false,false]. The dimension variances for the above example are shown in Table 1.3.

Table 1.3: Example CDF --- Specification for 2-Dimensional Representation

When the record and dimension variances have been defined correctly, the amount of physical storage needed for the CDF is drastically reduced. In the above example, 2-dimensional arrays are not physically stored for each rVariable in a CDF record. Instead, the physical storage for each rVariable consists of just one value for Time in each CDF record, a single 1-dimensional array of values for the Longitude and Latitude rVariables (in only the first CDF record), and a full 2-dimensional array of values for Temperature in each CDF record. The actual physical storage (physical view) is shown in Table 1.4. The conceptual view of the CDF, however, is still that of one 2-dimensional array per rVariable in each CDF record as shown in Table 1.2 (the physically stored values are shown in boldface type).

Table 1.4: Example CDF --- 2-Dimensional Representation (Physical)

Next: zVariables Up: 1.5.1 Variables Previous: 1.5.1 Variables

cdfsupport@nssdca.gsfc.nasa.gov