Although these two cases are arranged differently in CF-netCDF, each one contains, sometimes implicitly, a datum or a coordinate conversion formula and is therefore regarded as a coordinate reference construct by the data model. A grid mapping name or the standard name of a parametric vertical coordinate corresponds to a string-valued scalar parameter of a coordinate conversion formula. A grid mapping parameter which has more than one value (as is possible with the "standard parallel" attribute) corresponds to a vector parameter of a coordinate conversion formula. A data variable referenced by a formula_terms attribute corresponds to the term of a coordinate conversion formula—either a domain ancillary construct or, if it is zero-dimensional, a scalar parameter.
This representation also allows one to have a variable number of elements in different profiles, at the cost of some wasted space. In that case, any unused elements of the data and auxiliary coordinate variables must contain missing data values (section 9.6). If the profile instances have the same number of elements and the vertical coordinate values are identical for all instances, you may use the orthogonal multidimensional array representation. This has either a one-dimensional coordinate variable, z, provided the vertical coordinate values are ordered monotonically, or a one-dimensional auxiliary coordinate variable, alt, where o is the element dimension. In the former case, listing the vertical coordinate variable in the coordinates attributes of the data variables is optional. A "reduced" longitude-latitude grid is one in which the points are arranged along constant latitude lines with the number of points on a latitude line decreasing toward the poles.
Storing this type of gridded data in two-dimensional arrays wastes space, and results in the presence of missing values in the 2D coordinate variables. We recommend that this type of gridded data be stored using the compression scheme described in Section 8.2, "Lossless Compression by Gathering". Compression by gathering preserves structure by storing a set of indices that allows an application to easily scatter the compressed data back to two-dimensional arrays.
The compressed latitude and longitude auxiliary coordinate variables are identified by the coordinates attribute. To geo-reference data horizontally with respect to the Earth, a grid mapping variable may be provided by the data variable, using the grid_mapping attribute. If the coordinate variables for a horizontal grid are not longitude and latitude, then a grid_mapping variable provides the information required to derive longitude and latitude values for each grid location. If no grid mapping variable is referenced by a data variable, then longitude and latitude coordinate values shall be supplied in addition to the required coordinates. For example, the Cartesian coordinates of a map projection may be supplied as coordinate variables and, in addition, two-dimensional latitude and longitude variables may be supplied via the coordinates attribute on a data variable. The use of the axis attribute with values X and Y is recommended for the coordinate variables .
If the time series instances have the same number of elements and the time values are identical for all instances, you may use the orthogonal multidimensional array representation. This has either a one-dimensional coordinate variable, time, provided the time values are ordered monotonically, or a one-dimensional auxiliary coordinate variable, time, where o is the element dimension. In the former case, listing the time variable in the coordinates attributes of the data variables is optional. In data for discrete sampling geometries written according to the rules of this section, wherever there are unused elements in data storage, the data variable and all its auxiliary coordinate variables must contain missing values. This situation may arise for the incomplete multidimensional array representation, and in any representation if the instance dimension is set to a larger size than the number of features currently stored. Data variables should also contain missing values to indicate when there is no valid data available for the element, although the coordinates are valid.
The orthogonal multidimensional array representation , the simplest representation, can be used if each feature instance in the collection has identical coordinates along the element axis of the features. For example, for a collection of the timeSeries that share a common set of times, or a collection of profiles that share a common set of vertical levels, this is likely to be the natural representation to use. In both examples, there will be longitude and latitude coordinate variables, x, y, that are one-dimensional and defined along the instance dimension. If there is only a single feature to be stored in a data variable, there is no need for an instance dimension and it is permitted to omit it.
The data will then be one-dimensional, which is a special case of the multidimensional array representation. The instance variables will be scalar coordinate variables; the data variable and other auxiliary coordinate variables will have only an element dimension and not have an instance dimension, e.g. data and t for a single timeSeries. To represent data at scattered locations and times with no implied relationship among of coordinate positions, both data and coordinates must share the same instance dimension. Because each feature contains only a single data element, there is no need for a separate element dimension. The representation of point features is a special, degenerate case of the standard four representations.
The coordinates attribute is used on the data variables to unambiguously identify the relevant space and time auxiliary coordinate variables. Character strings labelling the elements of an axis are regarded as string-valued auxiliary coordinate variables. The coordinatesattribute of the data variable names the variable that contains the string array. An application processing the variables listed in the coordinatesattribute can recognize a string-valued auxiliary coordinate variable because it has a type of char or string.
If the variable has a type of char, the inner dimension is the maximum length of each string, and the other dimensions are axis dimensions. If an auxiliary coordinate variable has a type of string and has no dimensions, or has a type of char and has only one dimension , it is a string-valued scalar coordinate variable (seeSection 5.7, "Scalar Coordinate Variables"). As such, it has the same information content and can be used in the same contexts as a string-valued auxiliary coordinate variable of a size one dimension.
If a data variable has two or more scalar coordinate variables, they are regarded as though they were all independent coordinate variables with dimensions of size one. At the cost of some wasted space, the multidimensional array representation also allows one to have a variable number of profiles for different trajectories, and varying numbers of levels for different profiles. In these cases, any unused elements of the data and auxiliary coordinate variables must contain missing data values (section 9.6). At the cost of some wasted space, the multidimensional array representation also allows one to have a variable number of profiles for different stations, and varying numbers of levels for different profiles. When storing multiple trajectories in the same file, and the number of elements in each trajectory is the same, one can use the multidimensional array representation.
This representation also allows one to have a variable number of elements in different trajectories, at the cost of some wasted space. This is both to enhance conformance to COARDS and to facilitate the use of generic applications that recognize the NUG convention for coordinate variables. An application that is trying to find the latitude coordinate of a variable should always look first to see if any of the variable's dimensions correspond to a latitude coordinate variable. If the latitude coordinate is not found this way, then the auxiliary coordinate variables listed by the coordinates attribute should be checked.
Note that it is permissible, but optional, to list coordinate variables as well as auxiliary coordinate variables in the coordinates attribute. CF-netCDF variables named by the formula_terms attribute of a CF-netCDF coordinate variable correspond to domain ancillary constructs. These CF-netCDF variables may be coordinate, scalar coordinate, or auxiliary coordinate variables, or they may be data variables. The CF conventions generalize and extend the COARDS conventions . This standard also relaxes the COARDS constraints on dimension order and specifies methods for reducing the size of datasets.
For each of the tie point coordinate and auxillary coordinate variables having a bounds_tie_points attribute, add the bounds attribute to the target coordinate variable and create the target domain bounds variable. For each interpolation subarea, apply the interpolation method to reconstitute the target domain bound values and store these in the target domain bound variables. Repeat this step for each combination of indices of the non-interpolated dimensions. If an axis attribute is attached to an auxiliary coordinate variable, it can be used by applications in the same way the axis attribute attached to a coordinate variable is used. For instance, auxiliary coordinate variables which lie on the horizontal surface can be identified as such by their dimensions being horizontal.
Horizontal dimensions are those whose coordinate variables have an axis attribute of X or Y, or a units attribute indicating latitude and longitude. The dimensions of the domain must be stored with the dimensionsattribute, and the presence of a dimensions attribute will identify the variable as a domain variable. Therefore thedimensions attribute must not be present on any variables that are to be interpreted as data variables.
It is necessary to list these dimensions, rather than inferring them from the contents of the other attributes, as it can not be guaranteed that the referenced variables span all of the required dimensions . The value of the dimensionsattribute is a blank separated list of the dimension names. There is no restriction on the order in which the dimensions appear in thedimensions attribute string. If a domain has no named dimensions then the value of the dimensions attribute must be an empty string, as could be the case if the dimensions of the domain are all defined implicitly by scalar coordinate variables.
The use of scalar coordinate variables is a convenience feature which avoids adding size one dimensions to variables. A numeric scalar coordinate variable has the same information content and can be used in the same contexts as a size one numeric coordinate variable. Similarly, a string-valued scalar coordinate variable has the same meaning and purposes as a size one string-valued auxiliary coordinate variable (Section 6.1, "Labels"). Note however that use of this feature with a latitude, longitude, vertical, or time coordinate will inhibit COARDS conforming applications from recognizing them. For each of the tie point coordinate and auxillary coordinate variables, create the corresponding target coordinate variable.
For each interpolation subarea, apply the interpolation method, as described in Section J.3, "Interpolation Methods", to reconstitute the target domain coordinate values and store these in the target domain coordinate variables. CF-netCDF auxiliary coordinate variables and non-numeric scalar coordinate variables correspond to auxiliary coordinate constructs. Reconstitution of the uncompressed coordinate and auxiliary coordinate variables is based on interpolation. To accomplish this, the target domain is segmented into smaller interpolation subareas, for each of which the interpolation method is applied independently. For the reconstitution of the uncompressed coordinate and auxiliary coordinate variables within an interpolation subarea, the interpolation method is permitted to access its defining tie points, and no others.
A data variable defines its domain via its own attributes, but a domain variable provides the description of a domain in the absence of any data values. The variable should be a scalar (i.e. it has no dimensions) of arbitrary type, and the value of its single element is immaterial. It acts as a container for the attributes that define the domain.
The purpose of a domain variable is to provide domain information to applications that have no need of data values at the domain's locations, thus removing any ambiguity when retrieving a domain from a dataset. Ancillary variables and cell methods are not part of the domain, because they are only defined in relation to data values. The expanded form of the grid_mapping attribute is required if one wants to store coordinate information for more than one coordinate reference system. In this case each coordinate or auxiliary coordinate is defined explicitly with respect to no more than one grid_mapping variable. This syntax may be used to explicitly link coordinates and grid mapping variables where only one coordinate reference system is used.
In this case, all coordinates and auxiliary coordinates of the data variable not named in the grid_mapping attribute are unrelated to any grid mapping variable. All coordinate names listed in the grid_mapping attribute must be coordinate variables or auxiliary coordinates of the data variable. The subscripts o and p distinguish the data elements that compose a single feature. For example in a collection of timeSeries features, each time series instance, i, has data values at various times, o. In a collection of profile features, the subscript, o, provides the index position along the vertical axis of each profile instance.
We refer to data values in a feature as its elements , and to the dimensions of o and p as element dimensions . Each feature can have its own set of element subscripts o and p. For instance, in a collection of timeSeries features, each individual timeSeries can have its own set of times. The notation t means there is a set of times with subscripts o for the elements of each feature i.
Feature instances within a collection need not have the same numbers of elements. If the features do all have the same number of elements, and the sequence of element coordinates is identical for all features, savings in simplicity and space are achievable by storing only one copy of these coordinates. This is the essence of the orthogonal multidimensional representation (see section 9.3.1). For some applications the coordinates of a data variable can require considerably more storage than the data itself. Space may be saved in the netCDF file by storing a subsample of the coordinates that describe the data. The uncompressed coordinate and auxiliary coordinate variables can be reconstituted by interpolation, from the subsampled coordinate values to the domain of the data (i.e. the target domain).
The creator of the compressed dataset can control the accuracy of the reconstituted coordinates through the degree of subsampling and the choice of interpolation method, see Appendix J, Coordinate Interpolation Methods. A grid mapping variable may be referenced by a data variable in order to explicitly declare the coordinate reference system used for the horizontal spatial coordinate values. For example, if the horizontal spatial coordinates are latitude and longitude, the grid mapping variable can be used to declare the figure of the earth (WGS84 ellipsoid, sphere, etc.) they are based on. The direction of positive (i.e., the direction in which the coordinate values are increasing), whether up or down, cannot in all cases be inferred from the units. The direction of positive is useful for applications displaying the data.
For this reason the attribute positive as defined in the COARDS standard is required if the vertical axis units are not a valid unit of pressure — otherwise its inclusion is optional. This attribute may be applied to either coordinate variables or auxiliary coordinate variables that contain vertical coordinate data. A domain axis construct (Figure I.3) comprises a positive integer which specifies the number of cells lying along an independent axis of the domain. In CF-netCDF, it is usually defined either by a netCDF dimension or by a scalar coordinate variable, which implies a domain axis of size one. The field construct's data array spans the domain axis constructs of the domain, except that the size-one axes may optionally be omitted, because their presence makes no difference to the order of the elements. Hence, the data array may be zero-dimensional (i.e. scalar) if there are no domain axis constructs of size greater than one.
If all of the profiles along any given trajectory have the same set of vertical coordinates values, the vertical auxiliary coordinate variable could be dimensioned alt. In the latter case, listing the vertical coordinate variable in the coordinates attribute is optional. If all of the profiles at any given station have the same set of vertical coordinates values, the vertical auxiliary coordinate variable could be dimensioned alt.
The variable, row_size , is the count variable containing the length of each time series feature. It is identified by having an attribute with name sample_dimension whose value is name of the sample dimension . The sample dimension could optionally be the netCDF unlimited dimension.
The variable bearing the sample_dimension attribute must have the instance dimension as its single dimension, and must have an integer type. This variable implicitly partitions into individual instances all variables that have the sample dimension. The auxiliary coordinate variables lat , lon , alt and station_name are station variables. Table 9.2 illustrates the storage of a data variable using the orthogonal multidimensional array representation. The individual features, distinguished by color, are sequenced along the horizontal axis by the instance dimension indices, i1, i2, i3, i4.
Each instance contains three elements, sequenced along the vertical with element dimension indices, o1, o2, o3. The i and o subscripts would be interchanged (i.e. Table 9.2 would be transposed) if the element dimension were the netCDF unlimited dimension. A bounds_tie_points attribute must be defined for each tie point coordinate variable corresponding to reconstituted coordinates with cell boundaries. An example of the usage of the bounds_tie_points is shown in Example 8.7. Because identification of a coordinate type by its units is complicated by requiring the use of an external software package , we provide two optional methods that yield a direct identification. The attribute axis may be attached to a coordinate variable and given one of the values X, Y, Z or T which stand for a longitude, latitude, vertical, or time axis respectively.
Alternatively the standard_name attribute may be used for direct identification. But note that these optional attributes are in addition to the required COARDS metadata. The NUG conventions provide the _FillValue, missing_value,valid_min, valid_max, and valid_range attributes to indicate missing data. Missing data is allowed in data variables and auxiliary coordinate variables. Generic applications should treat the data as missing where any auxiliary coordinate variables have missing values; special-purpose applications might be able to make use of the data. There are two methods used to identify variables that contain coordinate data.
The first is to use the NUG-defined "coordinate variables." The use of coordinate variables is required for all dimensions that correspond to one dimensional space or time coordinates . In cases where coordinate variables are not applicable, the variables containing coordinate data are identified by the coordinates attribute. For each of the target coordinate and auxillary coordinate variable having a bounds attribute, add the bounds_tie_points attribute to the tie point coordinate variable and create the bounds tie point variable.
No comments:
Post a Comment
Note: Only a member of this blog may post a comment.