NASA Logo, National Aeronautics and Space Administration
SPDF
========================================================================
                                C D F
       F R E Q U E N T L Y    A S K E D   Q U E S T I O N S
========================================================================

FAQ Maintainer: Space Physics Data Facility (SPDF)
Email:          nasa-cdf-support@nasa.onmicrosoft.com
Address:        Space Physics Data Facility
                Heliophysics Science Division
                Sciences and Exploration Directorate
                NASA Goddard Space Flight Center
                Greenbelt, MD 20771, U.S.A.

Last Updated:   February 2023

========================================================================

------- Introductory CDF ----------------------------------------------------

 1.  What is CDF?
 2.  What is the current operational version of CDF?
 3.  What CDF utility programs are available?
 4.  What platforms does CDF run on?
 5.  Where do I get source code and information relevant to CDF and
     CDF utilities?
 6.  How widely used is CDF?
 7.  What are the differences between CDF and NetCDF, and CDF and HDF?

------- Manuals, Documentation, Man Pages --------------------------------

 8.  What documentation is available?
 9.  What are some publications/references relating to CDF?

------- Mailing List, User Support, Etc -----------------------------------

 10. How can I contact them?
 11. How do I make a bug report?
 12. Is there a mailing list for CDF questions, and announcements?

------- Common CDF Questions ---------------------------------------------

 13. What programming languages does CDF support?
 14. Are CDF data sets/files platform-independent?
 15. What's the difference between a single-file CDF and multi-file CDF?
 16. What are the differences between the Standard Interface, Internal 
     Interface, and the CDF Java APIs?
 17. What's variable? 
 18. What compression algorithms does CDF support? 
 19. Can CDF delete a range of records? 
 20. Does CDF support random file access? 
 21. How do I optimize a CDF file or can I read data fast?
 22. How do I know a CDF is not corrupted or compromised?

------- General CDF --------------------------------------------------------

 23. What is the best way to represent my data using CDF?
 24. Can new versions of CDF applications read CDF files written using older
     versions of the CDF library?
 25. Can my application programs which work with old versions of the CDF
     library always be compiled with new versions of CDF?
 26. Is there any commercial or public domain visualization software that
     accepts CDF files?
 27. Are there any conversion programs available to convert non-CDF files
     into CDF files or vice versa?
 28. Can I convert HDF4, HDF5, netCDF, or FITS to CDF?
 29. How can I contribute my software to the CDF user community?


------------------------------------------------------------------------------

1. What is CDF? CDF is the Common Data Format. It is a conceptual data abstraction for storing, manipulating, and accessing multidimensional data sets. The basic component of CDF is a software programming interface that is a device- independent view of the CDF data model. The application developer is insulated from the actual physical file format for reasons of conceptual simplicity, device independence, and future expandability. CDF files created on any given platform can be transported to any other platform onto which CDF is ported and used with any CDF tools or layered applications. The CDF software, documentation, and user support services are provided by NASA and available to the public free of charge. There are no license agreements or costs involved in obtaining or using CDF. A more detailed introduction to CDF can be found in the CDF User's Guide.

2. What is the current operational version of CDF? CDF V3.9.0.0 is the current operational version. It was released in February 2023. CDF V3.0 or later will read a file that was created with any of the previous CDF versions. But, a file created from scratch with CDF 3.0 or later won't be readable by applications built with prior-V3.0 libraries, e.g., 2.7, 2.6, 2.5, or 2.4. Notes: The official version of the ISTP project is CDF V2.5.

3. What CDF utility programs are available? The following utility programs (a.k.a. CDF toolkit) are available in the CDF distribution package for the supported platforms: Program Description ------- ----------- CDFcompare This utility is used to display the differences between two CDF files. It has many options that can be specified to select what and how the data entities in CDFs to be compared. CDFconvert This program is used to convert various properties of a CDF. In all cases new CDFs are created (existing CDFs are not modified.) Any combination of the following properties may be changed when converting a CDF. - A CDF created with an earlier release of the CDF library (e.g., CDF V2.5) may be converted to the current library release. - The format of the CDF may be changed. - The data encoding of the CDF may be changed. - The variable majority of the CDF may be changed. CDFdir This utility is used to display a directory listing of a CDF's files. The .cdf file is displayed first followed by the rVariable files and then the zVariable files (if either exist in a multi-file CDF) in numerical order. CDFedit This program allows the display and/or modification of practically all of the contents of a CDF by way of a full-screen interface. It is also possible to run CDFedit in a browse-only mode if that's desired. CDFexport This program allows the entire contents or a portion of a CDF file to be exported to the terminal screen, a text file, or another CDF. The variables to be exported can be selected along with a filter range for each variable which allows a subset of the CDF to be generated. When exporting to another CDF, a new compression and sparseness can be specified for each variable. When exporting to the terminal screen or a text file, the format of the output can be tailored as necessary. CDFinquire This program displays the version of the CDF distribution being used and the default toolkit qualifiers. CDFmerge This program merges two or more CDF files into a single CDF. It can merge metadata and/or data. CDFstats This program produces a statistical report on a CDF's variable data. Both rVariables and zVariables are analyzed. For each variable it determines the actual minimum and maximum values (in all of the variable records), the minimum and maximum values within the valid range (if the VALIDMIN and VALIDMAX vAttributes and corresponding entries are present in the CDF), and the monotonicity. An option exists to allow fill values (specified by the FILLVAL vAttribute) to be ignored when collecting statistics. SkeletonTable This program is used to create an ASCII text file called a skeleton table containing information about a given CDF (SkeletonTable can also be instructed to output the skeleton table to the terminal screen.) It reads a CDF file and writes into the skeleton table the following information. 1. Format (single or multi file), data encoding, variable majority. 2. Number of dimensions and dimension sizes for the rVariables. 3. gAttribute definitions (and gEntry values). 4. rVariable and zVariable definitions and vAttribute definitions (with rEntry/zEntry values). 5. Data values for all or a subset of the CDF's variables. Traditionally, only NRV variable values are written to a skeleton table. RV variable values may now also be written. The above information is written in a format that can be "understood" by the SkeletonCDF program. SkeletonCDF reads a skeleton table and creates a new CDF (called a skeleton CDF). SkeletonCDF This program is used to make a fully structured CDF, called a skeleton CDF, by reading a text file called a skeleton table. The skeleton table contains the information necessary to create a CDF that is complete in all respects except for record-variant (RV) variable values. (RV variables vary from record to record.) RV values are then written to the CDF by the execution of an application program. The SkeletonCDF program allows a CDF to be created with the following. 1. The necessary header information - the number of dimensions and dimension sizes for the rVariables, format, data encoding, and variable majority. 2. The gAttribute definitions and any number of gEntries for each. 3. The rVariable and zVariable definitions. 4. The vAttribute definitions and the entries corresponding to each variable. 5. The data values for those variables that are non-record-variant (NRV). NRV variables do not vary from record to record. CDFValidate This program validates if a given file is a valid CDF file. If it's a valid CDF file, it check's the integrity of CDF internal data structures. CDFDump This program is used to dump the data contents in a readable form from a CDF. It can dump metadata or/and data from the selected variables in a range of records. This program can also be used to validate a CDF. CDFIRsDump This is a diagnostic tool, which is used to dump the internal data records (in Hex) of a CDF. It is used to show the internal data structure of the file and in case there is a question about the file, it may show what the problem might be. CDFLeapSecondsInfo This is a tool for showing how the CDF-based Leap Second table is being accessed either externally or internally. Optionally, the table content can be shown.

4. On what platforms does CDF run? Platforms Operating Systems --------- ----------------- DEC Alpha OSF/1, OpenVMS DECstation Ultrix, VMS HP 9000 series HP-UX PC Windows XP/7/8/10, Linux, Solaris, QNX, Cygwin, MinGW IBM RS6000 series AIX Macintosh Mac OS X NeXT Mach SGI Iris, Power series, Indigo IRIX Sun SunOS, Solaris VAX VMS ARM (Raspbian/Fedora/Ubuntu) Linux Itanium OpenVMS The latest CDF should supports all the platforms listed above, even the least popular ones, e.g.,the HP-UX and IBM AIX operating systems (due to lack of interest and hardware). If you need to run the latest CDF software on such platforms, please contact nasa-cdf-support@nasa.onmicrosoft.com.

5. Where do I get source code and information relevant to CDF and CDF utilities? The CDF software distribution is available at spdf.gsfc.nasa.gov/pub/software/cdf/dist for the Unix, Macintosh, Windows, and VMS operating systems. Some general information on CDF, including this FAQ, and software are also available from the URL: cdf.gsfc.nasa.gov.

6. How widely used is CDF? The CDF software package is used by hundreds of government agencies, universities, and private and commercial organizations as well as independent researchers on both national and international levels. CDF has been adopted by the International Solar-Terrestrial Physics (ISTP) project as well as the Central Data Handling Facilities (CDHF) as their format of choice for storing and distributing key parameter data.

7. What are the differences between CDF and netCDF, and CDF and HDF? The differences between the following formats are based on a high level overview of each. The information was obtained from various publicly available documentation and articles. To the best of our knowledge, the information below is accurate, although the accuracy may deviate from time to time depending on the timing of new releases. The best and most complete way to evaluate what package best fulfills your requirements is to acquire a copy of the documentation and software from each institution and examine them thoroughly.

  • CDF vs. netCDF

    CDF was designed and developed in 1985 by the National Space Science Data Center (NSSDC) at NASA/GSFC. CDF was originally written in FORTRAN and only available on the VAX/VMS environments. NetCDF was developed a few years later at Unidata, part of the University Corporation for Atmospheric Research (UCAR). The netCDF model was based on that of the CDF conceptual model but provided a number of additional features (such as C language bindings, portable to a number of platforms, machine-independent data format, etc.). Today both models and existing software have matured substantially since and are quite similar in most respects, although they do differ in the following ways:

    • Although the interfaces do provide the same basic functionality they do differ syntactically. (See users guides for details.)

    • NetCDF supports named dimensions (i.e., TEMP[x, y, ...]) whereas CDF utilizes the traditional logical (i.e., TEMP[true, true, ...]) method of indicating dimensionality.

    • CDF supports both multi- and single file filing systems whereas netCDF supports only single file filing systems.

    • CDF software can transparently access data files in any encoding currently supported by the CDF library (For example: a CDF application running on a Sun can read and write data encoded in a VAX format.) in addition to the machine-independent (XDR) encoding. netCDF-3 software reads and writes data in only the XDR data encoding, but netCDF-4 supports native encodings by default, using a "reader makes right" approach for portability.

    • The CDF library supports an internal caching algorithm in which the user can make modifications (if so desired) to tweak performance.

    • The netCDF data object is currently accessible via the HDF software; CDF is not.

    • As part of the CDF distribution, there exist a number of easy-to-use tools and utilities that enable the user to edit, browse, list, prototype, subset, export to ASCII, compare, etc. the contents of CDF data files.

  • CDF vs. HDF4

    CDF is a scientific data management software package and format based on a multidimensional (array) model. HDF is a Hierarchical Data Format developed at the National Center for Supercomputing Applications (NCSA) at the University of Illinois. The HDF4 data model is based on the hierarchical relationship and dependencies among data. Although the two models differ (in many ways like comparing apples to oranges) significantly in their level of abstraction and the way in which their inherent structures are defined and accessed, there exists a large overlap in the types of scientific data that each can support. Some of the obvious differences are as follows:

    • The HDF4 structure is based on a tagged format, storing tag identifiers (i.e., utility, raster image, scientific data set, and Vgroup/Vdata tags) for each inherent data object. The basic structure of HDF consists of an index with the tags of the objects in the file, pointers to the data associated with the tags, and the data themselves. The CDF structure is based on variable definitions (name, data type, number of dimensions, sizes, etc.) where a collection of data elements is defined in terms of a variable. The structure of CDF allows one to define an unlimited number of variables completely independent (loosely coupled) of one another and disparate in nature, a group of variables that illustrate a strong dependency (tightly coupled) on one another or both simultaneously. In addition CDF supports extensive meta-data capabilities (called attributes), which enable the user to define further the contents of a CDF file.

    • HDF4 supports a set of interface routines for each supported object (Raster Image, Pallets, Scientific Data Sets, Annotation, Vset, and Vgroup) type. CDF supports two interfaces from which a CDF file can be accessed: the Internal Interface and the Standard Interface. The Internal Interface is very robust and consists of one variable argument subroutine call that enables a user to utilize all the functionality supported via CDF software. The Standard Interface is built on top of the Internal Interface and consists of 23 subroutine calls with a fixed argument list. The Standard Interface provides a mechanism in which novice programmers can quickly and easily create a CDF data file.

    • HDF4 currently offers some compression for storing certain types of data objects, such as images. CDF supports compression of any data type with a choice of run-length encoding, Huffman, adaptive Huffman, and Gnu's ZIP algorithms.

    • CDF supports an internal cache in which the user can modify the size through the Internal Interface to enhance performance on specific machines.

    • HDF4 data files are difficult to update. Data records are physically stored in a contiguous fashion. Therefore, if a data record needs to be extended it usually means that the entire file has to be rewritten. CDF maintains an internal directory of pointers for all the variables in a CDF file and does not require all the data elements for a given variable to be contiguous. Therefore, existing variables can be extended, modified, and deleted, and new variables added to the existing file.

    • In the late 1980's the CDF software was redesigned and rewritten (CDF 2.0) in C. With little or no impact on performance, the redesign provided for an open framework that could be easily extended to incorporate new functionality and features when needed. CDF is currently at Version 3.0, and performance has been enhanced significantly.

    • CDF supports both host encoding and the machine-independent (XDR) encoding. In addition, the CDF software can transparently access data files in any encoding currently supported by the CDF library (For example, a CDF application running on a Sun can read and write data encoded in a VAX format.) HDF4 supports both host encoding and the machine-independent (XDR) encoding.

  • CDF vs. HDF5

    CDF is a scientific data management software package and format based on a multidimensional (array) model. HDF is a Hierarchical Data Format developed at the National Center for Supercomputing Applications (NCSA) at the University of Illinois. The data model of CDF is very similar to HDF5's data model. They both have 2 basic objects: data and attribute. Data is an entity that represents data while attribute is a mechanism used to denote data. HDF5 allows to group similar objects into a group, but CDF doesn't have a grouping mechanism.

Albeit HDF4 and HDF5 are developed by the same organization, the data model of HDF5 is totally different from HDF4 and their formats are incompatible.


8. What documentation is available?

The documentation set consists of the CDF User's Guide (UG), the CDF C 
Reference Manual (CRM), the CDF Fortran Reference Manual (FRM), and the CDF 
Internal Format Description (IFD).  

The CDF documents are available in Adobe Portable Document Format (PDF) files,
and they are located at spdf.gsfc.nasa.gov/pub/software/cdf/doc/latest_version.

The latest CDF release is Version 3.9.0.0 and its documentation consists of: 

     cdf390ug.pdf  (CDF User's Guide)
     cdf390crm.pdf (CDF C Reference Manual)
     cdf390frm.pdf (CDF Fortran Reference Manual)
     cdf390jrm.pdf (CDF Java Reference Manual)
     cdf390prm.pdf (CDF Perl Reference Manual)
     cdf390csrm.pdf (CDF C# Reference Manual)
     cdf390vbrm.pdf (CDF Visual Basic Reference Manual)
     cdf36ifd.pdf (CDF Internal Format Description)

Some general information on CDF, including this FAQ, and software are also
available from the CDF Home Page on the World Wide Web with the following URL:

	cdf.gsfc.nasa.gov

9. What are some publications/references relating to CDF? A list of publications about or referring to CDF can be found on the CDF URL site on spdf.gsfc.nasa.gov in the spdf.gsfc.nasa.gov/pub/software/cdf/doc/papers directory as spdf.gsfc.nasa.gov/pub/software/cdf/doc/papers/Bibliography.html. The same directory contains PostScript and/or text files for a few of the papers. The following is a list of some references: Mathews, G. J., and S. S. Towheed, "OMNIWeb: The First Space Physics Data WWW-Based Data Browsing and Retrieval System," Computer Networks and ISDN Systems, Proceedings of the Third International WWW Conference, Vol. 27, No. 6, April 1995, pp. 801-808. Goucher, G. W., and G. J. Mathews, "A Comprehensive Look at CDF," NSSDC/WDC-A-R&S 94-07, NASA/Goddard Space Flight Center, August 1994. S. Brown, M. Folk, G. Goucher, and R. Rew, "Software for Portable Scientific Data Management," Computers in Physics, Vol. 7, No. 3, pp. 304-308, May/June 1993. Salem, K., "MR-CDF: Managing Multi-Resolution Scientific Data," CESDIS TR 92-81, NASA/Goddard Space Flight Center, Greenbelt, Maryland, March 1992. Treinish, L. A. (ed.), "Data Structures and Access Software for Scientific Visualization," A Report on a Workshop at Siggraph '90, Computer Graphics, 25, No. 2, April 1991. Treinish, L. A., and G. W. Goucher, "A Data Abstraction for the Source- Independent Storage and Manipulation of Data," National Space Science Data Center Technical Paper, NASA/Goddard Space Flight Center, August 1988. Treinish, L. A., and M. L. Gough, "A Software Package for the Data- Independent Storage of Multi-Dimensional Data," EOS Transactions, American Geophysical Union, 68, pp. 633-635, 1987. IDL Scientific Data Formats, Version 3.6, Research Systems Incorporated, Boulder, Colorado, April 1994.

10. How can I contact them? Comments, suggestions, and questions may be directed to CDF User Support via

  • Electronic mail:
    	Internet   : nasa-cdf-support@nasa.onmicrosoft.com
    

  • U.S. mail:
    	Attention: CDF USER SUPPORT OFFICE
    	Space Physics Data Facility
    	Code 672.0
    	NASA/Goddard Space Flight Center
    	Greenbelt, Maryland 20771-0001
    	U.S.A.
    

  • Telephone:
    	Voice: (301) 286-9884
    	FAX:   (301) 286-1771
    
Before asking a question, please check the FAQ and CDF User's Guide.

11. How do I make a bug report? All bug reports, comments, suggestions, and questions should go to cdfsupport. ------------------ Template for bug report ------------------------ To: nasa-cdf-support@nasa.onmicrosoft.com. Subject: [area]: [synopsis] [replace with actual AREA and SYNOPSIS] VERSION: [CDF library/toolkit version (e.g., CDF Version 2.5.1). You can obtain the version of your CDF distribution using the CDFinquire toolkit program and specifying the "/ID" argument as in the examples below: From MS-DOS: cdfinq -id From UNIX: cdfinquire -id From VMS: CDFINQUIRE /ID ] USER: [Name, telephone number, and address of person reporting the bug. (email address if possible)] AREA: [Area of the CDF source tree affected, e.g., lib, tools, tests, top-level. If there are bugs in more than one AREA, please use a separate bug report for each AREA.] SYNOPSIS: [Brief description of the problem and where it is located] MACHINE / OPERATING SYSTEM: [e.g. Sparc/SunOS 4.1.3, HP9000/730-HPUX9.01...] COMPILER: [e.g. native cc, native ANSI cc, Borland C++ V3.1, MPW, ...] DESCRIPTION: [Detailed description of problem.] REPEAT BUG BY: [What you did to get the error; include test program or session transcript if at all possible. If you include a program, make sure it depends only on libraries in the CDF distribution, not on any vendor or third-party libraries. Please be specific; if we can't reproduce it, we can't fix it. Tell us exactly what we should see when the program is run.] SAMPLE FIX: [If available, please send context diffs (diff -c).] [PLEASE make your Subject (SYNOPSIS): line as descriptive as possible.] [Remove all the explanatory text in brackets before mailing.] [Send to nasa-cdf-support@nasa.onmicrosoft.com or to Attn: CDFSUPPORT NASA/GSFC/SED/ESED/SPDF Code 672.0 Greenbelt, MD 20771, U.S.A. ] ------------------ End of Bug Report Template ----------------------

12. Is there a mailing list for CDF announcements? For administration questions, for additions or deletions from the general mailing list or for corrections to the list, please send messages to cdfsupport The CDF users mailing list is used to distribute major announcements and events to CDF users. To subscribe, send an email to join mailing list (no text in the body is required). To unsubscribe, send an email to leave mailing list.

13. What programming languages does CDF support? The CDF library comes with C, Java, and Fortran Application programming Interfaces (APIs) that provide the essential framework on which graphical and data analysis packages can be created. In addition, Perl and C# APIs are also available as an optional package for download. The CDF library allows developers of CDF-based systems to easily create applications that permit users to slice data across multidimensional subspaces, access entire structures of data, perform subsampling of data, and access one data element independently regardless of its relationship to any other data element. On Windows, C# and Visual Basic APIs were also available.

14. Are CDF data sets/files platform-independent? CDF data sets are portable across any platforms supported by CDF. These currently consist of VAX (OpenVMS and POSIX shell), Sun (SunOS & SOLARIS), DECstation (ULTRIX), DEC Alpha (OSF/1 & OpenVMS), Silicon Graphics Iris and Power Series (IRIX), IBM RS6000 series (AIX), HP 9000 series (HP-UX), NeXT (Mach), Intel-based PC (Windows XP/7/8/10, Linux, QNX, Solaris, Cygwin & MinGW), Macintosh (68K & Power PC running Mac OS X or Linux), ARM (Raspbian/Fedora/Ubuntu) and Itanium (OpenVMS).

15. What's the difference between a single-file CDF and multi-file CDF? The CDF library gives the user option to choose from one of two file formats: single-file and multi-file. Single-file CDF contains the control information, metadata, and the data values for each of the variables in one file. Whereas multi-file CDF has two parts to it: metadata file and data file (one for each variable). Single-file CDF is the default file format, and it's recommended over the multi-file CDF. The main advantage of the single-file format is that it minimizes the number of files one has to manage and makes it easier to transport CDFs across a network. The organization of the data within the single file may, however, become somewhat convoluted, slightly increasing the data access time. The multi-file format, on the other hand, clearly delimits the data from the metadata and is organized in a consistent fashion within the files. Updating, appending, and accessing data are also done with optimum efficiency. For multi-file format CDFs, certain restrictions are applied. They are: - Compression: Compression is not allowed for the CDF or any of its variables. - Sparseness: Sparse records or arrays for variables are not allowed. - Allocation: Pre-allocation of records or blocks of records is not allowed. For each variable, the maximum written record is the last allocated record.

16. What are the differences between the Standard Interface, Internal Interface, and the CDF Java APIs? The CDF library comes with two types of Application Programming Interfaces (APIs) for C and Fortran: Standard Interface and Internal Interface. The Standard Interface allows to perform specific, basic operations such as creating a CDF file, opening a CDF file, and reading and writing data. See chapters 5, 6, and 7 of the CDF Fortran Reference Manual or the CDF C Reference Manual for a complete list of the APIs that are available for the Standard Interface and Internal Interface and their detailed descriptions. There are two types of Standard Interfaces: Original and Extended. The Original Standard Interface was introduced in early 90's and they only provide a very limited functionality within the CDF library. For example, it can only handle rVariables and can not handle zVariables and has no access to attribute's entry corresponding to the zVariables (zEntries). Up until CDF 2.7.2, if you wanted to create or access zVariables and zEntries, you had to use the Internal Interface that is harder to use. The limitations of the Original Standard Interface were addressed with the introduction of the Extended Standard Interface in CDF 3.1. The Extended Standard Interface provides almost all operations that were only previously available through the Internal Interface. The Internal Interface consists of only one routine, CDFlib. CDFlib is used to perform all possible operations on a CDF. In fact, all of the Standard Interface functions are implemented using the Internal Interface. CDFlib must be used to perform operations not possible with the Standard Interface functions. (e.g., specifying a single-file format for a CDF, accessing zVariables, compressing a CDF file or variables or specifying a pad value for an rVariable or zVariable). Note that CDFlib can also be used to perform certain operations more efficiently than with the Standard Interface functions. The Standard Interface and Internal Interface do not apply to the CDF Java APIs. The CDF Java APIs can do everything the C and Fortran APIs can do and more, such as copying a variable with or without data. Detailed description of the CDF Java APIs can be found at cdf.gsfc.nasa.gov/cdfjava_doc/cdf390/index.html.

17. What's "variable"? CDF's "variable" is a generic name or an object that represents data where data can be 0-dimensional (scalar data) or multi-dimensional (up to 10-dimension), and it does not have any scientific context associated it. For example, a variable can be data representing an independent variable, a dependent variable, time and date value, or whatever data might be (e.g. image, XML file, etc.). In other words, the variable doesn't contain any hidden meanings other than the data itself. One may describe one variable's relationship with other variable(s) through "attributes". There are two types of variables (rVariable and zVariable) and they can coexist in the same CDF file. Every rVariable in a CDF must have the same number of dimensions and dimension sizes, whereas each zVariable has its own dimension and dimension size. Suppose there are 2 rVariables (v1, v2) in a CDF. Let's say v2 is defined as 2:[20,10] - 2-dimensional array with its size of 20 x 10 (20 rows and 10 columns). Then v1 MUST be defined as 2:[20,10] albeit it only needs 1:[8] (since it is a rVariable). But if this model is implemented using zVariables, then v1 and v2 can be defined as 1:[20,10] and 1:[8] instead of 1:[20,10] and 1:[20,10]. As you can see above, since all the rVariables must have the same dimensions and dimension sizes, there'll be a lot of disk space wasted if a few variables need big arrays and many variables need small arrays. So why would you want to use rVariables over zVariables? There's no reason to use rVariables at all (since zVariables are much more efficient) if you are creating a new CDF file. But if you are analyzing data files that were created with early CDF releases or contain rVariables for some reason, you'll need to use rVariables. One may wonder why there are rVariables and zVariables, not just zVariables. When CDF was first introduced in early 90's, only rVariables were available. The inefficiencies with rVariables were quickly realized and addressed with the introduction of zVariables in later CDF releases.

18. What compression algorithms does CDF support? Compression may be specified for a single-file CDF and the CDF library can be instructed to compress a CDF as it is written to disk. This compression occurs transparently to the user. When a compressed CDF is opened, it is automatically decompressed by the CDF library. An application does not have to even know that a CDF is compressed. Any type of access is allowed on a compressed CDF. When a compressed CDF is closed by an application, it is automatically recompressed as it is written back to disk. The individual variables of a CDF can also be compressed. . The CDF library handles the compression and decompression of the variable values transparently. The application does not have to know that the variable is compressed as it accesses the variable's values. The following compression algorithms are supported by the CDF library: - Run-Length Encoding - Huffman - Adaptive Huffman - GZIP See chapter 2 of the CDF User's Guide for more detailed information for the compression algorithms mentioned above.

19. Can CDF delete a range of records? Yes. The Standard Interface doesn't have a function that allows users to delete a range of records, but the Internal Interface and the CDF Java APIs provide APIs for deleting a range of records. The following APIs/method allows to delete a range of records: C or Fortran: <DELETE_, rVAR_RECORDS> or <DELETE_, zVAR_RECORDS> CDF Java API: deleteRecords(firstRec, lastRec) method in the Variable class Chapter 6 of the CDF C Reference Manual and CDF C Reference Manual contain more detailed description of the <DELETE_, rVAR_RECORDS> and <DELETE_, zVAR_RECORDS> APIs/functions. Click here and select the "Variable" link in the lower left frame for a detailed description of the deleteRecords method.

20. Does CDF support random file access? Yes. One of the strengths of CDF is random file access and ability to specify the granularity of data read/write. Suppose you have a CDF file that has 1000 records in it. You can extract data by specifying: - the individual record number you want to extract data from OR - the start and end record number from which data is to be retrieved. You can also specify the record interval that defines the number of records to skip between successive reads. For example, if you specified 100, 200, and 2 for the start record number, the end record number, and the record interval respectively, the CDF system will start reading data from the record number 100 and read every other record (100, 102, 104, 106, and so on) until it reaches the record number 200. If a record is an array (contains more than one value), you can even specify which element(s) you want to extract for every record. The following methods in the Variable object class provide random file access for reading: getRecord, getScalarData, getSingleData, getHyperData. The following methods in the Variable object class provide random file access for writing: putRecord, putScalarData, putSingleData, putHyperData. Click here and select the "Variable" link in the lower left frame for a detailed description of the methods/APIs described above.

21. How do I optimize a CDF file or can I read data fast? Before reading data (whether it's a sequential or random/direct access), it's important to minimize the number of internal pointers (a.k.a data defragmentation) that are used to keep track of where data is located within a file before reading data (whether it's a sequential read or random/direct access read). The CDFconvert CDF utility, among many other functions it performs, allows users to optimize a CDF file and one can optimize a CDF file by entering the following command at the operating system prompt: CDFconvert <source cdf> <dest cdf> where <source cdf> = the name of the CDF file to be optimized <dest cdf> = the name of the newly optimized file For example, the following command reads a file called test.cdf and creates an optimized file called test_new.cdf. CDFconvert test.cdf test_new.cdf NOTE: It's always a good idea to run CDFconvert on newly created CDF files since reading data from an optimized file is very fast. This is especially true if the size of the file is big.

22. How do I know a CDF file is not corrupted or compromised? Several safety measures have been taken by the CDF since CDF V3.2.0 to ensure the data integrity in the CDF files. From CDF V3.2.0, the checksum feature was added. If the feature is used for a CDF, the file's checksum will be verified when it is accessed. Currently, MD5 checksum is the only algorithm used by CDF. Since CDF V3.2.1, the CDF file integrity is further enhanced so that the potential for a buffer overflow vulnerability in the code when reading specially-crafted (invalid) CDF files can be prevented. Various sanity checks were added in the code for data against their expected values or ranges. Any corrupted files are expected to be identified immediately when they are accessed. A standalone tool: CDFValidate (mentioned in Item 3: What CDF utility programs are available? in this page) is written to assist in validating a given CDF(s).

23. What is the best way to represent my data using CDF? There are many ways to represent the same information in any general-purpose data model, and there is no single "correct" way to store data in a CDF. The user has complete control over how the data values are stored in the CDF (within the confines of the variable array structure) depending on how the user views the data. This is the advantage of CDF. Data values are organized in whatever way makes sense to the user. We provide some guidelines in the CDF User's Guide (e.g., "Organizing Your Data in a CDF"), but we've found that a little experience helps. Occasionally, we have decided it was useful to change the structure of the CDF files after experience with how the data are used.

24. Can new versions of CDF applications read CDF files written using older versions of the CDF library? Our goal is to make CDF backward compatible in the sense that CDF files can always be read by new versions of the CDF library and tools. CDF 3.0 or later will read a file that was created with any of the previous CDF versions, but a file created from scratch with CDF 3.0 or later won't be readable by 2.7, 2.6, 2.5, or 2.4.

25. Can my application programs that work with old versions of the CDF library always be compiled with new versions of CDF? As CDF evolves new functions are added, but they are transparent to applications since all existing functions of the Internal and Standard interfaces are unchanged. However, the CDF Version 1.0 "Obsolete" interface, which was only callable from Fortran, is no longer supported with CDF V2.5. The Standard Interface is similar to the CDF V1.0 interface with several additions for new features. Those functions not available using the Standard Interface are made available using the Internal Interface. The Internal Interface makes CDF an easily extensible software package, so all applications using this interface will remain portable with newer versions of the CDF library. However, some advanced CDF users have exploited undocumented CDF functions that were removed from a later version of the library and could not compile their application. This practice is NOT recommended because these functions are for internal use only and subject to change without notice in future releases. If any internal CDF function is useful to your application, then you should notify us, and we will determine if it is appropriate to make it visible in the next release. In the meanwhile, the portable approach would be to copy the corresponding function(s) from the CDF library source, rename the function name(s) to something else, insert it into your application code, and update the calling statements with the new function name(s).

26. Is there any commercial or public domain data analysis/visualization software that accepts/supports CDF files? Commercial software: Interactive Data Language (IDL) MathWorks MATLAB Language (MATLAB) Application Visualization System (AVS) Weisang GmbH & Co. KG Data Analysis and Presentation (FlexPro) Public domain software: Autoplot (Autoplot) An interactive browser for data on the web. SPEDAS (SPEDAS) Space Physics Environment Data Analysis Software IDL-based plotting, analysis, and data downloading tools. pySPEDAS (pySPEDAS) Python implementation of many SPEDAS plotting, analysis, and downloading tools. TOPCAT (TOPCAT) STILTS (STILTS) These are two table analysis packages built on the same infrastructure; TOPCAT is a GUI tool and STILTS provides command-line access to the same functionality. These are java applications that deal with various file formats of interest to astronomers, and the CDF handling is based on the JCDF library that can be referenced in User Supplied Software. Open Visualization Data Explorer based on previous IBM's Data Explorer. Please goole it. CDAWlib - A set of IDL routines that underlie the CDAWeb software. This is a "library" of routines useful for reading in data that is stored in the CDF format and also for plotting the data variables as time series, images, radar, and spectrograms. In order to use this s/w you must have IDL installed on your local machine. CDFx - an IDL software tool for displaying, editing, and listing the contents of ISTP-compliant CDF data files. It displays image plots, time-series plots, CDF variables; it will list and store CDF data in plain ASCII text; and it will save subsets of data to new CDF files. Familiarity with CDF and ISTP CDF conventions will be helpful when using CDFx. ISTP specifications and master CDF file concept discussions are at the SPDF Use of CDF site. LinkWinds (contact berkin@krazy.jpl.nasa.gov) There are also several SPDF developed WWW-based data systems that provide access to data stored in CDF and use IDL to generate time series plots of selected variables, which are available via the following URLs: CDAWeb - cdaweb.gsfc.nasa.gov SSCWeb - sscweb.gsfc.nasa.gov COHOWeb - cohoweb.gsfc.nasa.gov OMNIWeb - omniweb.gsfc.nasa.gov Example visualizations can be found via the URL: cdf.gsfc.nasa.gov/html/examples.html

27. Are there any conversion programs available that will convert non-CDF files into CDF files or vice versa?

  • MakeCDF is a CDF application that reads flat data sets, in both binary and text and generates a ISTP-compliant CDF data set from that data. It is available at spdf.gsfc.nasa.gov/makecdf.html.
  • In a bid to facilitate and promote data exchanges among space scientists, the CDF office has developed a set of data translation tool programs Data Translation Tools. Below is a list of the data translation modes that are currently available:
    • CDF-to-ASCII (performed by CDFexport or CDFdump)
    • CDF-to-CDFML (CDF representation in XML)
    • CDFML-to-CDF
    • CDF-to-netCDF
    • netCDF-to-CDF
    • CDF-to-FITS
    • FITS-to-CDF
    • CDF-to-Json (CDF representation in Json)
    • Json-to-CDF
    • HDF4-to-CDF
  • CDFexport and CDFdump are C-based utility tools that can generate an ASCII text file. CDFexport can also generate a subsetted CDF of the selected variables from a CDF, among other things. It is a part of the CDF distribution package.
  • CDF to and from CDFML converters, both Java-based, are available in the CDF distribution package.
  • CDF to and from Json converters are added to the CDF package. These data format converting tools are ialso Java-based. Like CDF-XML, they depend on the CDF library and Java APIs.
  • Unidata developed NCdump can display a netCDF in text form.

28. Can I convert HDF4, HDF5, FITS, netCDF, Json to CDF? Yes. The CDF office has developed a set of the data format translators, which users can use to translate files in netCDF (Version 4 with limited data types), FITs, HDF4 and HDF5 (not an efficient tool) to CDF. The data translation tools are available for download at Data Translation Tools. As for the translation from HDF4 to HDF5, the NCSA Hierarchical Data Format (HDF) office has developed a tool that converts a HDF4 file to HDF5, and this tool (h4toh5) is available from the HDF home page (https://www.hdfgroup.org).

29. How can I contribute my software to the CDF user community? Contact CDFSUPPORT indicating that you would like to contribute your software to the CDF user community. Please fill out the CDF Directory Entry form (spdf.gsfc.nasa.gov/pub/software/cdf/doc/forms/direct) and send the completed form to us. Your information will then be added to our Directory of CDF User Applications, and users may contact you for information on your application. However, if you want us to distribute your application, we will set up a directory for you to send the contribution package. For other users' convenience your contribution package should include the software itself, a Makefile if possible, a man-page, test programs, and input data files for testing. A README file is also required. It should describe briefly the purpose, function, and limitation of the software; on which platforms and operating systems it runs; how to compile, install, and test it; whom and where to contact for comments, suggestions, or bug reports. We encourage CDF users to let us know about their applications, so other CDF users benefit from it as well. Some CDF-related software are available from the URL: spdf.gsfc.nasa.gov/pub/software/cdf/apps.


Comments, corrections, or additions to the CDF FAQ along with any CDF-related comments or questions should be sent to the CDF User Support Office via email to cdfsupport.