FAQ Maintainer: Space Physics Data Facility (SPDF)
Email: nasa-cdf-support@nasa.onmicrosoft.com
Address: Space Physics Data Facility
Heliophysics Science Division
Sciences and Exploration Directorate
NASA Goddard Space Flight Center
Greenbelt, MD 20771, U.S.A.
Last Updated: February 2023
========================================================================
------- Introductory CDF ----------------------------------------------------
------- Manuals, Documentation, Man Pages --------------------------------
------- Mailing List, User Support, Etc -----------------------------------
------- Common CDF Questions ---------------------------------------------
------- General CDF --------------------------------------------------------
CDF is the Common Data Format. It is a conceptual data abstraction for storing, manipulating, and accessing multidimensional datasets. The basic component of CDF is a software programming interface that is a device- independent view of the CDF data model. The application developer is insulated from the actual physical file format for reasons of conceptual simplicity, device independence, and future expandability. CDF files created on any given platform can be transported to any other platform onto which CDF is ported and used with any CDF tools or layered applications.
The CDF software, documentation, and user support services are provided by NASA and available to the public free of charge. There are no license agreements or costs involved in obtaining or using CDF.
A more detailed introduction to CDF can be found in the CDF User Guide.
CDF V3.9.1.0 is the current operational version. It was released in October 2024.
CDF V3.0 or later will read a file that was created with any of the previous CDF versions. But, a file created from scratch with CDF 3.0 or later won’t be readable by applications built with prior-V3.0 libraries, e.g., 2.7, 2.6, 2.5, or 2.4.
Notes: The official version of the ISTP project is CDF V2.5.
Platforms Operating Systems
DEC Alpha OSF/1, OpenVMS DECstation Ultrix, VMS HP 9000 series HP-UX PC Windows XP/7/8/10, Linux, Solaris, QNX, Cygwin, MinGW IBM RS6000 series AIX Macintosh Mac OS X NeXT Mach SGI Iris, Power series, Indigo IRIX Sun SunOS, Solaris VAX VMS ARM (Raspbian/Fedora/Ubuntu) Linux Itanium OpenVMS
The latest CDF should supports all the platforms listed above, even the least popular ones, e.g.,the HP-UX and IBM AIX operating systems (due to lack of interest and hardware). If you need to run the latest CDF software on such platforms, please contact nasa-cdf-support@nasa.onmicrosoft.com.
The CDF software distribution is available at spdf.gsfc.nasa.gov/pub/software/cdf/dist for the Unix, Macintosh, Windows, and VMS operating systems.
Some general information on CDF, including this FAQ, and software are also available from the URL: cdf.gsfc.nasa.gov.
The CDF software package is used by hundreds of government agencies, universities, and private and commercial organizations as well as independent researchers on both national and international levels. CDF has been adopted by the International Solar-Terrestrial Physics (ISTP) project as well as the Central Data Handling Facilities (CDHF) as their format of choice for storing and distributing key parameter data.
See Comparison and conversion software with other formats
Multiple formats use the CDF acronym or *.cdf file extension:
To distinguish CDF and netCDF, NetCDFs start with “CDF” whereas CDFs have the string “Common Data Format (CDF)” near the beginning (try the Unix strings command.
See the CDF User Guide
A list of publications about or referring to CDF can be found on the CDF URL site on spdf.gsfc.nasa.gov in the spdf.gsfc.nasa.gov/pub/software/cdf/doc/papers directory as spdf.gsfc.nasa.gov/pub/software/cdf/doc/papers/Bibliography.html. The same directory contains PostScript and/or text files for a few of the papers.
The following is a list of some references:
Mathews, G. J., and S. S. Towheed, “OMNIWeb: The First Space Physics Data WWW-Based Data Browsing and Retrieval System,” Computer Networks and ISDN Systems, Proceedings of the Third International WWW Conference, Vol. 27, No. 6, April 1995, pp. 801-808.
Goucher, G. W., and G. J. Mathews, “A Comprehensive Look at CDF,” NSSDC/WDC-A-R&S 94-07, NASA/Goddard Space Flight Center, August 1994.
S. Brown, M. Folk, G. Goucher, and R. Rew, “Software for Portable Scientific Data Management,” Computers in Physics, Vol. 7, No. 3, pp. 304-308, May/June 1993.
Salem, K., “MR-CDF: Managing Multi-Resolution Scientific Data,” CESDIS TR 92-81, NASA/Goddard Space Flight Center, Greenbelt, Maryland, March 1992.
Treinish, L. A. (ed.), “Data Structures and Access Software for Scientific Visualization,” A Report on a Workshop at Siggraph ‘90, Computer Graphics, 25, No. 2, April 1991.
Treinish, L. A., and G. W. Goucher, “A Data Abstraction for the Source- Independent Storage and Manipulation of Data,” National Space Science Data Center Technical Paper, NASA/Goddard Space Flight Center, August 1988.
Treinish, L. A., and M. L. Gough, “A Software Package for the Data- Independent Storage of Multi-Dimensional Data,” EOS Transactions, American Geophysical Union, 68, pp. 633-635, 1987.
IDL® Scientific Data Formats, Version 3.6, Research Systems Incorporated, Boulder, Colorado, April 1994.
Comments, suggestions, and questions may be directed to CDF User Support via
Internet: nasa-cdf-support@nasa.onmicrosoft.com
Attention: CDF USER SUPPORT OFFICE
Space Physics Data Facility
Code 672.0
NASA/Goddard Space Flight Center
Greenbelt, Maryland 20771-0001
U.S.A.
FAX: (301) 286-1771
Before asking a question, please check the FAQ and CDF User’s Guide.
All bug reports, comments, suggestions, and questions should go to cdfsupport.
------------------ Template for bug report ------------------------
To: nasa-cdf-support@nasa.onmicrosoft.com.
Subject: [area]: [synopsis] [replace with actual AREA and SYNOPSIS]
VERSION:
[CDF library/toolkit version (e.g., CDF Version 2.5.1).
You can obtain the version of your CDF distribution
using the CDFinquire toolkit program and specifying the "/ID"
argument as in the examples below:
From MS-DOS: cdfinq -id
From UNIX: cdfinquire -id
From VMS: CDFINQUIRE /ID ]
USER:
[Name, telephone number, and address of person reporting the bug.
(email address if possible)]
AREA:
[Area of the CDF source tree affected, e.g., lib, tools, tests,
top-level. If there are bugs in more than one AREA, please use
a separate bug report for each AREA.]
SYNOPSIS:
[Brief description of the problem and where it is located]
MACHINE / OPERATING SYSTEM:
[e.g. Sparc/SunOS 4.1.3, HP9000/730-HPUX9.01...]
COMPILER:
[e.g. native cc, native ANSI cc, Borland C++ V3.1, MPW, ...]
DESCRIPTION:
[Detailed description of problem.]
REPEAT BUG BY:
[What you did to get the error; include test program or session
transcript if at all possible. If you include a program, make
sure it depends only on libraries in the CDF distribution, not
on any vendor or third-party libraries. Please be specific;
if we can't reproduce it, we can't fix it. Tell us exactly what
we should see when the program is run.]
SAMPLE FIX:
[If available, please send context diffs (diff -c).]
[PLEASE make your Subject (SYNOPSIS): line as descriptive as possible.]
[Remove all the explanatory text in brackets before mailing.]
[Send to nasa-cdf-support@nasa.onmicrosoft.com or to
Attn: CDFSUPPORT
NASA/GSFC/SED/ESED/SPDF
Code 672.0
Greenbelt, MD 20771, U.S.A. ]
------------------ End of Bug Report Template ----------------------
For administration questions, for additions or deletions from the general mailing list or for corrections to the list, please send messages to cdfsupport
The CDF users mailing list is used to distribute major announcements and events to CDF users.
To subscribe, send an email to [join mailing list](mailto:gsfc-cdf-announcements-subscribe@lists.nasa.gov?Subject=Subscribe to CDF announcements) (no text in the body is required).
To unsubscribe, send an email to [leave mailing list](mailto:gsfc-cdf-announcements-leave@lists.nasa.gov?Subject=Leave CDF announcements).
The CDF library comes with C, Java, and Fortran Application programming Interfaces (APIs) that provide the essential framework on which graphical and data analysis packages can be created. In addition, Perl and C# APIs are also available as an optional package for download. The CDF library allows developers of CDF-based systems to easily create applications that permit users to slice data across multidimensional subspaces, access entire structures of data, perform subsampling of data, and access one data element independently regardless of its relationship to any other data element. On Windows, C# and Visual Basic APIs were also available.
CDF datasets are portable across any platforms supported by CDF. These currently consist of VAX (OpenVMS and POSIX shell), Sun (SunOS & SOLARIS), DECstation (ULTRIX), DEC Alpha (OSF/1 & OpenVMS), Silicon Graphics Iris and Power Series (IRIX), IBM RS6000 series (AIX), HP 9000 series (HP-UX), NeXT (Mach), Intel-based PC (Windows XP/7/8/10, Linux, QNX, Solaris, Cygwin & MinGW), Macintosh (68K & Power PC running Mac OS X or Linux), ARM (Raspbian/Fedora/Ubuntu) and Itanium (OpenVMS).
The CDF library gives the user option to choose from one of two file formats: single-file and multi-file. Single-file CDF contains the control information, metadata, and the data values for each of the variables in one file. Whereas multi-file CDF has two parts to it: metadata file and data file (one for each variable). Single-file CDF is the default file format, and it’s recommended over the multi-file CDF.
The main advantage of the single-file format is that it minimizes the number of files one has to manage and makes it easier to transport CDFs across a network. The organization of the data within the single file may, however, become somewhat convoluted, slightly increasing the data access time. The multi-file format, on the other hand, clearly delimits the data from the metadata and is organized in a consistent fashion within the files. Updating, appending, and accessing data are also done with optimum efficiency.
For multi-file format CDFs, certain restrictions are applied. They are:
Compression: Compression is not allowed for the CDF or any of its variables.
Sparseness: Sparse records or arrays for variables are not allowed.
Allocation: Pre-allocation of records or blocks of records is not allowed. For each variable, the maximum written record is the last allocated record.
The CDF library comes with two types of
APIs for C and Fortran: Standard Interface and Internal Interface.
The Standard Interface allows to perform specific, basic operations such as creating
a CDF file, opening a CDF file, and reading and writing data. See chapters
5, 6, and 7 of the CDF Fortran Reference Manual or the CDF C Reference Manual
for a complete list of the APIs that are available for the Standard
Interface and Internal Interface and their detailed descriptions.
There are two types of Standard Interfaces: Original and Extended. The Original Standard Interface was introduced in early 90’s and they only provide a very limited functionality within the CDF library. For example, it can only handle rVariables and can not handle zVariables and has no access to attribute’s entry corresponding to the zVariables (zEntries). Up until CDF 2.7.2, if you wanted to create or access zVariables and zEntries, you had to use the Internal Interface that is harder to use. The limitations of the Original Standard Interface were addressed with the introduction of the Extended Standard Interface in CDF 3.1. The Extended Standard Interface provides almost all operations that were only previously available through the Internal Interface.
The Internal Interface consists of only one routine, CDFlib. CDFlib is used to perform all possible operations on a CDF. In fact, all of the Standard Interface functions are implemented using the Internal Interface. CDFlib must be used to perform operations not possible with the Standard Interface functions. (e.g., specifying a single-file format for a CDF, accessing zVariables, compressing a CDF file or variables or specifying a pad value for an rVariable or zVariable). Note that CDFlib can also be used to perform certain operations more efficiently than with the Standard Interface functions.
The Standard Interface and Internal Interface do not apply to the CDF Java APIs. The CDF Java APIs can do everything the C and Fortran APIs can do and more, such as copying a variable with or without data. Detailed description of the CDF Java APIs can be found at cdf.gsfc.nasa.gov/cdfjava_doc/cdf390/index.html.
CDF’s “variable” is a generic name or an object that represents data where data can be 0-dimensional (scalar data) or multi-dimensional (up to 10-dimension), and it does not have any scientific context associated it. For example, a variable can be data representing an independent variable, a dependent variable, time and date value, or whatever data might be (e.g. image, XML file, etc.). In other words, the variable doesn’t contain any hidden meanings other than the data itself. One may describe one variable’s relationship with other variable(s) through “attributes.”
There are two types of variables (rVariable and zVariable) and they can
coexist in the same CDF file. Every rVariable in a CDF must have the
same number of dimensions and dimension sizes, whereas each zVariable has
its own dimension and dimension size. Suppose there are 2 rVariables
(v1, v2) in a CDF. Let’s say v2 is defined as 2:[20,10] - 2-dimensional
array with its size of 20 x 10 (20 rows and 10 columns). Then v1 MUST
be defined as 2:[20,10] albeit it only needs 1:[8] (since it is a rVariable).
But if this model is implemented using zVariables, then v1 and v2 can be
defined as 1:[20,10] and 1:[8] instead of 1:[20,10] and 1:[20,10]. As you
can see above, since all the rVariables must have the same dimensions and
dimension sizes, there’ll be a lot of disk space wasted if a few variables
need big arrays and many variables need small arrays.
So why would you want to use rVariables over zVariables? There’s no reason to use rVariables at all (since zVariables are much more efficient) if you are creating a new CDF file. But if you are analyzing data files that were created with early CDF releases or contain rVariables for some reason, you’ll need to use rVariables. One may wonder why there are rVariables and zVariables, not just zVariables. When CDF was first introduced in early 90’s, only rVariables were available. The inefficiencies with rVariables were quickly realized and addressed with the introduction of zVariables in later CDF releases.
Compression may be specified for a single-file CDF and the CDF library can be instructed to compress a CDF as it is written to disk. This compression occurs transparently to the user. When a compressed CDF is opened, it is automatically decompressed by the CDF library. An application does not have to even know that a CDF is compressed. Any type of access is allowed on a compressed CDF. When a compressed CDF is closed by an application, it is automatically recompressed as it is written back to disk.
The individual variables of a CDF can also be compressed. The CDF library handles the compression and decompression of the variable values transparently. The application does not have to know that the variable is compressed as it accesses the variable’s values.
The following compression algorithms are supported by the CDF library:
See chapter 2 of the CDF User’s Guide for more detailed information for the compression algorithms mentioned above.
Yes. The Standard Interface doesn’t have a function that allows users to delete a range of records, but the Internal Interface and the CDF Java APIs provide APIs for deleting a range of records.
The following APIs/method allows to delete a range of records:
C or Fortran: <DELETE_, rVAR_RECORDS> or <DELETE_, zVAR_RECORDS> CDF Java API: deleteRecords(firstRec, lastRec) method in the Variable class
Chapter 6 of the CDF C Reference Manual and CDF C Reference Manual contain more detailed description of the <DELETE_, rVAR_RECORDS> and <DELETE_, zVAR_RECORDS> APIs/functions.
Click here and select the “Variable” link in the lower left frame for a detailed description of the deleteRecords method.
Yes. One of the strengths of CDF is random file access and ability to specify the granularity of data read/write. Suppose you have a CDF file that has 1000 records in it. You can extract data by specifying:
the individual record number you want to extract data from
OR
the start and end record number from which data is to be retrieved. You can also specify the record interval that defines the number of records to skip between successive reads. For example, if you specified 100, 200, and 2 for the start record number, the end record number, and the record interval respectively, the CDF system will start reading data from the record number 100 and read every other record (100, 102, 104, 106, and so on) until it reaches the record number 200. If a record is an array (contains more than one value), you can even specify which element(s) you want to extract for every record.
The following methods in the Variable object class provide random file access for reading: getRecord, getScalarData, getSingleData, getHyperData.
The following methods in the Variable object class provide random file access for writing: putRecord, putScalarData, putSingleData, putHyperData.
Click here and select the “Variable” link in the lower left frame for a detailed description of the methods/APIs described above.
Before reading data (whether it’s a sequential or random/direct access), it’s important to minimize the number of internal pointers (a.k.a data defragmentation) that are used to keep track of where data is located within a file before reading data (whether it’s a sequential read or random/direct access read). The CDFconvert CDF utility, among many other functions it performs, allows users to optimize a CDF file and one can optimize a CDF file by entering the following command at the operating system prompt:
CDFconvert
where <source cdf> = the name of the CDF file to be optimized
<dest cdf> = the name of the newly optimized file
For example, the following command reads a file called test.cdf and creates an optimized file called test_new.cdf.
CDFconvert test.cdf test_new.cdf
NOTE: It’s always a good idea to run CDFconvert on newly created CDF files since reading data from an optimized file is very fast. This is especially true if the size of the file is big.
Several safety measures have been taken by the CDF since CDF V3.2.0 to ensure the data integrity in the CDF files.
From CDF V3.2.0, the checksum feature was added. If the feature is used for a CDF, the file’s checksum will be verified when it is accessed. Currently, MD5 checksum is the only algorithm used by CDF.
Since CDF V3.2.1, the CDF file integrity is further enhanced so that the potential for a buffer overflow vulnerability in the code when reading specially-crafted (invalid) CDF files can be prevented. Various sanity checks were added in the code for data against their expected values or ranges. Any corrupted files are expected to be identified immediately when they are accessed. A standalone tool: CDFValidate (mentioned in Item 3: What CDF utility programs are available? in this page) is written to assist in validating a given CDF(s).
There are many ways to represent the same information in any general-purpose data model, and there is no single “correct” way to store data in a CDF. The user has complete control over how the data values are stored in the CDF (within the confines of the variable array structure) depending on how the user views the data. This is the advantage of CDF. Data values are organized in whatever way makes sense to the user.
We provide some guidelines in the CDF User’s Guide (e.g., “Organizing Your Data in a CDF”), but we’ve found that a little experience helps. Occasionally, we have decided it was useful to change the structure of the CDF files after experience with how the data are used.
Our goal is to make CDF backward compatible in the sense that CDF files can always be read by new versions of the CDF library and tools.
CDF 3.0 or later will read a file that was created with any of the previous CDF versions, but a file created from scratch with CDF 3.0 or later won’t be readable by 2.7, 2.6, 2.5, or 2.4.
As CDF evolves new functions are added, but they are transparent to applications since all existing functions of the Internal and Standard interfaces are unchanged. However, the CDF Version 1.0 “Obsolete” interface, which was only callable from Fortran, is no longer supported with CDF V2.5. The Standard Interface is similar to the CDF V1.0 interface with several additions for new features. Those functions not available using the Standard Interface are made available using the Internal Interface. The Internal Interface makes CDF an easily extensible software package, so all applications using this interface will remain portable with newer versions of the CDF library.
However, some advanced CDF users have exploited undocumented CDF functions that were removed from a later version of the library and could not compile their application. This practice is NOT recommended because these functions are for internal use only and subject to change without notice in future releases. If any internal CDF function is useful to your application, then you should notify us, and we will determine if it is appropriate to make it visible in the next release. In the meanwhile, the portable approach would be to copy the corresponding function(s) from the CDF library source, rename the function name(s) to something else, insert it into your application code, and update the calling statements with the new function name(s).
Commercial software:
Interactive Data Language ([IDL](http://www.harrisgeospatial.com))
MathWorks MATLAB Language ([MATLAB](http://www.mathworks.com))
Application Visualization System ([AVS](http://www.avs.com))
Weisang GmbH & Co. KG Data Analysis and Presentation ([FlexPro](http://www.weisang.com))
Public domain software:
Autoplot ([Autoplot](http://autoplot.org))
An interactive browser for data on the web.
SPEDAS ([SPEDAS](http://spedas.org/wiki)) Space Physics Environment Data Analysis Software
IDL-based plotting, analysis, and data downloading tools.
pySPEDAS ([pySPEDAS](https://github.com/spedas/pyspedas))
Python implementation of many SPEDAS plotting, analysis, and downloading tools.
TOPCAT ([TOPCAT](http://www.starlink.ac.uk/topcat))
STILTS ([STILTS](http://www.starlink.ac.uk/stilts))
These are two table analysis packages built on the same infrastructure;
TOPCAT is a GUI tool and STILTS provides command-line access to
the same functionality. These are java applications that deal with various
file formats of interest to astronomers, and the CDF handling is based on the
JCDF library that can be referenced in [User Supplied Software](/html/user_supplied_sw.html).
Open Visualization Data Explorer based on previous IBM's Data Explorer. Please goole it.
[CDAWlib](//spdf.gsfc.nasa.gov/CDAWlib.html) - A set of IDL® routines that underlie the CDAWeb software.
This is a "library" of routines useful for reading in data
that is stored in the CDF format and also for plotting the
data variables as time series, images, radar, and
spectrograms. In order to use this s/w you must have IDL®
installed on your local machine.
[CDFx](//cdaweb.gsfc.nasa.gov/cdfx) - an IDL® software tool for displaying, editing, and listing the
contents of ISTP-compliant CDF data files. It displays image plots,
time-series plots, CDF variables; it will list and store CDF data
in plain ASCII text; and it will save subsets of data to new CDF
files.
Familiarity with CDF and ISTP CDF conventions will be helpful when
using CDFx. ISTP specifications and master CDF file concept
discussions are at the SPDF Use of CDF site.
LinkWinds (contact berkin@krazy.jpl.nasa.gov)
There are also several SPDF developed WWW-based data systems that
provide access to data stored in CDF and use IDL® to generate time
series plots of selected variables, which are available via the
following URLs:
CDAWeb - [cdaweb.gsfc.nasa.gov](//cdaweb.gsfc.nasa.gov)
SSCWeb - [sscweb.gsfc.nasa.gov](//sscweb.gsfc.nasa.gov)
COHOWeb - [cohoweb.gsfc.nasa.gov](//cohoweb.gsfc.nasa.gov)
OMNIWeb - [omniweb.gsfc.nasa.gov](//omniweb.gsfc.nasa.gov)
Example visualizations can be found via the URL:
[cdf.gsfc.nasa.gov/html/examples.html](examples.html)
See Comparison and conversion software with other formats
See Comparison and conversion software with other formats
We encourage CDF users to let us know about their applications, so other CDF users benefit from it as well. Contact CDFSUPPORT indicating that you would like to contribute your software to the CDF user community.
Some CDF-related software are available from the URL: spdf.gsfc.nasa.gov/pub/software/cdf/apps.
Comments, corrections, or additions to the CDF FAQ along with any CDF-related comments or questions should be sent to the CDF User Support Office via email to cdfsupport.