Home > 1. Data Basics >   Data Formats

1.6 Common Data Formats Used by Geo Scientists

Filename Extension Format Name Comments
ASC ESRI ASCII grid format; also widely used for any ASCII files, often tabular Table of ASCII values, preceded by a 6- or 7-line header; a very widely -used format for GIS grids
BAT DOS batch file format Records basic DOS commands, as in a script.  A close relative is the CMD format, the newer version for Microsoft "NT" systems after Windows 2000.
BMP Bitmap image format Microsoft format without any compression, causing files to be quite large
BPW Geo-referencing world file for BMP  
BSB Format for raster nautical charts A BSB image file has a .KAP extension and can be optionally accompanied by a .BSB file which stores further cartographic data and relates multiple .KAPs together.
BUFR Binary universal format for representation (usually of meteorological data) Self-describing format widely used for global data reporting into World Meteorological Organization systems; heavily depends on the use of external lookup tables to explain contained codes
CDF Common data format Self-describing data format for the storage of scalar and multidimensional data; open in ncBrowse software.  This format is a distant relative of NetCDF, and is sometimes incorrectly identified as "GMT NetCDF"
CDL Common data language ASCII version of NetCDF; created from NetCDF by ncdump utility; creates NetCDF through ncgen utility
CMD See in BAT, above  
CNV See in SeaSoft description Special tabular ASCII format produced by Seabird processing systems for CTD data
CSV Comma-separated variables; usually applied to ASCII tables Usually opens directly in Excel
DAT Generic term for any data file (often ASCII) Least useful of all "extensions"
DBF dBase file; auxiliary file to ESRI shape format (SHP) Historically named format for obsolete data software systems; required component of ESRI's SHP format; can be imported by Saga separately as a table object; can also be edited separately in some spreadsheet programs
DOC, DOCX Microsoft Word document Additional X denotes Microsoft 2007 and later
DXF Data exchange format Copyrighted CAD/CAM format also used for interchange between GIS systems
E00 ESRI concatenation format Format for storing more than a dozen types of ESRI GIS map objects; must be "unpacked" prior to object use
FLT Floating-point single precision binary grid For use with ESRI software; must be accompanied by a small auxiliary ASC file to provide the geo-referencing information.  The two together are  equivalent to the ASC file above.
GFW Geo-referencing world file for GIF  
GIF Graphics interchange format Copyrighted format widely used on webpages; limited to 256 colors (called the "GIF palette", although there are many different color mapping tables)
GINI GOES Ingest and NOAAPORT Interface image format Format for AWIPS satellite data
GML Geography Markup Language "The XML grammar defined by the Open Geospatial Consortium (OGC) to express geographical features" [Wikipedia] - Noteworthy for providing a non-commercial, logical, ASCII replacement for shapefiles and for coverages (i.e. rasters)
GRD
  • Filename extension used for NetCDF grids in GMT
  • Surfer grids (Golden software)
  • Generic extension used for grid files, often ASCII
 
GRIB, GRIB2 Gridded binary format Self-describing format widely used for global meteorological data products coming from World Meteorological Organization systems; heavily depends on the use of external lookup tables to explain contained codes
GZ Compressed file created with GZip  
HDF Hierarchical data format Mainly used for grids, but especially satellite imagery at all product levels.   A confusing array of types exists:
  • HDF4
  • HDF4-EOS
  • HDF5
  • HDF5-EOS

Where EOS denotes NASA's Earth Observing System.  Versions 4 and 5 are completely different formats, but conversion routines exist.  EOS refers more to content quality guidelines than to technical specifications.

JGW Geo-referencing world file for JPG  
JPG, JPEG Joint photographic experts' group compression scheme; image format based on the scheme Widely used image format
KAP Raster format for nautical charts A BSB image file has a .KAP extension and can be optionally accompanied by a .BSB file which stores further cartographic data and relates multiple .KAPs together.
KML Google's keyhole markup language Metadata "wrapper" around local or remote images; can also contain vector drawing instructions like a shapefile.
KMZ Zipped form of KML files (see above) KMZ files physically include any images cited by the KML file; change the extension from KMZ to ZIP to allow manipulation/examination with WinZip or 7Zip
MGD77 Marine Geophysical Data Exchange Format *.m77t - Data spreadsheet (TSV, ASCII) that is easily handled by Saga.  Currently set up for 9 common geologic survey data types.  *.h77t - Companion metadata format with long list of fields to identify/use survey data
MGRD Saga grid (metadata)  
NC, NetCDF Network Common Data Form Becoming the dominant format for marine data; has recently subsumed HDF so that remote sensing data can more easily be included; has an ASCII analog called CDL (common data language); variable names and other core terminology are usually controlled by NetCDF Conventions to maximize interoperability of data and systems; opens automatically in ncBrowse software
NCD   A file that aggregates multiple NetCDF files; documentation unavailable
NCML NetCDF Markup Language; NcML Cookbook Metadata wrapper around a NetCDF file, containing additional tags or use information
NCWMS ncWMS "ncWMS is a  Web Map Service for geospatial data that are stored in  CF-compliant  NetCDF files."
ODV Ocean Data View collection format; Main "index" file format for ODV collections (consisting of many other files in specially organized folders)
PGW Geo-referencing world file for PNG  
PNG Portable network graphics Public domain format to replace JPG and GIF (which actually are copyrighted by the originators)
PPT, PPTX, PPS, PPSX Microsoft PowerPoint Additional X denotes Microsoft 2007 and later; S denotes compiled versions
PRJ ESRI projection file for GIS shapes "Projection format; the coordinate system and projection information, a plain text file describing the projection using well-known text (WKT) format" [Wikipedia article]
RAR Compression format Can be unzipped with 7zip or Winzip
SDAT Saga grid (binary array)  
SGRD Saga grid (auxiliary header)  
SHP ESRI shapefile Comes with 2 mandatory auxiliary files (SHX and DBF) but also up to 13 optional special-purpose files; mainly for vectors, but can hold grids in the point shape form (e.g. World Ocean Atlas analysis grids)
SHX Auxiliary file to ESRI shape format (SHP) Required for all shapes
SPRJ Saga project Holds data object display information for desired map combinations
SPRM Saga properties Save often-used properties settings in this format, and then re-load when needed for any individual map object.
S-57 Vector interchange format used for maritime charts, developed by the International Hydrographic Office (IHO) for Electronic Navigation Charts (ENC) Actual files have extensions like .000, 001, 002, etc.  The principal shape appears to be the .000 item.
TAB ASCII table format used by Pangaea environmental data Can be converted to Ocean Data View spreadsheet format with Pan2Applic software
TAR Concatenation format for UNIX; often followed by zipping with GZip to yield "tar.gz" files  
TSV Tab-separated variables; usually applied to ASCII tables Usually opens directly in Excel
TXT Text - Often applied to tab-separated ASCII tables Use any spreadsheet software for column-wise editing; when very large, such as ASCII conversions from satellite images, use ConTEXT or NOTEPAD++ to edit.
TFW Geo-referencing world file for TIF/TIFF  
TIF, TIFF Tagged image file format Can hold images, but the specification also allows data rasters to be stored (for example, floating point numbers or integers); both forms can include internal geo-referencing tags, or external TFW files (see above).  When the georeferencing is internal, then the name GeoTIFF can also be used.  When the file contains numerical data, then it is always called a "raster".
VRT GDAL-language description (in XML) of an accompanying data file.  For programs that do read VRT, open this file to activate the code to read the data file, "The VRT driver is a format driver for GDAL that allows a virtual GDAL dataset to be composed from other GDAL datasets with repositioning, and algorithms potentially applied as well as various kinds of metadata altered or added. VRT descriptions of datasets can be saved in an XML format normally given the extension .vrt." GDAL Tutorial
World File JPW, GFW, BPW, TFW (v.s)  6-line ASCII files that geo-reference images
XIDV Aggregation format used by IDV to store data object locations (local or online) Use to re-view or share (e.g. by email) specific suites of data; for example synoptic data from a particular storm event
XLS, XLSX Microsoft Excel spreadsheet Additional X denotes Microsoft 2007 and later
XML Extensible Markup Language A "standard" that provides a "set of rules for encoding documents in a format that is both human-readable and machine-readable...widely used for the representation of arbitrary data structures" [Website]
XYZ XYZ Simple table format, with 3 data columns headed X (for east-west coordinate), Y (for north-south coordinate) and Z (the data value).  Usually tab-separated for ready use in many GIS systems.
ZIP Data compression and archiving format Widely-used format, originally developed by the company PKZIP