Introduction and Background
This is a real example of a recently used filename in an MDL exercise:
- ssh_vals_annu_100m_glo_woa05_ingrid_anal.nc
Why is it so long? Do you take the time to assign carefully structured, complete filenames to your data products? And do you have any problems finding them weeks or months later? Maybe you should read this article.
The following is a recommended file-naming convention for datasets used all your earth science data work. It is based on years of experience working with large numbers of datasets in classroom settings and in the compilation of regional marine atlases. Both situations often lead to confusion about the identities of files, due to name similarities and to the proliferation of "child" files derived from other files by sequential processing steps.
General
- Give as much detail as possible - but only the Parameter (see below) is absolutely mandatory
- Use familiar, intelligible and simple tokens for each filename section
- Use only numbers and lower-case letters
- Use an underscore (not a hyphen; not a space) to separate filename sections
-
Use the
letter p to donate a decimal point, for example 0p5
instead of 0.5
- Very new recommendation, not yet followed throughout MDL exercises
- Use no special characters, such as &, -, +, %, etc.
- Use no Greek letters, e.g. such as µ; use u
- SPECIAL NOTE: Filenames inherited from other comprehensive naming conventions should usually be left unchanged (or perhaps slightly augmented), for example the MODIS filenames containing coded metadata details
Filename Sections in the Order They Should Appear
- The first 3 sections are easily standardized words, numbers and acronyms
- The final 4 sections are a bit less easily standardized terms
- Use as many sections as you need; nothing is sacred
Parameter or Object
- Examples: temp, sal, relhum, airt, phos, current_u, current_v, ssh, sst, waves, secchi, par, kd490, etc.
- Examples: coast(line), traj(ectory), depth, height, grat(icule), aoi(area of interest box), etc.
- Examples: multiple entry indicates multiple parameters, e.g. airt_relhum_windv
- Optional: add "_img" if it contains an image of data, and not data (unnecessary if the filename extension is obvious, e.g. jpg or png)
- Optional: add "_grid" or "_grids" if it contains gridded data, e.g. currentv_grid, temp_grid, ssh_grid
- Optional: add "_cons" if it contains contour lines from a gridded analysis, e.g. phos_cons
- Optional: add "_vecs" if it contains vector arrows for wind, current, or motion, e.g. current_vecs
- Optional: add "_anal" if it contains an analysis product, such as a table from Ocean Data View's surface mode tables
Date and/or Time
- Think about climatological, specific time or specific interval
- For climatologies, use letters/abbreviations for months and annu for annual average; do not use season names
- Use the ISO Format Standard 8601 for all dates, as detailed as possible
- Examples: 2014-01-19T09:44:30Z2005, amj (April-June), jul (July), annu
- Indicate general time-of-day for satellite sensor products, when applicable, e.g. day, nite or night
- Double entry indicates time interval, e.g. 200301_200303 (jfm of 2003)
Depth/Height
- Examples: surf, 0m, 100m, bot(tom), 700mbar - sense (depth or height) taken from the parameter
- Double entry indicates depth/height interval, e.g. 50m_100m
Location
- Examples: nami/namibia, afr/africa, atl, ind/indo, pac, arct/arctic, ant/antarct, balt, black, casp, bbeng, carib, med/medit, natl/noratl, satl/soatl, spac/sopac, glo/glob/global, soc/sooc, namer, samer, eu, asia, seasia, au/aus, etc.
Originator
- Marine program, project or model that originally produced the data (to include major synthesis activities)
- Examples: woce, modis, woa05, wod05, czcs, seawis, argo, gpd, hycom, mercator, etc.
Provider
- Agency or website where the data have been re-formattted, re-packaged, re-distributed or downloaded
- Not necessary if provider is the same as the originator
- Examples: nvods, coriolis, ingrid, colorweb, pangaea, oceanportal, poet, wist, etc.
Extra
- Completely optional, as appropriate and necessary to identify the data
- Examples: 4u or 11u to indicate 4-micron or 11-micron wave band for satellite measurements
- Examples: Satellite - lev1, lev2, lev3, lev4; using standard Data Processing Levels
- Examples: Satellite - processing algorithm name, e.g. "gordon" for optical properties of sea water
- In Situ examples: raw, anal(yzed), mean, std (standard deviation), flag(ged), unflag(unflagged), etc.
- Format acronym: hdf, flt, grb, etc.; only needed if the file is compressed, i.e. zip, tar, gz, bz2, etc.
- Example: sst_clim_liberia_jun_colorweb_hdf.gz
- Software - name of the program used to create the file, e.g. surfer, saga, arcgis, odv, etc.
- Resolution - indicate the scale (in the inverse sense) for map objects, when important; for example "250k" could indicate a 1:250000 version of a coastline file
- Gridsize - indicate the size of the grid cells; for example "0p1deg" indicates grid cells are 0.1 degree on each side
Filename Examples
- ssh_vals_annu_100m_glo_woa05_ingrid_anal.nc
- airt_relhum_grids_20050101_500mbar_namer.flt
- airt_relhum_grids_20050101_500mbar_namer.hdr - companion header file for the above flt file
- chloro_jfm_surf_glo_modis_colorweb_lev3_hdf.bz2
- coastline_namibia_wvs_gebco2003_250k.shp
Special Files
- Some programs have special files to describe/specify aggregates of particular files and sometimes also the special settings to be applied. Example programs include Saga (PRJ, SPRM), IDV (XIDV), ODV (ODV). You should make your own rules on how to do these, but be consistent.