Combine multiple NetCDF files into timeseries multidimensional array python -


i using data multiple netcdf files (in folder on computer). each file holds data entire usa, time period of 5 years. locations referenced based on index of x , y coordinate. trying create time series multiple locations(grid cells), compiling 5 year periods 20 year period (this combining 4 files). right able extract data files 1 location , compile array using numpy append. however, extract data multiple locations, placing matrix rows locations , columns contain time series precipitation data. think have create list or dictionary, not sure how allocate data list/dictionary within loop.

i new python , netcdf, forgive me if easy solution. have been using code guide, haven't figured out how format i'd do: python reading multiple netcdf rainfall files of variable size

here code:

import glob netcdf4 import dataset import numpy np  # define x & y index grid cell of interest      # pittsburgh 37,89 yindex = 37  #first number xindex = 89  #second number  # path path = '/users/lmc/research data/narccap/'   folder = 'mm5i_ccsm/'  ## load data file names     all_files = glob.glob(path + folder+'*.nc') all_files.sort()  ## initialize np arrays of timeperiods , locations yindexlist = [yindex,'38','39'] # y indices grid cells of interest xindexlist = [xindex,xindex,xindex] # x indices grid cells of interest ngridcell = len(yindexlist) ntimestep = 58400  # 4 files of 14600 timesteps  ## initialize np array timeseries_per_gridcell = np.empty(0)  ## start loop file import timestep, datafile in enumerate(all_files):         fh = dataset(datafile,mode='r')       days = fh.variables['time'][:]     lons = fh.variables['lon'][:]     lats = fh.variables['lat'][:]     precip = fh.variables['pr'][:]      in range(1):         timeseries_per_gridcell = np.append(timeseries_per_gridcell,precip[:,yindexlist[i],xindexlist[i]]*10800)      fh.close()  print timeseries_per_gridcell      

i put 3 files on dropbox access them, allowed post 2 links. here are:

https://www.dropbox.com/s/rso0hce8bq7yi2h/pr_mm5i_ccsm_2041010103.nc?dl=0 https://www.dropbox.com/s/j56undjvv7iph0f/pr_mm5i_ccsm_2046010103.nc?dl=0

nice start, recommend following solve issues.

first, check out ncrcat concatenate individual netcdf files single file. highly recommend downloading nco netcdf manipulations, in instance ease python coding later on.

let's files named precip_1.nc, precip_2.nc, precip_3.nc, , precip_4.nc. concatenate them along record dimension form new precip_all.nc record dimension of length 58400 with

ncrcat precip_1.nc precip_2.nc precip_3.nc precip_4.nc -o precip_all.nc 

in python need read in new single file , extract , store time series desired grid cells. this:

import netcdf4 import numpy np  yindexlist = [1,2,3] xindexlist = [4,5,6] ngridcell = len(xidx) ntimestep = 58400  # define empty 2d array store time series of precip set of grid cells timeseries_per_grid_cell = np.zeros([ngridcell, ntimestep])  ncfile = netcdf4.dataset('path/to/file/precip_all.nc', 'r')  # note precip 3d, need read in dimensions precip = ncfile.variables['precip'][:,:,:]  in range(ngridcell):      timeseries_per_grid_cell[i,:] = precip[:, yindexlist[i], xindexlist[i]]  ncfile.close() 

if have use python only, you'll need keep track of chunks of time indices individual files form make full time series. 58400/4 = 14600 time steps per file. you'll have loop read in each individual file , store corresponding slice of times, i.e. first file populate 0-14599, second 14600-29199, etc.


Comments

Popular posts from this blog

powershell Start-Process exit code -1073741502 when used with Credential from a windows service environment -

twig - Using Twigbridge in a Laravel 5.1 Package -

c# - LINQ join Entities from HashSet's, Join vs Dictionary vs HashSet performance -