A notebook on some tricks in Linux. For some of them, I learned them from the Internet. For the rest, I got it by playing around it myself.
1.1. Continue the interrupted downloading process
Use “-C” option. Specially, if we want cURL to automatically determine the location to continue transfer, then use “-C -“.
curl -C - https://sample.tgz
This would also save the failed file transfer from wget. Continue reading
This is to record some interesting use of CDO, just to avoid frequent Googling…
1. GRIB1 file, from Gaussian reduced grid to regular Gaussian grid:
cdo setgridtype,regular <infile> <outfile>
This is mainly used to pre-process ECMWF data. For the explanation on the reduced Gaussian grid, see ref . Specific, this handles the issue:
Warning (cdfDefRgrid) : Creating a netCDF file with data on a gaussian reduced grid.
Warning (cdfDefRgrid) : The further processing of the resulting file is unsupported!
2. Delete select timesteps in the NetCDF file
cdo delete,timestep=1,10,20 <infile> <outfile>
Note the counter starts from 1.
 Reduced Gaussian grid.
This post is a record on how to process the HadGEM2-CC data for WRF downscaling. Unlike GFDL-ESM2M, the atmospheric results in this dataset is on pressure levels, which saves a lot of trouble. But the data available on the portal are masked out with terrain, so some interpolation is required to avoid WRF errors during initialization (e.g. unreasonably large RH values for interpolation).
After a long time of working with reanalysis data, I finally came to running WRF with CMIP5 data. Unfortunately, there are not many resources available online (though there have been so many publications based on WRF downscaling of CMIP5). One most useful post is the instructions at CORDEX experiment site. This one is for MIROC5 model output. It is worth pointing out that there is one bug in the instruction (step 2.4, we should not remove ps variable at this step). For my case, I need to drive WRF with CESM4, HadGEM2-CC and GFDL-ESM2M data.
CESM4 have been bias-corrected and prepared for WRF by NCAR as ds316.1, and this helps to save a lot of time. Steps for HadGEM2-CC are coverd by the part (2) of this post. GFDL-ESM2M happens to be not fully compliant to CF-1.4 convention, so I got some extra trouble. Luckily, I was able to solve all of them, and here is a record of the steps needed for GFDL-ESM2M data digestion. Continue reading
HYSPLIT is a software used for air parcel trajectory computation. It is developed by NOAA ARL, and has been used for a range of studies. In my study I use it for air moisture tracing, and there are numerous parcels I need to trace, so I figured it out that it many easier to work in Linux (i.e. HYSPLIT together with shell scripting). There are some confusions as to how to install it, and some modifications are needed in the default configuration (at least to me).
Post processing WRF is a big task, especially when different tasks come in different times. Most of the time I can find pretty simple (and beautiful) solutions in NCL, so I decided to learn more about it. I will add more to this post when I get new codes that are useful.
NCL means NCAR Command Language, and here is the official website. It is designed to handle the climate data, so WRF is kind of natively supported. It has a package called “WRFUserARW.ncl”, which is super helpful in processing WRF results.
NCL has 3 main references, and they are great tools to start with. Here I mainly handling NetCDF files, and I am only putting some useful quick notes (or short code piece) in case I need them. Continue reading
Just a follow-up of what I mention in “Software installation in CentOS 7 for Scientific Computation”. Now GMT project team is maintaining both GMT 4 and 5. This will be the official case till the announcement of GMT 6. The biggest (and perhaps the first one you will notice) change is that in GMT 5, there are no single executables like “psxy”. Instead, they need to be called as “gmt psxy”.
Unfortunately, there are still lots of scripts written in GMT 4 syntax. I am having this problem now, and there are two methods that can quickly get you back to the GMT 4 “environment”. But be cautious that the usage of some commands are slightly different in GMT 4 and 5, so these methods does not guarantee the old GMT 4 scripts work as we expect under GMT 5. Besides, GMT 4 will come to its end of support sooner or later, so the better way is, as you can guess, a fresh start in GMT 5. Continue reading
Just found a good place to try the regular expression knowledge (and also for debugging/experimenting my Perl codes). There seems to be a lot of similar platforms available, but I would go with this one, as it is so clean.
So here comes some nice parts of working with Linux. I am trying to figure out running several WRF simulations in parallel (I am not talking about a MPI run of WRF) on the school cluster. From the simple run experience, we just need to compile the code, run the model. But what if I want to have multiple runs at the same time? I do not want to go into the “building-running-deleting” cycle, and I would like to figure out how to have several sets of namelists along with only one wps.exe/wrf.exe to do the simulation. This would make it handy when we need to recompile wrf.exe or wps.exe. I am writing down what I have tested, but there is no guarantee that this is correct. Continue reading
I came across this problem again, haha.
I was running a program that requires opening 8000+ files at the same time. But I had trouble while reading the 1022nd file, just like before. This time I decided to solve it permanently.
Add this into /etc/security/limits.conf:
* soft nofile 9000
* hard nofile 9000
And you are all set. Alternatively, if you just want a one-time violation, then use this:
ulimit -n 9000
I have seen a post before talking about this problem, and it is said by default CentOS has the limit of 1024. Within these 1024 file descriptors, 2 are reseved by the system (stdout and stderr). In my case, another descriptor was occupied by the script, therefore we had problem around 1022nd file.