started to implement the nearest neighbor search for regridding
corrected calculation of divergence in real space.
corrected handling of maximum stress deviation (removed mask)
condensed makefile syntax.
can now use a system-defined LAPACK instead of ACML (required for OS X...).
fixed bug that 'ULTRA' was not using -O3 for most of the compiling...
-removed to long lines
-restructured f2py modules and merged make_DAMASK2Python into setup processing
-setup_code.py now sets library path in makefile and asks for compile switches for spectral code
-substituted \ in format strings with $
restructured DAMASK_spectral:
-more logical output and structure of code
-better input for spectral debug parameters
rearranged some logic here and there.
(hopefully) improved readability of debug/standard output.
restarting logic would need some discussion with Martin/Krishna still…
introduced parameters for selective debugging of spectral code and partly introduced the advanced divergence calculation again which is controlled by debug.config
added switch in numerics to control divergence behavior (uncorrected and corrected by phenomenological factor)
added precision directive to all values I found
dislocation velocities are stored in the state, so we actually now have three "parts" of the state, the basic states that are updated by "constitutive_dotState" come first, then the dependent states that are calculated by "constitutive_microstructure" follow, and finally we have a last part reserved for other variables that just use the memory reserved by the state array and are updated somewhere else.
constitutive:
LpAndItsTangent does not need the full state, but only the local state, so changed that at least for the nonlocal constitutive law
restart write is on per default
restart read is switched on by using --restart or -r INT where INT gives step at which the calculation should restart
setting INT to a value <1 will turn restart write off
It is set in the DAMASK root folder by running damask_env.sh
Much needed in the testing routines so far (try ./testing/run_tests.py)
damask_env.sh could also trigger other scripts to get to a working setup after fresh checkout. For example it could run
./processing/setup/setup_processing.py
./code/setup/setup_code.py
if it finds that we have a fresh checkout.
enabled finetuning of FFTW
added some debugging options
reading in rotation of boundary conditions
using header in geometry file
corrected error in calculating tolerance for stress BC
polishing of output, variable declaration, and variable names
use "-l, --load, --loadcase" to specify loadcase file and
"-g, --geom, --geometry" to specify geometry file
!incremental update, wait for commit of damask_spectral.f90 before checking out
Date: Tue, 13 Sep 2011 17:46:44 +0200
From: m.diehl@mpie.de
To: wangleyu@msu.edu, lebenso@lanl.gov, denny.tjahjanto@imdea.org,
o.guevenc@mpie.de, n.jia@mpie.de, m.diehl@mpie.de, c.kords@mpie.de,
c.zambaldi@mpie.de, p.eisenlohr@mpie.de, f.roters@mpie.de
Subject: update: /home/svn/repos/DAMASK to 1000
Message-ID: <4e6f7ae4.Xjnd/szYAh8QCXTo%m.diehl@mpie.de>
User-Agent: Heirloom mailx 12.2 01/07/07
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
A processing/pre/FromEBSD/
A processing/pre/FromEBSD/Hex2Cub.cpp
A processing/pre/FromEBSD/SpectralMethodFromEBDS
A processing/pre/FromEBSD/patchFromReconstructedBoundaries
D processing/pre/patchFromReconstructedBoundaries
added two small (quick and dirty) tools to convert data from EBSD to input files for spectral method, put them together with patchFromReconstructedBoundaries into new folder.
Date: Tue, 13 Sep 2011 17:54:06 +0200
From: m.diehl@mpie.de
To: wangleyu@msu.edu, lebenso@lanl.gov, denny.tjahjanto@imdea.org,
o.guevenc@mpie.de, n.jia@mpie.de, m.diehl@mpie.de, c.kords@mpie.de,
c.zambaldi@mpie.de, p.eisenlohr@mpie.de, f.roters@mpie.de
Subject: update: /home/svn/repos/DAMASK to 1001
Message-ID: <4e6f7c9e.v9E4JVN2a6bg5tL8%m.diehl@mpie.de>
User-Agent: Heirloom mailx 12.2 01/07/07
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
U code/DAMASK_spectral.f90
U code/DAMASK_spectral_interface.f90
U code/IO.f90
U code/crystallite.f90
U code/homogenization_RGC.f90
U code/lattice.f90
U code/makefile
U code/material.f90
U code/mesh.f90
did a lot of polishing:
- removed unnecessary "return" before end of subroutine or function:
- changed undetermined array length (:) to (1:3)
To prevent problems with some code analysing tools:
- "3D oneliner loops" (with ";) only for "do" and "enddo" at the same time
- removed line continuation in OMP statements
made the makefile more flexible, removed heap-arrays switch
Date: Tue, 13 Sep 2011 17:57:07 +0200
From: m.diehl@mpie.de
To: wangleyu@msu.edu, lebenso@lanl.gov, denny.tjahjanto@imdea.org,
o.guevenc@mpie.de, n.jia@mpie.de, m.diehl@mpie.de, c.kords@mpie.de,
c.zambaldi@mpie.de, p.eisenlohr@mpie.de, f.roters@mpie.de
Subject: update: /home/svn/repos/DAMASK to 1002
Message-ID: <4e6f7d53.IEDDzo+JSzDWNSBr%m.diehl@mpie.de>
User-Agent: Heirloom mailx 12.2 01/07/07
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
A documentation/Compiling/
A documentation/Compiling/Stack+usage.pdf
A documentation/ParallelizationAndTuning/
A documentation/ParallelizationAndTuning/BSC_tools_Overview.pdf
A documentation/ParallelizationAndTuning/Intro_Perf.pdf
A documentation/ParallelizationAndTuning/Kcachegrind.pdf
A documentation/ParallelizationAndTuning/LRZ210703_1.pdf
A documentation/ParallelizationAndTuning/LRZ210703_2.pdf
A documentation/ParallelizationAndTuning/MUST_Overview.pdf
A documentation/ParallelizationAndTuning/NPB-MZ-MPI-BT_Exercise.pdf
A documentation/ParallelizationAndTuning/PAPI.pdf
A documentation/ParallelizationAndTuning/PSC_Exercise_BT-MPI.pdf
A documentation/ParallelizationAndTuning/Paraver_Exercise.pdf
A documentation/ParallelizationAndTuning/Periscope_Overview.pdf
A documentation/ParallelizationAndTuning/SIONlib.pdf
A documentation/ParallelizationAndTuning/Scalasca_Examples.pdf
A documentation/ParallelizationAndTuning/Scalasca_Exercise_BTMZ.pdf
A documentation/ParallelizationAndTuning/Scalasca_Overview.pdf
A documentation/ParallelizationAndTuning/Scalasca_Patterns.pdf
A documentation/ParallelizationAndTuning/TAU.pdf
A documentation/ParallelizationAndTuning/VIHPS-TW8.pdf
A documentation/ParallelizationAndTuning/Vampir_Exercise.pdf
A documentation/ParallelizationAndTuning/Vampir_Overview.pdf
A documentation/ParallelizationAndTuning/instructions_periscope.pdf
A documentation/ParallelizationAndTuning/manualf06.pdf
added some information from Tuning workshop in Aachen regarding tuning/parallelization
added slides with information how to prevent segmentation fauld
- removed unnecessary "return" before end of subroutine or function:
- changed undetermined array length (:) to (1:3)
To prevent problems with some code analysing tools:
- "3D oneliner loops" (with ";) only for "do" and "enddo" at the same time
- removed line continuation in OMP statements
made the makefile more flexible, removed heap-arrays switch
* use "math_invert3x3" instead of "math_inv3x3" for inversion of Fe
* for dislocation stress calculation: first regular case, then special case of dead dislocations in central ip
* "dv_dtau" now given for each dislocation type, so is a (ns,4) array
* deleted unused variables in "_LpAndItsTangent"
* corrected contribution of deads in "_LpAndItsTangent"
* the NaN variables defined in math did not give a proper NaN value, so use 0.0/0.0 again
* neighbors with nonlocal constitution but local properties (i.e. /nonlocal/ flag not set) are also considered for incoming fluxes
made convergence independent of size and resolution,
polishing output in DAMASK_spectral.f90
added function to compute eigenvalues without eigenvectors and function to convert a 3x3 logical to a 9 vector in math.f90
removed obsolete variable in numerics.f90
corrected calculation of stress BC condition. Depending on given BC, the stiffness matrix is reduced and than inversed. Then it is filled with zeros and used for the calculation of the correct change of deformation gradient. All calculation is done using dP/dF
fixed bug in bc_temperature assignment that was hitting memory.
Temperature is taken from the first loadcase and evolves from there in an adiabatic fashion for the moment. I.e. T-specifications from later loadcases are ignored...
* dislocation flux is blocked if we encounter a sign change in the resolved shear stress from the central ip to the neighbor
* do not set density to zero if below certain threshold; this creates an artificial sink term
* damper initialized with one
* inversion of Mandelized stiffness tensor does not work, have to use plain tensor
* new functions in math that allow for conversion between Mandel and Plain tensors
you have to specify the job you are restarting from in the job description (cae), if you prepare your input file by hand this is the first line after *Heading
example: if the first job was using Oldjob.inp the first entry in the job description needs to be Oldjob (without the .inp)
as for Marc restart works only from last converged increment, i.e. ther restart writing should be specified like this:
*retsart, write, frequency=1, overlay
Overlay is not essential but saves a lot of disk space and as stated before you can only restart from the last converged increment anyway
* Marc: node displacements are added to initial node coordinates (mesh_node0) to get current node positions (mesh_node), then ip coordinates are deduced
* Abaqus: ip coordinates are directly updated, no update of node coordinates!
* Spectral: for the moment no update of either ip or node coordinates! passing only dummy values with initial ip coordinates
* replaced "dble" intrinsic function by "real" with pReal kind in constitutive_nonlocal.f90
* removed useless line breaks in output of state in CPFEM.f90
* Also added some more openmp directives to increase percentage of parallelized code.
* "implicit none" was missing in two subroutines of homogenization and constitutive.
0 : only version infos and all from "hypela2"/"umat"
1 : basic outputs from "CPFEM.f90", basic output from initialization routines, debug_info
2 : extensive outputs from "CPFEM.f90", extensive output from initialization routines
3 : basic outputs from "homogenization.f90"
4 : extensive outputs from "homogenization.f90"
5 : basic outputs from "crystallite.f90"
6 : extensive outputs from "crystallite.f90"
7 : basic outputs from the constitutive files
8 : extensive outputs from the constitutive files
If verbosity is equal to zero, all counters in debug are not set during calculation (e.g. debug_StressLoopDistribution or debug_cumDotStateTicks). This might speed up parallel calculation, because all these need critical statements which extremely slow down parallel computation.
In order to keep it like that, please follow these simple rules:
DON'T use implicit array subscripts:
example: real, dimension(3,3) :: A,B
A(:,2) = B(:,1) <--- DON'T USE
A(1:3,2) = B(1:3,1) <--- BETTER USE
In many cases the use of explicit array subscripts is inevitable for parallelization. Additionally, it is an easy means to prevent memory leaks.
Enclose all write statements with the following:
!$OMP CRITICAL (write2out)
<your write statement>
!$OMP END CRITICAL (write2out)
Whenever you change something in the code and are not sure if it affects parallelization and leads to nonconforming behavior, please ask me and/or Franz to check this.
* removed input variables in constitutive_collectDotState and constitutive_postResults that are not needed anymore (because of recent changes in constitutive_nonlocal)
Now it is possible to compile a single precision spectral solver/crystal plasticity by replacing mesh.f90 and prec.f90 with mesh_single.f90 and prec_single.f90.
For the spectral method, just call "make precision=single" instead of "make". Use "make clean" evertime you switch precision