DAMASK_EICMD/processing/post/addCalculation.py

#!/usr/bin/env python2
# -*- coding: UTF-8 no BOM -*-

import os,re,sys
import math                                                                                         # noqa
import numpy as np
from optparse import OptionParser
import damask

scriptName = os.path.splitext(os.path.basename(__file__))[0]
scriptID   = ' '.join([scriptName,damask.version])

# --------------------------------------------------------------------
#                                MAIN
# --------------------------------------------------------------------

parser = OptionParser(option_class=damask.extendableOption, usage='%prog options [file[s]]', description = """
Add or alter column(s) with derived values according to user-defined arithmetic operation between column(s).
Column labels are tagged by '#label#' in formulas. Use ';' for ',' in functions.
Numpy is available as np.

Special variables: #_row_# -- row index
Examples: (1) magnitude of vector -- "np.linalg.norm(#vec#)" (2) rounded root of row number -- "round(math.sqrt(#_row_#);3)"

""", version = scriptID)

parser.add_option('-l','--label',
                  dest = 'labels',
                  action = 'extend', metavar = '<string LIST>',
                  help = '(list of) new column labels')
parser.add_option('-f','--formula',
                  dest = 'formulas',
                  action = 'extend', metavar = '<string LIST>',
                  help = '(list of) formulas corresponding to labels')

parser.add_option('-c','--condition',
                  dest   = 'condition', metavar='string',
                  help   = 'condition to filter rows')

parser.set_defaults(condition = None,
                   )

(options,filenames) = parser.parse_args()

if options.labels is None or options.formulas is None:
  parser.error('no formulas and/or labels specified.')
if len(options.labels) != len(options.formulas):
  parser.error('number of labels ({}) and formulas ({}) do not match.'.format(len(options.labels),len(options.formulas)))

for i in xrange(len(options.formulas)):
  options.formulas[i] = options.formulas[i].replace(';',',')

# --- loop over input files -------------------------------------------------------------------------

if filenames == []: filenames = [None]

for name in filenames:
  try:
    table = damask.ASCIItable(name = name,
                              buffered = False)
    output = damask.ASCIItable(name = name,
                               buffered = False)
  except:
    continue
  damask.util.report(scriptName,name)

# ------------------------------------------ read header -------------------------------------------  

  table.head_read()

# -----------------------------------------------------------------------------------------------------
  specials = { \
               '_row_': 0,
             }

# ------------------------------------------ Evaluate condition ---------------------------------------
  if options.condition:  
    interpolator = []
    condition = options.condition                                                                     # copy per file, since might be altered inline
    breaker = False
  
    for position,operand in enumerate(set(re.findall(r'#(([s]#)?(.+?))#',condition))):                # find three groups
      condition = condition.replace('#'+operand[0]+'#',
                                    {  '': '{%i}'%position,
                                     's#':'"{%i}"'%position}[operand[1]])
      if operand[2] in specials:                                                                      # special label 
        interpolator += ['specials["%s"]'%operand[2]]
      else:
        try:
          interpolator += ['%s(table.data[%i])'%({  '':'float',
                                                  's#':'str'}[operand[1]],
                                                 table.label_index(operand[2]))]                     # ccould be generalized to indexrange as array lookup
        except:
          damask.util.croak('column "{}" not found.'.format(operand[2]))
          breaker = True
        
    if breaker: continue                                                                              # found mistake in condition evaluation --> next file
  
    evaluator_condition = "'" + condition + "'.format(" + ','.join(interpolator) + ")"

  else: condition = ''

# ------------------------------------------ build formulae ----------------------------------------

  evaluator = {}
  
  for label,formula in zip(options.labels,options.formulas):
    for column in re.findall(r'#(.+?)#',formula):                                                   # loop over column labels in formula
      idx = table.label_index(column)
      dim = table.label_dimension(column)
      if column in specials:
        replacement = 'specials["{}"]'.format(column)
      elif dim == 1:                                                                                # scalar input
        replacement = 'float(table.data[{}])'.format(idx)                                           # take float value of data column
      elif dim > 1:                                                                                 # multidimensional input (vector, tensor, etc.)
        replacement = 'np.array(table.data[{}:{}],dtype=float)'.format(idx,idx+dim)                 # use (flat) array representation
      else:
        damask.util.croak('column {} not found, skipping {}...'.format(column,label))
        options.labels.remove(label)
        break

      formula = formula.replace('#'+column+'#',replacement)

    evaluator[label] = formula

    
# ------------------------------------------ process data ------------------------------------------

  firstLine   = True
  outputAlive = True

  while outputAlive and table.data_read():                                                          # read next data line of ASCII table
    specials['_row_'] += 1                                                                          # count row
    output.data_clear()
    
# ------------------------------------------ calculate one result to get length of labels  ---------

    if firstLine:
      firstLine = False
      labelDim  = {}
      for label in [x for x in options.labels]:
        labelDim[label] = np.size(eval(evaluator[label]))
        if labelDim[label] == 0: options.labels.remove(label)

# ------------------------------------------ assemble header ---------------------------------------

      output.labels_clear()
      tabLabels = table.labels()
      for label in tabLabels:
        dim = labelDim[label] if label in options.labels \
                              else table.label_dimension(label)
        output.labels_append(['{}_{}'.format(i+1,label) for i in xrange(dim)] if dim > 1 else label)

      for label in options.labels:
        if label in tabLabels: continue
        output.labels_append(['{}_{}'.format(i+1,label) for i in xrange(labelDim[label])]
                             if labelDim[label] > 1
                             else label)

      output.info = table.info
      output.info_append(scriptID + '\t' + ' '.join(sys.argv[1:]))
      output.head_write()

# ------------------------------------------ process data ------------------------------------------

    for label in output.labels():
      oldIndices = table.label_indexrange(label)
      Nold = max(1,len(oldIndices))                                                                  # Nold could be zero for new columns
      Nnew = len(output.label_indexrange(label))
      output.data_append(eval(evaluator[label]) if label in options.labels and
                                                   (condition == '' or eval(eval(evaluator_condition)))
                     else np.tile([table.data[i] for i in oldIndices]
                                  if label in tabLabels
                                  else np.nan,
                                  np.ceil(float(Nnew)/Nold))[:Nnew])                                 # spread formula result into given number of columns

    outputAlive = output.data_write()                                                                # output processed line

# ------------------------------------------ output finalization -----------------------------------

  table.input_close()                                                                                # close ASCII tables
  output.close()                                                                                     # close ASCII tables
shebang for python2 got lost 2016-05-17 20:11:38 +05:30			`#!/usr/bin/env python2`
tested new scripts to update shebang, all files got same shebang (and for python files encoding) 2014-04-02 00:11:14 +05:30			`# -- coding: UTF-8 no BOM --`
mini calculator for column-column arithmetic 2011-12-02 20:45:36 +05:30
using prospector --tool=dodgy --tool=pyflakes -M to find python flaws 2016-03-01 22:55:14 +05:30			`import os,re,sys`
still struggling with prospector 2016-03-02 00:07:31 +05:30			`import math # noqa`
simplified option parsing (so far, only for addCalculation and addMises) 2014-06-17 12:40:10 +05:30			`import numpy as np`
			`from optparse import OptionParser`
			`import damask`
mini calculator for column-column arithmetic 2011-12-02 20:45:36 +05:30
python files now report their version depending on VERSION file in $DAMASK_ROOT 2016-01-27 22:36:00 +05:30			`scriptName = os.path.splitext(os.path.basename(__file__))[0]`
			`scriptID = ' '.join([scriptName,damask.version])`
deal more gracefully with problematic user input. 2013-12-09 21:24:47 +05:30
mini calculator for column-column arithmetic 2011-12-02 20:45:36 +05:30			`# --------------------------------------------------------------------`
			`# MAIN`
			`# --------------------------------------------------------------------`

simplified option parsing (so far, only for addCalculation and addMises) 2014-06-17 12:40:10 +05:30			`parser = OptionParser(option_class=damask.extendableOption, usage='%prog options [file[s]]', description = """`
merged functionality of fillTable (change existing column values) into addCalculation. If labels already exist, they are altered. Otherwise, new columns are appended. 2016-05-17 05:36:13 +05:30			`Add or alter column(s) with derived values according to user-defined arithmetic operation between column(s).`
outsourced multiple repetitive functions into ASCIItable class. changed ASCIItable API from file-handles to filenames. adopted these changes in pre and post processing scripts. unified behavior and look. fixed bugs here and there. improved functionality. 2015-08-08 00:33:26 +05:30			`Column labels are tagged by '#label#' in formulas. Use ';' for ',' in functions.`
some more work on the postprocessing scripts, decreased test tolerance because spectral decomposition has rounding errors (depending on machine/python/numpy version) 2014-08-06 18:57:09 +05:30			`Numpy is available as np.`
mini calculator for column-column arithmetic 2011-12-02 20:45:36 +05:30
outsourced multiple repetitive functions into ASCIItable class. changed ASCIItable API from file-handles to filenames. adopted these changes in pre and post processing scripts. unified behavior and look. fixed bugs here and there. improved functionality. 2015-08-08 00:33:26 +05:30			`Special variables: #_row_# -- row index`
			`Examples: (1) magnitude of vector -- "np.linalg.norm(#vec#)" (2) rounded root of row number -- "round(math.sqrt(#_row_#);3)"`
some more work on the postprocessing scripts, decreased test tolerance because spectral decomposition has rounding errors (depending on machine/python/numpy version) 2014-08-06 18:57:09 +05:30
			`""", version = scriptID)`
mini calculator for column-column arithmetic 2011-12-02 20:45:36 +05:30
outsourced multiple repetitive functions into ASCIItable class. changed ASCIItable API from file-handles to filenames. adopted these changes in pre and post processing scripts. unified behavior and look. fixed bugs here and there. improved functionality. 2015-08-08 00:33:26 +05:30			`parser.add_option('-l','--label',`
			`dest = 'labels',`
			`action = 'extend', metavar = '<string LIST>',`
			`help = '(list of) new column labels')`
			`parser.add_option('-f','--formula',`
			`dest = 'formulas',`
			`action = 'extend', metavar = '<string LIST>',`
			`help = '(list of) formulas corresponding to labels')`
mini calculator for column-column arithmetic 2011-12-02 20:45:36 +05:30
merged functionality of fillTable (change existing column values) into addCalculation. If labels already exist, they are altered. Otherwise, new columns are appended. 2016-05-17 05:36:13 +05:30			`parser.add_option('-c','--condition',`
			`dest = 'condition', metavar='string',`
			`help = 'condition to filter rows')`

			`parser.set_defaults(condition = None,`
			`)`

mini calculator for column-column arithmetic 2011-12-02 20:45:36 +05:30			`(options,filenames) = parser.parse_args()`

further attemps to make it conform with best python practice 2016-03-02 01:14:43 +05:30			`if options.labels is None or options.formulas is None:`
outsourced multiple repetitive functions into ASCIItable class. changed ASCIItable API from file-handles to filenames. adopted these changes in pre and post processing scripts. unified behavior and look. fixed bugs here and there. improved functionality. 2015-08-08 00:33:26 +05:30			`parser.error('no formulas and/or labels specified.')`
			`if len(options.labels) != len(options.formulas):`
			`parser.error('number of labels ({}) and formulas ({}) do not match.'.format(len(options.labels),len(options.formulas)))`
improved help for automatic documentation and simplified some scripts 2015-05-09 18:15:30 +05:30
added possibility to use comma in functions, to prevent splitting substituted by ';'. Eg. round(4.021,1) becomes round(4.021;1) in command line call 2012-08-22 23:17:34 +05:30			`for i in xrange(len(options.formulas)):`
outsourced multiple repetitive functions into ASCIItable class. changed ASCIItable API from file-handles to filenames. adopted these changes in pre and post processing scripts. unified behavior and look. fixed bugs here and there. improved functionality. 2015-08-08 00:33:26 +05:30			`options.formulas[i] = options.formulas[i].replace(';',',')`

			`# --- loop over input files -------------------------------------------------------------------------`

updated to new ASCII table style 2015-08-13 02:29:10 +05:30			`if filenames == []: filenames = [None]`
outsourced multiple repetitive functions into ASCIItable class. changed ASCIItable API from file-handles to filenames. adopted these changes in pre and post processing scripts. unified behavior and look. fixed bugs here and there. improved functionality. 2015-08-08 00:33:26 +05:30
			`for name in filenames:`
updated to new ASCII table style 2015-08-13 02:29:10 +05:30			`try:`
merged functionality of fillTable (change existing column values) into addCalculation. If labels already exist, they are altered. Otherwise, new columns are appended. 2016-05-17 05:36:13 +05:30			`table = damask.ASCIItable(name = name,`
			`buffered = False)`
			`output = damask.ASCIItable(name = name,`
			`buffered = False)`
updated to new ASCII table style 2015-08-13 02:29:10 +05:30			`except:`
			`continue`
adopted philips changes for reporting, using pyflakes to clean up 2015-09-24 14:54:42 +05:30			`damask.util.report(scriptName,name)`
outsourced multiple repetitive functions into ASCIItable class. changed ASCIItable API from file-handles to filenames. adopted these changes in pre and post processing scripts. unified behavior and look. fixed bugs here and there. improved functionality. 2015-08-08 00:33:26 +05:30
			`# ------------------------------------------ read header -------------------------------------------`

			`table.head_read()`

merged functionality of fillTable (change existing column values) into addCalculation. If labels already exist, they are altered. Otherwise, new columns are appended. 2016-05-17 05:36:13 +05:30			`# -----------------------------------------------------------------------------------------------------`
removed --count option. introduced 'internal' column #_row_#. adopted polished column head identification developed in filterTable. wiki updated. 2012-02-16 17:26:16 +05:30			`specials = { \`
			`'_row_': 0,`
			`}`

merged functionality of fillTable (change existing column values) into addCalculation. If labels already exist, they are altered. Otherwise, new columns are appended. 2016-05-17 05:36:13 +05:30			`# ------------------------------------------ Evaluate condition ---------------------------------------`
			`if options.condition:`
			`interpolator = []`
			`condition = options.condition # copy per file, since might be altered inline`
			`breaker = False`

			`for position,operand in enumerate(set(re.findall(r'#(([s]#)?(.+?))#',condition))): # find three groups`
			`condition = condition.replace('#'+operand[0]+'#',`
			`{ '': '{%i}'%position,`
			`'s#':'"{%i}"'%position}[operand[1]])`
			`if operand[2] in specials: # special label`
			`interpolator += ['specials["%s"]'%operand[2]]`
			`else:`
			`try:`
			`interpolator += ['%s(table.data[%i])'%({ '':'float',`
			`'s#':'str'}[operand[1]],`
			`table.label_index(operand[2]))] # ccould be generalized to indexrange as array lookup`
			`except:`
			`damask.util.croak('column "{}" not found.'.format(operand[2]))`
			`breaker = True`

			`if breaker: continue # found mistake in condition evaluation --> next file`

			`evaluator_condition = "'" + condition + "'.format(" + ','.join(interpolator) + ")"`

			`else: condition = ''`

			`# ------------------------------------------ build formulae ----------------------------------------`

mini calculator for column-column arithmetic 2011-12-02 20:45:36 +05:30			`evaluator = {}`
deal more gracefully with problematic user input. 2013-12-09 21:24:47 +05:30
mini calculator for column-column arithmetic 2011-12-02 20:45:36 +05:30			`for label,formula in zip(options.labels,options.formulas):`
some further improvements on ASCII table handling 2014-07-10 14:57:51 +05:30			`for column in re.findall(r'#(.+?)#',formula): # loop over column labels in formula`
outsourced multiple repetitive functions into ASCIItable class. changed ASCIItable API from file-handles to filenames. adopted these changes in pre and post processing scripts. unified behavior and look. fixed bugs here and there. improved functionality. 2015-08-08 00:33:26 +05:30			`idx = table.label_index(column)`
			`dim = table.label_dimension(column)`
deal more gracefully with problematic user input. 2013-12-09 21:24:47 +05:30			`if column in specials:`
outsourced multiple repetitive functions into ASCIItable class. changed ASCIItable API from file-handles to filenames. adopted these changes in pre and post processing scripts. unified behavior and look. fixed bugs here and there. improved functionality. 2015-08-08 00:33:26 +05:30			`replacement = 'specials["{}"]'.format(column)`
			`elif dim == 1: # scalar input`
			`replacement = 'float(table.data[{}])'.format(idx) # take float value of data column`
			`elif dim > 1: # multidimensional input (vector, tensor, etc.)`
			`replacement = 'np.array(table.data[{}:{}],dtype=float)'.format(idx,idx+dim) # use (flat) array representation`
removed --count option. introduced 'internal' column #_row_#. adopted polished column head identification developed in filterTable. wiki updated. 2012-02-16 17:26:16 +05:30			`else:`
merged functionality of fillTable (change existing column values) into addCalculation. If labels already exist, they are altered. Otherwise, new columns are appended. 2016-05-17 05:36:13 +05:30			`damask.util.croak('column {} not found, skipping {}...'.format(column,label))`
			`options.labels.remove(label)`
outsourced multiple repetitive functions into ASCIItable class. changed ASCIItable API from file-handles to filenames. adopted these changes in pre and post processing scripts. unified behavior and look. fixed bugs here and there. improved functionality. 2015-08-08 00:33:26 +05:30			`break`

			`formula = formula.replace('#'+column+'#',replacement)`
added possibility to specify formula with array return type added support for numpy 2013-07-17 03:18:23 +05:30
merged functionality of fillTable (change existing column values) into addCalculation. If labels already exist, they are altered. Otherwise, new columns are appended. 2016-05-17 05:36:13 +05:30			`evaluator[label] = formula`


updated first 9 post processing scripts to latest ASCII table handling style 2014-07-22 01:25:05 +05:30			`# ------------------------------------------ process data ------------------------------------------`
outsourced multiple repetitive functions into ASCIItable class. changed ASCIItable API from file-handles to filenames. adopted these changes in pre and post processing scripts. unified behavior and look. fixed bugs here and there. improved functionality. 2015-08-08 00:33:26 +05:30
			`firstLine = True`
updated first 9 post processing scripts to latest ASCII table handling style 2014-07-22 01:25:05 +05:30			`outputAlive = True`
outsourced multiple repetitive functions into ASCIItable class. changed ASCIItable API from file-handles to filenames. adopted these changes in pre and post processing scripts. unified behavior and look. fixed bugs here and there. improved functionality. 2015-08-08 00:33:26 +05:30
updated first 9 post processing scripts to latest ASCII table handling style 2014-07-22 01:25:05 +05:30			`while outputAlive and table.data_read(): # read next data line of ASCII table`
			`specials['_row_'] += 1 # count row`
merged functionality of fillTable (change existing column values) into addCalculation. If labels already exist, they are altered. Otherwise, new columns are appended. 2016-05-17 05:36:13 +05:30			`output.data_clear()`

some further improvements on ASCII table handling 2014-07-10 14:57:51 +05:30			`# ------------------------------------------ calculate one result to get length of labels ---------`
outsourced multiple repetitive functions into ASCIItable class. changed ASCIItable API from file-handles to filenames. adopted these changes in pre and post processing scripts. unified behavior and look. fixed bugs here and there. improved functionality. 2015-08-08 00:33:26 +05:30
updated first 9 post processing scripts to latest ASCII table handling style 2014-07-22 01:25:05 +05:30			`if firstLine:`
outsourced multiple repetitive functions into ASCIItable class. changed ASCIItable API from file-handles to filenames. adopted these changes in pre and post processing scripts. unified behavior and look. fixed bugs here and there. improved functionality. 2015-08-08 00:33:26 +05:30			`firstLine = False`
			`labelDim = {}`
merged functionality of fillTable (change existing column values) into addCalculation. If labels already exist, they are altered. Otherwise, new columns are appended. 2016-05-17 05:36:13 +05:30			`for label in [x for x in options.labels]:`
outsourced multiple repetitive functions into ASCIItable class. changed ASCIItable API from file-handles to filenames. adopted these changes in pre and post processing scripts. unified behavior and look. fixed bugs here and there. improved functionality. 2015-08-08 00:33:26 +05:30			`labelDim[label] = np.size(eval(evaluator[label]))`
merged functionality of fillTable (change existing column values) into addCalculation. If labels already exist, they are altered. Otherwise, new columns are appended. 2016-05-17 05:36:13 +05:30			`if labelDim[label] == 0: options.labels.remove(label)`
mini calculator for column-column arithmetic 2011-12-02 20:45:36 +05:30
fixed calculation of size and dimension in case of 2D (was limited to third dim only) plus polishing 2014-08-06 20:55:18 +05:30			`# ------------------------------------------ assemble header ---------------------------------------`
outsourced multiple repetitive functions into ASCIItable class. changed ASCIItable API from file-handles to filenames. adopted these changes in pre and post processing scripts. unified behavior and look. fixed bugs here and there. improved functionality. 2015-08-08 00:33:26 +05:30
merged functionality of fillTable (change existing column values) into addCalculation. If labels already exist, they are altered. Otherwise, new columns are appended. 2016-05-17 05:36:13 +05:30			`output.labels_clear()`
			`tabLabels = table.labels()`
			`for label in tabLabels:`
			`dim = labelDim[label] if label in options.labels \`
			`else table.label_dimension(label)`
			`output.labels_append(['{}_{}'.format(i+1,label) for i in xrange(dim)] if dim > 1 else label)`

			`for label in options.labels:`
			`if label in tabLabels: continue`
			`output.labels_append(['{}_{}'.format(i+1,label) for i in xrange(labelDim[label])]`
			`if labelDim[label] > 1`
			`else label)`
outsourced multiple repetitive functions into ASCIItable class. changed ASCIItable API from file-handles to filenames. adopted these changes in pre and post processing scripts. unified behavior and look. fixed bugs here and there. improved functionality. 2015-08-08 00:33:26 +05:30
merged functionality of fillTable (change existing column values) into addCalculation. If labels already exist, they are altered. Otherwise, new columns are appended. 2016-05-17 05:36:13 +05:30			`output.info = table.info`
			`output.info_append(scriptID + '\t' + ' '.join(sys.argv[1:]))`
			`output.head_write()`
mini calculator for column-column arithmetic 2011-12-02 20:45:36 +05:30
outsourced multiple repetitive functions into ASCIItable class. changed ASCIItable API from file-handles to filenames. adopted these changes in pre and post processing scripts. unified behavior and look. fixed bugs here and there. improved functionality. 2015-08-08 00:33:26 +05:30			`# ------------------------------------------ process data ------------------------------------------`

merged functionality of fillTable (change existing column values) into addCalculation. If labels already exist, they are altered. Otherwise, new columns are appended. 2016-05-17 05:36:13 +05:30			`for label in output.labels():`
			`oldIndices = table.label_indexrange(label)`
			`Nold = max(1,len(oldIndices)) # Nold could be zero for new columns`
			`Nnew = len(output.label_indexrange(label))`
fixed python flake complaints. 2016-05-17 05:39:00 +05:30			`output.data_append(eval(evaluator[label]) if label in options.labels and`
			`(condition == '' or eval(eval(evaluator_condition)))`
merged functionality of fillTable (change existing column values) into addCalculation. If labels already exist, they are altered. Otherwise, new columns are appended. 2016-05-17 05:36:13 +05:30			`else np.tile([table.data[i] for i in oldIndices]`
			`if label in tabLabels`
			`else np.nan,`
			`np.ceil(float(Nnew)/Nold))[:Nnew]) # spread formula result into given number of columns`
updated to new ASCII table style 2015-08-13 02:29:10 +05:30
merged functionality of fillTable (change existing column values) into addCalculation. If labels already exist, they are altered. Otherwise, new columns are appended. 2016-05-17 05:36:13 +05:30			`outputAlive = output.data_write() # output processed line`
mini calculator for column-column arithmetic 2011-12-02 20:45:36 +05:30
merged functionality of fillTable (change existing column values) into addCalculation. If labels already exist, they are altered. Otherwise, new columns are appended. 2016-05-17 05:36:13 +05:30			`# ------------------------------------------ output finalization -----------------------------------`
mini calculator for column-column arithmetic 2011-12-02 20:45:36 +05:30
merged functionality of fillTable (change existing column values) into addCalculation. If labels already exist, they are altered. Otherwise, new columns are appended. 2016-05-17 05:36:13 +05:30			`table.input_close() # close ASCII tables`
			`output.close() # close ASCII tables`
shebang for python2 got lost 2016-05-17 20:11:38 +05:30