DAMASK_EICMD/processing/post/groupTable.py

#!/usr/bin/env python2.7
# -*- coding: UTF-8 no BOM -*-

import os,sys
import math                                                                                       # noqa
import numpy as np
from optparse import OptionParser
import damask

scriptName = os.path.splitext(os.path.basename(__file__))[0]
scriptID   = ' '.join([scriptName,damask.version])

# --------------------------------------------------------------------
#                                MAIN
# --------------------------------------------------------------------

parser = OptionParser(option_class=damask.extendableOption, usage='%prog options [file[s]]', description = """
Apply a user-specified function to condense all rows for which column 'label' has identical values into a single row.
Output table will contain as many rows as there are different (unique) values in the grouping column.

Examples:
For grain averaged values, replace all rows of particular 'texture' with a single row containing their average.
""", version = scriptID)

parser.add_option('-l','--label',
                  dest = 'label',
                  type = 'string', metavar = 'string',
                  help = 'column label for grouping rows')
parser.add_option('-f','--function',
                  dest = 'function',
                  type = 'string', metavar = 'string',
                  help = 'mapping function [%default]')
parser.add_option('-a','--all',
                  dest = 'all',
                  action = 'store_true',
                  help = 'apply mapping function also to grouping column')

parser.set_defaults(function = 'np.average')

(options,filenames) = parser.parse_args()

funcModule,funcName = options.function.split('.')

try:
  mapFunction = getattr(locals().get(funcModule) or 
                        globals().get(funcModule) or
                        __import__(funcModule), 
                        funcName)
except:
  mapFunction = None

if options.label is None:
  parser.error('no grouping column specified.')
if not hasattr(mapFunction,'__call__'):
  parser.error('function "{}" is not callable.'.format(options.function))


# --- loop over input files -------------------------------------------------------------------------

if filenames == []: filenames = [None]

for name in filenames:
  try:    table = damask.ASCIItable(name = name,
                                    buffered = False)
  except: continue
  damask.util.report(scriptName,name)

# ------------------------------------------ sanity checks ---------------------------------------  

  table.head_read()
  if table.label_dimension(options.label) != 1:
    damask.util.croak('column {} is not of scalar dimension.'.format(options.label))
    table.close(dismiss = True)                                                                     # close ASCIItable and remove empty file
    continue
  else:
    grpColumn = table.label_index(options.label)

# ------------------------------------------ assemble info ---------------------------------------  

  table.info_append(scriptID + '\t' + ' '.join(sys.argv[1:]))
  table.head_write()

# ------------------------------------------ process data -------------------------------- 

  table.data_readArray()
  rows,cols  = table.data.shape

  table.data = table.data[np.lexsort([table.data[:,grpColumn]])]                                  # sort data by grpColumn
  
  values,index = np.unique(table.data[:,grpColumn], return_index = True)                          # unique grpColumn values and their positions
  index = np.append(index,rows)                                                                   # add termination position
  grpTable = np.empty((len(values), cols))                                                        # initialize output
  
  for i in xrange(len(values)):                                                                   # iterate over groups (unique values in grpColumn)
    grpTable[i] = np.apply_along_axis(mapFunction,0,table.data[index[i]:index[i+1]])              # apply mapping function
    if not options.all: grpTable[i,grpColumn] = table.data[index[i],grpColumn]                    # restore grouping column value
  
  table.data = grpTable

# ------------------------------------------ output result -------------------------------  

  table.data_writeArray()
  table.close()                                                                                     # close ASCII table
using python 2.7 has shebang will also work on mac without symlink unless someone uses the 6 year old python 2.6, this should be save 2016-07-18 23:05:35 +05:30			`#!/usr/bin/env python2.7`
added script to replace all rows of an ASCII table by a single row containing their average 2015-01-23 06:27:10 +05:30			`# -- coding: UTF-8 no BOM --`

using prospector --tool=dodgy --tool=pyflakes -M to find python flaws 2016-03-01 22:55:14 +05:30			`import os,sys`
generalized to user-specified mapping function instead of hardwired avg 2016-08-25 21:45:03 +05:30			`import math # noqa`
added script to replace all rows of an ASCII table by a single row containing their average 2015-01-23 06:27:10 +05:30			`import numpy as np`
			`from optparse import OptionParser`
			`import damask`

python files now report their version depending on VERSION file in $DAMASK_ROOT 2016-01-27 22:36:00 +05:30			`scriptName = os.path.splitext(os.path.basename(__file__))[0]`
			`scriptID = ' '.join([scriptName,damask.version])`
added script to replace all rows of an ASCII table by a single row containing their average 2015-01-23 06:27:10 +05:30
			`# --------------------------------------------------------------------`
			`# MAIN`
			`# --------------------------------------------------------------------`

			`parser = OptionParser(option_class=damask.extendableOption, usage='%prog options [file[s]]', description = """`
generalized to user-specified mapping function instead of hardwired avg 2016-08-25 21:45:03 +05:30			`Apply a user-specified function to condense all rows for which column 'label' has identical values into a single row.`
fixed broken functionality by adopting new ASCIItable class output for data_readArray(). 2015-07-15 22:27:03 +05:30			`Output table will contain as many rows as there are different (unique) values in the grouping column.`
added script to replace all rows of an ASCII table by a single row containing their average 2015-01-23 06:27:10 +05:30
			`Examples:`
fixed broken functionality by adopting new ASCIItable class output for data_readArray(). 2015-07-15 22:27:03 +05:30			`For grain averaged values, replace all rows of particular 'texture' with a single row containing their average.`
added script to replace all rows of an ASCII table by a single row containing their average 2015-01-23 06:27:10 +05:30			`""", version = scriptID)`

outsourced multiple repetitive functions into ASCIItable class. changed ASCIItable API from file-handles to filenames. adopted these changes in pre and post processing scripts. unified behavior and look. fixed bugs here and there. improved functionality. 2015-08-08 00:33:26 +05:30			`parser.add_option('-l','--label',`
			`dest = 'label',`
			`type = 'string', metavar = 'string',`
			`help = 'column label for grouping rows')`
generalized to user-specified mapping function instead of hardwired avg 2016-08-25 21:45:03 +05:30			`parser.add_option('-f','--function',`
			`dest = 'function',`
			`type = 'string', metavar = 'string',`
			`help = 'mapping function [%default]')`
			`parser.add_option('-a','--all',`
			`dest = 'all',`
fixed comma syntax error 2016-08-25 21:47:27 +05:30			`action = 'store_true',`
generalized to user-specified mapping function instead of hardwired avg 2016-08-25 21:45:03 +05:30			`help = 'apply mapping function also to grouping column')`

			`parser.set_defaults(function = 'np.average')`
outsourced multiple repetitive functions into ASCIItable class. changed ASCIItable API from file-handles to filenames. adopted these changes in pre and post processing scripts. unified behavior and look. fixed bugs here and there. improved functionality. 2015-08-08 00:33:26 +05:30
added script to replace all rows of an ASCII table by a single row containing their average 2015-01-23 06:27:10 +05:30			`(options,filenames) = parser.parse_args()`

generalized to user-specified mapping function instead of hardwired avg 2016-08-25 21:45:03 +05:30			`funcModule,funcName = options.function.split('.')`

			`try:`
			`mapFunction = getattr(locals().get(funcModule) or`
			`globals().get(funcModule) or`
			`__import__(funcModule),`
			`funcName)`
			`except:`
			`mapFunction = None`

more improved scripts 2016-03-02 02:05:59 +05:30			`if options.label is None:`
outsourced multiple repetitive functions into ASCIItable class. changed ASCIItable API from file-handles to filenames. adopted these changes in pre and post processing scripts. unified behavior and look. fixed bugs here and there. improved functionality. 2015-08-08 00:33:26 +05:30			`parser.error('no grouping column specified.')`
generalized to user-specified mapping function instead of hardwired avg 2016-08-25 21:45:03 +05:30			`if not hasattr(mapFunction,'__call__'):`
			`parser.error('function "{}" is not callable.'.format(options.function))`
added script to replace all rows of an ASCII table by a single row containing their average 2015-01-23 06:27:10 +05:30

fixed broken functionality by adopting new ASCIItable class output for data_readArray(). 2015-07-15 22:27:03 +05:30			`# --- loop over input files -------------------------------------------------------------------------`
added script to replace all rows of an ASCII table by a single row containing their average 2015-01-23 06:27:10 +05:30
adopting further simplifications provided through ASCIItable class. 2015-08-21 01:12:05 +05:30			`if filenames == []: filenames = [None]`
added script to replace all rows of an ASCII table by a single row containing their average 2015-01-23 06:27:10 +05:30
fixed broken functionality by adopting new ASCIItable class output for data_readArray(). 2015-07-15 22:27:03 +05:30			`for name in filenames:`
fixed bug preventing files specified with full path to be treated properly 2016-04-06 01:47:55 +05:30			`try: table = damask.ASCIItable(name = name,`
			`buffered = False)`
adopting further simplifications provided through ASCIItable class. 2015-08-21 01:12:05 +05:30			`except: continue`
adopted philips changes for reporting, using pyflakes to clean up 2015-09-24 14:54:42 +05:30			`damask.util.report(scriptName,name)`
added script to replace all rows of an ASCII table by a single row containing their average 2015-01-23 06:27:10 +05:30
outsourced multiple repetitive functions into ASCIItable class. changed ASCIItable API from file-handles to filenames. adopted these changes in pre and post processing scripts. unified behavior and look. fixed bugs here and there. improved functionality. 2015-08-08 00:33:26 +05:30			`# ------------------------------------------ sanity checks ---------------------------------------`
fixed broken functionality by adopting new ASCIItable class output for data_readArray(). 2015-07-15 22:27:03 +05:30
outsourced multiple repetitive functions into ASCIItable class. changed ASCIItable API from file-handles to filenames. adopted these changes in pre and post processing scripts. unified behavior and look. fixed bugs here and there. improved functionality. 2015-08-08 00:33:26 +05:30			`table.head_read()`
fixed broken functionality by adopting new ASCIItable class output for data_readArray(). 2015-07-15 22:27:03 +05:30			`if table.label_dimension(options.label) != 1:`
adopted philips changes for reporting, using pyflakes to clean up 2015-09-24 14:54:42 +05:30			`damask.util.croak('column {} is not of scalar dimension.'.format(options.label))`
fixed broken functionality by adopting new ASCIItable class output for data_readArray(). 2015-07-15 22:27:03 +05:30			`table.close(dismiss = True) # close ASCIItable and remove empty file`
			`continue`
generalized to user-specified mapping function instead of hardwired avg 2016-08-25 21:45:03 +05:30			`else:`
			`grpColumn = table.label_index(options.label)`
added script to replace all rows of an ASCII table by a single row containing their average 2015-01-23 06:27:10 +05:30
outsourced multiple repetitive functions into ASCIItable class. changed ASCIItable API from file-handles to filenames. adopted these changes in pre and post processing scripts. unified behavior and look. fixed bugs here and there. improved functionality. 2015-08-08 00:33:26 +05:30			`# ------------------------------------------ assemble info ---------------------------------------`
added script to replace all rows of an ASCII table by a single row containing their average 2015-01-23 06:27:10 +05:30
outsourced multiple repetitive functions into ASCIItable class. changed ASCIItable API from file-handles to filenames. adopted these changes in pre and post processing scripts. unified behavior and look. fixed bugs here and there. improved functionality. 2015-08-08 00:33:26 +05:30			`table.info_append(scriptID + '\t' + ' '.join(sys.argv[1:]))`
added script to replace all rows of an ASCII table by a single row containing their average 2015-01-23 06:27:10 +05:30			`table.head_write()`

			`# ------------------------------------------ process data --------------------------------`

fixed broken functionality by adopting new ASCIItable class output for data_readArray(). 2015-07-15 22:27:03 +05:30			`table.data_readArray()`
			`rows,cols = table.data.shape`
added script to replace all rows of an ASCII table by a single row containing their average 2015-01-23 06:27:10 +05:30
generalized to user-specified mapping function instead of hardwired avg 2016-08-25 21:45:03 +05:30			`table.data = table.data[np.lexsort([table.data[:,grpColumn]])] # sort data by grpColumn`
added script to replace all rows of an ASCII table by a single row containing their average 2015-01-23 06:27:10 +05:30
generalized to user-specified mapping function instead of hardwired avg 2016-08-25 21:45:03 +05:30			`values,index = np.unique(table.data[:,grpColumn], return_index = True) # unique grpColumn values and their positions`
			`index = np.append(index,rows) # add termination position`
			`grpTable = np.empty((len(values), cols)) # initialize output`
added script to replace all rows of an ASCII table by a single row containing their average 2015-01-23 06:27:10 +05:30
generalized to user-specified mapping function instead of hardwired avg 2016-08-25 21:45:03 +05:30			`for i in xrange(len(values)): # iterate over groups (unique values in grpColumn)`
			`grpTable[i] = np.apply_along_axis(mapFunction,0,table.data[index[i]:index[i+1]]) # apply mapping function`
			`if not options.all: grpTable[i,grpColumn] = table.data[index[i],grpColumn] # restore grouping column value`
added script to replace all rows of an ASCII table by a single row containing their average 2015-01-23 06:27:10 +05:30
generalized to user-specified mapping function instead of hardwired avg 2016-08-25 21:45:03 +05:30			`table.data = grpTable`
fixed broken functionality by adopting new ASCIItable class output for data_readArray(). 2015-07-15 22:27:03 +05:30
added script to replace all rows of an ASCII table by a single row containing their average 2015-01-23 06:27:10 +05:30			`# ------------------------------------------ output result -------------------------------`

fixed broken functionality by adopting new ASCIItable class output for data_readArray(). 2015-07-15 22:27:03 +05:30			`table.data_writeArray()`
outsourced multiple repetitive functions into ASCIItable class. changed ASCIItable API from file-handles to filenames. adopted these changes in pre and post processing scripts. unified behavior and look. fixed bugs here and there. improved functionality. 2015-08-08 00:33:26 +05:30			`table.close() # close ASCII table`