General Syntax

FITSIO Home
次へ: Bit Masks 上へ: Row Filtering Specification 戻る: Row Filtering Specification 目次

General Syntax

The row filtering expression can be an arbitrarily complex series of operations performed on constants, keyword values, and column data taken from the specified FITS TABLE extension. The expression must evaluate to a boolean value for each row of the table, where a value of FALSE means that the row will be excluded.

For complex or commonly used filters, one can place the expression into a text file and import it into the row filter using the syntax '[@filename.txt]'. The expression can be arbitrarily complex and extend over multiple lines of the file. Any lines in the external text file that begin with 2 slash characters ('//') will be ignored and may be used to add comments into the file.

Keyword and column data are referenced by name. Any string of characters not surrounded by quotes (ie, a constant string) or followed by an open parentheses (ie, a function name) will be initially interpreted as a column name and its contents for the current row inserted into the expression. If no such column exists, a keyword of that name will be searched for and its value used, if found. To force the name to be interpreted as a keyword (in case there is both a column and keyword with the same name), precede the keyword name with a single pound sign, '#', as in '#NAXIS2'. Due to the generalities of FITS column and keyword names, if the column or keyword name contains a space or a character which might appear as an arithmetic term then enclose the name in '$' characters as in $MAX PHA$ or #$MAX-PHA$. Names are case insensitive.

To access a table entry in a row other than the current one, follow the column's name with a row offset within curly braces. For example, 'PHA{-3}' will evaluate to the value of column PHA, 3 rows above the row currently being processed. One cannot specify an absolute row number, only a relative offset. Rows that fall outside the table will be treated as undefined, or NULLs.

Boolean operators can be used in the expression in either their Fortran or C forms. The following boolean operators are available:

    "equal"         .eq. .EQ. ==  "not equal"          .ne.  .NE.  !=
    "less than"     .lt. .LT. <   "less than/equal"    .le.  .LE.  <= =<
    "greater than"  .gt. .GT. >   "greater than/equal" .ge.  .GE.  >= =>
    "or"            .or. .OR. ||  "and"                .and. .AND. &&
    "negation"     .not. .NOT. !  "approx. equal(1e-7)"  ~

Note that the exclamation point, '!', is a special UNIX character, so if it is used on the command line rather than entered at a task prompt, it must be preceded by a backslash to force the UNIX shell to ignore it.

The expression may also include arithmetic operators and functions. Trigonometric functions use radians, not degrees. The following arithmetic operators and functions can be used in the expression (function names are case insensitive). A null value will be returned in case of illegal operations such as divide by zero, sqrt(negative) log(negative), log10(negative), arccos(.gt. 1), arcsin(.gt. 1).

    "addition"           +          "subtraction"          -
    "multiplication"     *          "division"             /
    "negation"           -          "exponentiation"       **   ^
    "absolute value"     abs(x)     "cosine"               cos(x)
    "sine"               sin(x)     "tangent"              tan(x)
    "arc cosine"         arccos(x)  "arc sine"             arcsin(x)
    "arc tangent"        arctan(x)  "arc tangent"          arctan2(y,x)
    "hyperbolic cos"     cosh(x)    "hyperbolic sin"       sinh(x)
    "hyperbolic tan"     tanh(x)    "round to nearest int" round(x)
    "round down to int"  floor(x)   "round up to int"      ceil(x)
    "exponential"        exp(x)     "square root"          sqrt(x)
    "natural log"        log(x)     "common log"           log10(x)
    "modulus"            x % y      "random # [0.0,1.0)"   random()
    "random Gaussian"    randomn()  "random Poisson"       randomp(x)
    "minimum"            min(x,y)   "maximum"              max(x,y)
    "cumulative sum"     accum(x)  "sequential difference" seqdiff(x)
    "if-then-else"       b?x:y
    "angular separation"  angsep(ra1,dec1,ra2,de2) (all in degrees)
    "substring"      strmid(s,p,n) "string search"         strstr(s,r)

Three different random number functions are provided: random(), with no arguments, produces a uniform random deviate between 0 and 1; randomn(), also with no arguments, produces a normal (Gaussian) random deviate with zero mean and unit standard deviation; randomp(x) produces a Poisson random deviate whose expected number of counts is X. X may be any positive real number of expected counts, including fractional values, but the return value is an integer.

When the random functions are used in a vector expression, by default the same random value will be used when evaluating each element of the vector. If different random numbers are desired, then the name of a vector column should be supplied as the single argument to the random function (e.g., "flux + 0.1 * random(flux)", where "flux' is the name of a vector column). This will create a vector of random numbers that will be used in sequence when evaluating each element of the vector expression.

An alternate syntax for the min and max functions has only a single argument which should be a vector value (see below). The result will be the minimum/maximum element contained within the vector.

The accum(x) function forms the cumulative sum of x, element by element. Vector columns are supported simply by performing the summation process through all the values. Null values are treated as 0. The seqdiff(x) function forms the sequential difference of x, element by element. The first value of seqdiff is the first value of x. A single null value in x causes a pair of nulls in the output. The seqdiff and accum functions are functional inverses, i.e., seqdiff(accum(x)) == x as long as no null values are present.

In the if-then-else expression, "b?x:y", b is an explicit boolean value or expression. There is no automatic type conversion from numeric to boolean values, so one needs to use "iVal!=0" instead of merely "iVal" as the boolean argument. x and y can be any scalar data type (including string).

The angsep function computes the angular separation in degrees between 2 celestial positions, where the first 2 parameters give the RA-like and Dec-like coordinates (in decimal degrees) of the first position, and the 3rd and 4th parameters give the coordinates of the second position.

The substring function strmid(S,P,N) extracts a substring from S, starting at string position P, with a substring length N. The first character position in S is labeled as 1. If P is 0, or refers to a position beyond the end of S, then the extracted substring will be NULL. S, P, and N may be functions of other columns.

The string search function strstr(S,R) searches for the first occurrence of the substring R in S. The result is an integer, indicating the character position of the first match (where 1 is the first character position of S). If no match is found, then strstr() returns a NULL value.

The following type casting operators are available, where the enclosing parentheses are required and taken from the C language usage. Also, the integer to real casts values to double precision:

                "real to integer"    (int) x     (INT) x
                "integer to real"    (float) i   (FLOAT) i

In addition, several constants are built in for use in numerical expressions:

        #pi              3.1415...      #e             2.7182...
        #deg             #pi/180        #row           current row number
        #null         undefined value   #snull         undefined string

A string constant must be enclosed in quotes as in 'Crab'. The "null" constants are useful for conditionally setting table values to a NULL, or undefined, value (eg., "col1==-99 ? #NULL : col1").

There is also a function for testing if two values are close to each other, i.e., if they are "near" each other to within a user specified tolerance. The arguments, value_1 and value_2 can be integer or real and represent the two values who's proximity is being tested to be within the specified tolerance, also an integer or real:

                    near(value_1, value_2, tolerance)

When a NULL, or undefined, value is encountered in the FITS table, the expression will evaluate to NULL unless the undefined value is not actually required for evaluation, e.g. "TRUE .or. NULL" evaluates to TRUE. The following two functions allow some NULL detection and handling:

         "a null value?"              ISNULL(x)
         "define a value for null"    DEFNULL(x,y)

The former returns a boolean value of TRUE if the argument x is NULL. The later "defines" a value to be substituted for NULL values; it returns the value of x if x is not NULL, otherwise it returns the value of y.

FITSIO Home
次へ: Bit Masks 上へ: Row Filtering Specification 戻る: Row Filtering Specification 目次

HAMABE Masaru 平成22年9月17日