Docs: ERPSS System Tour - Page 4

Averaging Data

Raw ERPSS data consist of a continuous record of the EEG during the course of an experiment (excluding "pause" periods, such as breaks between trial blocks). In addition, the log file summarizes the events and their points of occurrence in the raw data file. Averaging data thus requires that discrete segments (termed epochs) of the continuous data be extracted from the continuous record, sorted into different categories, and averaged together. The ERPSS approach to this step of analysis is to employ an additional file, termed a binlist file, to specify the assignment of each raw trial to various averaging bins; it thus provides the sorting information needed to create averages of different event categories. Averaging then uses the log, raw, and binlist files as inputs and generates the averaged data file as an output. This pre-sorting of the raw data trials allows one to average according to any scheme and do it with only one pass through the raw file. In addition, one can employ multiple binlist files and form multiple types of averages, with different effective sampling rates, epoch lengths, etc., although this will require multiple passes through the raw file.

Although one can average data using a minimum of a raw file, a log file, and binlist file, in many cases one will wish to reject artifactual data trials. This is accomplished by employing one or more artifact rejection test (termed art) files during averaging (see arf (E4)). Artifact rejection is a tricky business; the ultimate word on artifact rejection goes like this:

There is no substitute for good data - it is folly to believe that
artifact rejection is going to convert bad data into good. Artifact 
rejection is most effective when the data are generally good and 
there are a few obviously contaminated trials; if every trial is 
more or less "dirty", artifact rejection may actually reduce the 
quality of the resulting average.

The following diagram gives an overview of the processing sequence of the acquired data files (i.e., log, raw, etc.) after they have been collected and transferred to the UNIX system. We'll cover most of the individual steps during the rest of this tutorial, but it might be helpful to see the "whole picture" beforehand so that you may better understand what role each step plays in the processing of ERPSS data.

Binlist Files

Binlist files have a specific format that is described in binlist (E4). They are ASCII files that can be generated or modified by any standard text editor, although this can be tedious. Automated generation of binlist files is the norm. The major portion of a binlist file consists of a sequential list of the events that occurred and the bins into which that trial should be averaged when it is encountered in the raw file. In addition to the sequential list of raw/log items and their corresponding averaging bins, the binlist file specifies the number of total bins that will be needed, descriptions for each bin, and, optionally, various averaging parameters on a bin-by-bin basis. Hence, the binlist file is the major input to the averaging process of the parameters for and types of averages to produce. Creation of a binlist file employs the log file and some description of which events should be averaged together. This can be done by hand, using the log file as a guide as to what events occurred when. However, most frequently one employs another program that processes the log file and a description of how events should be sorted to automate the generation of the binlist file. Currently, there is a specific program available for this purpose: ecdbl (E1). Ecdbl is an ERPSS version of the well-known (and loved?) cdbl program from the CD system. It will be briefly described below.

Artifact Rejection

Artifact rejection for ERPSS averaging employs artifact rejection test (art) files. These art files are ASCII text files that are generated using an editor, and consist of a series of tests that will be applied to each raw trial after it is extracted from the continuous record but before it is averaged into the bin specified in the binlist file. If a trial fails any of the tests specified in the art, it is not averaged into the corresponding bin. Currently it is not possible to reject only certain channels of data - if a trial is rejected on any basis, the data are excluded for all channels. Below is a sample from a art file:


     1  #       Thresh  Func     Slot   Name    Chan    Low     High    Arg
     2          250     ppa         2   eyemove 00      -500    990
     3          250     ppa         3   blink   01      -500    990
     4          55      ahiwpts     4   block   00      -500    990     10
     5          55      alowpts     4   block   00      -500    990     10
     6          55      ahiwpts     4   block   01      -500    990     10
     7          55      alowpts     4   block   01      -500    990     10
     8          55      ahiwpts     4   block   02      -500    990     10
     9          55      alowpts     4   block   02      -500    990     10
    10          55      ahiwpts     4   block   03      -500    990     10

Artifact rejection tests do not really detect artifacts; instead they calculate various signal parameters for the specific trial. Thus, one may attempt to detect eye blinks by assessing peak-to-peak amplitudes, or amplifier blocking by stretches of extraordinarily flat data. However, it should be noted that this is really a problem of signal detection, and it is difficult to reject every artifactual trial and also accept every good trial. Each artifact rejection test has a name and requires a number of mandatory parameters, such as the latencies over which to calculate the signal parameters, the channel of application, etc. Some tests require additional optional arguments. In any case, each test calculates a scalar value which is compared to a threshold: if the calculated value is larger than the threshold, the trial is rejected on the basis of that test, otherwise it is not. Only trials that pass all tests are averaged, and application of tests ceases as soon as the data are rejected by a particular test.

The thresholds are set by the experimenter by examining the raw data file using rawfile (E1) along with the results of applying the tests in an art file. One examines artifactual data and "good" data, noting the test results of each, and attempts to determine a set of thresholds which will separate the good from the bad. Since each raw data file contains data from a different subject, possibly recorded using different amplifier settings, the signal parameters that are used to detect artifacts can vary from data file to data file. Hence, it is usually necessary to individually tailor a separate art file to each raw data file. In addition, since one can request varying epoch lengths with varying decimation rates and/or filters for different bins, it is usually necessary to employ multiple art files, one for each epoch length, decimation rate/filter, and raw data file combination. Each of these must be calibrated separately using rawfile.

The format of art files, as well as the issues and considerations involved in using them, are described in the manual page arf (E4).

Sorting Trials - ecdbl

As mentioned, binlist files can be created by hand but are most frequently obtained as the output of an automated sorting program. Ecdbl (E1) is one such program. It uses an ASCII bin descriptor file, along with the log file to generate a binlist file. The bin descriptor file contains specifications that are used to assign each event in the log file to bin(s) for averaging or raw data trial separation.

Bin descriptor files have a certain syntax and semantics that allow one to select bins for log events on the basis of the event code, sequences of event codes, temporal contingencies, and status of the flag bits. It can be difficult to use, however, and one is advised to verify that the resulting binlist file matches one's intentions, perhaps using logfile (E1) to compare the log and binlist files. For a more detailed description of bin descriptor files, see bdf (E4).

Averaging - avgerps

Once one has the raw data file(s), log file(s), binlist file(s), and (possibly) artifact rejection test file(s) in hand, averaging itself is quite straightforward. The details on operating avgerps (E1) are well described in the manual entry; hopefully, the above outline of the general approach will assist in diagnosing problems and suggesting how a solution might be obtained.