Docs: ERPSS System Tour - Page 3

Collecting Data

Although provisions have been made in the design of the ERP software system for collecting many types of raw data, including "bump" data, at the present time only channel-multiplexed, raw data files can be created and processed using ERPSS programs. Hence, for now this section focuses on those data and the considerations involved in their collection.

General

The general theme of data collection using ERPSS programs is to defer as much processing as possible to subsequent programs that need not operate in "real time", and to record as much experimental information as possible during the actual digitization and collection. This approach allows the maximum rate of sampling and most processing flexibility at the cost of somewhat more storage space and complexity of subsequent processing steps.

In a similar vein, the machine used to digitize data is not also used to present stimuli or perform other tasks not directly related to data acquisition. These tasks are performed by separate machines, again maximizing performance and simplifying programming. Thus, the laboratory configuration used with ERPSS data collection includes at least one other machine to present stimuli. Additional machines may be needed to assist in performing the specific experiment, and the various machines are connected in a specific manner.

The following diagram specifies the general ERPSS laboratory setup. Arrows represent dataflow of one kind or another between objects.

All experimental events are transmitted to the Acquisition machine by way of 8 bit digital codes. These 8-bit codes are usually provided by the stimulus device or computer. Provisions are included to allow additional equipment to send codes indicating amplifier blocking, eye movements, or any other relevant experimental events to the Acquisition machine. All of these inputs, along with internally generated codes, are recorded in both the log and raw files during data collection.

Monitoring the performance of the subject is an option of the Acquisition program. Such performance monitoring allows one to verify that the subject is performing the intended task(s) and that the stimulus presentation equipment is working properly. The behavioral monitor runs side-by-side with the rest of the Acquisition processes to collect and analyze incoming data as it happens.

Acquisition Configuration Prerequisites

Acquisition has a few important configuration parameters that the experimenter must understand thoroughly before attempting to use the program. First, one must sample fast enough to capture the relevant experimental data. Mid-latency responses will require higher sampling rates than P300 responses. Second, one should keep in mind that while one can digitally decimate after the data have been collected to reduce the effective sampling rate, it is not possible to reconstruct information that was not recorded - that is, to increase the sampling rate after the fact. This point is mathematically stated by Nyquist's theorem: the sampling rate must be at least twice that of the highest frequency component present in the data to prevent loss and/or distortion of information via aliasing. Since this is the theoretical limit and assumes perfect subsequent interpolation of the sampled data (impossible), a good rule of thumb is to digitize faster than this. For example, if the amplifiers are set to pass only frequencies below 100 Hz, one might sample at 250 Hz or more. However, since much high frequency information in EEG is noise (at least in comparison to the ERP components of interest) and aliased noise is still noise and will average out, it is possible to cheat a little here; one is usually more concerned with the frequency content of the ERP waveforms that will be recorded than with the bandpass of the amplifier. Nonetheless, one will obtain the cleanest and most veridical data by obeying this sampling theorem (Nyquist's theorem). For more information on sampling and digital filtering in general, refer to idf (E1).

These are salient considerations when selecting a sampling rate, but there are other pragmatic considerations. Why not just sample as fast as possible and retain everything? The tradeoff is that the faster the sampling rate, the more raw data must be stored and subsequently processed. In addition, when the data are averaged, more sample points will need to be retained to cover a specific period of time. On the other hand, one must keep in mind that the sampling rate must be high enough to capture the events of interest and allow ease of subsequent measurement and processing.

As an example, let's consider recording an auditory N1-P2, which somewhat resembles a single cycle of a 10 Hz sine wave. While it's true that much energy of the waveform is at 10 Hz, it's also true that substantial energy at frequencies up to 100 Hz are needed to localize the waveform in time and clearly define small details of the waveshape (if desired, one can use plotidf (E1) to evaluate the spectrum of an ERP waveform). Hence, we will want to low-pass filter the EEG at 100 Hz, and sample at 250 Hz. This corresponds to a sample period of 4.0 milliseconds. That is, samples of the data will be separated by 4.0 msec. If one were to retain 256 points for averaged data epochs, their length would be 1024 msec.

Digitize

Once one has properly configured the Acquisition PC, the software is used to record data and generate the log and raw files. This program is quite installation-specific, and must be tailored to the idiosyncratic hardware on a particular machine. Hence, the details of it's use should be gleaned from the local manual page.

Data is obtained from many different places in the digitization process. There are stimulus codes being generated by the stimulus presentation machine, there are response codes being generated by the subject, and there is EEG data coming from the amplifiers. All of this input is eventually fed into the digitizer, where it is "time stamped" and placed in the corresponding log and raw files.

This diagram graphically depicts the flow of data in the PC digitization process. Any kind of stimuli may be used, as long as stimulus codes can be generated "in sync" with stimulus events.

Note that during the actual digitization, the log file that is created is not in its final form, as segments of the log file that are deleted are only marked at the ends of the offending segment. When digitization is complete but before Acquisition exits, the log file is "cooked" to mark each deleted entry appropriately.

One note is in order. It is not necessary to record all data from an experiment in one set of raw and log files - the averaging program allows one to average together as many separate sets of files as is necessary. However, this will entail creating a separate binlist file for each log and raw pair. Good luck.

Recovering Lost Data

Oops. The log file was accidently deleted! What to do? This common occurrence can be remedied with the program getlog (E1). It takes a raw file and reconstructs the corresponding log file.