Audio File Format Specifications

File Description: ESPS sampled data file
File Extension: Commonly .sd
File Byte Order: Little-endian or big-endian

Prof. Peter Kabal, MMSP Lab, ECE, McGill University: Last update: 2017-01-20

ESPS Audio File Specifications

Entropic Research Laboratories used a proprietary file format for its WAVES+ display program, ESPS signal processing library, and the HTK hidden Markov model speech recognition toolkit. WAVES+ and ESPS were withdrawn when Entropic was acquired by Microsoft in October 1999. HTK is now being maintained and distributed free of charge by Cambridge University: HTK Hidden Markov Model Toolkit - speech recognition research toolkit. The current version of HTK accepts a number of file formats including the Esignal format designated by Entropic to supercede the ESPS file format.

The ESPS file system is complicated by design. ESPS files are infamous as to the length of their headers; the standard ESPS programs imbed the header of the input file(s) along with processing parameters into the header of the output file to provide a complete history of the processing steps. From the Entropic document, The Esignal File Format, draft document 1995-08-24,

The self-description portion of a FEA file is stored in a header at the beginning of the file. There is no simple description of the format for such headers because their specification is essentially algorithmic - in effect, the format is defined as the representation that is read and written by the ESPS library functions read_header and write_header. For example, these functions include hash coding for the tables contained in the header. ...

The situation is complicated by licensing considerations; the structure of ESPS and the patterns of desirable ESPS program usage made it necessary to perform product license checks within read_header and write_header.

Nonetheless, it is possible to read sound data from ESPS files.

The Festival speech recognition system from the Centre for Speech Technology Research a the University of Edinburgh includes routines for reading and writing ESPS files: Festival download.
The AFsp package can read ESPS sound files.
Man pages: esps.5, FEA.5, FEASD.5.

ESPS FEA-SD files (feature files with sampled data) contain a preamble and a fixed part of the header which in fact gives most of the information necessary to extract the data from the file.

Preamble

Offset	Length	Type	Contents
0	4	integer	Machine code
4	4	integer	Version check code: (`3000`)
8	4	integer	Data offset
12	4	integer	Record size
16	4	character	ESPS magic number: `0x00006A1A` for big-endian data, or `0x1A6A0000` for little-endian data
20	4	integer	EDR flag, 0 or non-zero
24	4	integer	Alignment pad
28	4	integer	Pointer to foreign header, or `-1`

The machine code indicates the type of machine the file was written on and thus implicitly the file byte order. The file byte order is primarily determined by the ESPS magic number (which appears again in the fixed part of the header). The EDR flag indicates that the data was written in a machine independent format, which is big-endian.
The data offset is the offset from the beginning of the file to the start of data.
The record size is the size (bytes) of a frame of data. Each frame consists of a set of values (channels in the case of audio signals).

Fixed Part of the Header

Offset	Length	Type	Contents
32	2	integer	File type (`13` for an `FEA` file)
34	2	integer	-
36	4	character	ESPS magic number, `0x00006A1A` or reversal
40	26	character	Date, e.g. `Fri May 26 23:57:43 1995`
66	8	character	Header version, e.g. `1.85`
74	16	character	Program, e.g. `copysps`
90	8	character	Program version, e.g. `3.14`
98	26	character	Program date, e.g. `6/19/91`
124	4	integer	Number of data records (frames)
128	2	integer	Tag flag, `0` or non-zero
130	2	integer	-
132	4	integer	Number of 64-bit floats in a record
136	4	integer	Number of 32-bit floats in a record
140	4	integer	Number of 32-bit integers in a record
144	4	integer	Number of 16-bit integers in a record
148	4	integer	Number of characters in a record
152	4	integer	Length of the fixed part of the header (`40`)
156	4	integer	Total length of the header
160	8	character	User name: `sccsmas`
168	20	character	(always seen as zeros)

Each record of the data can consist of mixed data items, where the number of each type of item is given in header. For audio files, normally all samples will consist of the same data type and so only one of the number fields will be non-zero.
The size of the record from the preamble should match the size of the data items times the number of data items. If the data is tagged, this check will fail. However, sampled data is not tagged.
The number of data records is sometimes bogus and reports numbers which may be too high or too low.

Feature File Header Information

The feature header has the following structure.

Offset	Length	Type	Contents
188	2	integer	Feature file type (`8` for a FEA-SD file)
190	2	integer	Labelled flag
192	2	integer	Number of fields

The next block of data starting at offset 194 contains unknown data. The data includes snippets of directory strings.
We get back on track at offset 232. The following 32 bytes seem to be zeros.
We are now at offset 194. If the number of fields is N, the next blocks of data are 4N, 4N, 2N, and 2N bytes long.
The next block (at offset 276 for N=1) contains the counts of each type of data in each record. For sampled data files, this repeats the information earlier in the fixed part of the header.
The next block (at offset 316 for N=1) contains N field names. Each field name is specified by a structure with a 4 byte count, followed by a character string. This string is "samples" for sampled data files.
After a further 6 zeros, FEA items start at offset 333 for N=1.

FEA Items

The preamble and fixed part of the header give us most of the information needed. What is missing is the sampling frequency. This and other information is contained in Generic Items. Each Generic Item has the following form.

Offset	Length	Type	Contents
0	2	integer	Code: `13`
2	2	integer	Identifier length (number of 4 byte words)
4	n	character	Identifier string (null terminated, multiple of 4 bytes), e.g. `record_freq`
n+4	4	integer	Data count
n+8	2	integer	Data type: 64-bit float (`1`), 32-bit float (`2`), 32-bit integer (`3`), 16-bit integer (`4`)
n+10	-	-	Data

The Generic Item of most interest is the sampling frequency (record_freq) containing one 64-bit float value. This Generic Item is required for sampled data files.
The Generic Item indicating the starting time for the data, (start_time) contains one 64-bit float value. This Generic Item is required for sampled data files.
The Generic Item indicating the maximum data value (max_value) containing either one 64-bit float (for all channels) or one value per channel. The maximum value is an upper bound. This Generic Item is optional.
Some of the other FEA items seen in ESPS files are as follows:
- Terminator, code 0, identifier length 0.
- Directory, code 1, same layout as the first part of a Generic Item.
- Typed text, code 2, same layout as the first part of a Generic Item.
- Imbedded header, code 4, same layout as the first part of a Generic Item.
- Filter specification, code 8.
- Comment string, code 11, same layout as the first part of a Generic Item. Imbedded newline characters separate lines.
- Generic Item, code 13 (as in table above)
- Source file, code 15, same layout as the first part of a Generic Item.

Sample Files

addf8.sd (48 kB): ESPS file, big-endian byte order, 16-bit data, 8 kHz sampling rate
speech12.sd (84 kB): ESPS file, big-endian byte order, 32-bit float data, 12 kHz sampling rate.