KIHT-Paper dataset

Description

This dataset is composed of 40 adult writers, acquired on Paper. 5 recordings are dedicated to the test set, from different writers than in training. This refers to 2983 examples in training and 281 in the test set.

Every recording session generates files from the data acquisition mobile app. The sensor signals file has 13 columns: milliseconds, rear accelerometer  (x, y, z), gyroscope (x, y, z), magnetometer  (x, y, z), and force signals. Tablet signal files contain milliseconds, position coordinates (x, y, z), and pressure force signals. Other KIHT datasets contains an additional front accelerometer (x,y,z) which is not taken into account in the experiments when training on paper and tablet data jointly.

The transcription (labels) file contains labels and the start and stop time-stamps for every sample. Additional files concerning the sensor calibration and recording meta data are provided.

Data Acquisition

The recording process begins by selecting a set of predefined scripts to be written on the tablet surface using the Digipen. While recording, the user holds the pen’s on/off switch up, which is a natural way to take the Digipen due to grips designed on the pen to naturally position the fingers properly. The dataset is detailed below.

characterswordssentencesequationsdrawings
Train1411404276185147
Valid33498564230
Test14750392520

References

If you use the KIHT-Paper dataset, you agree to cite the following reference:

[1] Paper in submission


How to get the dataset ?

Before downloading the dataset, you agree that this dataset is under the CLIC licence and can only be used for research purposes. To receive the download link, please complete the following contact form.

    Back to the KIHT dataset page.