IRISA-KIHT-S and KIHT-public Datasets
- Contact: Florent IMBERT (florent.imbert@irisa.fr), Eric ANQUETIL (eric.anquetil@irisa.fr)
- Partner:
- Funding : ANR
- Startdate: 01.10.2021
- Enddate: 31.12.2024
Links
- SHADoc team ex Intuidoc team
- Project webpage
- Download the IRISA-KIHT-S Dataset
- Download the KIHT-public Dataset
Digital devices can help pupils and teachers in the learning process by promoting active learning techniques and providing immediate feedbacks. The e-learning literature shows that computer-based analysis of handwriting can be really accurate, sensitive, and reliable to produce relevant and consistent feedbacks for correction or guidance.
The IRISA-KIHT-S dataset was presented in [1] for a task of handwritting reconstruction from the sensor data. The sensor data come from a digital pen called the STABILO Digipen. Noted that these data can also be used for classification purposes.
Conditions of Use
1. Purpose and Scope
- 1.1 The database is provided for research purposes only.
- 1.2 Users must agree to use the database solely for academic, educational, or scientific research. Commercial use is strictly prohibited unless explicitly authorized in writing.
2. Citations
- 2.1 For publications using the IRISA-KIHT-S database, please quote the reference: [1]
- 2.2 For publications using the KIHT-Public database, please quote the reference:[2]
Datasets description
These datasets are composed of 30 recordings for the IRISA-KIHT-S dataset and 149 recordings for the KIHT-Public datasets.
Every recording session generates files from the data acquisition mobile app. The sensor signals file has 13 columns: milliseconds, accelerometer front (x, y, z), accelerometer rear (x, y, z), gyroscope (x, y, z), magnetometer (x, y, z), and force signals. Tablet signal files contain milliseconds, position coordinates (x, y, z), and pressure force signals.
The transcription (labels) file contains labels and the start and stop time-stamps for every sample. Additional files concerning the sensor calibration and recording meta data are provided.
Data Acquisition
The recording process begins by selecting a set of predefined scripts to be written on the tablet surface using the Digipen. These two data sets are made up of the following two recording types:
- KIHT_TABLET_MIXED, consists of 34 samples to be written one by one during a single recording session. It is composed of five groups: 15 characters, 10 words, 5 equations, 2 shapes and 2 word groups.
- KIHT_TABLET_MIXED_EXTENDED, consists of 57 samples to be written one by one during a single recording session. It is composed of five groups: 30 characters, 10 words, 5 equations, 4 shapes and 8 word groups.
While recording, a user holds the pen’s on/off switch up, which is a natural way to take the Digipen due to grips designed on the pen to naturally position the fingers properly.
Sensors
Each Digipen is equipped with five sensors.
- Front accelerometer (STM LSM6DSL)
- Gyroscope (STM LSM6DSL)
- Rear accelerometer (Freescale MMA8451Q)
- Magnetometer (ALPS HSCDTD008A)
- Force sensor (ALPS HSFPAR003A)
Sensor Data
The sensors’ raw data stream is provided in the files called sensor_data.csv. Each file consists of 15 columns:
- Millis: The timestamp when the data were processed on the tablet computer that the pen was connected to during recording
- Acc1 X, Acc1 Y, Acc1 Z: The values of the front accelerometer in three dimensions
- Acc2 X, Acc2 Y, Acc2 Z: The values of the rear accelerometer in three dimensions
- Gyro X, Gyro Y, Gyro Z: The gyroscope values in three dimensions
- Mag X, Mag Y, Mag Z: The magnetometer values in three dimensions
- Force: The force with which the pen tip touches the surface
- Time: A sample counter
Citation
If you use the IRISA-KIHT-S or KIHT-Public dataset, please cite: