*************README file for**************************
HYPA-p: Hydrodynamic patterns dataset (processed data)
******************************************************
Last modified: 2024-01-10 (yyyy-mm-dd)

This dataset was generated by Pauline Rothmann-Brumm (2023) as part of her dissertation at the Technical University of Darmstadt, Germany.

Title of the dissertation: Visualisierung, Analyse und Modellierung von fluiddynamischen Musterbildungsphänomenen im Zylinderspalt unter Anwendung von Maschinellem Lernen (German) / Visualization, analysis and modeling of fluid dynamic pattern formation phenomena in the cylinder gap using machine learning (English translation)

---------------------

DATASET DESCRIPTION

URL: https://tudatalib.ulb.tu-darmstadt.de/handle/tudatalib/3841

The HYPA-p dataset contains processed image data of a variety of gravure printed patterns. The processed image data is obtained by automated processing of raw image data. The raw image data was created by high-resolution scanning of printed samples from industrial gravure printing web-presses. Pattern types, which are represented in the dataset, are dot patterns, finger patterns and mixed patterns. The HYPA-p dataset mainly aims at data-driven analysis of hydrodynamic pattern formation in gravure printing.

HYPA-p stands for hydrodynamic patterns. The suffix -p stands for processed data.

In the related HYPA-r dataset (see https://tudatalib.ulb.tu-darmstadt.de/handle/tudatalib/3840), the corresponding raw (-r) image data can be found. The Python code for automated processing of HYPA-r to HYPA-p is provided within this submission in the folder 'code_HYPA-p.zip'.

---------------------

METADATA

Here, only a selection of the metadata is given, the full metadata of the HYPA-p dataset can be found in the dissertation of Pauline Rothmann-Brumm.

The dataset consists of S-fields, S-subfields and L-fields (S stands for small and L for large), which are grouped into ZIP-folders that correspond to a specific printing experiment (e.g. G1-01). Beware that the ZIP-folders are relatively large (between 10 and 167 GB), see also section DOWNLOAD!

S-fields originate from S-scans (from the HYPA-r dataset) and S-subfields originate from S-fields. Each S-field is divided into 16 S-subfields. L-fields originate from L-scans (from the HYPA-r dataset). See thumbnail of dataset for a graphical overview. Resolution of all images is 2,400 pixel per inch (ppi) (= 10.5833 micrometer/pixel). The printing direction goes from bottom to top in all images. S-fields have a total size of 1,040 pixel x 1,040 pixel, S-subfields 260 pixel x 260 pixel and L-fields 5,859 pixel x 16,819 pixel. In total, the HYPA-p dataset contains 55,680 S-fields, 890,880 S-subfields and 3,904 L-fields.

The images are named according to their printing parameters:

Example for S-field:
G1-01_WCPvbam_V005_ESA0_S01_R060_T005_S.tif

Example for S-subfield:
G1-01_WCPvbam_V005_ESA0_S01_R060_T005_S_part1-1.tif

Example for L-field:
G1-01_WCPvbam_V005_ESA0_L01_R060_T100_L.tif

G1-01    --> Name of the printing experiment. G1-01: G1 stands for the first printing day at the Gallus printing machine, -01 stands for the first velocity ramp. For more information see the dissertation of Pauline Rothmann-Brumm.
WCPvbam  --> Character string that encodes printing parameters of the printing experiment. WCPvbam: Water (W) based ink on coated paper (CP). Ink with base (= very high) viscosity (vb). Medium doctor blade angle (am). For more information see the dissertation of Pauline Rothmann-Brumm.
V005     --> Printing velocity of 5 m/min. Other used printing velocities: 10, 15, 30, 60, 90, 120, 180 and 240 m/min.
ESA0	 --> ESA0: Electrostatic printing assist (ESA) off. ESA1: ESA on.
L01	 --> Type and number of sample. L: Samples with 4 large fields (L-scans). S: Samples with 80 small fields (S-scans).
R060	 --> Raster frequency of printing form of 60 lines/cm. Other used raster frequencies: 70, 80 and 100 lines/cm.
T005	 --> Tonal value of printing form of 5 %. Other used tonal values: 5, 10, 15, ... 100 %.
S	 --> Redundant information. S: Small sample. L: Large sample.
part1-1	 --> Location of the S-subfield within the S-field.

The used electromechanically-engraved printing form has tonal values between 5 and 100 %, raster frequencies of 60, 70, 80 and 100 lines/cm, and a raster angle of 59.35°, which corresponds to HELL raster angle #2 (HELL Gravure Systems GmbH & Co. KG, Kiel, Germany). The stylus angle for engraving is 120°.

---------------------

PYTHON CODE

The Python code 'code_HYPA-p.zip' is used for automated processing of the HYPA-r dataset to the HYPA-p dataset. 

Step by step guide:

1. Download the folder 'code_HYPA-p.zip' from the HYPA-p dataset (https://tudatalib.ulb.tu-darmstadt.de/handle/tudatalib/3841). Unzip the folder.

2. Set up a suitable Python environment by installing the provided 'requirements.yml' file using conda package manager:
conda env create -f requirements.yml

3. To test the code, download the two example-images 'G1-01_WCPvbam_V005_ESA0_L01.tif' (exemplary L-scan) and 'G1-01_WCPvbam_V005_ESA0_S01.tif' (exemplary S-scan) from the HYPA-r dataset (see https://tudatalib.ulb.tu-darmstadt.de/handle/tudatalib/3840). Be patient, download will take some time due to the large file size of 3.2 GB for each example image. The example-images are used as input for the code.

4. First, run the script 'crop_fields.py'. It automatically detects the type of scan (S-scan or L-scan) from the filename of the example-images and then crops out the fields that are visible on the scan. Cropped fields from L-scans are called 'L-fields' and cropped fields from S-scans are called 'S-fields'. Each L-scan yields 4 L-fields and each S-scan yields 80 S-fields.

5. Second, run the script 'cut_into_pieces.py'. It further divides each S-field into 16 'S-subfields'. Select the 80 S-fields, which were created from the example-image 'G1-01_WCPvbam_V005_ESA0_S01.tif' in step 4., as input for the script.

---------------------

DOWNLOAD

Before downloading the dataset, please take care that your hard drive has enough free space, since the dataset is very large! Eventually change the download path within your browser to an external hard drive.

To get a quick impression of the dataset, a small amount of example-images is provided separately for download (see folder 'examples.zip').

---------------------

CONTACT

Pauline Rothmann-Brumm
Technical University of Darmstadt, Department of Mechanical Engineering, Institute of Printing Science and Technology (IDD), Magdalenenstr. 2, 64289 Darmstadt, Germany
rothmann-brumm@idd.tu-darmstadt.de