Scope

This guideline is primarily intended for users of TU Darmstadt. Partner universities that also use this system may have different rules. Find more about those here: Contact

Explanation of Terms
General Information
Submission
Management
Publishing
Licensing
Accessing datasets
- How can I access a restricted dataset?
- How can I give unknown persons access to non-published data, e.g. in the context of peer review?

Explanation of Terms

What is Research Data?

The Guidelines on Digital Research Data at TU Darmstadt define research data as "all digital data that are produced during the process or as the outcome of experiments, measurements, simulations, software developments, studies of primary sources, inquiries or surveys." The spectrum ranges from images and multidimensional models to audio and video recordings, texts, tables, databases, computer programs (source code and applications) and last but not least to subject- and device-specific raw data in various formats.
The documentation and software necessary for understanding research data is also inseparably linked to the research data. Last but not least, research data in each scientific discipline is available in different levels of aggregation and often in different, sometimes very specific digital formats.

What is TUdatalib?

TUdatalib is the institutional repository of TU Darmstadt for research data that were created or that were used at TU Darmstadt. TUdatalib enables structured storage of research data and descriptive metadata, long-term archiving (at least 10 years) and - if desired - publication of metadata and/or files including DOI assignment. In addition, there is a fine-grained rights and role management. TUdatalib is jointly operated by ULB and HRZ and is based upon the open source software DSpace. The service has been available to all members of TU Darmstadt since August 2019.

The metadata accompanying the data are made available by TUdatalib to the outside world via an OAI-PMH interface and are automatically included in common data search engines (e.g. Google DatasetSearch, BASE, OpenAIRE, DataCite Commons, CORE), so that the best possible findability is guaranteed.

The metadata schema used in TUdatalib guarantees compatibility with the DataCite metadata standard. In addition, each dataset can be described with freely selected keywords. The type of the dataset is categorized via standardized terms (e.g. "pure dataset", "software", "text" etc.) and its subject affiliation is described by the assignment to one or more subjects from the DFG subject classification system. The language of the dataset, if the dataset contains text, can be specified in standardized form.

What is a community in TUdatalib?

In TUdatalib, the organisational units of the TU are preset as communities, sorted by departments, institutes, research groups. If a research group or similar is missing, please contact us. For each community in TUdatalib there should be at least one administrator (two are recommended). Administrators manage the rights and roles for their community and can create sub-communities and collections.

What is a collection in TUdatalib?

A collection contains the datasets. Each data set in TUdatalib must be within a collection. The term collection can be used flexibly, e.g. for all datasets of a certain sub-working group, for a certain project, for a publication, for a PhD project, etc. Collection administrators manage the rights and roles for their collection. Each community administrator is automatically also the admin for all associated collections. For intensive use it is recommended to appoint additional collection admins.

What is a dataset in TUdatalib?

A dataset can consist of one or more files and is characterised by common metadata, i.e. the descriptive metadata applies to all files in the dataset. It's up to you to decide which of your files you see as belonging together. Note that Digital Object Identifiers (DOIs) can be assigned to datasets, not to individual files.

What is metadata?

Metadata is data about data, i.e. the description and documentation of data. This includes, for example, title, authors, date of origin, and DFG subject area, but also subject-specific information such as methods, software or measuring instruments used to generate the data. Furthermore, links to related publications or third party funded projects are possible.
Additionally, TUdatalib allows for the addition of (almost) any other metadata fields for your subject context (i.e. your community or collection) - please contact us! This is the only way to make a subject-specific search for data possible in the long term.

General Information

Who can use TUdatalib?

The primary user group for TUdatalib are members of TU Darmstadt. They log in with their TU-ID. As a rule, you will be automatically assigned to your research group's community after login and can then use TUdatalib according to the rights granted to you by the community administrator.
External users can login via their ORCID identifier followed by e-mail verification and can be assigned rights by an administrator.
Non-registered users have read access only to openly available datasets.

I want to upload a data set, what do I have to do?

If your research group ("Fachgebiet") already has a TUdatalib admin, please contact them directly, they can grant you appropriate rights.

How are the admins appointed?

Please clarify at your research group ("Fachgebiet") who can fill this role. At least two persons are to be named. At least one of these persons should work at the TU Darmstadt on a long-term basis. Once these persons are identified, they log in once with TU-ID and then the reseach group head reports their names and e-mail addresses to the TUdata team.
More detailed information on the tasks and possibilities of such an administrator can be found in the corresponding guidelines.

What does the service cost?

The use of TUdatalib is basically free of charge. For archiving large amounts of data, the following applies: Up to 2 TB total volume of new data per year and research group is free of charge. For larger volumes of new data, a one-time cost sharing by the research group is required for the data volume exceeding 2 TB. This is currently 250 € / TB for 10-year archiving. Invoicing is handled by the HRZ.

How is the data secured?

All data in TUdatalib are professionally secured against unintentional loss via the Backup Service of the University Computing Centre. This includes a second data copy stored in Frankfurt. Further information on the backup solution is available on the Computing Centre website.

Should I use TUdatalib if there is also a subject-specific repository for the data from my subject?

If there is a subject-specific repository in your discipline, we recommend that you use it. Subject-specific repositories usually offer more specific indexing and search options than an institutional repository such as TUdatalib, e.g. by using subject-specific metadata schemas. Furthermore, the visibility in your subject community is usually higher. When choosing a subject-specific repository, you should consider criteria such as regular backup cycles, data security and sustainable service. You can find a repository that suits your needs under

https://www.re3data.org

In general it can be said that there are hardly any alternatives to TUdatalib in the field of engineering sciences. If you need further advice, please contact us.

Where can I get support for questions about TUdatalib?

Please contact us at tudata@tu-darmstadt.de for all questions concerning research data management in general and TUdatalib in particular.
Telephone contact details can be found here: https://www.tu-darmstadt.de/tudata/tudata/referat_forschungsdaten/index.de.jsp.

Submission

What kind of data can I archive?

You can basically archive all types of data related to your research. Excluded are currently all personal data that is subject to data protection. If necessary, the data must be anonymised.

Which data formats are suitable?

TUdatalib is not limited in the choice of file formats. However, there are generally no guarantees that certain formats will still be readable after many years. It is therefore advisable to convert your files into one of the recommended formats before uploading, if possible. Recommended formats for long-term archiving and information on possible conversions can be found here:

https://www.forschungsdaten.info/themen/veroeffentlichen-und-archivieren/formate-erhalten/

The TUdata team would also be happy to advise you.
In general, you should not upload your files in a container format (zip or similar) as this makes long-term archiving and indexing very difficult. Please only use container formats if you want to maintain a folder structure for your data or if the number of files would otherwise be significantly greater than 20, which would limit display clarity and usability of TUdatalib. In the case of container formats, please use an uncompressed archive (see Tutorial on creating uncompressed archives).

How are research data linked to a text publication (e.g. journal articles) or other resources?

In step 1 "Describe" of the submission process, for example, for "relation type" select "Is source of" or "Is referenced by" or another applicable relation and for "identifier type" select either DOI, arXiv-ID, URL, URN, ISBN or PubMed-ID, and then enter the corresponding identifier of the corresponding text publication in the field next to it. An example for these links is

https://tudatalib.ulb.tu-darmstadt.de/handle/tudatalib/1915.2

What is the maximum file size?

We allow uploading of files of any size to TUdatalib. The graphical user interface allows, for an upload of files of 50 - 100 GB given a stable internet connection ideally from within the TU Darmstadt. Other ways to transfer large files are described in the user guide. If you have issues uploading large files, please contact us. We will be happy to help and advise you in detail.

Can I archive and publish personal data?

Unfortunately, it is currently not possible to archive and publish personal data in TUdatalib for organisational reasons. Personal data is therefore prohibited by the user agreement. There is no objection to archiving and publishing completely anonymised data, as this is not considered personal data. Pseudonymisation is not sufficient.

How should I name my dataset?

It is advisable to choose meaningful names for the datasets, even for outsiders, and to avoid special characters.

How do I find third-party funded projects that I want to link to my dataset?

Links to externally funded projects are generally not made as free text, but via a selection list ("lookup"). In submission step 2, the input field "Drittmittelprojekt" allows a standardized selection of third-party funded projects that can be assigned to the data set. In the lookup list, projects can be searched for using the search line. It contains about 6.000 third-party funded projects of the last few years. Due to the still improvable quality of the entries in the list, it is best to search for the grant number, only alternatively for the funding body or acronym or title of a project. The latter are sometimes only included in abbreviated versions. Attention: You can only search here with one term, not with several terms! Should you miss or not find a third-party funded project, please contact us.

How do I link an author information to an ORCID?

The TU Darmstadt recommends the use of an ORCID in its publication guidelines, see

https://www.ulb.tu-darmstadt.de/service/elektronisches_publizieren/orcid.de.jsp

In submission step 1, when specifying the authors, the registered name can be searched in the ORCID registry by clicking on "Lookup" and then inserted into TUdatalib. Due to the comparison with the ORCID registry, this step takes several seconds.

Is there an API to archive data automatically?

Yes, automated upload of files and metadata is possible via a REST API. Sample code in the form of a Python script is available for their use. Detailed information on the REST interface and the scripts can be found in the guide for administrators. Please contact us if you want to use the REST interface and need help.

Management

How can I set up a collection?

As a community administrator, you can create collections yourself by clicking "Create Collection" in the navigation bar of your section.

What permissions can be assigned?

In TUdatalib, rights and roles are assigned exclusively to groups (and not to individual users). Users can be assigned four different roles by being included in these authorization groups:

Administrator: (see also in particular the section "Administration of data sets"). The main rights include:
1. creating collections
2. assigning and managing roles and rights
3. editing metadata
4. registering DOIs
5. mirroring datasets between collections
Reader: Users who are allowed to view the datasets in a collection. This is the default right for users who were automatically assigned to an community after logging in.
Submitter: Users who are allowed to submit new datasets to a collection.
Reviewer: Users who are allowed to accept or reject submitted datasets and edit their metadata.

How is a new user group created?

In the "Edit Collection" menu, groups can be defined to assign permissions to users. This is done in the tab "Assign Roles". User groups can generally also be members of other user groups themselves.
Attention: In the DEFAULT_READ groups the permissions are not checked recursively, i.e. only users should be members of such a DEFAULT_READ group, no other groups. In particular: If a collection is to be publicly visible, the DEFAULT_READ group in question should be completely deleted.

I cannot add a certain person to a group, what can I do?

If a user is not found in the user list, please make sure that the user has logged in to TUdatalib before.

Is there versioning? Can I update my dataset?

Versioning of datasets is possible, i.e. a new, slightly changed data record (e.g. small changes in the actual files) replaces a previous data record. The metadata does not have to be re-entered for the new data record. For this step click the button "Create version of this item" in the section CONTEXT in the navigation bar. After versioning, the old dataset can no longer be found by searching in TUdatalib, but remains accessible directly via URL or DOI (Digital Object Identifier). A note about the new version is displayed. Versioning instead of change is the means of choice for all data records provided with DOI in order to ensure the traceability of the data genesis.

How do I register a DOI (Digital Object Identifier)?

If the group "Anonymous" has read access to at least the metadata of a dataset, the administrator of the collection can register a DOI for this record in the "Edit item" tab.
If there are access restrictions for the metadata, DOI assignment is not possible, but you can remove the access restrictions with a click on "Make metadata public" or "Make dataset publicly available (open access)", which makes DOI assignment possible. Attention: DOI allocation is irreversible! If you only want to do the DOI registration later, but already need the DOI link, you can use the pre-reserved one, which is already displayed at the same place.

What does the DOI refer to - a file or a dataset?

The DOIs in TUdatalib always refer to one version of a dataset.
If a new version of this dataset is created, a new DOI is also created.

How do I delete a dataset?

Only the system administrator (the ULB) can permanently delete an entire dataset, please contact us for this. However, for reasons of academic reproducibility, this is only possible in justified exceptional cases. Datasets for which a DOI has been assigned cannot be deleted.
Community and collection administrators can, however, withdraw datasets. These records are then no longer searchable in the interface.

How do I delete a collection?

Community and collection administrators can delete collections. However, a collection can be deleted only if there are no datasets in it.If necessary, first move all records to another existing collection in your community before clicking delete collection under edit collection.

Publishing

Is the dataset automatically publicly visible?

Basically not. In detail, it depends on which collection you submit your data to, but it can also be specified for each dataset individually, independent of the collection's settings. In step 3 "Access" of the submission process you will see the settings of the collection. In this step you can set additional visibility settings for the dataset. Only collection administrators can subsequently change the access rights. If a collection is to be publicly visible, the relevant DEFAULT-READ group should be completely deleted.

Am I allowed to publish my research data at all?

Many funders demand publication of the research data as a prerequisite for project funding. Publication may be prevented, for example, by contractual regulations in corporation with industrial partners or data privacy reasons for research data based on personal information of interview participants and alike. If you are unsure about the scope or conditions under which you may publish your research data, we will be happy to advise you.

What is checked before publication?

The research groups themselves are responsible for checking the content. Collection administrators can set up the user group "Controllers" for this purpose. The users appointed in this user group will then check the submissions. A check for formal aspects (e.g. dead links, links with ORCID etc.) is carried out by the ULB.

Does an additional publication contract have to be concluded?

No, an additional publishing contract is not necessary. According to the User Agreement, you give us the order to publish your data by submitting it to a public collection or by setting the access rights to your data such that it is publicly visible.

Can the data be published with a blocking period (Embargo)?

No. But administrators can adjust the read permissions for records at any time.

How long is the data kept?

The data will be kept for at least 10 years in accordance with the DFG Code of Good Scientific Practice. Published data records should not be deleted. Detailed regulations on what happens after the 10 years are still being worked on at the TU Darmstadt.

May I publish data elsewhere?

Yes. By agreeing to the user agreement, you grant TU Darmstadt only non-exclusive usage rights. This gives you complete freedom in handling your data and does not prevent you from storing or publishing it elsewhere.

Licensing

Do I need a license?

If a dataset is to be published in TUdatalib, it should be provided with an appropriate license. A license regulates the rights and obligations in case of potential subsequent use by third parties. In TUdatalib you can choose from many different standard licenses or alternatively, you can link your own license text. If you do not issue a license, a note on general copyright protection will be displayed.

Which license is suitable for my data?

The basic rule is: as open as possible, as closed as necessary. Evaluate your data: Do parts of it need special protection, for example sensitive data, commercially usable data, or data from cooperations for which contractual arrangements have been made? Release your data for subsequent use as far as this is possible.
You can obtain further information on licenses, for example, at

http://forschungslizenzen.de or
https://www.forschungsdaten.info/themen/rechte-und-pflichten/lizenzvergabe.
A decision-making aid based on guiding questions is provided by https://choosealicense.com.

If you have a special need for advice, please contact us.

What happens if I publish my data without a license?

If you do not grant a license, any subsequent use of your data is only regulated by the general copyright protection, which leaves many questions unanswered. Your data can then only be used by other researchers to a very limited extent or only after consultation with you.

Accessing datasets

How can I access a restricted dataset?

Datasets whose metadata can be freely viewed can be found by unauthorized users, even if all or individual files are not accessible to the general public. In TUdatalib you have the possibility to request access to the files. This access request will be sent to the community administrator responsible by email. If desired, they can add you to the authorization group after you have logged in to TUdatalib. Alternatively, they send a temporarily valid download link for the files.

How can I give unknown persons access to non-published data, e.g. in the context of peer review?

In this case, there are basically two procedures available

Admins can create access links via tokens for non-public data sets in the collections they manage. These are valid for 30 days and can be extended by request to the TUdata team if necessary. See documentation for details.
The TUdata Team can configure anonymous accounts for you to access the relevant data. Please feel free to contact us about this.