5 June 2022
You probably know already that DiSSCo’s ultimate goal is a unified European collection that digitally brings together all European natural science assets under a common framework for access, curation, policies and practices. What you might not know yet is that DiSSCo comes with a delightful bonus: a wide catalogue of services that will probably change the way you look at Natural History Collections.
DiSSCo services will revolutionize the way researchers have interacted and worked with natural science collections for centuries. They will improve physical and digital access to collections, take digitisation to an industrial scale and provide mechanisms to make the best of the wealth of specimen data hosted in European collections.
On this piece, we have had some of the members of our community sit together and give us their thoughts on these services, and also on the current (global) effort in collection digitisation and the challenges waiting for Natural History Collections in the future. Enjoy!
Contributors
Lorenzo Cecchi (University of Florence); Dag Endresen (University of Oslo, Natural History Museum, ); Henry Engledow (Meise Botanic Garden, Belgium); Pierre-Yves Gagnier (MNHN Paris, France); Helen Hardy (NHM London, UK); Anne Koivunen (LUOMUS, Finland); Bram Langeveld (Natural History Museum Rotterdam, Netherlands); Lesley Scott (Royal Botanic Garden Edinburgh, UK); Myriam van Walsum (Naturalis Biodiversity Center, Netherlands); Luc Willemse (Naturalis Biodiversity Center, Netherlands).
Taking digitisation to an industrial scale is at the very core of DiSSCo’s endeavour.
1. Could you tell us about your roles and responsibilities at your institution?
Lorenzo Cecchi: I’m one among the four curator/technicians in charge of the botanical collections. In addition, I’m deeply involved in the activities of the Museum working group which is aimed to manage with and take part in both national and international initiatives and communities, such as CETAF, DiSSCo etc. Finally, I’m one of the people in charge of organizing the museum digitization activity and to lead digitization projects at the national level.
Dag Endresen: I am the GBIF participant node manager for Norway, fulltime. I am also the contact point for the Norwegian DiSSCo node together with our DiSSCo node manager Hugo de Boer. And I also represent the museum in different projects with focus on biodiversity informatics topics.
Henry Engledow: I am the database manager for the Living and Preserved Collections. This task includes many issues surrounding data: standardisation, normalisation, cleaning, support concerning data collection and management.
Pierre-Yves Gagnier: My main role is to promote digital innovation in the process of collection managing. My basic role is to make documentation of collection specimens (image and databases) available to the community; ensuring the conservation of this documentation (long term archiving).
Helen Hardy: I am the Science Digital Programme Manager with responsibility for mass digitisation and strategic digitisation planning. I currently manage a team of 17, mostly digitisers, as part of the Informatics division (the wider division are responsible for NHM’s data portal). I am closely involved in NHM’s international engagement on these topics including DiSSCo; in discussions with other UK collections; and in the Museum’s NHM@Harwell Programme to develop a new collections, research and digitisation centre at Harwell science campus.
Anne Koivunen: My title is Digitisation Manager and I am leading our digitisation team. It is my responsibility to develop digitisation processes and the workflows and supervise the implementation. I work also closely with our ICT-team to make the data openly accessible through our Laji.fi portal.
Bram Langeveld: As curator I am responsible for the collection, that includes acquisitions, exhibitions and (facilitating) loans, research, sampling and digitization.
Lesley Scott: Loans (incoming and outgoing), incoming and outgoing herbarium specimens (duplicates, new specimens to the collection), visiting researchers and artists.
Myriam van Walsum: I am product owner (or system owner) of the applications in the specimen domain (Collection Registration systems, public collections portal, medialibrary etc) and I am responsible for procedures and policies around data flow in this domain.
Luc Willemse: I am collection manager Orthopteroids (a relatively small group of the entomological collections) and chair of the CETAF Collections group.
2. How do you think that recent digitisation technologies and projects have impacted your role?
Henry Engledow: (…) The pressure to get information online is high. As the data is often put online before the basic cleaning has been down, emails start coming in about things that need to be cleaned (we are grateful for this input, but it does add to the pressure). I am also aware of where many of the issues lie, but there is not an automatic fix (despite what certain IT personnel like to believe).
Helen Hardy: (…) We are also excited about developing a greater and more structured range of digital services or ‘on demand’ digitisation in the medium term; and about digital process improvements for collections e.g. centralising some of the databasing needed when acquiring new collections objects.
Anne Koivunen: (…) New technologies made the mass digitisation possible and created a need to organise the work as efficiently as possible, hence digiteam was formed.
Bram Langeveld: They have created a better overview of the specimens kept in the collection and have significantly contributed to use of the collection by external researchers, students and also from beyond the field of natural history. They help in understanding strengths and weaknesses in the collection and help aim future endeavours concerning acquisitions and research.
Lesley Scott: We now image all specimens before they are sent out on loan. We can reduce the number of specimens physically sent out on loan with the offer of providing images. Researchers from around the world, who are not able to visit Edinburgh can request images of specimens to be made available in their field of interest.
Luc Willemse: They affected work processes and protocols in many ways, gave collection managers tools to gather a lot more detailed information about their collection, both at species and specimen level and offered the possibilities to make their collection worldwide accessible.
DiSSCo will create a unique gate for integrated analysis and interpretation of data through a catalogue of digital services.
3. What do you see as the biggest challenges for the next few years regarding your role?
Lorenzo Cecchi: (…) coordinating many other small to medium-sized NHM in Italy, which represent one of Italy’s hidden treasure but are still under-represented in the national and international scenery and are very often still unaware about the potentiality of new technologies and networks (…)
Dag Endresen: (…) Citation metrics of our museum specimens as scientific first-class citizens, in other words citation metrics directly pointing to the reuse of (data from) individual specimens, and not only current metrics limited to scientific journal article publications. And much deeper/granular than the GBIF citation metrics for entire datasets. Another large challenge is the linking of DNA sequence data to the museum collection material, and contributing to progress towards expanding the scope or mandate for museum collections to store tissue samples and environment samples.
Maybe of less direct relation to DiSSCo, are expectations to integrate new data types relevant for ecological research. In particular data or environment samples linked to ecosystems or environments and not only to discrete taxonomically uniform entities such as enabled with current data models.
Henry Engledow: (…) Educating our researchers on how they can deliver their data in a FAIR format. There is quite a resistance to the latter, as it costs the researcher’s time and they see little benefits (new metrics are needed to evaluate FAIR research so that this can form part of their evaluation).
Pierre-Yves Gagnier: AI development for indexation of images or data of natural history specimens in the first place
Helen Hardy: Resources/funding – either we won’t have the funding to achieve our aims of scaling up digitisation of our very large collections; or we will have funding and will therefore have the more pleasant challenge of scaling our team and operations rapidly and extensively to deliver. (…)
Anne Koivunen: The goals for the number of specimens to be digitised yearly is quite high, so I think the biggest challenge is to find novel ways to increase digitisation rate, perhaps finding and adopting new technologies.
Bram Langeveld: Keeping up digitization efforts by our volunteers with the volume of acquisitions when a whole generation of avid private collectors slowly starts to deposit their collections in museums.
Lesley Scott: (…) Customs restrictions (particularly new EU restrictions) is making it harder for us to physically send specimens out on loan and return loans which we have requested from other Herbaria. We don’t know what further restrictions and increased costs we will experience in the future.
Myriam van Walsum: (…) The balance between digitising more records at a basic level (which is easily measurable) and digitising less specimens but at higher quality is a big challenge. The standardisation of data entry and data quality through cleaning is important for re-use, but is getting low attention because it is hard to get funding for this.
The European Loans and Visits System (ELViS) portal, already operative.
4. Among DiSSCo Services, which one(s) will your role most benefit from?
Dag Endresen: Maybe the DiSSCo DOI registration entity? Professional mechanisms for physical specimen level persistent identification will be a critical fundament for very many other dependent services to build further from. (…) UCAS & NSIDR.org: these services for persistent specimen-level (or still only planned at digital-specimen-level?) and annotation could be a giant step forward for museums – and of huge direct benefit to tasks under my role at the museum in Oslo!!
Henry Engledow: SDR – this interests me the most. As I mostly work with the data and analysis – this could greatly help me in improving the quality of our collections.
Pierre-Yves Gagnier: ELVIS, Dashboard, Knowledgebase, Helpdesk, Autorisation infrastructure, are of interest for researcher or collection manager but not for the role (delegate for digital innovation) that I play in my institution beside the development of these applications. My role will benefit more from the Digital specimen concept from the Digital Specimen Repository and Specimen Data Refinery for innovation and specimen data quality control.
Helen Hardy: I hope that DiSSCo will provide both a catalyst, a platform and potentially a funding/grant mechanism for us to offer digitisation on demand and digital services such as additional imaging or even chemical and genomic analyses. It will be a challenge to be ready for this but will be really great if there is a good way to link users/demand with those services and with sources of funding.
Data standards and similar frameworks developed with or around DiSSCo are enormously valuable for data linkage and usefulness. (…)
Bram Langeveld: Collection Digitisation Dashboard, Specimen Data Refinery and Digital Specimen Repository seem most promising as they will increase our digitization efficiency and may result in more use of the results.
Myriam van Walsum: To improve our data quality, it would be interesting to implement UCAS; getting enrichment through annotations. However, this will have a major impact on our collection application landscape. How do we approve data and get that data back into our own CMS and collection portal? (…)
5. If you could make one or two suggestions to help DiSSCo be more supportive or responsive to your role needs, what would they be?
Lorenzo Cecchi: (…) For those that will be outside at the beginning, the benefits in joining DiSSCo should be always higher than the effort needed to join. To this goal, I’m expecting that the training strategy, especially in terms of user-friendly and multilingual communication documents and tutorials, will have a central role.
Dag Endresen: Support with finding a model for alignment of the GBIF node manager role and the DiSSCo node role. There are many overlapping mandates and tasks.(…) Some kind of progress on citation metrics services for the granularity of individual physical museum specimens. And FAIR data certification for museums in DiSSCo. Such services would most likely be a powerful demonstration of the need for DiSSCo when approaching national research council funding.
Henry Engledow: We need a good CMS that is support by the community and not defined by national borders. The development or support of tools for data cleaning, or that support research (…)
Helen Hardy: Be clearer about DiSSCo strategic vision and goals – not just what it might be but the impact it is intended to have and key communications messages about that for us all to use (…) – this would help me as a national/local leader in digitisation strategy.
Anne Koivunen: Currently, it would be nice to find the outputs produced in the DiSSCo projects more easily e.g. the Knowledge base could be updated more regularly.
Myriam van Walsum: We make short term and longer term plans for our applications. The DiSSCO services will have an impact on these plans. To be able to allow for these developments, we have a need to know which requirements we need to meet to be able to implement them at a later date. (…)
Luc Willemse: Communicate a lot, lot more with the community based on use cases and the role DiSSCo will play in improving or changing these and how this benefits researchers, collection managers etc.