Results of the 2022 NDSA Web Archiving Survey Report Now Available

The Web Archiving Survey Working Group is excited to announce the publication of the results of the 2022 Web Archiving Survey.  The 2022 survey builds upon surveys previously conducted in 2017, 2016, 2013, and 2011 — and though previous surveys were focused on the United States, the 2022 survey was open to international audiences as well. The 26-question survey was completed by 190 respondents over a 10-day period. This report details the outcomes of the web archiving survey that was distributed to various local, national, and international professional organizations and topical groups in October 2022. Topics discussed included archiving policies, tools and services, and access and discovery.

Some major takeaways from the report include:

  • Most respondents indicated that “staff capacity” was the biggest barrier to web archiving. Few organizations dedicate more than one, full-time employee to web archiving and very rarely is someone’s entire position dedicated to web archiving.
  • Since the first Web Archiving Survey in 2011, the landscape of web archiving has dramatically changed, and the 2022 survey results show an increase in web archiving practices by grassroots organizations, volunteer groups, and non-academic communities who seek to document their lived experiences.
  • An overwhelming majority of respondents indicated concerns about their ability to collect social media—in particular, Twitter, Instagram, Facebook, and Reddit. Content housed within social networks has always been difficult to capture for a myriad of reasons and recent changes to numerous social platforms have made this task harder.

Thank you to all NDSA members and others who participated in the survey. We appreciate your time and effort spent in providing the information to us.

Thank you to the members of the Web Archiving Survey Working Group who worked over the last year and half to make the report possible.

A Spotlight on Dedication, Creativity, and Effectiveness: Jes Neal on the NDSA Excellence Awards

Jessica C. Neal (she/they) is an archivist, records manager, and memory worker. She is currently the Records Management ProjectJes Neal standing in front of a wall that says Black Cultural Archives Manager at the Massachusetts Institute of Technology, and an archival consultant with Vanguard Archives Consulting. Jes’s work centers archives, preservation, data management, and developing ethical frameworks to better steward digital collections and projects that specifically focus on Black-led and -created social movements, oral histories, art, and literary history and culture.

We caught up with Jes recently, and she offered her perspective on the NDSA Excellence Awards.

 

In what way are you connected to the National Digital Stewardship Alliance

I’ve been involved as a member of the NDSA since 2017 and have been fortunate enough to s

erve on the leadership branch of NDSA, the Coordinating Committee, since 2020. Passionate about the long-term preservation and stewardship of digital information, NDSA has been a great space to build my professional network, expand my digital preservation skillset, and give and receive support from colleagues at various stages of their digital heritage preservation efforts. In our current rapidly evolving digital landsca

pe, the need to preserve digital heritage is critical. As we navigate the challenges of format obsolescence, data integrity, and ever-growing volumes of digital content, recognizing and celebrating outstanding efforts in digital preservation becomes an essential endeavor.

From your perspective, what do the NDSA Excellence Awards represent?

The NDSA Excellence Awards were established in 2012 to highlight and commend all forms of creative and meaningful contributions by individual professionals, future stewards, educators, organizations, projects, and sustainability activities to the field of digital preservation. At its core, The Excellence Awards were established to recognize and encourage exemplary achievement in the field of digital preservation stewardship at a level of national or international importance. However, The Excellence Awards also provide a spotlight on the dedication, creativity, and effectiveness in tackling the multifaceted challenges of digital preservation.

 

What do you currently see as some of the biggest challenges in digital preservation?

Digital preservation is not just about preserving archival records, datasets, digital images, websites, and emails; it’s about protecting our history, culture, and knowledge for future generations. In a climate of constantly evolving technologies, it is important that digital artifacts remain accessible and usable by wide and varied audiences. To that end, for as long as there have been digital artifacts, there have been archivists and records managers to implement preservation strategies.  

What efforts/advances/ideas of the last few years have you been impressed with or admired in the field of data stewardship and/or digital preservation? 

One aspect of ongoing digital preservation efforts that I’ve followed closely, admired, and participated in over the years is the evolving conversations, imaginings, and application of metadata as more than a record in cultural heritage institutions, especially those that collect and make accessible African American collections. Community involvement and applications of archival description afford marginalized groups to regain autonomy and ownership of their narratives, heritage, and history, while also amplifying historical injustices, social justice, and systemic racism which is essential to the preservation of cultural heritage.  

How do you feel the Excellence Awards encourage practitioners of digital stewardship/preservation?

Whether focusing on metadata and archival description, technological advances in systems and software, storage, creating resources, discovery and innovation of emergent digital preservation tools, collaborative road mapping of local and best practices, or developing digital preservation programs and policies, the work and ideas of practitioners is critically needed to ensure that the efforts of today are sustainable for tomorrow. Storage, sustainability, and the environmental impact of digital preservation are ever present challenges. It is only through collective sense making, creativity, and innovation that we together remedy these issues. 

One way to acknowledge and celebrate the achievement of information professionals and organizations is through recognition.  The NDSA Excellence Awards—in addition to DPCs Digital Preservation Awards—continues to be a means of inspiration, encouragement, and validation for our exemplary digital stewards, who remain committed to advancing digital preservation and stewardship.  

 

You can keep up with Jes on Twitter @JestheArchivist or Instagram @vanguardarchives!

It’s here, the 2023 NDSA Preservation Storage Infrastructure Survey!

Do you manage and preserve content? If so, we’d like to know about your preservation storage infrastructure! The NDSA Preservation Storage Infrastructure Working Group is back again, with our latest iteration of the NSDA Preservation Storage Infrastructure survey. Please contribute to our longitudinal research designed to gain insight into how organizations worldwide use preservation storage to ensure long-term access to their digital content and to learn how real-world capacity and best practices differ.  

The survey will remain open until November 22, 2023 and should take approximately 30 minutes to complete. We will make our best effort to protect your individual responses so that no one will be able to connect your responses with you or your organization. Any personal information that could identify you or your organization will be removed or changed before survey results are made public. We will combine your responses along with the responses of others and make the aggregated results public in mid-2024. To assist with completing the survey all of the survey questions can be viewed in advance

The results of our previous studies are provided here for your perusal:  2011, 2013, 2019

Please do not hesitate to reach out to the survey Co-chairs, Amy Allen (ala005 [at] uark [dot] edu) and Sibyl Schaefer (sschaefer [at] ucsd [dot] edu) with any questions you have about the survey.  

Thank you for your participation, and thank you for helping NDSA and our community define and advance digital preservation!

~ The NDSA Storage Infrastructure Survey Working Group

Announcing Incoming NDSA Coordinating Committee Members for 2024-2026

Please join me in welcoming the three newly elected Coordinating Committee members: Michael Barera, Chelsea Denault, and Jessica Venlet. Their terms begin January 1, 2024 and run through December 31, 2026.  Read more about their backgrounds and interest below.  

Michael Barera

Michael Barera has been the Assistant Archivist and Digitization Specialist at the Milwaukee County Historical Society (MCHS) Research Library since June 2022. This position ranges broadly from traditional archival responsibilities such as digitization, processing, and reference to unique and often innovative programs and projects related to Milwaukee history, including creating questions for and calling Milwaukee History Trivia Nights at local breweries and leading historical kayak tours on the Milwaukee River. Michael earned a Bachelor of Arts (BA) in history from the University of Michigan in 2012 and obtained a Master of Science in Information (MSI) in both Archives and Records Management (ARM) and Preservation of Information (PI) from the University of Michigan School of Information in 2014. Prior to taking his current position at MCHS, he previously served as an Assistant Archivist at the Texas A&M University-Commerce Libraries (from 2015 to 2019) and as the University and Labor Archivist at the University of Texas at Arlington Libraries (from 2019 to 2022). He has been a Certified Archivist since 2016.

Michael ran for NDSA Coordinating Committee for two primary reasons. The first is to bring the perspective of a small but innovative county historical society to the committee. The second is to learn from the committee and engage more deeply with NDSA as a whole, with the ultimate goal of learning more born-digital and digitization best practices that can be realistically implemented at MCHS and thus raise its level of practice.

Chelsea Denault

Chelsea leads the Michigan Digital Preservation Network, a program of the Midwest Collaborative for Library Services with support from the Library of Michigan. As the MDPN’s Coordinator, she works to build a community-centered statewide service focused on leveraging shared resources and expertise to make digital preservation affordable and accessible to all cultural memory institutions. As part of her efforts, Chelsea provides guidance and training on digital preservation in Michigan and leads the MDPN’s policy development and member recruitment. She also serves as the PI for the MDPN’s IMLS-funded grant to explore simplifying digital preservation workflows and provide training for non-technical users at under-resourced institutions in Michigan and beyond. Chelsea has served the NDSA on the DigiPres Conference Planning Committee (2021-2023) and the Long-Term Conference Planning Working Group. She also represents the MDPN in the Private LOCKSS Network (PLN) Community, and contributes to the Cross-PLN Technical Committee and the Shared Messaging Group. Before joining the MDPN, Chelsea was a public historian engaged in community outreach and collections work, and she holds an MA and a PhD in Public History/US History from Loyola University Chicago. Chelsea is guided by the MDPN’s commitment to small, underserved organizations, and plans to represent their needs on the Coordinating Committee.

Jessica Venlet

Jessica Venlet works as the Assistant University Archivist for Digital Records and Records Management at the University of North Carolina at Chapel Hill Libraries. In this role, she is responsible for a variety of things related to both records management and digital preservation. In particular, she leads the processing and management of born-digital archival materials.

Jessica is drawn to participation with NDSA because of how valuable the resources and network are to her work and to the profession overall. She has recently participated in working groups for the 2019 Levels of Digital Preservation Reboot (assessment subgroup), the 2021 NDSA Staffing Survey, and the 2023 NDSA Excellence Awards. She is excited to join the Coordinating Committee and contribute to the continued development of the NDSA organization and all its associated programs and working groups.

 

We are also grateful to all of the very talented, qualified candidates who participated in this election.

We are indebted to our outgoing Coordinating Committee members, Elizabeth England, Jes Neal, and Linda Tadic, for their service and many contributions. To sustain a vibrant, robust community of practice, we rely on and deeply value the contributions of all members, including those who took part in voting.

Bethany Scott, Vice Chair, on behalf of the NDSA Coordinating Committee

Catching up with past NDSA Excellence Awards Winners: Tessa Walsh

The NDSA Individual Excellence Award honors individuals making significant contributions to the digital preservation community. In 2019, Tessa Walsh was one of two awardees in this category. Tessa has created an evolving suite of robust open source tools meeting many core needs of the stewardship community in appraising, processing, and reporting upon born-digital collections. At the time of the award, her projects included the Brunnhilde characterization tool; BulkReviewer, for identifying PII and other sensitive information; the METSFlask viewer for Archivematica METS files; SCOPE, an access interface for Archivematica dissemination information packages; and CCA Tools, for creating submission packages from a variety of folder and disk image sources. Taken together, these tools support a very wide gamut of both technical and curatorial activities. 

We recently caught up with Tessa to chat about the Excellence Awards. Read on to hear more about what Tessa has been working on recently! 

1) What have you been doing since receiving an NDSA Excellence Award?

I’ve been busy! Other than the whole global pandemic bit, I shifted from an archivist/librarian coding off the side of my desk to a professional software developer working on open source digital preservation tools, which has been a dream.

From March 2020 (the same week lockdown started here in Montreal) to September 2022, I worked as a Software Developer at Artefactual Systems, primarily on the Archivematica and Access to Memory (AtoM) projects. Getting a chance to grow leaps and bounds as a developer while working on open source software that the digital preservation and archival communities are heavily invested in was a dream come true. And as anyone who has had the chance to work with the folks at Artefactual will know, it’s a really supportive environment filled with kind, curious, multi-skilled people. I’m proud of some of the features I was able to work on there, including implementing an storage adapter for Archivematica to work with nearly any cloud storage provider, adding single sign-on to Archivematica and AtoM, helping users with their migration and theming projects, and working on some supplementary tools for things like reporting and audit logging.

In September 2022, I took a new role as Senior Applications and Tools Engineer at Webrecorder. Getting to work on a friendly and talented small team developing user-friendly open source solutions to challenging problems in web archiving has been fantastic. Since starting at Webrecorder, I’ve made contributions to pywb and Browsertrix Crawler, and have been heavily involved in the development of Browsertrix Cloud, a new open source cloud-native browser-based crawling service that unifies several Webrecorder tools into a single easy-to-use web application for creating, managing, curating, and sharing web archives. We’ve been hard at work developing both the software as well as a sustainable open source business model around it, and will be launching a hosted service as well as support models for the open source software in the coming months and year. It’s a really exciting time to be at Webrecorder, and I’m excited for us to continue furthering Webrecorder’s mission of web archiving for all.

I’ve also kept developing and maintaining a small set of my own open source projects, including putting out several releases of Bulk Reviewer (https://github.com/bulk-reviewer/bulk-reviewer/), a desktop application that aids users in finding and managing private and sensitive information in digital archives that is now included in the BitCurator Environment.

Finally, I’ve had the pleasure of being involved in a few research projects that I hope are helping to push forward thinking on topics that are of special interest to me. With Keith Pendergrass, Walker Sampson, and Laura Alagna, I published the paper “Toward Environmentally Sustainable Digital Preservation” in American Archivist in late 2019, which explores the environmental impact of digital preservation practice and suggests ways for the field to move forward in a more sustainable fashion. With Aliza Leventhal and Julie Collins, I published “Of Grasshoppers and Rhinos: A Visual Literacy Approach to Born-Digital Design Records,” also in American Archivist, in 2021. The paper applies a visual literacy approach to notoriously difficult digital design records such as CAD/BIM and 3D models in architectural archives with the hopes of making these materials more approachable to those responsible for preserving and providing access to them. And finally, with Jess Whyte, I’ve also been conducting interviews with Canadian memory workers on the issues they face and strategies they use in managing private and sensitive information in digital collections. Our paper, titled “‘Carefully and Cautiously’: How Canadian Cultural Memory Workers Review Digital Materials for Private and Sensitive Information,” will be published later this year in the open access journal Partnership: The Canadian Journal of Library and Information Practice and Research.

2) What did receiving the NDSA award mean to you?

Receiving the NDSA award validated the work that I was doing in trying to develop and maintain open source software that makes digital archiving and digital preservation work easier for practitioners. It helped me get over a bit of imposter syndrome and find the confidence to pursue software development as a career rather than just an interest, which I’m deeply grateful for! I hope and suspect it also introduced some new folks to some of the tools that I’d been working on, which is always nice.

3) What efforts/advances/ideas of the last few years have you been impressed with or admired in the field of data stewardship and/or digital preservation?

I think the conversations around environmental sustainability that have been happening in the last few years are wonderful and needed, especially as we see the effects of climate change unfold in real time. Digital stewardship will need to both respond to increasing risks of events like data center outages, and it behooves us to try to reduce our footprint as we can through classic archival practices like careful selection and new techniques like threat modeling and using defined levels of preservation tiers appropriate for various types of content being stored.

In the web archiving space, I’ve been really excited about the possibilities afforded by client-side replay in the browser made possible by Webrecorder’s replayweb.page tool. By being able to render and rewrite web archives in the browser we remove the need to upload data to a server in order to replay web archives and open up new exciting possibilities for access such as embedding web archive viewers into preservation and access systems (for more on that, see: https://replayweb.page/docs/embedding). I’m a big proponent of putting the focus on access to content that we’re preserving and I think this is a big step forward for web archives on that front!

4) How has your work evolved since you won the Excellence Award?

Since winning the Excellence Award, I’ve been fortunate to receive a lot of mentoring and have grown into a senior developer, which is really exciting personally. I’ve also had the opportunity to deepen my thinking on the sustainability of open source projects that the digital stewardship and preservation fields rely on through firsthand experience as a solo maintainer and as a person working on larger open source projects with many contributors. It’s a difficult thing to get right but really important, as we don’t want the burden of maintaining these tools to fall on individuals who aren’t compensated for their labor or for projects to become abandoned after being widely adopted.

5) What do you currently see as some of the biggest challenges or opportunities in digital preservation?

One thing I see as both a challenge and an opportunity currently is beginning to shift the focus from preserving content to providing open, sophisticated, useful access. Ultimately the goal of preservation is (or should be!) for someone to come use what we’re preserving. As the field matures and gets more comfortable in our preservation practices, I think there are a lot of interesting opportunities to demonstrate our value by connecting preserved content to users in forms that are useful to them, whether that means providing computational access to data, making it easier to integrate preserved content with our access systems, or pushing content to where people already are.

I’d also love to see us continue to lower the technical barriers to entry for digital preservation practice. A lot of the tools we rely on assume a certain level of competence with command line interfaces and scripting languages. Those tools can be great for providing a lot of flexibility to practitioners, and the field has done a lot to make learning these skills easier. That said, requiring such skills can also make it difficult to hire and mentor the next generation of digital stewards. I’d love to see our common toolsets continue to get more approachable and easier to use so that we can continue to grow and diversify our field of practitioners.

6) Are you working on any new digital preservation related tools at the moment? If so, could you please share a bit about the tool(s).

I’ve mentioned a few tools already, but I’d like to talk a little bit more about Browsertrix Cloud, the focus of a lot of my activity at Webrecorder these days. In the early days of development, a lot of our focus was on supporting functionality that were already possible through tools like Browsertrix Crawler in a more user-friendly and modern user interface. Now we’re focusing on building features that are new to Webrecorder, such as building and publicly sharing curated collections of web content, and integrating Browsertrix Cloud with existing tools like the archiveweb.page Chrome extension for manually archiving websites in your browser. By the end of the year, we’ll be working on some features that are I think relatively new to the web archiving field as a whole. I’m particularly excited about starting to work on software-assisted quality assurance (QA) of crawls, where we will be analyzing the WACZ files created by our crawler and presenting information to the end user about the relative quality of capture for the pages that have been crawled. That’s really just a start and I’m sure we’ll continue to refine what assisted QA can entail, but it aligns super well with my personal mission of using software to make currently onerous tasks easier for digital stewards, freeing them to use their time on the tasks where our expertise is most valuable.

Click here to read about other winners from the 2019 NDSA Innovation Awards!

How do you use the NDSA Levels of Digital Preservation?

Earlier this year the Levels Steering Group gathered feedback from the community about how the Levels of Digital Preservation were used. Some of our findings have already been shared in the following blog posts – Finding out more about the use of the NDSA Levels of Digital Preservation and National Libraries and the Levels of Digital Preservation. This is the third and final blog post in this series.

As noted in previous blog posts, there was a modest number of responses to our scenarios gathering exercise, so the results shared below offer only a small snapshot of community use of the Levels rather than representing a statistically valid set of results.

Here we focus on just one of the questions that was asked in our scenarios survey, though note this was actually several questions in one, digging into some of the different aspects of how organizations use the Levels. We’ll take each of those sub-questions in turn…

Describe how your organization uses the NDSA Levels.

The responses included some interesting examples of how people use the NDSA Levels, with some using them in quite specific ways.

  • One organization uses an approximation of the NDSA storage levels to define the different types of storage available for their digital assets.
  • A nice example of how one respondent uses the levels is their use to create and implement a basic repository-wide digital preservation system.
  • Another respondent notes a preference for DPC’s Rapid Assessment Model (DPC RAM), but states that they do like to directly refer to the NDSA Levels when they have a reason to articulate maturity levels relating to a specific row (for example metadata).
  • A couple of other responses mention using the Levels alongside other tools – one as a quick assessment tool alongside DPC RAM and another in conjunction with both DPC RAM and the DiaGRAM tool.
  • Another mentioned using the Levels as a means of teaching both students and practicing professionals about digital preservation.

One response to this question was that the Levels are not currently used at all because initial steps to persuade senior administrators about the need for digital preservation have not yet been made.

How frequently do you use them?

Out of the answers provided to this question, it seemed that the Levels are not typically used on a regular cycle (for example as an activity that is carried out to an agreed schedule every 1 or 2 years), rather they are used in a more ad hoc fashion or as a result of specific triggers or drivers.

  • One respondent mentioned that “we do not use the NDSA Levels in a systematic way”.
  • Another noted that though they aim to complete it annually, in reality they might actually have a delay of two or three years in between assessments.
  • Particular triggers that were noted by other respondents that could lead to a re-assessment using the Levels might be when digital preservation policies and plans are being reviewed or if other tools that rely on the Levels are being applied (the DiAGRAM tool from The National Archives UK was mentioned in this context).
  • Another respondent mentioned that they had used the NDSA Levels at the start of their digital preservation journey and have used it to check in on progress a couple of times since. They have a plan to continue to incorporate regular assessment going forward.

Who gets involved?

There were a range of responses to this question, with some answers stating that the digital archivist will carry out a self-assessment using the Levels alone, a couple of answers mentioning that a colleagues in either their department or in IT will also be involved and another stating that though the assessment is driven by the digital archivist, other internal stakeholders would be consulted as appropriate.

Who is the resulting information communicated to?

Some of the responses to this included:

  • Information produced as part of an assessment using the Levels is typically communicated internally with colleagues – both staff within the respondent’s department, senior administrative staff and other stakeholders. In particular, senior staff were considered to be an important audience for this information.
  • One respondent noted that the information was used “as a way to help explain community expectations to traditional IT staff”.
  • Another noted that the information was communicated outside of their organization as evidence for their application for Archives Accreditation.

It is encouraging to see the NDSA Levels being used as an advocacy and communication tool within several organizations.

What documentation is maintained about the process?

There was little response to this sub-question, but one respondent noted that they retain a filled in report for their records and to facilitate the tracking of progress over time. Another noted that documentation about their self assessment using the Levels is kept as part of their organizational records and another specified that both the assessment and notes about it were kept on their internal wiki area. This is encouraging to see – given the Levels can be used to track and monitor progress over time, keeping records of previous assessments and notes relating to why a particular level was selected is an important way to facilitate future comparisons.

Does this process tie in with organizational review and planning cycles?

Several respondents mention how their use of the Levels ties in with planning.

  • One states the Levels are used to plan their digital preservation program and also that the use of the Levels does tie in with planning for future software and equipment upgrades.
  • Another states that their use of the levels ties into planning cycles so they can identify target areas for resource investment.
  • Another mentions that they have referenced the Levels in the Digital preservation Strategic Plan and mapped out strategic priorities and actions against it.
  • One person stated that “it does not tie in with any organizational review or planning cycles.

There are likely to be benefits in regularly using the Levels as a check in when reviewing progress and planning future work, so it is encouraging to see that several people were using them in this way.

Summary

It is interesting as ever to share information about the different ways that the community uses the NDSA Levels of Digital Preservation. Hopefully this small snapshot gives you some ideas to take away and use within your own organizations.

Thank you and next steps

Thanks again to all of those who submitted information to us on their uses of the Levels. As always, we encourage the whole community to provide feedback on the Levels at any time. We are currently considering whether a review or update to the Levels is required in 2024 and are interested in hearing from the community if there are things that you think need to change. You are also welcome to come to our next Open Office Hour session on October 18th at 11:30 AM Eastern Time. Connection details and notes from past sessions are available here

 

NDSA Welcomes Two New Members in Quarter 3 of 2023

As of September 2023, the NDSA Leadership unanimously voted to welcome its two most recent applicants into the membership. Each new member brings a host of skills and experience to our group. Keep an eye out for them on your calls and be sure to give them a shout out. Please join me in welcoming our new members! To review our list of all members, you can see them here.

~ Bethany Scott, NDSA Coordinating Committee Vice-Chair

Loras College Center for Dubuque History

The Center for Dubuque History is home to many rare images, documents, and AV materials, and they are committed to making them more accessible through digitization. They are still in the early stages of this process, but after attending a Digital POWRR Institute, they are on their way and are eager to join a group where they can both learn and share our experiences with others as they gain expertise. 

Open Preservation Foundation

The Open Preservation Foundation are a global not-for-profit membership organization working to advance shared standards and solutions for the long-term preservation of digital content. Through the development of open source tools, they enable memory institutions to preserve their digital collections. Two of their staff are already contributing to NDSA by being part of the DigiPres Program Committee.

 

Skip to content