Where are they now? Part II

Written by Sarah Newhouse on July 27th, 2012

The last time this blog heard from me, I had finished processing the papers of Dr. Stella Kramrisch at the Philadelphia Museum of Art. In that blog post, you can tell that I’m a little surprised that the processing went so well. I thought it would be complex to reconcile two different phases of previous processing that had separated a collection into two physical groups.  I can laugh at that now, because it turns out 1-year-ago-Sarah had no idea how complex processing could really be. (Oh, little baby archivist, just you wait.)

Since I left the Hidden Collections project, I’ve worked on two projects at The Historical Society of Pennsylvania (which participated in the CLIR grant but alas, I was not part of the team that worked there). The first was as project archivist for the Digital Center for Americana Project, Phase II. Both phases of this project had, at their heart, the drive to create access to the collections at HSP through digitization. Phase I focused on collections relating to the Civil War and Phase II on collections that documented immigrant families, individuals, and communities in the Philadelphia area. I feel especially lucky that I got to work on this project given the subject matter. Many researchers know about HSP’s treasures – and there are some amazing things in those holdings, believe me – but fewer researchers know about these collections that document the immigrant experience or represent minority groups. The history of the Philadelphia area is mostly a narrative of Western European families who, yes, were all immigrants themselves, but very well-documented immigrants. So I’m happy to be adding to the richness of that narrative by making collections of less well-documented minority and immigrant groups accessible to the public.

The project involved some MPLP and some full processing. Collections had to be arranged, described, housed, inventoried, conserved, and digitized. Some collections received full digitization, like the beautiful 18th and 19th century bound volumes in the Abraham H. Cassel collection and the tapes and transcripts in the Balch Institute’s South Asian Immigrants in the Philadelphia Area Oral History Project.  Others received “signpost” images, meaning that I selected items for digitization that represented the contents of the collection. This was actually a bit of a challenge, because I had to resist the urge to digitize the most unusual, amazing, or funniest items in a collection and just digitize things that wouldn’t mislead a researcher as to the collection’s contents.  So, for the Athena Tacha papers, rather than digitize a letter from one of Tacha’s famous artist friends, I chose one of her many letters to her family in Greece.

One of the biggest challenges with this project was the language barrier. I can read some German (but don’t ask me to speak it), as well as Japanese, Latin, and a tiny bit of Spanish, but this project also included Greek, Swedish, and French, languages that I had zero experience with. Luckily, I was able to fall back on the skills of two interns who were natives of Sweden and Greece. Without their help, the finding aids for these collections would have been a lot less informative and the processing experience a lot less fun. The interns had different levels of archives experience, so I relied on them mostly as translators rather than processors. But even our clever Swedish intern, who spoke German fluently, was stumped by some of the spidery, 18th century German handwriting and syntax we encountered.

Working on the DCAII has given me a deep respect and thankfulness for the work that Holly and Courtney did on the PACSCL CLIR project. Transitioning from a student processing intern to a project archivist had a very, very, very steep learning curve. But luckily I had some understanding coworkers who created a support system of archivists, conservators, and digital technicians, all willing to put up with my mistakes and answer my questions (although in hindsight, one of my biggest mistakes was not asking more questions). Coordinating moving collections between three departments was difficult, as was getting used to budgeting my time on a project for which I had to keep track of and participate in processing, conservation, and digitization tasks. I also managed interns, ordered supplies, blogged, helped organize an exhibit, helped arrange a talk, and generally tried to look like I knew what I was doing. (As the internet says: fail.)

Of course, I would not be where I am now — happily processing the papers of the Woodlands Cemetery Company at HSP — if I hadn’t been selected as a student processor for the Hidden Collections project. This project and others like it are truly wonderful ways for archives and LIS students to get their feet wet in the processing pool. Especially if they’re managed as well as we were, with readily available guidance and frequent on-site supervision, processing interns gain not only skills they’ll need for those first few jobs, but the confidence to use them.

For further reading, here are some links with information about the projects I’ve done since Hidden Collections:

HSP’s Digital Library: http://digitallibrary.hsp.org/

HSP’s finding aids: http://hsp.org/collections/catalogs-research-tools/finding-aids

HSP’s archives blog, “Fondly, Pennsylvania:” http://hsp.org/blogs/fondly-pennsylvania

Please feel free to contact me if you have any questions about the DCAII and its collections, Woodlands Cemetery, or my experience with the PACSCL-CLIR Hidden Collections project. snewhouse@hsp.org

 

Hidden Collections Initiative for Pennsylvania Small Archival Repositories

Written by Celia Caust-Ellenbogen on May 7th, 2012

If you’ve been following this blog of the PACSCL-CLIR Hidden Collections Processing Project, you might be interested in learning about the Hidden Collections Initiative for Pennsylvania Small Archival Repositories (HCI-PSAR, or the “Small Repository Project” for short). This post could be filed under “PACSCL-CLIR Student Processors–Where Are They Now?” since I, and fellow former student processor Michael Gubicza, are both currently employed on the Small Repository Project. But before you conjure up too many thoughts of drug-addicted 80s TV stars and one-hit-wonder 90s teen queens, think of this post also under the headings “Lessons Learned” and “Project Legacy.” The Small Repository Project carries on PACSCL’s commitment to uncovering hidden archival collections, and builds on the PACSCL-CLIR methodology, tools, and infrastructure–with a few new twists, of course.

Another creative storage solution at Millbrook Society! Hatboro Borough records, stored in a biscuit box.

Another creative storage solution at Millbrook Society! Hatboro Borough records, stored in a biscuit box.

First, some background on the Small Repository Project. It’s an initiative of the Historical Society of Pennsylvania–not coincidentally, one of the repositories where I processed for PACSCL-CLIR–with funding from the Andrew W. Mellon Foundation. The Small Repository Project aims to make better known and more accessible the important archival collections held at the many small, primarily volunteer-run historical societies, historic sites, and museums in the Philadelphia region. It was envisioned as a three-part project, and right now we’re in the midst of Phase I, which focuses on Philadelphia and Montgomery Counties. My title is Project Surveyor, so my job is to visit all of the small repositories in those two counties and survey their archival collections. There are two major components to the survey work: description and assessment.

Historical Society of Tacony: Frank Shuman, a Tacony resident, developed the world's first solar power plant in 1912-1913!

Historical Society of Tacony: Frank Shuman, a Tacony resident, developed the world's first solar power plant in 1912-1913!

Description In just six months of surveying, we’ve already discovered many amazing collections! From big names–like Pennsylvania Governor Samuel Pennypacker and Civil War naval engineer John Ericsson–to names that didn’t make the history books–like Frank Shuman, who built the world’s first solar power plant in 1912, or Dr. Hiram Corson, an abolitionist and prominent advocate for women physicians. To make these important resources more visible, we are creating what amount to “stub” finding aids: we don’t have the time to physically process any collections, but we can provide collection-level descriptions with very summary information. To be as fast yet thorough as possible, Michael and I use Archivist’s Toolkit, Holly and Courtney’s data-entry best practices, and an Excel-to-XML worksheet of my own devising that was heavily inspired by Matt Herbison’s.

PACSCL and the University of Pennsylvania recently agreed to host our finding aids, so they will be on the PACSCL Finding Aid Site together with the PACSCL-CLIR “Hidden Collections” finding aids. I am personally thrilled about this detail, because it means Philadelphia will be one step closer to having one central database where all area archival collections could be searched. In one place, you will be able to search collections from the biggest professionally-run PACSCL member to the smallest all-volunteer historical society! None of the Small Repository Project finding aids are up quite yet, but keep an eye on the site…

Old York Road Historical Society

Old York Road Historical Society

Assessment As I mentioned, the Hidden Collections Project doesn’t have the time to physically process all the collections that we survey, but we do hope that at least some of them will be processed in the not-too-distant future! Toward that end, we not only describe but also assess each of the collections we survey. We look at the condition of the material, quality of housing, degree of intellectual access (existence of finding aids), physical accessibility (organization), and research value (a combination of an interest ranking, and a rating for how well those interesting topics are documented). These ratings help establish collection care and processing priorities–a collection with a high research value rating but low accessibility ratings should be processed first.

PACSCL did the same sort of assessments for its member institutions a few years back (PACSCL Consortial Survey Initiative), based on a survey project at the Historical Society of Pennsylvania before that. The collections processed for the PACSCL-CLIR “Hidden Collections” Processing Project were those identified by the PACSCL survey as having the highest potential research value.

The assessment methodology that we use in the Small Repository Project, down to the assessment criteria and ratings descriptions, is modeled after the PACSCL survey. Check out Matthew Lyons’ blog post about our methodology. We strive for consistency so that our ratings will be comparable to PACSCL’s. Only the future can say whether anyone will undertake a large-scale, multi-repository processing project like PACSCL-CLIR “Hidden Collections.” But our assessments can help individual small repositories best allocate their own limited resources.

Social Media While I worked on the PACSCL-CLIR project, I loved sharing my favorite “finds” from the collections I processed on the project Flickr page and blog. We do the same thing at the Small Repository Project! Check out our blog and our photoalbums. For updates, follow us on Facebook or Twitter.

Finally, I’d like to take this opportunity to thank Holly, Courtney, and everyone who has worked on the PACSCL-CLIR Hidden Collections Project. The tools, techniques, and wisdom they developed and shared on their project website have proved invaluable to us in implementing the Small Repository Project. I’m sure that many other important and innovative archival projects will build on the PACSCL-CLIR project, and we all, collectively, thank you for enriching our communal knowledge.

 

Excel to EAD-XML to AT—the spreadsheet from heaven.

Written by Holly Mengel on March 19th, 2012

Although it seems like a million years, it actually was not so long ago that our students were processing at the Independence Seaport Museum.  While we were there, we were faced with one of the limitations of our minimal processing time frames.  The archivist there, Matt Herbison (now at Drexel University College of Medicine Legacy Center) had a few spreadsheets detailing information on ships’ plans—information that made the collections truly useful to researchers.  Problem was, there were thousands of entries in the spreadsheets and we knew that our processors could never re-key or copy/paste that information into the Archivists’ Toolkit in the time allotted for the processing.

Because we knew that this information would really make a difference for users, we thought and thought of ways to make this work, but our best solution involved saving the spreadsheet as a pdf and linking to it from the finding aid–not very elegant. And then Matt, who really is extraordinarily techie, created this amazing spreadsheet that solved the problem.  To sweeten the deal even more, he offered Courtney and me the use of the spreadsheet for the project.

I will now make a very bold statement:  this spreadsheet made it possible for us to finish the project within the time frame.  Not only did we use it at the Seaport, our processors used it for original data entry at repositories that had spotty internet connections, technical troubles, and/or did not adopt the Archivists’ Toolkit.  Our Archivists’ Toolkit cataloger used it as a starting point for almost all electronic legacy finding aids.

Matt has offered to share this spreadsheet with everyone.  It is available here and we have created a guide for using the spreadsheet.  In a nutshell, each column in the spreadsheet maps to specific field in the Archivists’ Toolkit.  It has three levels of hierarchy below the collection level, so it not the tool of choice if your finding aids has sub-sub series and items, but for most modern finding aids, it is the ticket.  I should say, though, that it is not necessarily a quick process if you are starting with existing data … time needs to be taken to combine columns, format data, and check for errors.  If you know how to use regular expressions, you can really streamline some of this work.  If you are doing original data entry, the use of the spreadsheet is incredibly efficient for getting container lists into the Archivists’ Toolkit.

This means that anyone with knowledge of MS Excel can create finding aids and take legacy information from an electronic format to xml.  Pretty awesome! I will say that a little knowledge of EAD is very useful and understanding the Archivists’ Toolkit will make decisions in data entry easier.  Many of our students preferred working with the spreadsheet rather than the Archivists’ Toolkit, but it is a matter of preference.  I think it is a little harder to see the hierarchy when using the spreadsheet, but it is a thousand times easier fix error in Excel than in the Archivists’ Toolkit.  Check it out, try it out and see if it changes your life.

Yes, I did say that … I think it could change your life!

Thanks SO much to Matt Herbison!

 

Legacy finding aids: a trial (by any definition)!

Written by Holly Mengel on February 13th, 2012

77 “substandard” or legacy guides are now in the Archivists’ Toolkit and final editing is underway.  And I am happy about that … however, almost none of these look as good as they could or should.  Garrett Boos, Archivists’ Toolkit cataloger, and I spoke many times about the limitations of this part of the project.

We decided that there were several problems:  working remotely from the collections; the format, structure and quality of the finding aids that were given to us; and, to be perfectly honest, our own expectations for the final product.

Before Garrett started, I decided that working remotely was going to be the most logical way to approach this part of the project.  Garrett worked in our office at Penn and entered the collections into our own instance of the Archivists’ Toolkit.  We then exported the finding aids from his AT and  imported them into each repository’s instance of the Archivists’ Toolkit.  I decided to have Garrett work at Penn primarily because of logistics—otherwise, he would have had to work at 18 different repositories and, as we have learned, technology and space are two of the greatest challenges of the project.  Not to mention the instances when security clearances would need to be run, etc.  However, now that Garrett is done with the project, I have been trying to decide if it would have been better for him to work on-site and I am torn.  On the one hand, it would have made a lot of factors easier—especially checking on locations, vague titles and missing dates, to name only a few.  On the other hand, it would almost certainly have stopped being a “legacy finding aid conversion” project and turned into a “reprocessing” project. So I guess I need to stand by my decision to work off-site, even it was limiting.

The reason I say that it would have turned into a “reprocessing” project is because Garrett and I think that at least 60% of the collections should have had some physical and intellectual work before the finding aid was considered final.  As with all aspects of this project, the legacy finding aid component was an experiment and therefore, the grant allowed repositories to send us any “substandard finding aids.” This resulted in several types of “tools.”  Garrett took them all on:  lists, card catalogs, databases and more traditional finding aids.  The biggest problem we found was that very few of these guides were organized hierarchically which meant that we had to do a lot of guessing—was something a folder, or was it an item?  Should the paragraph connected to a folder title be added as a scope note or was it actually part of the folder title?  What to do with the information about the contents of a letter, or the condition of the material?  What happens when there is no biographical/historical note and no scope and content note?  Thank goodness for email and helpful repository staff! 

I should say that there were a number of finding aids that came to us in absolute perfect shape … putting that finding aid into the Archivists’ Toolkit was a piece of cake and the resulting finding aid was beautiful. Others that were written before finding aids were standardized did not work nearly so well. Because we forced non-hierarchical guides into AT, a system designed to organize information hierarchically, some of the finding aids are actually less user-friendly than the originals. Many of these legacy guides had item level description, something our stylesheet doesn’t handle well, resulting in what Garrett and I have termed, “really ugly finding aids.” Moreover, of 77 finding aids, only 15 did not require some enhancement of biographical/historical or scope and contents notes–which is pretty tricky when working off-site. Titles and dates almost always needed to be reformatted for DACs compliance. Our primary goal was to maintain every bit of information that was in the original, but it worries me that we have created online guides that are potentially overwhelming and off-putting to researchers.

Some repositories have told me that I should not worry—that getting the guide online is enough.  Others, though, I know are really disappointed with the result. We surveyed our participating repositories about the effectiveness of the project and their satisfaction, and while we have not heard from all, the component of the project that proved least satisfying is the legacy finding aid component. I know that it is, by far, the part of the project with which I am least pleased.

Does this mean that you should not do a legacy finding aid conversion project?  No!  Do a legacy finding aid conversion, but do it with some structure and guidelines!  In order to have a successful legacy finding aid conversion project, we learned that repository staff will have to do some (or alot of) front line work prior to unleashing the guide on the cataloger.

Before handing over a finding aid, repository staff should identify (in pencil is okay):

• Folder title (underlined in one color)
• Folder date (underlined in another color)
• Box number
• Folder number
• If there is additional material, into what field in the Archivists’ Toolkit/EAD should it be entered?
• Biographical/historical note (does not need to be narrative, but the information should be provided by an “expert”)
• Scope and content note (same as the bio note)

If, as you go through this process, it becomes obvious that reprocessing is necessary, take the collection off your conversion list and place it on a priority list for processing.  Processing the collection may be quick and speedy and your result will almost certainly be better! In fact, I think, in some cases, we spent more time forcing data into AT than it would have taken to reprocess the collection.

Identifying these essentials should result in finding aids that are more standardized and allow researchers greater access to your awesome stuff. Don’t count on it being a quick process, however: the prep work is time consuming, the conversion is time consuming, and the proofing and editing is REALLY time consuming. This is not a task that can be placed only on the person converting the finding aid … even after the finding aid was in AT, Courtney and I, with fresh pairs of eyes, found lots of mistakes in spelling, hierarchy and grammar which would have been embarrassing and, even worse, would have potentially prevented people from finding that for which they were looking. Which is, of course, the whole point of all our work!

 

Description in MPLP is counter-intuitive

Written by Holly Mengel on February 7th, 2012

Courtney and I both felt strongly, from the very beginning of the project, that sacrificing description for speed was a risk in this project.  Although we know that every collection could still use additional work, we worked hard to make it so that the repository did not feel that additional work was necessary before they made the collection public.  Moreover, we knew from the start, that many of the collections would NEVER be worked on again.  Unfortunately, that is just how it is.

So what have we learned about description?  We learned that description takes a lot of time—in fact, that is probably the first thing we learned in this project when we tested the manual and discovered that even an experienced processor could not arrange and describe a fairly straightforward collection from start to finish in 2 hours per linear foot.  As a result, Courtney and I created processing plans that included a preliminary biographical/historical note before processing started.  In general, we have learned that it generally takes roughly the same amount of time to describe a collection as it does to arrange a collection.

I’m not going to lie … I am pro description … few things give me more professional pleasure that a beautifully crafted folder title or a paragraph in a scope and content note that I know will help a user determine if this collection is going to help them with their research.  That is the whole point—letting researchers know that we have the stuff that they need.  As a result, the PACSCL/CLIR team took it seriously.  Description is the one part of training that has probably evolved most over the course of the project.  We developed exercises to help our processors write better and more descriptive folder titles and structure notes so that they are both concise and informative.  The project didn’t have a lot of time, so we tried to make our processors think like a user and learn to quickly assess the contents of a folder.  For the most part, we are really pleased with our finding aids and I think, nine times out of ten, researchers will be able to determine by the finding aid if the collection is worth their time in looking at it.

One of the really interesting things we learned is, to me, still the most counter-intuitive.  A collection with extremely tidy existing arrangement usually results in a collection with less thorough description.  I am going to use two specific collections to illustrate this issue.

The first collection is the Dillwyn and Emlen family correspondence, 1770-1818, housed at the Library Company of Philadelphia (unquestionably one of my favorite collections in this project—as well as being one of my biggest disappointments, archivally speaking).  When I sat down to process this collection, I was really confident—the collection was 2 linear feet and was already arranged.  At one point in time, it had been bound in volumes and at another point in time, the letters were removed from the volumes and placed in very acidic folders.  Every letter had a catalog number written on the document.  While a few of the letters were out of chronological order, the vast majority of the collection was arranged very effectively; each folder containing letters from a span of dates.

This collection desperately needed to be re-foldered.  Not only were the folders highly acidic, but they were too small and some of the letters were showing a bit of damage.  I re-foldered the 130 folders in the collection which took about 2.5 hours.  Then I entered the folder list into the Archivists’ Toolkit which probably took only about 15 to 20 minutes.   So in roughly 3 hours (three quarters of my allotted time), I had the collection rehoused and the folder list in the Archivists’ Toolkit, which left me 1 hour to write a scope and content note.  Should have been easy, right? Well, no. Because this collection was perfectly arranged, I did not need to look at even one document in order to create the container list.  Moreover, the container list is not very helpful to a researcher.  All it contains is a list of dates which means that the scope and content note should be full of the subjects addressed in the correspondence.  Problem is, I did not know anything about the letters.  There was no way that I could read enough of the letters in an hour to discover all the topics addressed in the letters that will almost certainly be interesting to researchers.  I did my best—I valiantly scanned through as many letters as I could and wrote down key topics that popped up more than once or twice, and as each minute passed, my heart sank just a little more—I knew perfectly well that I could never do this extraordinary collection justice, even with twice the time.  Prior to beginning processing, I had performed my research for the biographical note and I had discovered that several authors had used portions of the collection in their published works … so I turned to them for expertise on this collection.  They wrote about only a tiny portion of the collection, Susanna Dillwyn Emlen’s bout with breast cancer.  I soaked up every bit of information in their books and included it in my scope note in order to give users the most information possible, but I feel like the project failed this collection.  Perhaps I feel this so strongly because I had been so confident in significantly improving access to it.

I have beheld the second collection, the Belfield collection, 1697-1977, housed at the Historical Society of Pennsylvania, with equal amounts of awe, excitement and horror since I first laid eyes on it.  Never have I seen such a mess of a collection—please see just a few photographs as words cannot effectively describe the condition of this collection.  Courtney and I spoke with Matthew Lyons of HSP and he said that he was not expecting much more than good box level descriptions of the contents.  Even with these reduced expectations, we thought it wise to double our forces and therefore, Michael, Celia, Courtney and I all worked together on this collection.  I am happy to say that this collection will, for quite a few series, contain folder level description, but even more than that, the scope and content note for this collection is rich, deep and full of the flavor of the four generations of family who lived at Belfield.

So why does a collection that was the biggest (filthiest) mess of all time result in a better finding aid than a small and beautifully arranged collection?   I know it is because we were forced to sift through the messy collection in order to create any order, and it is amazing how much one absorbs simply by looking at the collection.  In the end, I feel that this is one of the biggest rapid maximal processing successes of the entire project.  We took the collection from utterly unusable chaos to an order that could certainly be refined, but is beyond serviceable.

When selecting collections for a minimal/rapid maximal processing project, consider your time frames and what result you want from the project.  If you want a container list in a hurry, select a well-organized collection.  If you want fuller description, a collection that needs some arrangement will probably be the best choice.  From a purely selfish perspective, I would pick a wreck of a collection over a tidy one every time—the sense of accomplishment and success is so much sweeter than that despair I still feel when I think of Dillwyn and Emlen letters.

I mentioned in an earlier blog post that there are about 3 collections that I don’t feel enormously benefited from this project.  In every case, the collections had existing arrangement that I felt either prevented me from starting from scratch or were in good enough order that I did not learn valuable content that I could then share with researchers.

 

The decision to minimally process should be a collection-by-collection decision …

Written by Holly Mengel on January 27th, 2012

Fairly early in this project, Courtney and I determined that “MPLP 2 Hours” was not going to be a wholesale success—most collections simply cannot be processed in that time frame, regardless of the shortcuts taken (our average across the board is 3.2 hours per linear foot).  And in some cases, those shortcuts resulted in a product that we did not feel was more useful to a researcher post-processing.  What we have determined is essentially this … it is difficult, if not impossible, to say that collections can be processed in a set or determined amount of time, but it is possible to make educated estimates allowing us to allocate human resources to process collections efficiently.

There are several factors that allow us to better determine a time frame for the processing of collections:  age, type of collection, and original arrangement of the collection are the three biggies. None of these factors work independently—they are all intertwined to help determine the time frame.  So, based upon the data collected for 125 collections, processors have physically processed collections with the oldest material dating from the:

17th century at an average of 4.1 hours per linear foot;

18th century at an average of 3.3 hours per linear foot;

19th century at an average of 3.4 hours per linear foot;

20th century at an average of 2.9 hours per linear foot.

Processors have processed:

artificial collections at an average of 3.6 hours per linear foot;

institutional/corporate records at an average of 2.5 hours per linear foot;

personal papers at an average of 3.7 hours per linear foot;

family papers at an average of 4.2 hours per linear foot.

Age seems like it should be the most logical factor, but in fact, it has proven to be the least certain factor in our ability to judge the time frame for processing.  We thought originally that old collections (pre 1850s for certain) would take us significantly longer to process, but this is not necessarily the case.  The age does not seem to deter us in being able to efficiently process an “old” collection.  Age does, however, quite frequently deter us from describing the collections well.  Quickly skimming for content in folders of 17th, 18th and 19th century handwritten material is not easy—and it absolutely results in less thorough description.  However, if the collection is arranged and available for research use, perhaps this is where we ask for help … as researchers use the collections, we can ask them to provide more robust description of what the correspondence, journals, etc. contain.  Finding aids CAN be iterative … especially with technology such as the Archivists’ Toolkit.  “Newer” collections may or may not be easier to process … certainly there is more typewritten material that makes it immediately easier to categorize series/subseries/folders and describe the contents of the folders more thoroughly.  However, in the end, the ease of the processing relies more heavily on the type of collection more than the age.

For this project, we have divided collections into four basic types:  institutional/corporate records, personal papers, family papers and artificial collections.  Again, there is no one size fits all … each collection is unique (is that not why archival collections are so awesome?).  Generally speaking though, an institution or company’s records can be processed most quickly, followed by personal papers and then family papers.  Artificial collections are usually the fastest or the slowest depending entirely upon the collector.  Usually, they are speedy—the collector is in love with the topic they are collecting and as a result, they arrange the collection for their own personal satisfaction and use—all the letters of a children’s book author are arranged chronologically by date sent or alphabetically by the recipients’ names.  If this is the case, the artificial collection is a dream to process and it usually requires only description.  In a few instances, however, we have found collections where the collector simply collects … they probably know that the stuff is important, but they are not organizers.  At that point, trying to create a system out of a group of randomly acquired material can be quite difficult.

Institutional and business records are usually quick and easy and this is because the functions of a business or an institution generally follow the same basic structures and are fairly predictable.  Usually, you will find financial records, minutes, committee records, administrative records, subject files, correspondence, etc.  Because the function generates the records, it is logical and easy to determine a good organizational scheme for the papers.  But as always, the collections are unique and we have found that different creators generate different levels of tidiness, logical order, and structure.

Personal papers are the next quickest to process (generally speaking), especially if the creator was involved in several major movements, careers, and/or activities.  However, the ability to efficiently process a person’s personal collection often depends upon how intermingled those pursuits are with family, friends, and work.

Family papers have been, fairly consistently, the most time-consuming collections to process.  The problems that arise with family papers that generally do not exist with personal papers are the intertwining relationships that make determining to whom a certain group of materials belong challenging, and sometimes, impossible.  When every generation in a family has a woman named Sarah, determining generations becomes a trial.   Many a day passed at the Historical Society of Pennsylvania with the following conversation: “So wait, this is Sarah Logan Wister Starr?”  “No, this is Sarah Logan Starr Blaine!”  Or:  “Here is a letter to Grandma Sarah from Sarah …does that mean it is Sarah Logan Starr Blain?”  “No!  It could be Sarah Logan Starr Blain OR Sarah Logan Wister Starr OR Sarah Tyler Boas Wister!”  Egads … I wanted to buy a baby name book for this family!  Not surprisingly, this kind of questioning takes time … lots of time.

The third main factor in determining time for processing a collection is existing arrangement.  A collection of 20th century business records thrown into boxes will take longer than a collection of 18th century business records that are housed in volumes.  A collection of family papers organized by the donor into distinct family member’s papers can probably be processed more quickly than a collection of personal papers that are completely unsorted.  I have intentionally not used the term original order which implies that the order was generated the creator.  Existing arrangement may have been generated by the creator, but in many cases, it is generated by an archivist who starts processing the collection but does not complete the project.  Unfortunately, the hardest collections to process efficiently are often collections that someone else has started to process.  Trying to understand an undocumented order that has been imposed or continue with an arrangement scheme that does not seem logical is much more difficult than imposing order from absolute chaos.  And without a questions, the collections that take the absolute longest are ones in which parts of the collection have received item level treatment.  Addressed in the next blog post will be how this type of existing arrangement affects description of collections.

So, basically what we have said here is that every collection is different and unique and there is absolutely no way to say that one time will work even within a date frame or a type of record. Our observations are backed by Greene and Meissner who say that “MPLP … advises vigorously against adopting cookie-cutter approaches … and [recommends] flexible approaches,” (page 176).  In order to make educated estimates for allocating resources, we believe that a base-line starting time frame is needed:  institutional/corporate collections should be given 3 hours per linear foot.  Based upon the existing arrangement, tack on another hour per linear foot if it is in a shambles.  If the bulk of the material is from the 18th century, tack on yet another hour per linear foot for increased perusal time which will result in more effective description.  So, in this case, your estimated processing time is 5 hours per linear foot.  Could you do it in three?  Yes, probably.  However, with allowances for age and existing arrangement, you will almost unquestionably have a better product, still at just over ½ the rate of traditional processing.

Based upon our experience, the PACSCL/CLIR project believes that the following base-line processing time estimates would work well:

Artificial collections:  3 hours per linear foot

Institutional/corporate collections:  3 hours per linear foot

Personal papers:  4 hours per linear foot

Family papers:  6 hours per linear foot

Our averages clearly show how quickly collections can be processed … but the base-line estimate with upgrades allows us to provide the best possible product while being mindful of available resources.

 

Historic recipes: always a great way to celebrate!

Written by Holly Mengel on January 24th, 2012

At the end 2011, the PACSCL/CLIR “Hidden Collections” project gathered our processors, our repository staff, and our extraordinary helpers together to celebrate the successful completion of the project.  It was a way for Courtney and me to thank everyone who worked so hard and made this project work!

We celebrated in the beautiful Ewell Sale Stewart Library and Archives at the Academy of Natural Sciences, Philadelphia, thanks to the generosity of the archivist, Clare Fleming.  And many of our project team brought dishes, straight out of the past!  As we processed, the foodies among us took photographs of recipes we found in the collections, so it turned out that we had quite a nice pile of historic recipes to choose from when selecting our fabulous menu.  Photographs of our recipes can be found in our Flickr set Eating in the Archives.

I made five recipes and it occurred to me as I was running off to the grocery what a different world we live in from the late 1700s and 1800s.  For example, to make my five dishes, my ingredient list included the following ingredients:  butter, shortening, flour, eggs, baking powder (lots of it!), a little sugar, milk, rice and a few spices.  I tend to think of myself as rather in touch with history, but I remember sitting for a few moments staring at the list and thinking, “what of my fabulous vanilla from Mexico?  what of cocoa?  what of lemon zest?”  I also remember thinking, in a cold sweat, of what I would have to eat in the midst of December if it were not for grocery stores, airplanes, ships, railroads, commercial farms with irrigation systems, etc., bringing fresh fruit and exotic ingredients from around the world.  The cold sweat returned as I baked–I have a whole new appreciation for epicurious and cookbooks with instructions … I decided not to make “soft gingerbread” because the recipe included a list of ingredients, but no other instructions.  I am pleased to say that my braver colleague, Sarah, made the gingerbread with great success.

Now that I have talked about food (one of two conversations everyone eventually has with me–and usually sooner rather than later), I would like to publicly thank a few people:  our amazing project team; repository staff, who took us in and trusted us with their world-class collections; UPenn for hosting Courtney and me; Laura Blanchard, PACSCL staff member extraordinaire; Delphine Khanna, who is responsible for our fantastic PACSCL Finding Aids Site; Matt Herbison, who created a spreadsheet of wonder that helped make our project succeed (blog post on this forthcoming); Christa Williford from CLIR for all her support throughout the last 2.5 years; and Christine DiBella, who was responsible for the PACSCL Survey Initiative and helped me out, so much, particularly at the beginning of the project.  And finally, Courtney Smerz, who has brought her archival skill, pride of work, and enthusiasm to this project.

We have until the end of March to pull together all our loose ends and then we will leave behind this amazing project.  Thanks PACSCL and CLIR for this amazing opportunity!  I have enjoyed every minute!

 

27 months, 125 collections — How’d we do?

Written by Courtney Smerz on January 18th, 2012

After two years of speed processing across the Delaware Valley, Holly and I thought it prudent to take one last look at the collections before calling it quits. From September to the end of November we traveled from site to site reviewing our work and gathering information on the quality and accuracy of our efforts. In doing so, we learned a lot about the limitations of minimal processing AND our approach to training.

We processed 125 collections and spot checked 103.  Our approach varied a little from collection to collection, but generally speaking we followed the same protocol across the board, and created a worksheet to keep us on task. We took note of the overall condition of each collection, and recorded data on the condition of folders and whether folder labels were complete and legible.  We remeasured each collection (including counting containers and volumes), and carefully reviewed the contents of several boxes (every fifth or tenth box, for example, depending on the size of the collection).  Within boxes, we counted folders and reviewed the title and contents of at least one folder (sometimes many more) in each box, comparing the physical collection to what was recorded in the finding aid.  Here’s what we found:

  • Collections or parts of collections that benefited from new housing were infinitely easier to review than collections that remained in their original housing, particularly when it came to counting files, and reading and understanding the information provided on folder labels.
  • Inconsistent and incomplete folder labeling was a recurring issue in 32% of the collections we reviewed.  In particular, one of the more frustrating problems we encountered was that students frequently sacrificed recording the box and folder number or collection name on folder labels.
  • Another major issue we encountered was mistakes in box and folder numbering.  9% of the boxes we checked had numbering issues.  9% doesn’t seem like a lot, but renumbering boxes and folders (141 of ‘em, to be precise) is incredibly time consuming.  One mistake in numbering, as you probably know, means the entire box must be renumbered and updated in the database.
  • We identified 17 items that were unaccounted for in finding aids.
  • Happily, 96% of the files we checked for accuracy in description, when compared to the finding aid, were correctly described!

What we learned:

We had the good fortune to find and hire bright and enthusiastic student processors — nearly all of whom planned to become professional archivists.  We sometimes forgot, however, that they were not yet professional archivists and, though we provided a lot of training and feedback in certain areas, we placed less emphasis on others, perhaps assuming the importance of some tasks to be common knowledge.  We absolutely provided instruction on how to handle, house and label the physical collection, but in training (and in supervision) I think we inadvertently placed more importance on the quality of the finding aid.  That we employed MPLP, where less work is done physically, probably exacerbated this problem.

Though we were not able to gather data for all of these issues, anecdotally, I can say, the biggest offenders that detracted from the overall physical quality of the processed collections were: (1) failure to replace all of the damaged and/or brittle folders, (2) failure to re-record information provided on file labels with deteriorating adhesive, (3) inconsistency in folder labeling, (4) neglecting to record the collection name or number on folder labels, and (5) neglecting to record box and folder numbers on folder labels.

These issues not only made the collections look messy, but made them difficult to use.  Incomplete and inconsistent folder labels will certainly make research and reference (particularly returning files to their rightful place) difficult. And the failure to re-record information from failing adhesive labels risks losing some or all identifying information when those labels inevitably fall off and are lost.

If we had the chance to do it again, we would definitely add to our training and change how we supervised the students. At the very least, we would incorporate reference exercises into boot camp to place greater emphasis on how the condition of the physical collection impacts research and reference. Though in our case, this was hard to avoid, I think there would be less remote supervision. While we pored over finding aids, making endless edits (four rounds of editing!), we should have made more time to review the actual collection together with the processors.  We had lots of conversations along the way about how to approach arrangement, but little time was made to discuss the mechanics of processing. Doing so would also have provided the opportunity for processors to fix their own mistakes (rather than Holly and I doing it for them, after they’ve moved on), which, in my opinion, is one of the best ways to learn.

We are revamping our training and processing materials to reflect what we learned over the last few months, so be on the lookout for a tweet or blog post announcing when they are ready.

 

Last minute holiday gift for the archivist in your life …

Written by Holly Mengel on December 22nd, 2011

Over the last two years, Courtney and I have had 17 graduate students work for us and we appreciated every minute of the time and energy that they extended to the project. So, as a gift at the end of their service to the project, we gave them what we like to call the Archivists’ Kit Bag (the obvious Archivists’ Toolkit being taken). We hoped that this bag of tools would make them ready for whatever job came their way–and a job came for each and every one of them!

I personally want one of these kit bags, as does Courtney and a few other repository staff members who have seen them, so if you are still looking for the perfect gift for the archivist in your life, consider putting together this bag of goodies. We bought some tools from Gaylord, but you can actually go to craft stores (Dick Blick, Michaels, A.C. Moore, etc.) to get the bulk of it, if you are in a hurry.

Our kit bags included:

  • Bone folder
  • Micro spatula
  • Mechanical pencil with extra lead and erasers
  • Eraser
  • PH Pen
  • Knife
  • Measuring tape
  • Plasti-clips
  • Gloves
  • Note book

Our original bags were made by John Armstrong, surveyor during the PACSCL Survey Initiative, but after he moved to New England, we had bags made by an artist on Etsy who is, unfortunately, no long able to make them for the price he had originally quoted.  They are waxed cotton and pretty awesome, however, a pencil case would work just as well–just make certain the bag is big enough to hold the micro spatula.

Your favorite archivist will love it!

 

A suite of related collections poses problems with controlled vocabulary at the PMA

Written by Courtney Smerz on December 12th, 2011

As you know, we processed at the Philadelphia Museum of Art this summer, where Susie Anderson, the archivist there, selected several collections of related institutional records for processing.  All were post-1950 collections of institutional records, each representing a distinct department within the Museum.  Seems like the ideal situation for MPLP processing, right?  Well, not exactly.  After processing a couple of the collections we realized that subject matter overlapped, and we ran into trouble with controlling vocabulary both within and across collections.  That means, whether browsing the collection folder lists or doing a keyword search on the finding aid site, a researcher will not find everything they are looking for in one place or under one heading.

The issue of controlled vocabulary has posed a problem throughout our project, particularly in regard to the faceted search on our finding aid site, which I won’t get into here.  In fact, I am not talking about name and subject authorities, I am talking about titling folders.

5 of us processed at the Art Museum.  When you have 5 different people whipping through collections at record speeds, you risk getting 5 different ways of arranging and describing records.  This problem is amplified when the collection’s creator wasn’t so great about consistent filing either.  Even when processors make it a point to talk to each other about filing decisions, which we did, some files and topics fall through the cracks and end up disbursed throughout the collection, rather than forming a cohesive (and easy to use) group of related records.

In cases like the Art Museum’s, where you (and 4 of your colleagues) are processing a group of related collections, describing collections consistently is important.  It’s also a really hard thing to achieve in MPLP.  What I found most frustrating, was how obvious the mistakes were when we were completing data entry.  In a different environment, we’d have time to correct those discrepancies, but in our world we must live with them!