The Value of List Archives
A recent post on Policies on Unused JISCMail Lists highlighted the potential value of JISCMail lists which are no longer active but which host content which may provide historical insights into digital library developments. As is suggested by the recent JISC ITT for an Analysis of the Value and Benefits of Text Mining and Text Analytics in UK FE and HE data mining tools have developed in sophistication since the JISCMail service was launched in 2000. It may well now be timely to perform data mining work on our email archives – particularly as recent email messages have been sent to owners of unused lists inviting them to delete archives without mentioning the implications of such actions. The current importance of JISCMail as an archive rather than as a communications tool is also suggested by the JISCMail statistics which show that the majority of lists (5,840) have no recent posts (and this number is steadily increasing), 1,583 have between 1 and 10 posts, 707 have between 11 and 100 and only 114 have over 100 posts.
However, as was discussed in the comments on the recent post, it is unclear whether closed lists which are no longer in use can be made open. Does the 30 year rule which, according to Wikipedia, states that “Public records ….other than those to which members of the public have had access before their transfer …., shall not be available for public inspection until they have been in existence for [thirty] years or such other period….as the Lord Chancellor may,…. for the time being prescribe as respects any particular class of public records” apply to JISCMail lists? But as Chris Rusbridge has pointed out Section 7 of the JISCMail Acceptable Use Policy states that:
“Messages sent to a JISCMail list will normally be archived, and these archives can then be retrieved by any member of that same list. These archives may also (at the discretion of the listowner) be made publicly available on the web, and thus be available to anyone. …
Archives or collections of the messages sent to a JISCMail list may not be made publicly available at another site unless the listowner has granted explicit permission, and the list members have been informed.“
It would therefore appear that as listowner I can make the LIS-ELIB-MANAGERS archive available (and also make the archive available elsewhere) provided I inform the list members. However although the FAQ suggests that the decision for opening access to a closed list resides with the list owner, the list owner will need to make a decision as to whether it is appropriate for a list to be made open. Clearly there may be lists which contains confidential, sensitive, embarrassing or even potential illegal content which should not be made available. In addition, as described in a JISCMail page on Copyright:
When you send a message to a JISCMail list, you retain your copyright in that message. You also retain your moral right to be identified as the author of the work, and your moral right against derogatory treatment.
The extent to which your message is made available across the internet will depend on the level of access that has been decided by the listowner.
What processes should be taken to decide whether or not to open up a closed list archive? This post describes the processes which are being taken for the LIS-ELIB-MANAGERS archive.
Processes For Informing List Members
Auditing The List
The LIS-ELIB-MANAGERS list currently has 26 members. A message was sent to the list in order to see how many of the email addresses were still valid. There were 11 bounced messages but only four people replied to a request to respond to a message sent to the list. It does not seem to be possible to find out how many people in total have subscribed to a list. For data protection reasons when users leave JISCMail, their name and email address are removed from the JISCMail database. However the ownership of email messages relates to list members who have posted to a list and not to those who have only lurked on a list. It therefore would seem feasible to explore information about the numbers of people who have posted to the list and the number of messages they have posted.
Unfortunately there doesn’t seem to be an easy way of getting reports on the numbers of people who have posted to a JISCMail list or the number of messages they have posted. I therefore used the advanced search function to search for the numbers of messages posted for each year.
It would be useful if information on the numbers of messages posted by list members could be obtained. However this does not seem to be provided. I therefore looked at the list archives in order to find the email address of people who had posted to the list and searched for this email address in order to see the total number of messages posted from the email address. I did this for 30 users, including those whose names were familiar to me and whom I felt were likely to have posted significant numbers of messages to the list. The details are given below.
In addition I skimmed through some of the messages in order to gain a feel for the issues being discussed and to see if there appeared to be any sensitive topics been discussed or flame wars breaking out. As can be seen from the list of subjects which are illustrated, there doesn’t appear to be any sensitive issues being routinely discussed. However subsequently I came across one post which contained personal information about a member of the community which I feel should be deleted if the list archives are to be made open.
|Name||Nos. of posts|
|1||Chris Rusbridge||119 [118+1]|
|3||John Kirriemuir||29 [16 (UKOLN), 6 (ILRT) + 7 (OMNI)]|
|4||Kelly Russell||24 [13 + 8 + 3]|
|10||Brian Kelly||11 [10+1]|
|17||Lorcan Dempsey||5 [4+1]|
|25||Liora Rolfe Stubbs||3|
In the above table it should be noted that one person (John Kirriemuir) posted from three different organisational addresses. In addition four others posted from different variants of the same email address (e.g. email@example.com and firstname.lastname@example.org).
This table does not include everyone who has posted and also does not necessarily include information on those who have posted significant numbers of messages, since there are 155 messages not attributed to a sender. However we seem to have listed the most active participants, including those who worked for the eLib programme team and those who worked at UKOLN who hosted the eLib programme Web site and were actively involved in the design of the eLib programme. Having skimmed through the list archives, especially for the most active period in 1996-1998, it seems that many of the remaining posts will have been from a long tail of people posting informational messages about their projects, events, publications, etc.
Policy and Processes for Changing Access to the List Archives
Following this audit I have been in touch with Chris Rusbridge and Lorcan Dempsey in order to solicit feedback on the following proposed policy and implementation processes:
Information on the audit of the lis-elib-managers JISCMail list will be published and promoted to those who were active in the eLib community in order to solicit their views on opening access to the lis-elib-managers-archives.
Current and previous list members will be informed that the list owner and others involved in managing the list when it was being actively used feel that the list had been made closed in order that the list helped to address a particular audience and wanted to minimise distractions.
Posts which are discovered which contain personal information which we feel may be inappropriate to be published openly will be deleted.
Individuals who have posted to the list who may have concerns regarding issues related to confidentiality, legality and related issues for their posts can request further information about their posts.
If there are no specific concerns raised after a period of a month the list archives will be made open. This policy on openness will allow the archives to be published elsewhere, such as on the Markmail.org service. If concerns are raised these will be discussed by Brian Kelly, Chris Rusbridge and Lorcan Dempsey.
Rachel Bruce and Neil Grindley from the JISC who will have interests in preservation policies will be informed of the proposed change in status of the list and the processed which have been used prior to this change.
In brief the process for opening up access to the mailing list archive which may be applicable for other lists consists of:
- Auditing the archive in order to identify the numbers of people who have posted messages and the numbers of messages that have been posted.
- Identifying the reasons why the list was set up as a closed list.
- Gaining an understanding of possible risks in opening up access to list archives.
- Formulating a policy decision with key stakeholders.
- Communicating the policy and gathering feedback.
- Analysing the feedback and reviewing any changes to the proposed policy.
- Implementing the policy.
This post began by the value which text mining tools can potentially provide by exploring the contents of email archives. It is important to note that such text mining need not be carried out by the organisation hosting the archives; indeed there may be advantages in allowing an email distribution service to focus on the challenges in delivering large volumes of email for the higher education sector and allowing other organisations with expertise in data mining to provide this service.
The proposed changes to the policy will also allow the content to be reused elsewhere, such as the Markmail.org service. As can be seen, this contains a large amount of content about JISC (1,235 messages from the 8,371 lists it currently indexes). However this does not include lists which are hosted by JISCMail, due to the JISCMail policy which prohibits archives from being hosted elsewhere, without the permission of the list owner. I hope that this post has outlined one way in which a closed list can be made open and such openness exploited by enabling a service which can demonstrably add value to be allowed to make use of the valuable archives provided by the JISCMail service.
Is this an appropriate approach? I’d welcome your feedback.