Tag Archives: Anna Kijas

Florence Nightingale 1858 diagram of mortality rates

March 2019 Data Visualization Display @ O’Neill Library

For the months of March and April, the O’Neill digital display (by the POP collection) will feature a curated “Women Also Know Data” visualization display to highlight diagrams, projects, and software developed by women over the last 160 years.

Diagram of the Causes of Mortality in the Army in the East

Florence Nightingale (1820-1910) is commonly referred to as the founder of modern nursing, but did you know that she was also a statistician and created hand-drawn data visualizations? She created “coxcombs” (or diagrams) and used them to report on conditions of medical care during the Crimean War. The “Diagram of the Causes of Mortality in the Army in the East” (1858), is an example of a polar area diagram that depicts the number of soldiers’ deaths from preventable diseases (identified in blue, red, and black) according to month and year. Nightingale was recognized for her pioneering work and was the first woman elected (1859) into the Royal Statistical Society.

Los Angeles: The City and the Library

Dr. Colleen Jaurretche’s composition class at UCLA developed the “Los Angeles: The City and the Library” site as a way to explore the history of Los Angeles. This project exemplifies how faculty can collaborate with librarians and archivists on a course-long assignment. Students attended library research sessions to develop their research paper topics and worked with the UCLA Special Collections to select images and create annotations about the materials examined in the archive, plotting each artifact on a map. The infrastructure for this project is built on Flâneur, a Jekyll theme for maps and texts, developed by Dawn Childress (UCLA, Digital Scholarship Librarian) & Niqui O’Neil (NCSU, Digital Technologies Development Librarian).

Dear Data: A friendship in data, drawing and postcards

Giorgia Lupi, an information designer, artist and entrepreneur, and Stefanie Posavec, a designer embarked on a year long project during which they hand drew data visualizations on postcards and mailed them to each other across the Atlantic. Their visualizations were based on the data they collected about their lives, including such things as door patterns, laughter, clocks, and so on. You can find images of the postcards on their project site, as well as a few videos.

Deconstructing Space Oddity one dimension at a time

Following the death of musician David Bowie (1947-2016), designer Valentina D’Efilippo and researcher Miriam Quick, developed “Deconstructing Space Oddity one dimension at a time” project in which they visualized data from Bowie’s song, “Space Oddity,” written in 1969. They selected this piece due to its significance in Bowie’s legacy, because it was his first breakthrough single, first British top 5 hit, and his first US Top 20. D’Efilippo and Quick deconstructed the song and visualized data according to narrative, recording, texture, rhythm, harmony, structure, melody, lyrics, trip, and emotions. For example, the data about the recording visualizes the master tracks of this song, including the lead vocals, backing vocals, and instrumentation (flute, strings, mellotron, stylophone, guitar, bass, and drums).

The Women of Data Viz

Alli Torban, a data visualization designer, created this visualization based on survey results collected by Elijah Meeks. 142 women who identify as data visualization practitioners took the survey and responded to questions about themselves and their work. Torban’s visualization is a example of a free-form visualization of data that can be as she puts it is “not only beautiful and engaging, but also something that helps you connect with your data.” Torban hosts a podcast called “Data Viz Today” and you can hear more about this visualization and other projects in Episode 28: How to Build a Connection With Your Data Through Original Visualization. You can also view the Meeks’ original survey data and results in his GitHub repository, https://github.com/emeeks/data_visualization_survey.


The visualization display is curated by Allison Xu (Data and Visualization Librarian) and Anna Kijas (Digital Scholarship Librarian).

visualization of US immigration trends

February 2019 Data Visualization Display @ O’Neill Library

For the rest of January through February, the O’Neill Library digital display (by the POP collection) will showcase a selection of data visualizations that covers a variety of topics, including health, politics, immigration as well as food. Each source is linked to the original site where you can further explore the associated data, visualization, or literature.

2018 Midterm Election

The beginning of a year is always a good time to look back at the past year. 2018 will be remembered in many ways, one thing that reminds us about 2018 is the midterm election which has been one of the most popular topics in the media for quite a while. The visualization by Bloomberg maps the 2018 election for the House, Senate and Governor races. The data provided is extensive and impeccably organized by the three races which can be further broken down by state, races with women, open races, key races, committee chairs, and flipped seats. The map combines a lot of information into one single map in that users can change the view from cartogram to map, and switch between the elections and states easily.

World Coffee Production

Do you like coffee? If so, you will probably find this visualization interesting. Nitin Paighowal visualizes the world’s “Coffee Bean Belt,” which shows areas with the most coffee production. He shows which nations produce the most coffee according to coffee varieties. The visualization was originally created in Tableau and published on Tableau Public. Because of the high popularity, it has been selected as one of the best visualizations of 2018 in Tableau Public Gallery.

Rhythm of food

Powerful data visualization can translate complex information into beautiful visual representations for storytelling. Rhythm of food, a visualization project, created by Google News lab in collaboration with Truth & Beauty, charts 12 years of food related search trends based on Google search data. They collected weekly google trends data for hundreds of dishes and ingredients over 12 years, and plotted the results on a year clock to discover the interplay between seasons, years, holidays and rhythm of food around the world.

Food trends across the country

When it comes to restaurants, every US city has its own favorite(s). Have you ever wondered what the most popular local cuisine is when you travel to a new city? A visualization by Google News Lab and design studio Polygraph will answer your question with a map. In this visualization, you will find out that Boston ranked No.2 for Pizza and No.4 for Burger out of all US cities.

Searching for health

Another visualization that we found was also created with Google search data. Google News Lab, collaborated with Schema and Alberto Cairo to create “Searching for Health”, a visualization that tracks the top searches for common health issues in the United States, from Cancer to Diabetes, and compares them with the actual location of occurrences for those same health conditions. By using data from both Google Trends API and the Center for Disease Control and Prevention (CDC), the visualization allows the reader to find potential geographic relationships between those who search and the actual prevalence of health conditions across the country.

The Simulated Dendrochronology of U.S. Immigration 1790-2016

America is a nation of immigrants, Simulated Dendrochronology of US Immigration visualizes the history of immigration to the United States over the past two centuries. The visualization was created by Pedro Cruz, John Wihbey, Avni Ghael, and Felipe Shibuya from Northeastern University. Data was collected from IPUMS-USA which contains Census data from 1790 to 2016. Pedro Cruz explains the method for creating this visualization in his paper: “Process of Simulating Tree Rings for Immigration in The U.S. A video version of this visualization is also available.

This month’s data visualization blog post was written by Allison Xu (Data and Visualization Librarian). The visualization display was curated by Allison Xu and Anna Kijas (Digital Scholarship Librarian).

front cover of Thomas D. Craven diary

Encoding the Thomas D. Craven Diary

In the spring of 2018, several library and archives staff from Thomas P. O’Neill (Nancy Adams, Meg Critch, Sarah DeLorme, Anna Kijas) and John J. Burns Library (Kathleen Monahan, Annalisa Moretti) began a collaborative transcription and encoding project of a 1917 diary written by Boston College student, Thomas D. Craven. This diary was written during the spring semester of Craven’s senior year when he began serving in the Army Air Corps Medical Corps during World War I.

Thomas D. Craven, c.1917

Thomas D. Craven (The Sub turri: The Yearbook of Boston College, 1917).

Nancy and I created a Guide to Transcription to help guide the transcription process for the team members. This guide also provides basic TEI encoding directions, because we wanted to begin identifying elements and attributes as we transcribed each entry with the hope that it would make the later review and encoding phase a bit easier and streamlined. After the project team reviewed the guide and provided input, everyone began transcribing approximately 50 entries per person. Kathleen and Sarah began working on a prosopography to identify people, places, and organizations mentioned in the diary. Meg began developing the TEI header for the diary, which will include descriptions about the electronic edition and manuscript source. The transcription phase was completed in December 2018 and the next phase has begun to review and make corrections, as well as do a closer encoding of the text.

Here is the first page from the diary dated January 1, 1917 followed by the first draft of the encoded transcription:

Page 1 of Thomas D. Craven Diary from January 1, 1917

Diary entry dated January 1, 1917


Encoded transcription of diary entry dated January 1, 1917

Encoded transcription of diary entry dated January 1, 1917

Our next task is to review the transcriptions and further encode the text according to the TEI. This will also require a discussion on the use of specific elements and attributes. The group agreed that we will use TAPAS to render and publish the TEI files from this project, although we may consider creating a stand alone project website where we can present the edition with additional content, images, or visualizations.

The work of this group aims to not only make this content more accessible and visible to a wider community, but to expand our own expertise and understanding of the TEI through project-based learning. The TEI files and guidelines will demonstrate how we chose to encode these texts and can be re-used for other projects or pedagogical purposes. In addition, encoding these materials will make them easier to discover and access online and will further promote the John J. Burns Library collections. Project-based learning can be used as a model for future initiatives at Boston College that aim to develop expertise and skills in areas of digital scholarship.

This project is currently under development, but you can view a sample of encoded text from this diary (created previously) and other special collections materials found in our TEI Learning Docs project hosted in TAPAS. It is part of our ongoing effort to learn the TEI, explore research and pedagogical applications of the TEI to primary source documents, and make the process and contents visible and accessible to a wider community of students, scholars, and archives/library professionals.


Source citation: Diary, Thomas D. Craven papers, BC.2004.121, John J. Burns Library, Boston College, http://hdl.handle.net/2345.2/BC2004_121_ref5.

Screenshot of network graph

Representing Musicians in the SCCIM as a Network

We have made some strides since the last post where I described our initial work to represent relationships between musicians in The Séamus Connolly Collection of Irish Music (SCCIM) as linked data and a network graph. Recently, Kelly and Meg completed their work in creating links between the artists and recordings in MusicBrainz and this has resulted in a rich set of metadata. Each artist and instrument is now connected to each relevant recording.

Screenshot of Connolly data in MusicBrainz

Fig. 1 Connolly data in MusicBrainz

This results in linked records with aggregated data that shows all of the recordings a specific musician is connected to and the instrument they performed on per recording.  

Screenshot of MusicBrainz data for Tina Lech

Fig. 2 MusicBrainz record for Tina Lech

Another goal of this project is to create a network graph of the musicians in the SCCIM. To visualize the relationships between each musician, I am using Gephi, an open source network visualization tool, to generate the network graphs and render them with the Sigma.js library.

A force directed layout (ForceAtlas2) was applied to render this network graph. This layout is useful for smaller graphs, such as the SCCIM network, which has 158 nodes and 224 edges. Within the network graph, the musicians are represented as nodes (dots) and relationships are edges (lines). A musician is connected to other musicians by an edge if they performed together on one or multiple tunes. The network also shows single nodes, which represent a musician who composed and/or performed the tune as a soloist. All of the data was drawn directly from the metadata in the SCCIM.

Screenshot of dataset

Fig. 3 Snapshot of edge data


Screenshot of group legend

Fig. 4 Group legend

A color is assigned to each node in order to represent groupings by role. Group 1 includes musicians who are only performers; Group 2 includes both performers/composers; and Group 3 includes only composers. Relationships are defined as a collaboration between musicians on a tune within the SCCIM collection. 

Screenshot of sidebar view

Fig. 5 Sidebar view

When a node is selected, a sidebar opens up on the right with data about the musician and relationships within this network. Their name and role is listed, as is the degree (number of edges), which tells you how many collaborations this specific musician has within the context of the SCCIM. The musicians they collaborated with appear as links under “Connections.” When one of these names is clicked, the graph adjusts to show their connections. If Alice Bérubé is selected, for example, you’ll see that she collaborated with four different musicians, three of these are performers (Jeannine Webb, Pete Sutherland, and Ken Perlman), and one is a composer/performer (Seamus Connolly). Bérubé composed the tune “Don’t Get Me Anything” and also played the fiddle with Webb (fiddle), Perlman (banjo), Sutherland (piano), and Connolly (fiddle).

Screenshot of Berube connections

Fig. 6 Alice Bérubé’s connections

We focused solely on collaborations, because it was not possible to identify other types of relationships, such as who may have influenced who, who someone studied with, or band membership, because this data was not part of the original collection. In some cases, the stories written by Connolly shed some light on these additional relationships, for example, it might be mentioned that one of the musicians was a student of a certain individual or that they were influenced by a specific musician. This information might be useful to scholars and musicians interested in learning more about the way that musical traditions are transmitted, taught, and shared in the traditional Irish music community.

Visualizing the musicians who contributed compositions, performances, or both as a network graph provides a bird’s eye view at the distribution of roles. Using the “Group Selector,” you can easily see that there are 91 performers, 43 musicians who are both composers/performers, and 24 composers in this network. You can view the makeup of each group by selecting one specific group. These views depict the level of collaboration by role.

A longer term goal of this project is to not only show a general list of connections for each node, but to provide further context through the use of RDF and LOD so that users will be able to see how many times a musician interacted with another musician, on which tune(s), and which instrument they played, or whether they composed the tune that was performed. This data, which will live in RDF pages, such as this example, would be dynamically generated and accessed when someone selects a musician (node). This project is ongoing and additional updates will be shared along the way. You can view the version of the interactive network graph as described in this post online.

Desegregating Boston Schools Poster

Visualizing racial disparity in Boston, c. 1970

During the spring and summer of this year, I collaborated on an exhibit, Desegregating Boston Schools: Crisis and Community Activism, 1963-1977, with Sarah Melton and Dr. Eric Weiskott. The main exhibit is at the John J. Burns Library, and a smaller complementary exhibit is on view in the Reading Room, Level 3, Thomas P. O’Neill, Jr. Library. Curating this exhibit required doing research in special collections at John J. Burns Library, specifically in the Louise Bonar and Carol Wolfe collection, Citywide Coordinating Council Records, and the Robert F. Drinan, SJ Congressional Papers.

One aspect of this exhibit was to create visualizations and infographics using racial demographic data for the City of Boston, racial distribution of students within the Boston Public Schools, and outcomes of the Boston School Committee election of 1973. The data for these visualizations was drawn from the materials in the Bonar/Wolfe collection, Citywide Coordinating Council Records, 1970 Census, and Analyze Boston.

To complement the materials in the exhibit in the John J. Burns Library, which include a map depicting the total black population in the City of Boston (1970) juxtaposed with the wards won by the only black candidate—Patricia Bonner-Lyons, who ran for the Boston School Committee in 1973—I created these three density maps. The maps were created with tract-level 1970 Census data, which depicts the neighborhoods within the City of Boston as established by the Bureau of the Census. The shading (light to dark) of each neighborhood correlates with the number (low to high) of people according to race, as documented in the 1970 Census. From these visualizations it is easy to see that neighborhoods, including South Boston, West Roxbury, Roslindale, and Jamaica Plain were predominantly white, while the neighborhoods of Roxbury and Dorchester were predominantly black.

Density map of population by racial demographics in City of Boston, ca. 1970.

Density map depicting population according to racial demographics (white, black, and hispanic) in the City of Boston, ca. 1970. (Click on the image to open the interactive map in separate tab)

There are many different GIS platforms and tools available, but for this project I used Tableau Public a freely available software that enables you to create interactive data visualizations (not just maps!). The neighborhoods in these maps are created with a shapefile that I generated from the Neighborhood Change Database 1970-2010. Tableau Public provides the option to connect a spatial file, which will then allow you to render a spatial visualization and identify the specific dimensions (for this map: population by race) that will be shown in an info box upon clicking or hovering over the map.

Screen-shot showing the pop-up box.

Dimensions are visible in the pop-up box.

The full workbook for this visualization can be downloaded from the “City of Boston 1970 (test)” page on my Tableau Public profile page.

Screenshot of XML

Collaborative TEI

This semester we will be opening up our Collaborative TEI group to the wider Boston College and digital humanities community. This group is interested in developing an understanding of transcription and encoding standards using the TEI XML-based schema. Those interested in using other schemas, such as the MEI are also welcome to participate. Currently, we are uploading all of our learning docs and texts into the TAPAS Project, where you can immediately render and view your files.

We will have a monthly meet-up during which we encourage you to bring a text or document that you are interested in transcribing and encoding. We will spend the 1.5 hours working alongside each other in an environment that embraces discussion and questions. The purpose of this group is not to provide instruction at each session—although there may be opportunities for this—but rather to foster a learning community around the application of XML encoding standards, specifically TEI.  

If you are interested in joining us, please register in advance. Also, we ask that you bring your own laptop and install a text editor that recognizes and validates XML (such as Oxygen XML or Atom). We will meet in the Digital Studio open conference area. If you would like to schedule a consultation outside of this group, please contact Anna Kijas, Digital Scholarship Librarian, anna.kijas at bc.edu.

City of Boston map

ARL Digital Scholarship Institute: Part 2

It’s hard to believe that two months have passed since the inaugural Association of Research Libraries’ (ARL) Digital Scholarship Institute hosted at Boston College. In the previous post, you can read Sarah Melton’s overview of the goals of the Institute, and takeaways from the keynote by Jennifer Vinopal, Associate Director for Information Technology at The Ohio State University Libraries, and an opening workshop with Alex Gil, Digital Scholarship Coordinator at Columbia University Libraries. The ARL Digital Scholarship Institute was developed by a group of individuals from five institutions brought together by ARL in October 2016 to support one of the primary goals of the ARL Academy  to foster the development of an agile, diverse and highly-motivated workforce as well as the inspiring leadership necessary to meet present and future challenges.”

Continue reading

digital studio space

Announcing Digital Studio’s new software request policy

The Digital Studio provides access to a wide variety of software applications that support tasks, such as data analysis, design, multimedia, and visualization. In order to better anticipate software needs in the Digital Studio we are launching a new software request policy and form that outlines criteria and timeframes for submitting requests.

We encourage BC faculty, instructors, and staff to refer to this policy and submit the form if there is software they would like to use as part of their teaching, course support, research, or projects. This software would be installed on the public machines in the Digital Studio space. The policy and form can be viewed here and is also available through the “Library Resources” –> “Library Links” section within Canvas.

For questions regarding the policy or form, please contact Anna Kijas (anna.kijas at bc.edu). You can also provide feedback related to the Digital Studio via this form.

music as data

Small Data Rescue

Libraries+ Network recently published a blog post, “Engaging in Small Data Rescue” in which I describe data rescue efforts at Boston College during Endangered Data Week (April 17-21, 2017). Our participation came about after conversations with colleagues in the Music Library Association (MLA)Society of American Archivists (SAA), and at Penn Libraries and DataRefuge. In this blog post, I examine several initiatives for data rescue or data archiving of federal agency data, our aims and strategies in rescuing IMLS and NEH data, and the actual workflow of pulling data and creating records in a shared CKAN repository.