Citations on the web

While programming Fidus Writer, we needed something to format bibliographic citations in the text and in the bibliography. At first we programmed something simple ourselves, but it grew into a major headache to get it to work in more specific cases and more advanced options. Ian Mulvany of eLife finally pointed us to the Citation Style Language (CSL) project — a little known, but very much used project to add bibliographic capabilities to web pages. Fidus Writer has joined the long list of projects that use CSL. Frank Bennett and Rintze Zelle of the CSL project have been kind enough to explain a bit about their work:

You both are contributors to the Citation Style Language (CSL) project. What is CSL, and what is it used for?

Frank of the Citation Style Language project

Frank Bennett of the Citation Style Language project

Scientific papers are full of references, and for good reason. They are important for attributing prior ideas and results, crucial in building scientific arguments, and they lay out paths for the reader through the scientific literature. The same is true for citations in law, and many other fields. Writing references by hand is an arduous task, aggravated by the wide variety of journal-specific styles—the burden of referencing might be a reason we call these fields “disciplines”. The Citation Style Language is an XML-based language that allows the computer to automatically generate references for you. The computer only needs the information of the items you want to reference (like the details of a journal article), a CSL citation style, and a CSL processor. This processor is a piece of software that interprets the CSL style and produces the formatted citations. CSL was created by Bruce D’Arcus, and further shaped through an early collaboration with developer Simon Kornblith and others in the core Zotero team. Zotero itself was released to the public in 2006, as the first program to use CSL. Over the years, CSL has become more and more popular. We currently offer over 6000 CSL citation styles, there are multiple open source CSL processors, and CSL is used by over a dozen desktop and online tools, including the popular reference managers Zotero, Mendeley, and Papers. And Fidus Writer, of course.

What are your roles in the CSL project, and how did you become involved?

Rintze of the Citation Style Language project

Rintze Zelle of the Citation Style Language project

Rintze: I discovered Zotero in 2007 during my PhD studies, but ran into a few limitations of CSL as I was writing my manuscripts. I joined the CSL project with the hope to address them, and over time became more and more involved. I wrote the CSL specification, started maintaining the CSL schema and managing new CSL releases. While CSL has had many contributors, I think it’s fair to say that Frank and I were the main drivers for the CSL 1.0 and 1.0.1 releases in 2010 and 2012, respectively. I currently spend most of my CSL time maintaining styles in the CSL styles repository, together with Sebastian Karcher.

Frank: I teach in a law faculty in Japan, and I stumbled across Zotero in 2008 while looking for tools that might be useful to our students. While Zotero and CSL were a brilliantly good fit for us in most respects, I was keen to promote a few extensions and improvements. Persistent tinkering, and discussions with Rintze and others, resulted in the citeproc-js processor which now runs in Zotero, Fidus Writer, and other projects. I follow CSL development channels closely to pick up bug reports against the processor, and I pester the group from time to time for advice on CSL-m, a fork of official CSL used in Multilingual Zotero (MLZ) that aims to provide support for legal and multilingual research (MLZ is itself a fork of Zotero). I’m optimistic that the extended features will be merged into official CSL when the time is right, but meanwhile the two projects are happy neighbors.

Since many people now read their papers digitally, do we still need formatted references? Can’t we just use links instead?

Rintze: I think that references still have a role in the digital age. It’s true that readers don’t rely on formatted references as much anymore for looking up papers, as most literature is now available online, and publishers often add links to bibliographies to make referenced papers easy to find. But formatted references are still an important part of the text. Citations can be unobtrusive and informative markers: a “traditional” citation like “(D’Arcus, 2003)” in the text provides the reader with much more context than a DOI link (“dx.doi.org/10.1046/j.1467-8330.2003.00347.x“). Bibliographies give a quick overview of referenced materials. And formatted references work very well in print (I personally still read most of my papers on paper). There is clearly plenty of room to innovate, though. Fortunately, publishers like Elsevier and Springer are starting to standardize the citation styles used by their journals, and are also becoming more flexible in accepting manuscripts, allowing the use of any citation style. Others, such as the eLife journal, are experimenting with new ways to display references in their online articles.

Frank: My main practical interest is in support for legal writing, and in the law we don’t yet have a system of identifiers. There is a reason, although not a particularly good one: to the present, the two (now three) leading publishers of primary legal materials that dominate the US market don’t have a strong incentive to participate in lightweight identifier schemes that could undermine their position. The largest, West Publishing, famously sued (repeatedly) to assert copyright in the page numbers of its publications. As text-processing and networking costs decline, the legal publishing sector is starting to loosen up, but it will be a long time before working lawyers and judges feel inclined to go all-electronic and abandon the familiar citation on the printed page.

How many different citation styles are really needed? Couldn’t we all just use one single style?

Rintze: there are now CSL styles for over 6,000 journals. Many of those use identical citation formats, but we still have over 800 unique styles. Some of that diversity is hard to eliminate, since styles have to cover different languages and different disciplines. On the other hand, many publishers still have a good way to go in reducing the number of citation styles in use.

Frank: I kind of like that there is some variety in citation styles. Apart from giving publishers and authors expressive choices, it helps to remind us all that citations are distinct from the full metadata of the sources they identify. I’m with Rintze that the extent of trivial variation we have today is hard to justify, though. Now that the burden of current practices is becoming visible in a single public archive, we can hope that publishers and the communities they serve will start reeling in some of the excess.

Mendeley and Papers have recently been acquired by Elsevier and Springer, respectively. Since both of these tools use CSL, what is your current relationship with these publishers?

Rintze: It might surprise you, but I’ve had little contact with publishers in my time with the CSL project. From our side, it’s always hard to reach the right people at these large publishing companies. And since the CSL project has so far been run without funding, we don’t get many opportunities to visit conferences where we can talk about CSL. From their perspective, I guess we might appear a bit unorganized: CSL doesn’t have an office, employees, telephone number or even a central email address. People’s roles within the project are also informally established and can easily change over time. Still, I think it’s clear that we have a shared interest in improving research and publishing workflows, so I welcome any dialogue between the two worlds. Fortunately, since their purchase, Mendeley and Papers have, in a way, become our communication channels with Elsevier and Springer. Their developers now provide us with metadata of most Elsevier and Springer journals, from which we were able to generate thousands of new CSL styles.

For LaTeX, in recent years a system known as BibLaTeX to style citations and bibliographies has emerged. CSL and BibLateX are very different, but have a similar purpose. Do you guys not communicate with one another?

Frank: Designing a citation formatting language demands a huge investment of time in the study and abstract expression of style requirements. The replication of that effort in the TeX and CSL communities is one of the growing pains that comes with exploration; and it does tend to wed us to the solutions in which we have invested our time. That said, it would be fair to say that CSL is the more general of the two approaches. CSL processors currently drive academic repositories, blogs, and word processor plugins, and can even be used to cast BibTeX cites for onward processing by LaTeX systems. A CSL-based tool could be slotted into the TeX processing chain more directly: but if that does happen one day, LaTeX will gain a bibliographic formatting option, without loss to existing systems like BibLaTeX.

What does the future hold for CSL?

Rintze: CSL development has slowed down a bit recently. This is partially because CSL has become more mature, but also because because many CSL developers are busy with other things (like their day jobs). As CSL has become more popular, I also spend more time helping users who wish to contribute styles to the repository. So far the CSL project has been completely unfunded and mostly community-driven, so it can be difficult at times to keep momentum. I hope we can keep the project relevant and moving forward, while reducing the current reliance on just a few volunteers to maintain the central CSL infrastructure (such as the style repository). Hopefully publishers and funding agencies will pick up the gauntlet and help us find ways to operate the project in a more sustainable manner, while maximizing the benefits that CSL can bring to the publishing workflows of the research community. If there are any interested parties out there, a few well-placed grants could certainly help speed things along. That said, there are already plenty of exciting developments. The style repository shows steady growth, new tools that use CSL pop up left and right (like Fidus Writer!), and CSL seems to have become the standard choice for new citation software. Several open source CSL processors are being actively maintained, and Frank has been busy working on his fork of Zotero, called Multilingual Zotero, that has improved support for multilingual and legal citations. His book that describes Multilingual Zotero was just published a few months ago!

Frank: What Rintze said. 🙂

 

Thanks for the interview! With CSL you have really made it quite a bit easier to program editors like Fidus Writer and other bibliography centric web projects. I hope you will continue your work for a long time to come!

3 thoughts on “Citations on the web

  1. Hallo, I use Zotero as my main bibliography manager. Can I use it together with Fiduswriter as source of biblography?

    • Zotero is a priority for us, which is why we have made the bibtex import filter work specifically with it in mind. We would also like to create a direct conention to the zotero database. This depends on funding more than anything.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.