BibTeX Done Right
Online bibliographies and automation for better citations
Unlike in disciplines such as the humanities, computer science students are
normally not especially trained in the art of correct citation. There are
several reasons for this. One is that our work usually does not require heavy
text work with direct quotations and precise references to text passages.
Furthermore, the use of LaTeX enables us to use the powerful BibTeX tool
which creates beautiful bibliographies from structured text files, the .bib
bibliography files. The only thing required in addition to filling your .bib
files with bibliography entries is to place \cite{paper-key}
commands at
suitable positions in the LaTeX document.
For example, a .bib
file could contain the following entry:
@inproceedings{steinhoefel.haehnle-22,
author = {Dominic Steinh{\"{o}}fel and
Reiner H{\"{a}}hnle},
editor = {Maurice H. ter Beek and
Annabelle McIver and
Jos{\'{e}} N. Oliveira},
title = {Abstract Execution},
booktitle = {Formal Methods - The Next 30 Years - Third World Congress, {FM} 2019,
Porto, Portugal, October 7-11, 2019, Proceedings},
series = {Lecture Notes in Computer Science},
volume = {11800},
pages = {319--336},
publisher = {Springer},
year = {2019},
url = {https://doi.org/10.1007/978-3-030-30942-8\_20},
doi = {10.1007/978-3-030-30942-8\_20},
}
You can use this by adding \cite{steinhoefel.haehnle-22}
to your .tex
file,
as in
Abstract Execution~\cite{steinhoefel.haehnle-22} is a framework for proving the
correctness of statement-level program transformations.
Unfortunately, many computer science students—in my personal
experience—still produce bibliographies of poor quality due to, e.g.,
missing fields, wrong entry types, and wrong field contents in
.bib
files. For example, it would be wrong to omit the title
field in
the above BibTeX snippet, or use an @article
type instead of @inproceedings
.
It is not strictly wrong to omit the digital object identifier (DOI) or the
URL; however, it is good practice to include this information to make it
easier for readers and tools to locate and correctly identify a paper. However,
the DOI information is not even included in the .bib
file supplied by the
publisher of the paper referenced
above.
There’s plenty of information in the internet on how to use BibTeX in your LaTeX documents (e.g., on Overleaf, Wikipedia, and university websites). I warmly recommend reading these resources. However, I think that such recommendations are insufficient since students are already busy enough writing their thesis, conducting experiments, and so on, and you don’t get credit points for educating yourself to use BibTeX correctly.
This article does not aim to be yet another guide on BibTeX requirements or how to use BibTeX in your paper. Instead, I want to show how online resources and tools can help to get your bibliography correct automatically. This is what you’d expect from a computer scientist, right? 😊
I will expound the following three suggestions:
- Use online bibliographies (I recommend DBLP) to obtain the correct BibTeX entry for a published article.
- Follow my recommendations for referencing online resources.
- Check your file for problems using BibLatex-Check.
- (Optional) Format your BibTeX file using BibTool.
Online Bibliographies
OK, I lied: I won’t talk about online bibliographies here, but only about the single one I use regularly: The DBLP database.
Whenever you want to cite an article, first look for it on DBLP.
In my experience, the BibTeX snippets from DBLP are the most correct and complete ones you can get for computer science publications in the Internet. Using DBLP to obtain BibTeX code is easy:
- Visit https://dblp.org/.
- Enter parts of the paper’s title and authors' names until you sufficiently narrowed the search results.
- Hover on the “export record” icon and choose “BibTeX.”
- Click on “download as a .bib file” or copy-paste the displayed BibTeX code.
The screen cast below demonstrates how to obtain the reference information for the famous “Dragon Book” by Aho et al.
If you do not find a publication on DBLP, you might look for it on other pages,
such as Google Scholar,
CiteSeerX, or Semantic
Scholar. However, the results from these
websites are usually much worse compared to DBLP. For example, the citation
exported for the Dragon Book by Semantic Scholar has the @inproceedings
type
instead of @book
, which is plain wrong. You can also use online
tools supporting the manual creation of BibTeX
files by displaying the possible and signalling the required fields for the
available BibTeX entry types. If you want to cite a web page, please read on!
Citing Online Resources
Web pages are not “published” in the usual sense. They are not peer-reviewed, usually come without a permanent version identifier (i.e., they can be updated), and there is no BibTeX entry specifically for web pages (however, there is a suitable BibLaTeX entry). If possible, you should not cite websites at all. If you want to refer to a software project’s repository, for example, it is better to use a footnote than a citation. However, there are valid reasons for citing web pages. Consider the following sentence from one of my papers:
“When Companies Become Prisoners of Legacy Systems:” This title of a Wall Street Journal article [41] well describes the predominant perception of legacy software systems in academia.
It’s perfectly legitimate to use a citation for this online article of the Wall Street Journal. Here’s how to do this using BibTeX.
Online Resources in BibTeX
Since BibTeX does not provide us with an entry type for online resources, we
have to resort to the @misc
type. This entry type does not have any required
fields, and offers the optional fields author
, title
, howpublished
,
month
, year
, and note
. I strongly recommend using at least the following
fields for web pages:
title
: The title of the web pages. Each web pages should have a title, so it should always be possible to extract this information.howpublished
: The URL of the online resource.note
: Here, you enter the date when you last accessed this resource. This permits at least some “versioning” information for updatable resources.
Additionally, you should provide the author
, month
, and year
fields if
this information is available, which it is for the Wall Street Journal article.
The final entry for our article would be
@Misc{ schneider-13,
Author = {Schneider, Adam},
Title = {{When Companies Become Prisoners of Legacy Systems}},
Month = oct,
Year = 2013,
HowPublished = {\url{https://deloitte.wsj.com/articles/when-companies-become-prisoners-of-legacy-systems-1380600092}},
Note = {Accessed: 2022-11-03}
}
@Misc
or
@misc
. The above snippet was processed by BibTool,
which normalized keys to “upper camel case.” The astute reader furthermore will
have spotted that I surrounded the title with two pairs of curly braces. This
preserves the title case in the LaTeX output. I talk about this in the section
on formatting.
Online Resources in BibLaTeX
If you use
BibLaTeX,
you can (and should) use the @online
entry type:
@Online{ schneider-13,
Author = {Schneider, Adam},
Title = {{When Companies Become Prisoners of Legacy Systems}},
Month = oct,
Year = 2013,
URL = {https://deloitte.wsj.com/articles/when-companies-become-prisoners-of-legacy-systems-1380600092},
URLDate = {2022-11-03}
}
Checking BibTeX Files for Problems
If you obtained all your BibTeX entries from DBLP and came up with perfect entries for your online references, you can skip this point. However, we are also using linters, integration tests etc. for our code, even though we carefully tried to avoid any formatting or functional errors. I think that we should apply the same care to our bibliographies!
For automatically checking you BibTeX files for errors, I recommend the BibLatex-Check tool. To use it, follow these steps:
-
Clone the BibLatex-Check repository:
git clone https://github.com/rindPHI/BibLatex-Check
The URL for the tool, which was not created by myself, points to my fork of BibLatex-Check where I fixed the option to show the analysis results nicely rendered in your default web browser.
-
Call BibLatex-Check:
python3 BibLatex-Check/biblatex_check.py -b path/to/my.bib
This requires Python (version 3) on your system. If you don’t have Python, get it! It’s a great prototyping language 😃
You can also ask BibLatex-Check to show you a rendering of the results in your web browser:
python3 BibLatex-Check/biblatex_check.py -b path/to/my.bib -o result.html -v
-
Fix any errors 😎
Here’s a screen cast of me exercising steps 1 and 2:
Formatting Your BibTeX
I marked this part as optional because (1) you don’t need to refactor your
.bib
files for correct bibliographies—the generated entries in the rendered
file will look just the same; and (2) preserving “Title Case” in bibliography
entries is more controversial than I thought. Still, I recommend to follow the
suggestions in this section! You will be rewarded with nice, crisp, and
uniformly-looking BibTeX code and titles in Title Case. The world as it should
be.
Automatically Refactoring BibTeX
Maybe you’ve heard the term refactoring before: Changing your code such that
it still performs the same functionality, but is better to understand or
maintain for humans. That’s the first part of my two formatting suggestions:
Use BibTool to automatically clean up your BibTeX and bring it into a uniform
shape. BibTool is available on GitHub,
but also bundled in many LaTeX distributions such that you don’t need to install
it manually. Simply type bibtool -h
into a terminal window and see if you get
an error or a help text.
Assuming you have BibTool installed, you can use my reformatBibliography.sh
script
to format your bibliography in-place. I recommend to use a version control
system like Git and commit your changes to the .bib
file before, since the
script overwrites the existing file.
To install the script on your (Linux/UNIX/MacOS) system, follow these steps:
- Change into a directory in your
$PATH
, e.g., a local~/bin
directory:
cd ~/bin
- Download the sources:
wget https://gist.github.com/rindPHI/bff52e25d70c8acd0b81eab84fad8fca/archive/0c96e4328309fb85ef3ef0c84dd813aef2583c73.zip -O reformatBibliography.zip
- Unzip and delete the archive:
unzip reformatBibliography.zip
rm reformatBibliography.zip
- Make the script executable:
chmod +x reformatBibliography.sh
Using the script is simple: Call reformatBibliography.sh my.bib
to reformat a
file my.bib
. For example, assume my.bib
consists of the DBLP entry for my
Abstract Execution paper, which already served us as an example above:
@inproceedings{DBLP:conf/fm/SteinhofelH19,
author = {Dominic Steinh{\"{o}}fel and
Reiner H{\"{a}}hnle},
editor = {Maurice H. ter Beek and
Annabelle McIver and
Jos{\'{e}} N. Oliveira},
title = {Abstract Execution},
booktitle = {Formal Methods - The Next 30 Years - Third World Congress, {FM} 2019,
Porto, Portugal, October 7-11, 2019, Proceedings},
series = {Lecture Notes in Computer Science},
volume = {11800},
pages = {319--336},
publisher = {Springer},
year = {2019},
url = {https://doi.org/10.1007/978-3-030-30942-8\_20},
doi = {10.1007/978-3-030-30942-8\_20},
timestamp = {Sat, 12 Oct 2019 12:51:42 +0200},
biburl = {https://dblp.org/rec/conf/fm/SteinhofelH19.bib},
bibsource = {dblp computer science bibliography, https://dblp.org}
}
Running reformatBibliography.sh my.bib
changes the content of my.bib
to the
following entry:
@InProceedings{ steinhoefel.haehnle-19,
Author = {Dominic Steinh{\"{o}}fel and Reiner H{\"{a}}hnle},
Editor = {Maurice H. ter Beek and Annabelle McIver and Jos{\'{e}} N.
Oliveira},
Title = {Abstract Execution},
BookTitle = {Formal Methods - The Next 30 Years - Third World Congress,
{FM} 2019, Porto, Portugal, October 7-11, 2019,
Proceedings},
Series = {Lecture Notes in Computer Science},
Volume = {11800},
Pages = {319--336},
Publisher = {Springer},
Year = {2019},
URL = {https://doi.org/10.1007/978-3-030-30942-8\_20},
DOI = {10.1007/978-3-030-30942-8\_20},
timestamp = {Sat, 12 Oct 2019 12:51:42 +0200},
biburl = {https://dblp.org/rec/conf/fm/SteinhofelH19.bib},
bibsource = {dblp computer science bibliography, https://dblp.org}
}
Note that BibTool changed the key of the entry. I personally prefer uniform keys
and don’t want to figure them out myself. If you want to deactivate this
behavior, delete -f "{%-2n(author) # %-2n(editor)}-%2d(year)"
from
reformatBibliography.sh
.
my.bib
, one of
the entries will receive the key steinhoefel.haehnle-19*1
. I am quite sure
that this issue can be prevented by adding new entries to the end of the
.bib
file. It won’t hurt, however, to check whether the citations in your
rendered document are still referring to the correct publications if you receive
any ...*1
entries.
Preserving Title Case and Removing Superfluous Fields
Now, this is the controversial part 😃 If your title field is only
enclosed by single braces (or quotation marks) as in
Title = {Abstract Execution}
, the title in the rendered bibliography will be
“Abstract execution.” The second word appears in lower case, the title case
formatting is not preserved. I don’t
like this: Titles should be in title case, full stop. However, some people think
that you should not fuzz with bib entries. I honestly don’t understand why this
should be the case; in the actual papers, the titles are also in Title Case,
right? Well, I always change the title fields to look like Title = {{Abstract Execution}}
, i.e., enclosed by double braces, which preserves the case.
I generally remove fields like timestamp
, biburl
, or bibsource
added by
DBLP. They won’t appear in the rendered document, and only clutter my BibTeX
file. Also, I usually change URL
to x-URL
, which effectively comments the
URL out (you could also remove the field, but I’m always a little reluctant to
do so). The reason for this is that in bibliography styles which consider the
DOI
field, the link to the doi.org
page would appear twice otherwise, which
does not make sense in my opinion.
Thus, the “Abstract Execution” entry in my BibTeX file finally has the following
shape (here with URL
removed, and not commented out):
@InProceedings{ steinhoefel.haehnle-19,
Author = {Dominic Steinh{\"{o}}fel and Reiner H{\"{a}}hnle},
Editor = {Maurice H. ter Beek and Annabelle McIver and Jos{\'{e}} N.
Oliveira},
Title = {{Abstract Execution}},
BookTitle = {Formal Methods - The Next 30 Years - Third World Congress,
{FM} 2019, Porto, Portugal, October 7-11, 2019,
Proceedings},
Series = {Lecture Notes in Computer Science},
Volume = {11800},
Pages = {319--336},
Publisher = {Springer},
Year = {2019},
DOI = {10.1007/978-3-030-30942-8\_20},
}
That’s it! If you get used to
- Using DBLP
- Citing online sources correctly
- Checking your BibTeX for errors using an automatic linter
you will obtain a nice and correct bibliography in your paper or thesis almost all of the times! Additionally, if you
- Use BibTool, remove superfluous fields, and wrap titles (in Title Case) in double braces,
your BibTeX code will look nice and uniform, including the keys, titles will appear in tidy Title Case, and readers of your bibliography will be happy 😊 Enjoy!