The Most Reliable Travelogues Database

Our bibliography of travel accounts has currently more than 1,400 entries. It has thus already become a significant source of information for the genre. However, collecting the data is just a small step toward our final goal. We aim to build a totally accurate and reliable database for travelogues. We, therefore, keep on gathering, cleaning, and improving our data to reach this goal.

We obtain our metadata of travel accounts from the most reliable sources, namely the best research libraries in the world. However, the data coming from even the best databases (such as those of the HathiTrust and Princeton University Library) have various inaccuracies and inconsistencies. As a result, we need to inspect every single entry manually to check the information on authors, translators, editors, publishers, years and places of publication etc. One common problem that we come across, for example, is that when we download the metadata, translators and editors (and even sometimes publishers) are listed as authors. This inflates the number of travelogue writers, which is a serious problem for our purposes.

Another problem that we face is separating real travel accounts from “fake” or “quasi-” ones. We have detected several fictional travel accounts that are added to our bibliography through semi-automatic bibliography building processes. Moreover, there are reports and treatises on the Middle East that seem, at first sight, like results of personal experiences and observations in the region, yet in fact, are written without visiting the region. Such works, despite potentially valuable information that they may contain, do not fit our (rather wide) definition of a travel account. They, therefore, are taken out of our bibliography.

In short, through meticulous efforts in collecting and cleaning the data on the travelogues, we are building “the” most accurate and reliable database of travel accounts about the 19th-century Middle East.


“A Hypertext Map” of Mark Twain’s Travel Accounts

“With the help of computer technology this map attempts to display two different things. The black line tracks the route MT took on the Quaker City excursion. At the same time, the 13 places named in blue are active links. Clicking on any of them will take you to a passage in Innocents Abroad about that place. And when MT’s book is reconceived this way — geographically, or geo-culturally — what begins to appear is a map of the racial and ethnic prejudices shared by the book and its American audience.”



Mark Twain’s Innocents Abroad; or, The New Pilgrim’s Progress :

“Published in 1869, this account of a trip east to the Old World was a great popular success. Within its first year it sold over 70,000 copies, and it remained the best-selling of MT’s books throughout his lifetime. The book began as a series of travel letters written mainly for the Alta California, a San Francisco paper that sponsored MT’s participation in the Quaker City trip to Europe and the Holy Land in 1867. Revising the letters into a book was suggested by Elisha Bliss, who published Innocents as a subscription book on July 20th, 1869.”


Poster Presentation at DHd 2019

We will have a poster presentation at DHd 2019.

(“6. Jahrestagung Digital Humanities im deutschsprachigen Raum,” Frankfurt & Mainz, 25.03. —29.03.2019)

Poster Title: “Linked Open Travel Data: Erschließung gesellschaftspolitischer Veränderungen im Osmanischen Reich im 19. Jahrhundert im Spiegel von Reiseberichten durch ein multimediales Online-Portal mit LOD und Text Mining Funktionalität”


We have exceeded 1,000 entries in our travelogues bibliography! This number includes different editions and translations of some popular works, which will give us a chance to analyze variations in different editions.

We will update our bibliography on the website soon, along with adding more information about the authors.

Notes on Method

As a first step, we have first identified the published reports of European travelers and entered them into a Zotero bibliography, accompanied with related primary and secondary literature. We have identified about 530 individual reports so far. Digital copies of approximately 95% of the initial sampling are available from Google Books and Hathi Trust. These copies will be used for the project. An online portal will provide information about the travelers and their reports, and it will also give access to the copyright-free full texts of these works.

The processed data, which includes texts, images, and metadata, will be presented via the GIS functionality of the platform (integrated into OSDS). The itineraries will thus be visualized. Copyright-free images that are found in the travelogues will be assigned to the relevant locations and are displayed or linked on the platform. This will make the travel experience vividly comprehensible and place the texts and relevant media in geographical context.

For the indexing of the reports, we import the files into a search index, and we extract the relevant entities. This process renders a faceted search in the Apache / Solr index possible. For this purpose, we benefit from Open Semantic Search as the search environment, which offers an integrated NER. In a further development phase, we will use the Linked Data capacities of Open Semantic Search, and we will integrate additional metadata about the travelers and the places that they visit. Targeted place names can be automatically introduced via Wikidata into the NER. Since it is impossible to avoid errors in the automated generation of facets and metadata, manual post-processing and enrichment of the texts will be necessary.