This data catalog gives an overview of the Linked Datasets that were available on the Web in August 2014.

The content of the catalog originates from two sources:

  1. We performed a crawl of the Web of Linked Data in April 2014 and analyzed the discovered Linked Data sets for their compliance with the Linked Data best practices. The catalog contains all 1091 datasets that were discovered during the crawl as well as the results of our analysis in the form of tags indicating for instance whether a data source provides licensing or provenance meta-information, or indicating which vocabularies are used by a data source. The results of the analysis of the crawled data are found at State of the LOD Cloud 2014. The original crawled data is found at Supplementary Material at http://data.dws.informatik.uni-mannheim.de/lodcloud/2014/ISWC-RDB/.

  2. The Linked Data community collects meta-information about datasets in the datahub.io catalog. We added all dataset descriptions from the DataHub catalog to this catalog, that were tagged on August, 22nd 2014 with the tags lod, lodcloud, …

Thus this catalog contains a mixture of community-provided metadata and metadata that was derived from the Linked Data crawl. By combining metadata from both sources, this catalog provides rather comprehensive descriptions of the Linked Data sets that were available on the Web in August 2014.

This catalog formed the basis for the August 2014 version of the Linked Data cloud diagram, the April 2014 version of the crawl-able Linked Data diagram, as well as for the analysis presented in the paper "Adoption of the Linked Data Best Practices in Different Topical Domains" presented at ISWC2014.

Both versions of the LOD cloud diagram only contain datasets that are interlinked with other datasets. The analysis presented in the paper covers all published dataset regardless if they are interlinked or not.

In order to indicate whether a dataset in this catalog was visited during our crawl as well as to indicate whether a dataset appears in the LOD clouds, we marked the corresponding datasets in the catalog with the following tags:

  1. LinkedDataCrawl2014: Dataset was crawled as part of our crawl and metadata obtained during our analysis was added to the catalog using the cataloging guidelines defined at http://www.w3.org/wiki/TaskForces/CommunityProjects/LinkingOpenData/DataSets/CKANmetainformation.

  2. crawledLinkedDataCloud2014: Dataset is part of the crawled Linked Data Cloud 2014, meaning that it was visited by our crawler and our crawler discovered RDF links pointing at other linked datasets.

  3. LODCloud2014: Dataset is part of the LOD Cloud diagram 2014 which includes all datasets which set RDF links to other datasets and are eigher cataloged in the DataHub catalog or were discovered during our crawl.

Note that this catalog is static. If you want to add datasets or update the description of datasets, please perform your changes in the datahub.io catalog according to the guidelines found at http://www.w3.org/wiki/TaskForces/CommunityProjects/LinkingOpenData/DataSets/CKANmetainformation.

The creation of this catalog as well as the LOD cloud diagrams has been supported by the EU project PlanetData.