CliqueSquare RDF platform on Hadoop available for download

We are pleased to announce the source code release of CliqueSquare, an RDF data management system based on Hadoop.

CliqueSquare is a system for storing and querying large RDF graphs relying on Hadoop’s distributed file system (HDFS) and Hadoop’s MapReduce open-source implementation. It provides a novel partitioning and storage scheme that permits 1-level joins to be evaluated locally using efficient map-only joins. In addition, CliqueSquare is equipped with a unique optimization algorithm based on graphs and cliques capable of generating highly parallelizable flat query plans relying on n-ary equality joins.
The system is described in an upcoming ICDE 2015 paper as well as an ICDE 2015 demonstration (see

CliqueSquare Features
* Scalable RDF storage using novel partitioning algorithms specially designed for Hadoop and HDFS that take into account the peculiarities of the RDF structure to reduce query-generated network traffic
* Scalable processing of SPARQL Basic Graph Pattern (BGP) queries relying on:
(i) novel optimization algorithms aiming to produce highly parallelizable query plans;
(ii) efficient MapReduce physical operators maximizing the usage of the Hadoop cluster.

Minimum system requirements
* Hadoop 1.2.1
* Linux / Mac OS
* Java 6

The initial release of CliqueSquare is available at:

Feature to be added soon: support for grouping and aggregation

Try it out and help us improve it by sending us your feedback:

Best regards,

The CliqueSquare Team

Team Website:

Permanent link to this article: