Organize your data

Make them easily accessible through all your devices, scalable, understandable and easy to analyse

More info

Make your data rock

Model your data, interprete them and take appropriate actions

More Info

Save Your Time

Automate your analysis and save time and money right now

More Info

Stat'Rgy news

Stat'Rgy in action

Latest Posts

When relational databases are not good enough

Often relational databases will be a good fit for your data, especially if the data are well defined in advance and the structure is not expected to change much. Everything goes fine until your data volumes becomes too large. Then it is time to think of another solution, a NoSQL solution such as HBase, Cassandra, […]

Continue reading

Spark streaming: simple example streaming data from HDFS

This is a little example how to count words from incoming files that are stored in HDFS. Note that Spark streaming can read data from HDFS but also from Flume, Kafka, Twitter and ZeroMQ. For our example, the virtual machine (VM) from Cloudera was used (CDH5.3). From the command line, let’s open the spark shell […]

Continue reading

Best data processing engine: Flink vs Spark

Flink has been recently graduated top-level Apache Project. Flink started as a research project at the Technical University of Berlin in 2009. Spark was originally developed in the AMPLab at UC Berkeley in 2009 and became an Apache top-level project in February 2014. If Flink is less known than Spark, especially outside Europe, it is […]

Continue reading

Best SQL-on-hadoop tool?

SQL on Hadoop: Drill, Impala or Spark SQL Drill, Impala and Spark SQL all fit into the SQL-on-Hadoop category. Apache Drill and Spark are both top level Apache projects. Impala is developed by Cloudera and shipped by Cloudera, MapR, Oracle and Amazon. Spark SQL is part of the Spark project and is mainly supported by […]

Continue reading

Stat'Rgy works with all kinds of companies: from International groups to small- or medium-sized companies