maven

Working with JSON in Scala using the json4s library (Part one).

In this very brilliant article, you can find a comparison between Scala libraries in terms of parsing speed. One of the best result was given by the json4s library. In the first part I will describe the library and it’s main functions, while in the second part I’ll go in deep showing some more detailed examples. As usual let’s create a Maven Scala project with Eclipse, adding the following dependency to the Maven pom.xml file:

org.json4s
json4s-native_${scala.version}
3.2.10

Substitute ${$scala.version} with your version of Scala (2.10 for example). If you don’t know how to create a Maven project with Scala in Eclipse follow this article (just the first part in which it is showed how to setup/install Eclipse with the Scala plugin). At the time of writing, I’ve found some problem with the 3.2.11 version (which is the last one), but the previous one was working smoothly. Now let’s create a Scala object with a main function to run:

package com.nosqlnocry.test
import org.json4s._
import org.json4s.JsonDSL._
import org.json4s.jackson.JsonMethods._
object Json4sTest {  
  def main(arg: Array[String]) {
    ...
  }
}

Before starting, we have to take a look at how the library json4s is modelling JSONs. Looking in the box below, we can see that it is using a syntax tree AST (Abstract Syntax Tree). …Continue reading →

Advertisements

How to build a Spark fat jar in Scala and submit a job

Are you looking for a ready-to-use solution to submit a job in Spark? These are short instructions about how to start creating a Spark Scala project, in order to build a fat jar that can be executed in a Spark environment. I assume you already have installed Maven (and Java JDK) and Spark (locally or in a real cluster); you can either compile the project from your shell (like I’ll show here) or “import an existing Maven project” with Eclipse and build it from there (read this other article to see how).

Requirements: Maven installation, Spark installation.

Simply download the following Maven project from github: https://github.com/H4ml3t/spark-scala-maven-boilerplate-project

If you have git installed, you can clone the repository:

git clone git@github.com:H4ml3t/spark-scala-maven-boilerplate-project.git
cd spark-scala-maven-boilerplate-project

or without git you have to download the zip from here: https://github.com/H4ml3t/spark-scala-maven-boilerplate-project/archive/master.zip (to open use: unzip master.zip)

Here it is the pom.xml maven file: …continue reading →