Programmer's Stack: What is Apache Spark?

What is Apache Spark?

- Fast, Scalable distributed processing engine

- Provides high level API for in memory processing and significant performance improvements over Hadoop MapReduce

- Spark SQL integrates with structued or tabular data

- Steams for processing streaming data in real time

- MLlib machine learning and GraphX for processing graphs

- It's written in Scala (JVM language)

- It supports Java, Python and Ruby

No comments:

Subscribe to: Post Comments (Atom)