作者:empty 页数:663 出版社:empty |
Sumit Kumar and Sourav Gulati are technology evangelists with deep experience in envisioning andimplementing solutions, as well as complex problems dealing with large and high-velocity data.Everytime I talk to them about any complex problem statement, they have provided an innovative andscalable solution.
I have over 17 years of experience in the IT industry, specializing in envisioning, architecting andimplementing various enterprise solutions revolving around a variety of business domains, such ashospitality, healthcare, risk management, and insurance.I have known Sumit and Sourav for 5 years as developers/architects who have worked closely withme implementing various complex big data solutions.From their college days, they were inclinedtoward exploring/implementing distributed systems.As if implementing solutions around big datasystems were not enough, they also started sharing their knowledge and experience with the big datacommunity.They have actively contributed to various blogs and tech talks, and in no circumstancesdo they pass upon any opportunity to help their fellow technologists.Knowing Sumit and Sourav, I am not surprised that they have started authoring a book on Spark and Iam writing foreword for their book-Apache Spark 2.x for Java Developers.Their passion for technology has again resulted in the terrific book you now have in your hands.This book is the product of Sumit'sand Sourav's deep knowledge and extensive implementationexperience in Spark for solving real problems that deal with large, fast and diverse data.Several books on distributed systems exist, but Sumit'sand Sourav's book closes a substantial gapbetween theory and practice.Their book offers comprehensive, detailed, and innovative techniquesfor leveraging Spark and its extensions/API for implementing big data solutions.This book is aprecious resource for practitioners envisioning big data solutions for enterprises, as well as forundergraduate and graduate students keen to master the Spark and its extensions using its Java APLThis book starts with an introduction to Spark and then covers the overall architecture and conceptssuch as RDD, transformation, and partitioning.It also discuss in detail various Spark extensions, suchas Spark Streaming.ML lib, Spark SQL, and Graph X.Each chapter is dedicated to a topic and includes an illustrative case study that covers state-of-the-artJava-based tools and software.Each chapter is self-contained, providing great flexibility of usage.The accompanying website provides the sourcecode and data.This is truly a gem for both studentsand big data architects/developers, who can experiment first-hand the methods just learned, or candeepen their understanding of the methods by applying them to real-world scenarios.As I was reading the various chapters of the book, I was reminded of the passion and enthusiasm of
SumisiandySoueavihasieu for n distributed.fia epo wonks Thay have communicated the concepts describedin the book with clarity and with the same passion.I am positive that you, as reader, will feel thesame.I will certainly keep this book as a personal resource for the solutions I implement, andstrongly recommend it to my fellow architects.Sumit Gupta
Sourav Gulati is associated with software industry for more than 7 years.He started his career withUnix/Linux and Java and then moved towards big data and No SQL World.He has worked on variousbig data projects.He has recently started a technical blog called Technical Learning as well.Apartfrom IT world, he loves to read about mythology.Sumit Kumar is a developer with industry insights in telecom and banking.At different junctures, hehas worked as a Java and SQL developer, but it is shell scripting that he finds both challenging andsatisfying at the sametime.Currently, he delivers big data projects focused on batch/near-real-timeanalytics and the distributed indexed querying system.Besides IT, he takes a keen interest in humanand ecological issues.
About the Reviewer pa PDF for NET.Prashant Verma started his IT carrier in201lasaJava developer in Ericsson working in telecomdomain.After couple of years of JAVA EE experience, he moved into Big Data domain, and hasworked on almost all the popular big data technologies, such as Had oop, Spark, Flume, Mongo,Cassandra, etc.He has also played with Scala.Currently, He works with QA Infotech as Lead DataEng inner, working on solving e-Learning problems using analytics and machine learning.Prashant has also worked on Apache Spark2x for Java Developers, Pack tasa Technical Reviewer.I want to thank Pack t Publishing forgiving me the chance to review the book as well as myemployer and my family for their patience while I was busy working on this book.