Wednesday 20 February 2013

Books that every developer must read!

        I have a rack-space of books! I just wanted to make a wishlist of all the books which I've read and yet to read! :) 

1.
Design Patterns: Elements of Reusable Object-Oriented Software
Erich Gamma Richard Helm Ralph Johnson John Vlissides

2.
Structure And Interpretation Of Computer Programs, Second Edition : By Harold Abelson and Gerald Jay Sussman

3.
Refactoring to Patterns
Joshua Kerievsky

4.
Types and Programming Languages
Benjamin C. Pierce

5.
Code: The Hidden Language of Computer Hardware and Software
Charles Petzold

6.
Object-Oriented Analysis and Design with Applications (2nd Edition)
Grady Booch

7.
Code Complete: A Practical Handbook of Software Construction, Second Edition
Steve McConnell

8.
The Design of the UNIX Operating System [Prentice-Hall Software Series]
Maurice J. Bach

9.
The Pragmatic Programmer: From Journeyman to Master
Andrew Hunt  David Thomas

10.
Practical API Design: Confessions of a Java Framework Architect
Jaroslav Tulach

11.
The Practice of Programming (Addison-Wesley Professional Computing Series)
Brian W. Kernighan  Rob Pike


12.
Programming Pearls (2nd Edition)
Jon Bentley

13.
Writing Secure Code, Second Edition
Michael Howard David LeBlanc

14.
The Mythical Man-Month: Essays on Software Engineering, Anniversary Edition (2nd Edition)
Frederick P. Brooks Jr.

15.
Patterns of Enterprise Application Architecture
Martin Fowler

16.
Introduction to Functional Programming (Prentice Hall International Series in Computing Science)
Richard Bird

17.
The Art of Computer Programming
Donald E. Knuth

18.
Effective Java (2nd Edition)
Joshua Bloch


19.
Thinking in Java (4th Edition) 
Bruce Eckel

20.
Programmers at Work: Interviews With 19 Programmers Who Shaped the Computer Industry (Tempus)
Susan Lammers


21.
Coders at Work: Reflections on the Craft of Programming Peter Seibel













Well, I compiled all the aforementioned from Amazon. I will keep appending to list, if I remember :)
Happy Learning! :)

Monday 11 February 2013

Hadoop Hangover: How-to launch a hadoop cluster CDH4 [MRv1 / YARN + Ganglia] using Apache Whirr


  This post is about how-to launch a CDH4 MRv1 or CDH4 Yarn cluster on EC2 instances. It's said that you can launch a cluster with the help of Whirr and in a matter of 5 minutes! This is very true if and only if everything works out well! ;) 

Hopefully, this article helps you in that regard.
So, let's row the boat...
  • Download the stable version of Apache Whirr  ie. whirr-0.8.1.tar.gz from the following link whirr-0.8.1.tar.gz
  • Extract from the tarball and generate the key 
  • Generate the key
  • Make a properties file to launch the cluster with that configuration.
  • Now let me tell you how to avoid getting headaches!
    • cluster name: Keep your cluster name simple. Avoid testCluster, testCluster1 etc. ie. No Caps, numerics..
    • Decide on the number of datanodes you want judiciously.
    • Your launch may not be successful, if java is not installed. Make sure the image has Java. However, this properties file takes care of that.
    • It will be good to go ahead with MRv1 for now and later switch to MRv2, when we get a production stable release.
    • This is the minimal set of configurations for launching a Hadoop cluster. But, you can do a lot performance tuning upon this.
    • I had launched this cluster from an ec2 instance, Initially i faced errors, regarding user. Setting the configuration below, solved the problem.
    • Set proper permissions for ~/.ssh and whirr-0.8.1 folder before launching.
  •  Well, we are ready to launch the cluster. Name the properties file as "whirr_cdh.properties".
In the console you can see, links to Namenode and JobTracker Web UI. It also prints how to ssh to the instances in the end.

  • Now, you should be having the files generated. You will be able to see  these files: instances, hadoop-proxy.sh and hadoop-site.xml
  • Starting the proxy
  • Open another terminal, and type
  • You should be able to access the HDFS.
  • You can alternatively download hadoop tarball and launch with 
  •  Okay! So I know that you will not be satisfied unless you a web UI
So, we are good to go! 
  •   If you want to launch MRv2,  use this.
and the same process! 
Happy Learning! :)