mathjax

Monday, September 9, 2013

Get Pig LogicalPlan

Recently I've been wanting to get ahold of the logical plan (a graph representation) for a pig script without running it. The largest reason is that the logical plan is a fairly language and platform agnostic representation of a dataflow. Once you have the logical plan I can think of several fun things you could do with it:


  • Serialize it as JSON and send it to any number of arbitrary tools
  • Visualize it in a web browser
  • Edit it with a web app
  • Compile it into an execution (physical) plan for arbitrary (non-hadoop map-reduce) backend frameworks that make sense (storm, s4, spark) 
Ok, so maybe those are the fun things I actually plan on doing with it, but what's the difference?

Problem

Pig doesn't make it easy to get this. After spending several hours digging through the way pig parses and runs a pig script I've come away somewhat shaken up. The parsing logic is deeply coupled with the execution logic. Yes, yes, this is supposed to change as we go forward, eg PIG-3419, but what about in the mean time?

Hack/Solution

So, I've written this little jruby script to return the LogicalPlan for a pig script. Right now all it does is exactly the same as putting an 'EXPLAIN' operator in your script. However, since it exposes the LogicalPlan, you could easily extend this to do whatever you like with it.


6 comments:

  1. I was able to get the logical plan in textual format. Can anybody suggest a java library to convert the generated logical plan to a directed-acyclic-graph in graphical form

    ReplyDelete
  2. Thanks for posting this info. I just want to let you know that I just check out your site and I find it very interesting and informative. I can't wait to read lots of your posts.
    college-paper.org reviews

    ReplyDelete
  3. When I initially commented, I clicked the “Notify me when new comments are added” checkbox and now each time a comment is added I get several emails with the same comment. Is there any way you can remove people from that service? Thanks.
    Authorized ipad service center in Chennai | ipad service center in chennai | Authorized ipad service center in Chennai | ipad service center in chennai | ipad service center in chennai | 100% genuine apple parts

    ReplyDelete
  4. Sebelum anda memulai permainan tentukan dulu target yang ingin anda capai, jika target anda sudah tercapai hari ini sampai sini aja, jangan tamak. Masih ada hari besok untuk anda bermain.
    asikqq
    dewaqq
    sumoqq
    interqq
    pionpoker
    bandar ceme terbaik
    hobiqq
    paito warna
    forum prediksi

    ReplyDelete
  5. Great article. Your blogs are unique and simple that is understood by anyone.


    BSc Time Table - B.SC 1 2 3 Ka Exam Date Scheme 2022

    ReplyDelete