the explain operator is used to display the logical, physical, and mapreduce execution plans of a relation.
syntax
given below is the syntax of the explain operator.
grunt> explain relation_name;
example
assume we have a file student_data.txt in hdfs with the following content.
001,rajiv,reddy,9848022337,hyderabad 002,siddarth,battacharya,9848022338,kolkata 003,rajesh,khanna,9848022339,delhi 004,preethi,agarwal,9848022330,pune 005,trupthi,mohanthy,9848022336,bhuwaneshwar 006,archana,mishra,9848022335,chennai.
and we have read it into a relation student using the load operator as shown below.
grunt> student = load 'hdfs://localhost:9000/pig_data/student_data.txt' using pigstorage(',')
as ( id:int, firstname:chararray, lastname:chararray, phone:chararray, city:chararray );
now, let us explain the relation named student using the explain operator as shown below.
grunt> explain student;
output
it will produce the following output.
$ explain student;
2015-10-05 11:32:43,660 [main]
2015-10-05 11:32:43,660 [main] info org.apache.pig.newplan.logical.optimizer
.logicalplanoptimizer -
{rules_enabled=[addforeach, columnmapkeyprune, constantcalculator,
groupbyconstparallelsetter, limitoptimizer, loadtypecastinserter, mergefilter,
mergeforeach, partitionfilteroptimizer, predicatepushdownoptimizer,
pushdownforeachflatten, pushupfilter, splitfilter, streamtypecastinserter]}
#-----------------------------------------------
# new logical plan:
#-----------------------------------------------
student: (name: lostore schema:
id#31:int,firstname#32:chararray,lastname#33:chararray,phone#34:chararray,city#
35:chararray)
|
|---student: (name: loforeach schema:
id#31:int,firstname#32:chararray,lastname#33:chararray,phone#34:chararray,city#
35:chararray)
| |
| (name: logenerate[false,false,false,false,false] schema:
id#31:int,firstname#32:chararray,lastname#33:chararray,phone#34:chararray,city#
35:chararray)columnprune:inputuids=[34, 35, 32, 33,
31]columnprune:outputuids=[34, 35, 32, 33, 31]
| | |
| | (name: cast type: int uid: 31)
| | | | | |---id:(name: project type: bytearray uid: 31 input: 0 column: (*))
| | |
| | (name: cast type: chararray uid: 32)
| | |
| | |---firstname:(name: project type: bytearray uid: 32 input: 1
column: (*))
| | |
| | (name: cast type: chararray uid: 33)
| | |
| | |---lastname:(name: project type: bytearray uid: 33 input: 2
column: (*))
| | |
| | (name: cast type: chararray uid: 34)
| | |
| | |---phone:(name: project type: bytearray uid: 34 input: 3 column:
(*))
| | |
| | (name: cast type: chararray uid: 35)
| | |
| | |---city:(name: project type: bytearray uid: 35 input: 4 column:
(*))
| |
| |---(name: loinnerload[0] schema: id#31:bytearray)
| |
| |---(name: loinnerload[1] schema: firstname#32:bytearray)
| |
| |---(name: loinnerload[2] schema: lastname#33:bytearray)
| |
| |---(name: loinnerload[3] schema: phone#34:bytearray)
| |
| |---(name: loinnerload[4] schema: city#35:bytearray)
|
|---student: (name: loload schema:
id#31:bytearray,firstname#32:bytearray,lastname#33:bytearray,phone#34:bytearray
,city#35:bytearray)requiredfields:null
#-----------------------------------------------
# physical plan: #-----------------------------------------------
student: store(fakefile:org.apache.pig.builtin.pigstorage) - scope-36
|
|---student: new for each(false,false,false,false,false)[bag] - scope-35
| |
| cast[int] - scope-21
| |
| |---project[bytearray][0] - scope-20
| |
| cast[chararray] - scope-24
| |
| |---project[bytearray][1] - scope-23
| |
| cast[chararray] - scope-27
| |
| |---project[bytearray][2] - scope-26
| |
| cast[chararray] - scope-30
| |
| |---project[bytearray][3] - scope-29
| |
| cast[chararray] - scope-33
| |
| |---project[bytearray][4] - scope-32
|
|---student: load(hdfs://localhost:9000/pig_data/student_data.txt:pigstorage(',')) - scope19
2015-10-05 11:32:43,682 [main]
info org.apache.pig.backend.hadoop.executionengine.mapreducelayer.mrcompiler -
file concatenation threshold: 100 optimistic? false
2015-10-05 11:32:43,684 [main]
info org.apache.pig.backend.hadoop.executionengine.mapreducelayer.multiqueryop timizer -
mr plan size before optimization: 1 2015-10-05 11:32:43,685 [main]
info org.apache.pig.backend.hadoop.executionengine.mapreducelayer.
multiqueryop timizer - mr plan size after optimization: 1
#--------------------------------------------------
# map reduce plan
#--------------------------------------------------
mapreduce node scope-37
map plan
student: store(fakefile:org.apache.pig.builtin.pigstorage) - scope-36
|
|---student: new for each(false,false,false,false,false)[bag] - scope-35
| |
| cast[int] - scope-21
| |
| |---project[bytearray][0] - scope-20
| |
| cast[chararray] - scope-24
| |
| |---project[bytearray][1] - scope-23
| |
| cast[chararray] - scope-27
| |
| |---project[bytearray][2] - scope-26
| |
| cast[chararray] - scope-30
| |
| |---project[bytearray][3] - scope-29
| |
| cast[chararray] - scope-33
| |
| |---project[bytearray][4] - scope-32
|
|---student:
load(hdfs://localhost:9000/pig_data/student_data.txt:pigstorage(',')) - scope
19-------- global sort: false
----------------