Apache Pig Tutorial on Apache Pig Filter Operator

Back to Course

Apache Pig Introduction

Apache Pig Overview

Read

Apache Pig Architecture

Read

Apache Pig Environment

Apache Pig Installation

Read

Apache Pig Execution

Read

Apache Pig Grunt Shell

Read

Pig Latin

Pig Latin Ã¢ÂÂ Basics

Read

Load & Store Operators

Apache Pig Reading Data

Read

Apache Pig Storing Data

Read

Diagnostic Operators

Apache Pig Diagnostic Operators

Read

Apache Pig Describe Operator

Read

Apache Pig Explain Operator

Read

Apache Pig Illustrate Operator

Read

Grouping & Joining

Apache Pig Group Operator

Read

Apache Pig Cogroup Operator

Read

Apache Pig Join Operator

Read

Apache Pig Cross Operator

Read

Combining & Splitting

Apache Pig Union Operator

Read

Apache Pig Split Operator

Read

Apache Pig Filter Operator

Read

Apache Pig Distinct Operator

Read

Apache Pig Foreach Operator

Read

Apache Pig Order By

Read

Apache Pig Limit Operator

Read

Pig Latin BuiltIn Functions

Apache Pig Eval Functions

Read

Apache Pig Load & Store Functions

Read

Apache Pig Bag & Tuple Functions

Read

Apache Pig String Functions

Read

Apache Pig Datetime Functions

Read

Apache Pig Math Functions

Read

Other Modes Of Execution

Apache Pig Running Scripts

Read

Apache Pig Quick Guide

Read

Apache Pig Useful Resources

Read

Discuss Apache Pig

Read

the filter operator is used to select the required tuples from a relation based on a condition.

syntax

given below is the syntax of the filter operator.

grunt> relation2_name = filter relation1_name by (condition);

example

assume that we have a file named student_details.txt in the hdfs directory /pig_data/ as shown below.

student_details.txt

001,rajiv,reddy,21,9848022337,hyderabad
002,siddarth,battacharya,22,9848022338,kolkata
003,rajesh,khanna,22,9848022339,delhi 
004,preethi,agarwal,21,9848022330,pune 
005,trupthi,mohanthy,23,9848022336,bhuwaneshwar 
006,archana,mishra,23,9848022335,chennai 
007,komal,nayak,24,9848022334,trivendram 
008,bharathi,nambiayar,24,9848022333,chennai

and we have loaded this file into pig with the relation name student_details as shown below.

grunt> student_details = load 'hdfs://localhost:9000/pig_data/student_details.txt' using pigstorage(',')
   as (id:int, firstname:chararray, lastname:chararray, age:int, phone:chararray, city:chararray);

let us now use the filter operator to get the details of the students who belong to the city chennai.

filter_data = filter student_details by city == 'chennai';

verification

verify the relation filter_data using the dump operator as shown below.

grunt> dump filter_data;

output

it will produce the following output, displaying the contents of the relation filter_data as follows.

(6,archana,mishra,23,9848022335,chennai)
(8,bharathi,nambiayar,24,9848022333,chennai)

Previous Lesson

Next Lesson