getting started datafu pig apache datafu pig is a collection of user-defined functions for working with large scale data in apache pig . it has a number of useful functions available: statistics compute quantiles, median, variance, wilson binary confidence, etc. set operations perform set intersection...