python - Spark DataFrame TimestampType - how to get Year, Month, Day values from field? -
i have spark dataframe take(5) top rows follows:
[row(date=datetime.datetime(1984, 1, 1, 0, 0), hour=1, value=638.55), row(date=datetime.datetime(1984, 1, 1, 0, 0), hour=2, value=638.55), row(date=datetime.datetime(1984, 1, 1, 0, 0), hour=3, value=638.55), row(date=datetime.datetime(1984, 1, 1, 0, 0), hour=4, value=638.55), row(date=datetime.datetime(1984, 1, 1, 0, 0), hour=5, value=638.55)]
it's schema defined as:
elevdf.printschema() root |-- date: timestamp (nullable = true) |-- hour: long (nullable = true) |-- value: double (nullable = true)
how year, month, day values 'date' field?
you can use simple map
other rdd:
elevdf = sqlcontext.createdataframe(sc.parallelize([ row(date=datetime.datetime(1984, 1, 1, 0, 0), hour=1, value=638.55), row(date=datetime.datetime(1984, 1, 1, 0, 0), hour=2, value=638.55), row(date=datetime.datetime(1984, 1, 1, 0, 0), hour=3, value=638.55), row(date=datetime.datetime(1984, 1, 1, 0, 0), hour=4, value=638.55), row(date=datetime.datetime(1984, 1, 1, 0, 0), hour=5, value=638.55)])) (elevdf .map(lambda (date, hour, value): (date.year, date.month, date.day)) .collect())
and result is:
[(1984, 1, 1), (1984, 1, 1), (1984, 1, 1), (1984, 1, 1), (1984, 1, 1)]
btw: datetime.datetime
stores hour anyway keeping separately seems waste of memory.
since spark 1.5 can use number of date processing functions
import datetime pyspark.sql.functions import year, month, dayofmonth elevdf = sc.parallelize([ (datetime.datetime(1984, 1, 1, 0, 0), 1, 638.55), (datetime.datetime(1984, 1, 1, 0, 0), 2, 638.55), (datetime.datetime(1984, 1, 1, 0, 0), 3, 638.55), (datetime.datetime(1984, 1, 1, 0, 0), 4, 638.55), (datetime.datetime(1984, 1, 1, 0, 0), 5, 638.55) ]).todf(["date", "hour", "value"]) elevdf.select(year("date").alias('year'), month("date").alias('month'), dayofmonth("date").alias('day')).show() # +----+-----+---+ # |year|month|day| # +----+-----+---+ # |1984| 1| 1| # |1984| 1| 1| # |1984| 1| 1| # |1984| 1| 1| # |1984| 1| 1| # +----+-----+---+
Comments
Post a Comment