DataCamp’s tutorial gave me a challenging exercise which I felt it should be share here. Using the data structure, the Pandas Series, the exercise pre-loaded a .csv file as a DataFrame for analyzing data from the Pandas DataFrame.
You can think of DataFrame columns as single-dimension arrays called Series.
The below example uses the DataFrame, df and list comprehension to extract the 12th to 19th characters in the string of column “created_at” to extract the time. The second print() statement uses conditional list comprehension to check,
entry[17:19] is equal to ’19’.
The code is as below:
# Extract the created_at column from df: tweet_time tweet_time = df['created_at'] # Extract the clock time: tweet_clock_time tweet_clock_time = [entry[11:19] for entry in tweet_time] # Print the extracted times print(tweet_clock_time) # Access the 12th to 19th characters in the string to extract the time. Add a conditional expression that checks whether entry[17:19] is equal to '19'. tweet_clock_time = [entry[11:19] for entry in tweet_time if entry[17:19] == '19'] # Print the extracted times print(tweet_clock_time)
Both print() statements print out the output of variable, “tweet_clock_time”. It contains the timing of each records found in the .csv file with different conditions. The second print() statement print out records that ended with 19 seconds only.